History log of /openbsd-current/sys/net/if.h
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 1.217 09-Jun-2024 jan

Introduce IFCAP_VLAN_HWOFFLOAD for vio(4).

Add IFCAP_VLAN_HWOFFLOAD to signal hardware like vio(4) can handle
checksum or TSO offloading with inline VLAN tags.

tested by Mark Patruck, sf@ and bluhm@

ok sf@ and bluhm@


# 1.216 11-Apr-2024 bluhm

Prevent changing interface loopback flag from userland.

IFF_LOOPBACK is telling userland the behaviour of a specific driver,
it is supposed to be static and permanent. Clearing the loopback
flag on lo0 could lead to a kernel crash due to inconsistent multicast
igmp group.

Reported-by: syzbot+2f24ed6c8ddb2d6bb22c@syzkaller.appspotmail.com
OK claudio@ deraadt@


Revision tags: OPENBSD_7_5_BASE
# 1.215 11-Nov-2023 bluhm

Pass constant struct sockaddr to interface lookup functions.

OK mvs@


Revision tags: OPENBSD_7_4_BASE
# 1.214 30-May-2023 dlg

add net_tq_barriers

this waits once for something to end in all the net tqs.

ok claudio@


# 1.213 16-May-2023 jan

Use separate IFCAPs for LRO and TSO.

This diff introduces separate capabilities for TCP offloading. We split this
into LRO (large receive offloading) and TSO (TCP segmentation offloading).
LRO can be turned on/off via tcprecvoffload option of ifconfig and is not
inherited to sub interfaces.

TSO is inherited by sub interfaces to signal this hardware offloading capability
to the network stack.

With tweaks from bluhm, claudio and dlg

ok bluhm, claudio


# 1.212 15-May-2023 bluhm

Implement the TCP/IP layer for hardware TCP segmentation offload.
If the driver of a network interface claims to support TSO, do not
chop the packet in software, but pass it down to the interface
layer.
Precalculate parts of the pseudo header checksum, but without the
packet length. The length of all generated smaller packets is not
known yet. Driver and hardware will use the mbuf packet header
field ph_mss to calculate it and update checksum.
Introduce separate flags IFCAP_TSOv4 and IFCAP_TSOv6 as hardware
might support ony one protocol family. The old flag IFXF_TSO is
only relevant for large receive offload. It is missnamed, but keep
that for now.
Note that drivers do not set TSO capabilites yet. Also the ifconfig
flags and pseudo interfaces capabilities will be done separately.
So this commit should not change behavior.
heavily based on the work from jan@; OK sashan@


Revision tags: OPENBSD_7_3_BASE
# 1.211 07-Mar-2023 jan

Avoid enabling TSO on interfaces which are already attached to a bridge.

with tweaks from claudio and deraadt

ok claudio, bluhm


# 1.210 27-Feb-2023 jan

Turn off TSO if interface is added to layer 2 devices.

ok bluhm@, claudio@


Revision tags: OPENBSD_7_2_BASE
# 1.209 27-Jun-2022 jan

Introduce Large Receive Offloading of TCP segment offloading for ix(4). It is
disabled by default. Also add a tso option to ifconfig(8) to enable and
disable this feature.

ok deraadt


Revision tags: OPENBSD_6_9_BASE OPENBSD_7_0_BASE OPENBSD_7_1_BASE
# 1.208 11-Mar-2021 florian

When RFC 8981 obsoleted RFC 4941 the terminology changed from
"privacy extensions" to "temporary address extensions"

Change ifconfig(8) to output temporary after temporary addresses and
add "temporary" option which is an alias for autoconfprivacy for now.

Also make AUTOCONF6TEMP a positiv flag that is set by default.
Previously the negative flag "INET6_NOPRIVACY" was set when privacy
addresses were disabled. This makes the flags output less ugly and
will allow us to disable autoconf addresses while having temporary
addresses enabled in the future.

More work is needed in slaacd.

input benno, jmc, deraadt
previous verison OK benno
OK jmc, kn


# 1.207 20-Feb-2021 dlg

add a MONITOR flag to ifaces to say they're only used for watching packets.

an example use of this is when you have a span port on a switch and
you want to be able to see the packets coming out of it with tcpdump,
but do not want these packets to enter the network stack for
processing. this is particularly important if the span port is
pushing a copy of any packets related to the machine doing the
monitoring as it will confuse pf states and the stack.

ok benno@


# 1.206 01-Feb-2021 mvs

ifunit() was fully replaced by if_unit(9) and should go away.

ok bluhm@ dlg@


# 1.205 18-Jan-2021 mvs

Introduce new function if_unit(9). This function returns a pointer the
interface descriptor corresponding to the unique name. This descriptor
is guaranteed to be valid until if_put(9) is called on the returned
pointer. if_unit(9) should replace already existent ifunit() which
returns descriptor not safe for dereference when context was switched.
This allow us to avoid some use-after-free issues in ioctl(2) path.
Also this unifies interface descriptor usage.

ok claudio@ sashan@


# 1.204 30-Sep-2020 mvs

We have no if_attachtail() function so remove the declaration.

ok deraadt@ claudio@


Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE OPENBSD_6_8_BASE
# 1.203 25-Jul-2019 krw

AF_INET comes before AF_INET6. Shorten line to <80 chars.

pointed out by claudio@


# 1.202 25-Jul-2019 krw

Add IFXF_AUTOCONF4 to if_xflags to match IFXF_AUTOCONF6. Let
ifconfig set/unset it.

ok deraadt@ kmos@


# 1.201 19-Apr-2019 dlg

add IF_HDRPRIO_OUTER for rxprio

IF_HDRPRIO_OUTER says you want the priority from the outer encap header.

ok claudio@


Revision tags: OPENBSD_6_5_BASE
# 1.200 10-Apr-2019 dlg

add struct if_sffpage so userland can read a page of sfp/qsfp info

this will be used by ifconfig so it can show you things like who
mades what module you're using and when it was made, and on some
modules you get diag info like temperature, vcc, and rx and tx
powers.

im putting the kernel side in so we can keep fiddling with the
userland printf side of things.

this work was done based on a question by rachel roch
ok deraadt@
enthusiasm from many including mikeb@ sthen@ patrick@


# 1.199 23-Jan-2019 dlg

add a SIOCGPWE3 ioctl for interfaces to advertise pwe3 capability

im going to turn mpw into an ethernet interface, which includes
changing its if_type to IFT_ETHER. currently ldpd looks for if_type
IFT_MPLSTUNNEL to decide if an interface is a pseudowire, ie, it's
going to break. the ioctl will let ldpd ask the interface if it is
pseudowire capable as an alternative.

ok claudio@


# 1.198 12-Nov-2018 dlg

add ifreq bits for the tx header prio field ioctls

a tx header prio can set to a fixed value from 0 to 7, or magic
values to represent populating the prio field from the encapsulated
packet, or from the mbuf prio value.

ok claudio@


# 1.197 12-Nov-2018 krw

Add new routing socket message RTM_80211INFO to provide details of
802.11 interface state changes (e.g. SSID) to interested parties.

Original diff from phessler@. Many suggestions and tweaks from
claudio@, stsp@, anton@.

ok claudio@ stsp@ anton@ phessler@


# 1.196 11-Nov-2018 dlg

use the llprio on gre(4) and eoip(4) interfaces for the keepalive tos

llprios are valued 0 to 7, while the ip tos/dscp/tclass is an 8 bit
value. fortunately the high 3 bits map nicely to the llprio values,
so we shift the llprio into place when generating the keepalive
frames. the llprio is defaulted to the value that cisco uses for
their gre keepalives.


Revision tags: OPENBSD_6_4_BASE
# 1.195 12-Sep-2018 krw

Fix obvious cut&pasto in comment (ifa_msghdr -> if_announcemsghdr).

ok claudio@


# 1.194 30-May-2018 dlg

restrict the prio values from SIOCSIFLLPRIO to what the kernel handles

previously the ioctl code checked that prio was an int less than
UCHAR_MAX, but the rest of the kernel (and priq code in particular)
expects it to be between 0 and 7 inclusive.

ok krw@ tb@


# 1.193 25-Apr-2018 jca

Make this header standalone #if __BSD_VISIBLE, by including needed headers

Puts us in line with Free/NetBSD and Linux and will get us rid of
pointless patches in the ports tree. ok guenther@ deraadt@


Revision tags: OPENBSD_6_3_BASE
# 1.192 19-Feb-2018 dlg

tunneldf needs ifr_df


# 1.191 10-Feb-2018 florian

Implement RFC 7217: "A Method for Generating Semantically Opaque
Interface Identifiers with IPv6 Stateless Address Autoconfiguration."

"An IPv6 address configured using this method is stable within each
subnet, but the corresponding Interface Identifier changes when the
host moves from one network to another. This method is meant to be an
alternative to generating Interface Identifiers based on hardware
addresses."

OK naddy, sthen


# 1.190 16-Jan-2018 mpi

Recycle IFF_NOTRAILERS into IFF_STATICARP and document ownerhsip
of IFF* flags.

inputs from jmc@, ok bluhm@, visa@


# 1.189 21-Dec-2017 dlg

prototype if_attach_iqueues so drivers can configure multiple iqs.


# 1.188 09-Nov-2017 tb

The cmd argument of ifconf() has been unused since COMPAT_LINUX was
purged. Remove it and move the prototype to if.c since ifconf() is
not used outside of this file.

ok mpi


# 1.187 31-Oct-2017 sashan

- add one more softnet taskq
NOTE: code still runs with single softnet task. change definition of
SOFTNET_TASKS in net/if.c, if you want to have more than one softnet task

OK mpi@, OK phessler@


Revision tags: OPENBSD_6_1_BASE OPENBSD_6_2_BASE
# 1.186 24-Jan-2017 krw

A space here, a space there. Soon we're talking real whitespace
rectification.


# 1.185 24-Jan-2017 dlg

add support for multiple transmit ifqueues per network interface.

an ifq to transmit a packet is picked by the current traffic
conditioner (ie, priq or hfsc) by providing an index into an array
of ifqs. by default interfaces get a single ifq but can ask for
more using if_attach_queues().

the vast majority of our drivers still think there's a 1:1 mapping
between interfaces and transmit queues, so their if_start routines
take an ifnet pointer instead of a pointer to the ifqueue struct.
instead of changing all the drivers in the tree, drivers can opt
into using an if_qstart routine and setting the IFXF_MPSAFE flag.
the stack provides a compatability wrapper from the new if_qstart
handler to the previous if_start handlers if IFXF_MPSAFE isnt set.

enabling hfsc on an interface configures it to transmit everything
through the first ifq. any other ifqs are left configured as priq,
but unused, when hfsc is enabled.

getting this in now so everyone can kick the tyres.

ok mpi@ visa@ (who provided some tweaks for cnmac).


# 1.184 23-Jan-2017 mpi

Flag pseudo-interfaces as such in order to call add_net_randomness()
only once per packet.

Fix a regression introduced when if_input() started to be called by
every pseudo-driver.

ok claudio@, dlg@


# 1.183 23-Jan-2017 dlg

i botched the copyout to ifr->ifr_data in SIOCGIFDATA.

this lets pflogd run again.

rename if_data() to if_getdata() while here to make grepping for
things less noisy.

reported by jsg@
worked through with deraadt@


# 1.182 23-Jan-2017 dlg

merge the ifnet and ifqueue stats together when userland wants them.

a new if_data() function takes a pointer to ifnet and merges its
if_data and ifq statistics. it takes the ifq mutex around the reads
of the ifq stats so they get a consistent copy.

the ifnet and ifq stats are merged because some parts of the stack
still update the ifnet counters.

ok visa@ (on an earlier diff) mpi@ claudio@


# 1.181 12-Dec-2016 mpi

Remove most of the splsoftnet() recursions related to cloned interfaces.

inputs and ok bluhm@


# 1.180 27-Oct-2016 dlg

add a new pool for 2k + 2 byte (mcl2k2) clusters.

a certain vendor likes to make chips that specify the rx buffer
sizes in kilobyte increments. unfortunately it places the ethernet
header on the start of the rx buffer, which means if you give it a
mcl2k cluster, the ethernet header will not be ETHER_ALIGNed cos
mcl2k clusters are always allocated on 2k boundarys (cos they pack
into pages well). that in turn means the ip header wont be aligned
correctly.

the current workaround on these chips has been to let non-strict
alignment archs just use the normal 2k cluster, but use whatever
cluster can fit 2k + 2 on strict archs. that turns out to be the
4k cluster, meaning we waste nearly 2k of space on every packet.

properly aligning the ethernet header and ip headers gives a
performance boost, even on non-strict archs.


# 1.179 04-Sep-2016 reyk

Move code to change the rdomain of an interface from the ioctl switch case
to a new function if_setrdomain().

OK mpi@ henning@


# 1.178 03-Sep-2016 reyk

Add support for a multipoint-to-multipoint mode in vxlan(4). In this
mode, vxlan(4) must be configured to accept any virtual network
identifier with "vnetid any" and added to a bridge(4) or switch(4).
This way the driver will dynamically learn the tunnel endpoints and
their vnetids for the responses and can be used to dynamically bridge
between VXLANs. It is also being used in combination with switch(4)
and the OpenFlow tunnel classifiers.

With input from yasuoka@ goda@
OK deraadt@ dlg@


Revision tags: OPENBSD_6_0_BASE
# 1.177 10-Jun-2016 vgross

Add the "llprio" field to struct ifnet, and the corresponding keyword
to ifconfig.

"llprio" allows one to set the priority of packets that do not go through
pf(4), as the case is for arp(4) or bpf(4).

ok sthen@ mikeb@


# 1.176 02-Mar-2016 dlg

provide generic ioctls for managing an interfaces parent

in the future this will subsume the individual vlandev, carpdev,
pppoedev, foodev options for things like vlan, carp, pppoe, etc.

inspired by vnetid

ok mpi@ jmatthew@


Revision tags: OPENBSD_5_9_BASE
# 1.175 05-Dec-2015 deraadt

avoid an ugly wrap in a comment


# 1.174 03-Dec-2015 dlg

rework if_start to allow nics to provide an mpsafe start routine.

existing start routines will still be called under the kernel lock
and at IPL_NET.

mpsafe start routines will be serialised so only one instance of
each interfaces function will be running in the kernel at any point
in time. this guarantees packets will be dequeued in order, and the
start routines dont have to lock against themselves because if_start
does it for them.

the code to do that is based on the scsi runqueue code.

this also provides an if_start_barrier() function that should wait
until any currently running instances of if_start have finished.

a driver can opt in to the mpsafe if_start call by doing the following:

1. setting ifp->if_xflags = IFXF_MPSAFE
2. only calling if_start() instead of its own start routine
3. clearing IFF_RUNNING before calling if_start_barrier() on its way down
4. only using IFQ_DEQUEUE (not ifq_deq_begin/commit/rollback)

to simplify the implementation the tx mitigation code has been removed.

tested by several
ok mpi@ jmatthew@


# 1.173 20-Nov-2015 mpi

Keep if_ref() private, if_get() is what you want to use before if_put().

The thread detaching an interface will sleep until all references to this
interface have been released. So we decided to only keep references for
a short period of time.

Keeping if_ref() private will hopefully help preserve this goal as long
as it makes sense.

Calling if_get()/if_put() in the same function also allows us to make
use of static analysis tools (thanks jsg@!) to catch our errors.

ok dlg@


# 1.172 24-Oct-2015 reyk

Add pair(4), a vether-based virtual Ethernet driver to interconnect
rdomains and bridges on the local system. This can be used to route
through local rdomains, to create L2 devices (like trunks) between
them, and many other things.

Discussed with many, with input from mpi@
OK sthen@ phessler@ yasuoka@ mikeb@


# 1.171 23-Oct-2015 claudio

Introduce a new sysctl NET_RT_IFNAMES that returns only ifnames to ifindex
mappings. This will be used by if_nameindex(3), if_nametoindex(3) and
if_indextoname(3) soon to fix the issues in pledge because of inet6 link
local addressing.
OK mpi@ benno@ deraadt@
The libc version will follow soon so better start updating your kernels


# 1.170 23-Oct-2015 dlg

tweak the vnetid so it can be optional and therefore cleared/deleted.

the abstract vnetid is promoted to a uin32_t, and adds a SIOCDVNETID
ioctl so it can be cleared.

this is all because i set an assignment on implementing a virtual
network interface and the students got confused when vnetid 0 didnt
show up in ifconfig output.

the vnetid in the vxlan(4) protocol is optional, but the current
code confuses 0 with no vnetid being set. this makes it clear.

ok reyk@ who also simplified my diff


# 1.169 05-Oct-2015 uebayasi

Add ifi_oqdrops and its alias to struct if_data.

Necessary bumps in Ports will be handled by sthen@.

OK mpi@ dlg@


# 1.168 27-Sep-2015 stsp

Add if_setlladdr(), factored out from ifioctl(). Will be used by iwm(4) soon.
With suggestions from tedu@ and guenther@
ok kettenis@


# 1.167 11-Sep-2015 stsp

Make room for media types of the future. Extend the ifmedia word to 64 bits.
This changes numbers of the SIOCSIFMEDIA and SIOCGIFMEDIA ioctls and
grows struct ifmediareq.

Old ifconfig and dhclient binaries can still assign addresses, however
the 'media' subcommand stops working. Recompiling ifconfig and dhclient
with new headers before a reboot should not be necessary unless in very
special circumstances where non-default media settings must be used to
get link and console access is not available.

There may be some MD fallout but that will be cleared up later.

ok deraadt miod
with help and suggestions from several sharks attending l2k15


# 1.166 09-Sep-2015 dlg

introduce reference counts for interfaces (ie, struct ifnet *ifp).

if_get can get a reference to an ifp, but it never releases that
reference. this provides an if_put function that can be used to
decrement the refcount.

we cannot come up with a scheme for letting the network stack run on
one (or many) cpus while ioctls are pulling interfaces down on another
cpu without refcounts for the interfaces.

if_put is going in now so we can go through the stack and put the
necessary calls to it in, and then we'll backfill this implementation
to actually check the refcounts when the interface detaches.

ok mpi@ mikeb@ claudio@


# 1.165 30-Aug-2015 mpi

Use a global table for domains instead of building a list at run time.

As a side effect there's no need to run if_attachdomain() after the
list of domains has been built.

ok claudio@, reyk@


Revision tags: OPENBSD_5_8_BASE
# 1.164 07-Jun-2015 jsg

Introduce unhandled_af() for cases where code conditionally does
something based on an address family and later assumes one of the paths
was taken. This was initially just calls to panic until guenther
suggested a function to reduce the amount of strings needed.

This reduces the amount of noise with static analysers and acts
as a sanity check.

ok guenther@ bluhm@


# 1.163 18-May-2015 reyk

Move the rdomain from struct ifnet into struct if_data. This way it
will be exported to userland with the existing sysctl, getifaddrs()
and routing socket (if_msghdr.ifm_data) interfaces that expose
if_data. All programs and daemons - Apps - that call the
SIOCGIFRDOMAIN ioctl in a getifaddrs() loop or after receiving an
interface message on the routing socket can now remove the pointless
additional ioctl. In base, that could be: dhclient, isakmpd, dhcpd,
dhcrelay, ntpd, ospfd, ripd, ifconfig.

No ABI breakage because it uses a previously unused pad field in if_data.

OK mpi@ deraadt@


# 1.162 10-Apr-2015 mpi

Run detach hook and similar before cleaning up any other resource when
an interface is destroyed/removed. This way we can ensure pseudo-driver
changes done after attaching an interface are undone before detaching it.

Note: it is safe to call if_deactivate() multiple times as the interface
should not have any attached pseudo-interface after the first call.

ok deraadt@, dlg@


# 1.161 18-Mar-2015 dlg

remove the congestion handling from struct ifqueue.

its only used for the ip and ip6 network stack input queues, so it
seems unfair that every instance of ifqueue has to carry a pointer
around for this specific use case.

this moves the congestion marker to a kernel global. if we detect
that we're congested, we assume the whole system is busy and punish
all input queues.

marking a system as congested is done by setting the global to the
current value of ticks. as the system moves away from that value,
it moves away from being congested until the comparison fails.

written at s2k15
ok henning@ beck@ bluhm@ claudio@


Revision tags: OPENBSD_5_7_BASE
# 1.160 08-Feb-2015 mpi

Introduce if_input() a function to pass packets dequeued from a
recieving ring to the stack.

if_input() is at the moment a drop-in replacement for ether_input_mbuf()
but will let us stack pseudo-driver in a nice way in order to no longer
call ether_input() recursively.

ok pelikan@, reyk@, blambert@, henning@


# 1.159 06-Jan-2015 stsp

Remove the NOINET6 interface flag, a left-over from the times when IPv6
was enabled by default. Add AFATTACH/AFDETACH ioctls which enable/disable
an address family for an interface (currently used for IPv6 only).

New kernel needs new ifconfig for IPv6 configuration (address assignment
still works with old ifconfig making this easy to cross over).

Committing on behalf of henning@ who is currently lebensmittelvergiftet.
ok stsp, benno, mpi


# 1.158 05-Dec-2014 mpi

Explicitly include <net/if_var.h> instead of pulling it in <net/if.h>.

ok mikeb@, krw@, bluhm@, tedu@


Revision tags: OPENBSD_5_6_BASE
# 1.157 14-Jul-2014 dlg

now that receive ring accounting has been pulled out of the mbuf layer,
we can pull the space the mbuf layer used to do per interface accounting
out of struct if_data.

saves a hundredish bytes on every interface.

ok deraadt@ claudio@


# 1.156 11-Jul-2014 henning

introduce the IFXF_AUTOCONF6 interface flag which controls wether we
accept rtadvs on that interface. the global net.inet6.ip6.accept_rtadv
sysctl just doesn't cut it, even tho the spec wants that - but in their
little absurd world, a host just has one interface by definition anyway...
the sysctlgoes away.
lots of head scratching, brain cell elemination etc from bluhm benno stsp
florian, excitement from simon and todd, ok bluhm stsp benno florian


# 1.155 08-Jul-2014 dlg

introduce the if_rxr api. it is intended to pull the rx ring accounting
out of the mbuf layer, and break the assumption that an interface will
only have a single ring per mbuf cluster size.

mpi@ is ok with moving this forward


# 1.154 13-Jun-2014 mpi

Instead of updating all the cluster allocation water marks of all the
interfaces when the kernel is livelocked, only do it for the current
pool and defer the other updates.

This allow us to get rid of an interface list iteration in a critical
path.

Ridding the libc crank since this change introduce an ABI break.

ok claudio@


Revision tags: OPENBSD_5_5_BASE
# 1.153 21-Nov-2013 mikeb

split kernel parts of the if.h into a separate header file if_var.h
which allows us to modify ifnet structure in a relatively safe way;
discussed with deraadt, ok mpi


# 1.152 09-Nov-2013 dlg

ticks is compared against mcl_grown to see if time has elapsed since
the rx ring was last allowed to grow and then assigned to it. it
is erroneous to do this because mcl_grown is a u_int and ticks is an
int.

this makes mcl_grown an int, and follows the idiom in kern_timeout.c
of going "thing - ticks < diff", which better copes with ticks
wrapping around and being used to calculate relative intervals.

ok pirofti@ guenther@


# 1.151 01-Nov-2013 pelikan

keep net/hfsc.h away from userspace, except in pfctl

tested by naddy, ok deraadt


# 1.150 21-Oct-2013 benno

nuke comment. How soon is now?
"do it" deraadt@


# 1.149 19-Oct-2013 reyk

Bring back the if_detachhook. We're going to have more users now.

ok mpi@ henning@ benno@


# 1.148 13-Oct-2013 reyk

Import vxlan(4), the virtual extensible local area network tunnel
interface. VXLAN is a UDP-based tunnelling protocol for overlaying
virtualized layer 2 networks over layer 3 networks. The implementation
is based on draft-mahalingam-dutt-dcops-vxlan-04 and has been tested
with other implementations in the wild.

put it in deraadt@


# 1.147 12-Oct-2013 henning

new bandwidth shaping subsystem, kernel side
uses hfsc behind the scenes; altq stays in parallel for a migration phase.
if.h even more messy for the transition, but eventuelly it should become
readable...
looked over & tested by many, ok phessler sthen


# 1.146 17-Sep-2013 mpi

Change vlan(4) detach procedure to not use a hook but a list of vlans
on the parent interface. This is similar to what bridge(4), trunk(4)
or carp(4) are doing and allows us to get rid of the detachhook.

ok reyk@, mikeb@


# 1.145 28-Aug-2013 mpi

Remove unused argument from *rtrequest()

ok krw@, mikeb@


Revision tags: OPENBSD_5_4_BASE
# 1.144 20-Jun-2013 mpi

Revert previous and unbreak asr, the new include should be protected.

Reported by naddy@


# 1.143 20-Jun-2013 mpi

Allocate the various hook head descriptors as part of the ifnet
structure rather than doing various M_WAITOK allocations during
the *attach() functions, we always rely on them anyway.

ok mikeb@, uebayasi@


# 1.142 02-Apr-2013 mpi

Instead of storing the link-level address of every interface in a global
array indexed by interface numbers, add a new field to the interface
descriptor pointing to it.

claudio@ and todd@ like it, ok mikeb@


# 1.141 26-Mar-2013 mpi

Remove various read-only *maxlen variables and use IFQ_MAXLEN directly.

ok beck@, mikeb@


# 1.140 20-Mar-2013 mpi

Introduce if_get() to retrieve an interface descriptor pointer given
an interface index and replace all the redondant checks and accesses
to a global array by a call to this function.

With imputs from and ok bluhm@, mikeb@


# 1.139 07-Mar-2013 mpi

Remove unused ifa_ifwithaf() function.

ok mikeb@, miod@


# 1.138 07-Mar-2013 mpi

Remove the IFAFREE() macro, the ifafree() function it was calling already
check for the reference counter.

ok mikeb@, miod@, pelikan@, kettenis@, krw@


Revision tags: OPENBSD_5_3_BASE
# 1.137 23-Nov-2012 sthen

Add SIOCGIFHARDMTU to allow retrieving the driver's maximum supported MTU
looks fine reyk@ ok mikeb@


# 1.136 11-Nov-2012 deraadt

align ifaliasreq.ifra_addr similar to the way that ifreq is fixed --
a gruesome union, to block the compiler from placing the struct
incorrectly aligned on stack frames
ok guenther


# 1.135 05-Oct-2012 camield

Point an interface directly to its bridgeport configuration, instead
of to the bridge itself. This is ok, since an interface can only be part
of one bridge, and the parent bridge is easy to find from the bridgeport.

This way we can get rid of a lot of list walks, improving performance
and shortening the code.

ok henning stsp sthen reyk


# 1.134 19-Sep-2012 henning

defina an IFCAP_CSUM_MASK, covering IFCAP_CSUM_*, and use it in if_vlan.c
to replace the list of them.
this actually makes vlan inherit the IPv6 CSUM flags from it's parent, that
had been commented out since this code was committed back in 2001.
ok benno mpf


# 1.133 10-Sep-2012 guenther

Bring into compliance with POSIX, exposing just the specified bits.

Requested by jasper@, ok kettenis@


# 1.132 21-Aug-2012 bluhm

Reverse the name and meaning of the IFXF_INET6_PRIVACY interface
flag. It is now called IFXF_INET6_NOPRIVACY. So IPv6 privacy
addresses are on by default without resetting the flag during
ifconfig down/up.
OK stsp@, sperreault@ (who wrote the same diff)


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.131 02-Dec-2011 haesbaert

Kill unused IFCAP_IPSEC and IFCAP_IPCOMP.

ok claudio@ henning@ mikeb@


# 1.130 02-Nov-2011 haesbaert

Expose if_capabilities to userland so that ifconfig can display the
device hardware features.
Tune ifconfig to show them with 'hwfeatures' argument.
While here, kill some old unused capabilities and respect 80 columns
in brconfig.h.

ok mcbride@, henning@, mpf@.


# 1.129 07-Oct-2011 henning

rename some vars and functions
unfortunately altq is one giant namespace violation. rename just those that
conflict with new stuff for now only to be found on my laptop. reduce pain,
the diff is huge already. ok ryan


Revision tags: OPENBSD_5_0_BASE
# 1.128 08-Jul-2011 henning

new priority queueing implementation, extremely low overhead, thus fast.
unconditional, always on. 8 priority levels, as every better switch, the
vlan header etc etc. ok ryan mpf sthen, pea tested as well


# 1.127 07-Jul-2011 henning

provide IF_LEN and IFQ_LEN to access ifq_len on an ifqueue, ryan ok


# 1.126 05-Jul-2011 henning

now of course I only noticed if_qflush is completely unused after
adjusting it to the new world order in my tree... remove it, ok ryan claudio


# 1.125 03-Jul-2011 henning

IFQ_CLASSIFY is also just schrapnel


# 1.124 03-Jul-2011 henning

no traces of ALTQ_DECL to be found anywhere, thus kill the #defines


# 1.123 03-Jul-2011 claudio

LINK_STATE_IS_UP() should consider LINK_STATE_UNKNOWN as an up state.
This is now possible because carp no longer uses LINK_STATE_UNKNOWN
for a state that is considered down. This will simplify a lot of code.
OK mpf@ mcbride@ henning@


# 1.122 13-Mar-2011 stsp

Add a way to enable/disable Wake On LAN with ifconfig.
ok deraadt


Revision tags: OPENBSD_4_9_BASE
# 1.121 17-Nov-2010 henning

introduce ifa_update_broadaddr to update an ifaddr's broadcast address,
trivial for the moment, more needed soon
tested by many as part of a larger diff, ok sthen claudio dlg krw


# 1.120 24-Sep-2010 claudio

Implement if_freenameindex() as a real function as required by posix.
OK deraadt@, millert@


# 1.119 23-Sep-2010 dlg

tweak the mclgeti algorithm to behave better under load.

instead of letting hardware rings grow on every interrupt, restrict
it so it can only grow once per softclock tick. we can only punish
the rings on softclock ticks, so it make sense to only grow on
softclock tick boundaries too.

the rings are now punished after >1 lost softclock tick rather than
>2. mclgeti is now more aggressive at detecting livelock.

the rings get punished by an 8th, rather than by half.

we now allow the rings to be punished again even if the system is
already considered in livelock.

without this diff a livelocked system will have its rx ring sizes
scale up and down very rapidly, while holding the rings low for too
long. this affected throughput significantly.

discussed and tested heavily at j2k10. there are still some games
with softnet we can play, but this is a good first step.

"put it in" and ok deraadt@
ok claudio@ krw@ henning@ mcbride@

if we find out that it sucks we can pull it out again later. till then
we'll run with it and see how it goes.


# 1.118 27-Aug-2010 jsg

remove the unused if_init callback in struct ifnet
ok deraadt@ henning@ claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.117 26-Jun-2010 claudio

Implement a simple keepalive mechanism in gre(4) that is compatible with
the one used by Cisco. It sends a return gre packet inside a gre packet
to the other side and expects it to return.
OK deraadt, reyk additional testing by sthen


# 1.116 28-May-2010 claudio

Rework the way we handle MPLS in the kernel. Instead of fumbling MPLS into
ether_output() and later on other L2 output functions use a trick and over-
load the ifp->if_output() function pointer on MPLS enabled interfaces to
go through mpls_output() which will then call the link level output function.
By setting IFXF_MPLS on an interface the output pointers are switched.
This now allows to cleanup the MPLS input and output pathes and fix mpe(4)
so that the MPLS code now actually works for both P and PE systems.
Tested by myself and michele
(A custom kernel with MPLS and mpe enabled is still needed).


# 1.115 17-Apr-2010 deraadt

split SIOCSIFLLADDR code out into an ifnewlladr() function
ok stsp


# 1.114 06-Apr-2010 stsp

Simple implementation of RFC4941, "Privacy Extensions for Stateless
Address Autoconfiguration in IPv6". For those among us who are paranoid
about broadcasting their MAC address to the IPv6 internet.

Man page help from jmc, testing by weerd, arc4random API hints from djm.

ok deraadt, claudio


Revision tags: OPENBSD_4_7_BASE
# 1.113 13-Jan-2010 henning

maintain a global RB tree of all local addresses in the system. this
includes AF_LINK addresses (aka mac addresses in the ethernet case). for
inet this also includes the broadcast addresses.
depends on ifinit() called earlier so we have a chance to pool_init before
autoconf assigns the AF_LINK addresses, the v6 fix, and the ifa_add/del
abstraction i just committed.
this is a change in semantics, it is now illegal to change the actual
address in an ifaddr struct because then the RB tree becomes unbalanced.
nothing using this tree yet.
ok theo ryan dlg


# 1.112 13-Jan-2010 henning

instead of fiddling with the per-interface address lists directly in
many places create a proper API (ifa_add / ifa_del) and use it.
ok theo ryan dlg


# 1.111 12-Jan-2010 deraadt

Make the structures for ifa_msghdr and friends even more like
the route messages so that people and compilers will not get
confused.
ok claudio


# 1.110 17-Sep-2009 claudio

Remove the comaptibility structures for routing socket version 3.
The RTM_VERSION bump is 2 years ago and so there is no need for this.
Diff made by tedu@ some time ago but got never commited so I do it now.


# 1.109 14-Sep-2009 claudio

Add a way to convert the ifi_link_state to a string without the use of
if_media. This makes link state tracking a lot easier as there is no need
to convert if types to if_media types, etc. Additionally this allows us
to extend the link states to include states tracked on higher protocol layers.
gre(4) keepalives packets, bfd and udld can be implemented without ugly hacks.
OK henning, michele, sthen, deraadt


# 1.108 10-Aug-2009 deraadt

At sys_reboot time, bring all the interfaces down so that their xxstop
functions are called, which will turn off DMA. Receiving packets into
your memory after a system reboot is pretty nasty. This will also mean
that the shutdown hooks can go; this solution is smaller.
ok henning miod dlg kettenis


Revision tags: OPENBSD_4_6_BASE
# 1.107 06-Jun-2009 rainer

when xflags got changed, tell the userland by routing sockets

ok henning@


# 1.106 05-Jun-2009 claudio

Initial support for routing domains. This allows to bind interfaces to
alternate routing table and separate them from other interfaces in distinct
routing tables. The same network can now be used in any doamin at the same
time without causing conflicts.
This diff is mostly mechanical and adds the necessary rdomain checks accross
net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6.
input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@


# 1.105 04-Jun-2009 henning

allow IPvShit to be turned off completely per-interface.
ifconfig em0 -inet6
deletes all v6 addresses including link-local and prevents new ones from
being added.
ifconfig em0 inet6 <addr>
re-enables v6, brings the link local back and adds optional <addr>
ok theo reyk


# 1.104 03-Jun-2009 beck

make wireless interfaces priority 4 by default. other interfaces remain
priority 0. while we are in here make sure we add wi interfaces to group "wlan"
in the same way the net80211 stuff already is.

this makes dhcp multiple default routes useful on laptops.

ok claudio@


Revision tags: OPENBSD_4_5_BASE
# 1.103 27-Jan-2009 dlg

make drivers tell the mclgeti allocator what their maximum ring size is
to prevent the hwm growing beyond that. this allows the livelock mitigation
to do something where the hwm used to grow beyond twice the rx rings size.

ok kettenis@ claudio@


# 1.102 12-Dec-2008 claudio

Introduce a if_priority that will be added to RTP_STATIC when routes are
added without an expilict priority. This allows to specify less prefered
interfaces that will only take over if the primary interface loses link.
OK deraadt@


# 1.101 11-Dec-2008 deraadt

export per-interface mbuf cluster pool use statistics out to userland
inside if_data, so that netstat(1) and systat(1) can see them
ok dlg


# 1.100 30-Nov-2008 brad

- Remove unused if_reset "bus reset routine" field in the ifnet struct.
- Add if_stop "stop routine" field in the ifnet struct.

ok mglocker@


# 1.99 26-Nov-2008 dlg

provide m_clsetlwm, an interface for an interface to raise its low
watermark for mbuf cluster allocations.

this is necessary for things like bge which cannot cope with less than a
certain number of pkts on the ring.

ok deraadt@


# 1.98 25-Nov-2008 deraadt

Factor increases are not needed, +1 appears to work as well.
ok dlg


# 1.97 24-Nov-2008 deraadt

move MCLPOOLS to if.h and force uipc_mbuf.c to get if.h, there is no
other option
ok dlg


# 1.96 24-Nov-2008 dlg

add several backend pools to allocate mbufs clusters of various sizes out
of. currently limited to MCLBYTES (2048 bytes) and 4096 bytes until pools
can allocate objects of sizes greater than PAGESIZE.

this allows drivers to ask for "jumbo" packets to fill rx rings with.

the second half of this change is per interface mbuf cluster allocator
statistics. drivers can use the new interface (MCLGETI), which will use
these stats to selectively fail allocations based on demand for mbufs. if
the driver isnt rapidly consuming rx mbufs, we dont allow it to allocate
many to put on its rx ring.

drivers require modifications to take advantage of both the new allocation
semantic and large clusters.

this was written and developed with deraadt@ over the last two days
ok deraadt@ claudio@


# 1.95 07-Nov-2008 deraadt

give this some /* CONSTCOND */ love


Revision tags: OPENBSD_4_4_BASE
# 1.94 10-Apr-2008 dlg

introduce mitigation for the calling of an interfaces start routine.

decent drivers prefer to have a lot of packets on the send queue so they
can queue a lot of them up on the tx ring and then post them all in one
big chunk. unfortunately our stack queues one packet onto the send queue
and then calls the start handler immediately.

this mitigates against that queue, send, queue, send behaviour by trying to
call the start routine only once per softnet. now its queue, queue, queue,
send.

this is the result of a lot of discussion with claudio@
tested by many.


Revision tags: OPENBSD_4_3_BASE
# 1.93 18-Nov-2007 mpf

Sync struct ifaltq to match struct ifqueue.
I wonder why 64-bit archs have not been bitten by this.
OK mcbride@, henning@


# 1.92 03-Sep-2007 claudio

Bump RTM_VERSION to 4 and start a new aera of routing in OpenBSD :)
Changes include 64bit counters instead of u_long, routing table id in the header
of most messages, reserved routing priority field, added a hdrlen field to skip
over the header so that binary compatibility becomes easier.
A minimal backward support for old binaries is included to ease upgrades but
don't expect anything more than ifconfig, route and dhclient to correctly work.
OK henning@ mglocker@


Revision tags: OPENBSD_4_2_BASE
# 1.91 25-Jun-2007 henning

crank ifq_maxlen from 50 to 256, so it is not smaller than most interfaces
rx rings any more. forwarding boxes with many fast interfaces can still use
some more, but this is a saner default.
ok deraadt markus henric


# 1.90 14-Jun-2007 reyk

Add a new "rtlabel" option to ifconfig. It allows to specify a route label
which will be used for new interface routes. For example,
ifconfig em0 10.1.1.0 255.255.255.0 rtlabel RING_1
will set the new interface address and attach the route label RING_1 to
the corresponding route.

manpage bits from jmc@
ok claudio@ henning@


# 1.89 29-May-2007 uwe

Define IF_ENQUEUE() and friends as proper C statements using do ... while
ok henning


# 1.88 26-May-2007 jason

one extern seems to be better than 20 for ifqmaxlen; ok krw


# 1.87 27-Mar-2007 jmc

grammar from bret lambert, and one more from me;


Revision tags: OPENBSD_4_1_BASE
# 1.86 09-Feb-2007 jmc

grammar fix from bret lambert;


# 1.85 28-Nov-2006 reyk

add additional link states to report the half duplex / full duplex
state, if known by the driver. this is required to check the full
duplex state without depending on the ifmedia ioctl which can't be
called in the kernel without process context.

ok henning@, brad@


# 1.84 16-Nov-2006 henning

introduce if_creategroup() to create an empty interface group.
code factored out from if_addgroup(), previously a group always had to have
members. ok mpf mcbride


# 1.83 31-Oct-2006 jason

ether_input_mbuf() isn't necessary, turn it into a macro and deal with
it's "special" case in ether_input(). Based on similiar idea in FreeBSD.
ok brad


Revision tags: OPENBSD_4_0_BASE
# 1.82 02-Jun-2006 mpf

Introduce attributes to interface groups.
As a first user, move the global carp(4) demotion counter
into the interface group. Thus we have the possibility
to define which carp interfaces are demoted together.

Put the demotion counter into the reserved field of the carp header.
With this, we can have carp act smarter if multiple errors occur.
It now always takes over other carp peers, that are advertising
with a higher demote count. As a side effect, we can also have
group failovers without the need of running in preempt mode.
The protocol change does not break compability with older
implementations.

Collaborative work with mcbride@

OK mcbride@, henning@


# 1.81 27-May-2006 brad

remove IFCAP_JUMBO_MTU interface capabilities flag and set if_hardmtu in a few
more drivers.

ok reyk@


# 1.80 26-May-2006 deraadt

rename jumbo mtu to if_hardmtu; ok brad reyk


# 1.79 19-May-2006 reyk

add a if_jumbo_mtu field to the interface structure for drivers
supporting ethernet jumbo frames. there's no standard for the size of
jumbo MTUs, so either let the driver set it's own value or use 9000
byte jumbo frames by default.

ok brad@


# 1.78 04-Mar-2006 brad

With the exception of two other small uncommited diffs this moves
the remainder of the network stack from splimp to splnet.

ok miod@


Revision tags: OPENBSD_3_9_BASE
# 1.77 09-Feb-2006 reyk

add an interface detach hook and use it with the vlan(4) driver. this
fixes a possible crash if the parent interface has been destroyed
(like vlan on trunk) before destroying the vlan interface.

ok brad@


Revision tags: OPENBSD_3_8_BASE
# 1.76 14-Jun-2005 henning

rename function and define to reflect the external -> egress name change
so it is clear what it is all about


# 1.75 14-Jun-2005 henning

use "egress" instead of "external" for the interface group containing the
interfaces the default route(s) point to, proposed deraadt some days ago,
ok djm deraadt


# 1.74 12-Jun-2005 henning

add SIOCGIFGMEMB ioctl, returns a list of all interfaces who are member of
the given group, markus ok


# 1.73 07-Jun-2005 henning

introduce a default "external" interface group, containing the interface(s)
the the default route(s) point to.
handles IPv4 and IPv6 as well as multipath routes.
follows default route changes, of course.
eases writing pf rulesets especially on laptops etc. that use different
interfaces depending on the environment (wired, wireless, ...)
ok theo ryan


# 1.72 06-Jun-2005 henning

use a define instead of hardcoding "all" in 3 places


# 1.71 05-Jun-2005 henning

const'ify the char *groupname param to if_addgroup and if_delgroup


# 1.70 24-May-2005 markus

add net.inet.ip.ifq for monitoring and changing ifqueue; similar to netbsd
ok henning


# 1.69 24-May-2005 reyk

initial import of a trunking (link aggregation and link failover)
implementation. it currently supports round robin mode with link state
checking, additional modes will be added later.

ok brad@, deraadt@


# 1.68 24-May-2005 henning

keep a list of member interfaces in ifg_group


# 1.67 22-May-2005 henning

allow pf to match on interface groups
pass on mygroup ...
markus ok


# 1.66 21-May-2005 henning

clean up and rework the interface absraction code big time, rip out multiple
useless layers of indirection and make the code way cleaner overall.
this is just the start, more to come...
worked very hard on by Ryan and me in Montreal last week, on the airplane to
vancouver and yesterday here in calgary. it hurt.
ok ryan theo


# 1.65 20-Apr-2005 mpf

Introduce if_linkstatehooks.
This converts if_link_state_change() to a generic usable
callback with dohooks().

OK henning@, camield@
Tested by camield@ and Alexey E. Suslikov


Revision tags: OPENBSD_3_7_BASE
# 1.64 07-Feb-2005 mcbride

Add new function if_link_state_change() to take care of sending messages
on the routing socket and notifying carp() of link changes.

ok brad@ mpf@


# 1.63 14-Jan-2005 henning

remove old ifgroups ioctls
the old ifgroups haven't been in use ever really, and the new
implementation is 3 months old today. theo ok (3 months ago)


# 1.62 07-Dec-2004 mcbride

Convert carp(4) to behave more like a regular interface, much in the same
style as vlan(4). carp interfaces no longer require the physical interface
to be on the same subnet as the carp interface, or even that the physical
interface has an adress at all, so CARP can now be used on /30 networks.

ok deraadt@ henning@


# 1.61 07-Dec-2004 mcbride

KNF


# 1.60 03-Dec-2004 henning

do not use one struct timeout for the if congestion stuff, but embed
a struct timeout to struct ifqueue so that each one has its own - it
is a per-queue thing. from chris pascoe


# 1.59 10-Nov-2004 grange

Safer IF_INPUT_ENQUEUE macro.

ok millert@


# 1.58 14-Oct-2004 mickey

avoid stupid commons


# 1.57 11-Oct-2004 henning

ifgroups reqrite
there is now a TAILQ with all interface groups as members, and
in struct ofnet there is only a pointer to the group structure stored
and not its name.
mostly hacked at c2k4 and somewhere over the atlantic ocean
ok markus mcbride


Revision tags: OPENBSD_3_6_BASE
# 1.56 26-Jun-2004 markus

cleanup ioctl for ifgroups; ok pb@


# 1.55 25-Jun-2004 pb

introduce "interface groups"

by "ifconfig fxp0 group foobar" "ifconfig xl0 group foobar"
these two interfaces are in one group.
Every interface has its if-family as default group.

idea/design from henning@, based on some work/disucssion from Joris Vink.

henning@, mcbride@ ok.


# 1.54 20-Jun-2004 beck

undo mbuf cluster breakage that causes free'ed packets to show up on the
input queues when using dhcp and hostap wi, or xl, or fxp....
ok art@


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.53 29-May-2004 jcs

introduce SIOCSIFDESCR and SIOCGIFDESCR to maintain interface
descriptions, configurable with ifconfig

help from various, ok deraadt@


# 1.52 18-May-2004 brad

if_ether.h
add ETHER_MAX_LEN_JUMBO, ETHER_VLAN_ENCAP_LEN, ETHER_ALIGN, and
ETHERMTU_JUMBO constants.

if.h
add a few more interface capabilities flags.

Some from NetBSD, some from FreeBSD.

ok markus@


# 1.51 26-Apr-2004 mcbride

Before enqueueing the packet, copy the contents of incoming clusters
to the mbuf and free the cluster when it contains a small packet.

ok deraadt@


# 1.50 17-Apr-2004 henning

add a congestion indicator to if_queue. It is set when the input queue
is full, along with a timer that unsets it again after 10ms.
The input queue beeing full is a reliable indicator for CPU overload, and
this flag allows other subsystems to cope with the situation.
hacked with beck
ok kjc@ markus@ beck@


Revision tags: OPENBSD_3_5_BASE
# 1.49 15-Jan-2004 markus

add a RTM_IFANNOUNCE message; from netbsd; ok itojun, henning


# 1.48 16-Dec-2003 markus

return error in ifc_destroy; ok deraadt, itojun, cedric, hshoexer


# 1.47 10-Dec-2003 itojun

use if_indexlim (instead of if_index) and ifindex2ifnet[x] != NULL
to check if interface exists, as (1) if_index will have different meaning
(2) ifindex2ifnet could become NULL when interface gets destroyed,
when we introduce dynamically-created interfaces. markus ok


# 1.46 08-Dec-2003 markus

add IOCIFGCLONERS; ifconfig -C; from netbsd; ok henning, deraadt


# 1.45 03-Dec-2003 markus

support for network interface "cloning", e.g. gif(4) via ifconfig(8)


# 1.44 19-Oct-2003 david

more typos


# 1.43 17-Oct-2003 mcbride

Common Address Redundancy Protocol

Allows multiple hosts to share an IP address, providing high availability
and load balancing.

Based on code by mickey@, with additional help from markus@
and Marco_Pfatschbacher@genua.de

ok deraadt@


Revision tags: OPENBSD_3_4_BASE
# 1.42 25-Aug-2003 fgsch

if_init support, required by ieee80211.
deraadt@ ok.


# 1.41 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_A UBC_SYNC_B
# 1.40 03-Jul-2002 miod

Change all variables definitions (int foo) in sys/sys/*.h to variable
declarations (extern int foo), and compensate in the appropriate locations.


# 1.39 30-Jun-2002 itojun

allocate sockaddr_dl for ifnet in if_alloc_sadl(), as we don't always know
the size of sockaddr_dl on if_attach() - for instance, see ether_ifattach().
from netbsd. fgs ok


# 1.38 23-Jun-2002 itojun

g/c last remains of old ipv6 prefix management


# 1.37 27-May-2002 itojun

if_attach() gets called before domaininit(). scan all interfaces for if_afdata
initialization after domaininit().


# 1.36 27-May-2002 itojun

framework to add af-dependent data structure to struct ifnet.
as discussed at bsd-api-discuss. sync w/kame


# 1.35 24-Apr-2002 dhartmei

Add hooks to struct ifnet that allow to register callbacks that will be
notified of interface address changes. ok provos@, angelos@


Revision tags: OPENBSD_3_1_BASE
# 1.34 15-Mar-2002 millert

Cosmetic changes only, primarily making comments line up nicely after the
__P removal.


# 1.33 14-Mar-2002 millert

First round of __P removal in sys


# 1.32 23-Jan-2002 fgsch

compatability -> compatibility.


Revision tags: OPENBSD_3_0_BASE UBC_BASE
# 1.31 05-Jul-2001 angelos

branches: 1.31.4;
KNF


# 1.30 05-Jul-2001 jjbg

Include files for IPComp support. angelos@ ok.


# 1.29 27-Jun-2001 kjc

ALTQ base modifications to the kernel.
- ALTQ introduces a set of new queue macros that coexist with the
traditional IF_XXX macros.
- "struct ifaltq" replaces "struct ifqueue" in "struct ifnet".
- assign cdev major 74 for i386 and 54 for alpha as ALTQ control interface.


# 1.28 23-Jun-2001 fgsch

Add ether_input_mbuf to help us remove the ether_header from
ether_input; all drivers should start migrating to this.
Discussed with jason@, deraadt@ more or les ok'ed.


# 1.27 15-Jun-2001 itojun

change the meaning of ifnet.if_lastchange to meet RFC1573 ifLastChange.
follows BSD/OS practice and ucd-snmp code (FreeBSD does it for specific
interfaces only).

was: if_lastchange get updated on every packet transmission/receipt.
now: if_lastchange get updated when IFF_UP is changed.


# 1.26 09-Jun-2001 angelos

By popular demand, protect from multiple inclusion, and fix to use the
same naming style.


# 1.25 28-May-2001 angelos

IPSECv4 -> IPSEC


# 1.24 28-May-2001 angelos

No need for separate ESP/AH interface capabilities.


# 1.23 28-May-2001 angelos

Interface capabilities (based on NetBSD, but merge ethercom and ifnet
capabilities into one, in the ifp).


Revision tags: OPENBSD_2_9_BASE
# 1.22 06-Feb-2001 mickey

allow changing number of loopbacks in ukc.
change rest of the code to use lo0ifp pointing
to the corresponding struct ifnet.
itojun@ and niklas@ ok


# 1.21 19-Jan-2001 itojun

pull post-4.4BSD change to sys/net/route.c from BSD/OS 4.2 (UCB copyrighted).

have sys/net/route.c:rtrequest1(), which takes rt_addrinfo * as the argument.
pass rt_addrinfo all the way down to rtrequest, and ifa->ifa_rtrequest.
3rd arg of ifa->ifa_rtrequest is now rt_addrinfo * instead of sockaddr *
(almost noone is using it anyways).

benefit: the follwoing command now works. previously we need two route(8)
invocations, "add" then "change".
# route add -inet6 default ::1 -ifp gif0

remove unsafe typecast in rtrequest(), from rtentry * to sockaddr *. it was
introduced by 4.3BSD-reno and never corrected.

XXX is eon_rtrequest() change correct regarding to 3rd arg?
eon_rtrequest() and rtrequest() were incorrect since 4.3BSD-reno,
so i do not have correct answer in the source code.
someone with more clue about netiso-over-ip, please help.


Revision tags: OPENBSD_2_8_BASE
# 1.20 20-Sep-2000 art

Since ifa_refcnt was bumped to an int and rt_flags is an int too, bump
ifa_flags to int.


# 1.19 28-Aug-2000 deraadt

changing the size of if_data has heavy impact on userland compat. there
was even a u_char slot available for ifi_link_state, which clearly does not
need a full 32 bits.


# 1.18 26-Aug-2000 nate

sync mii code with netbsd
adds detach functionality for phys
some code cleanup

Nobody really had time to test all of this out, but theo said commit anyway


Revision tags: OPENBSD_2_7_BASE
# 1.17 22-Mar-2000 itojun

remove if_withname(), which was imported during KAME merge by mistake.


# 1.16 21-Mar-2000 mickey

add SIOCGIFMTU/SIOCSIFMTU; remediate redundant code of tun, ppp, sppp; chris@ ok


Revision tags: SMP_BASE
# 1.15 02-Feb-2000 itojun

branches: 1.15.2;
wrap IFAFREE() by "do {} while (0)". it wasn't safe enough.


Revision tags: kame_19991208
# 1.14 08-Dec-1999 itojun

bring in KAME IPv6 code, dated 19991208.
replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support.
see sys/netinet6/{TODO,IMPLEMENTATION} for more details.

GENERIC configuration should work fine as before. GENERIC.v6 works fine
as well, but you'll need KAME userland tools to play with IPv6 (will be
bringed into soon).


Revision tags: OPENBSD_2_6_BASE
# 1.13 08-Aug-1999 niklas

Support detaching of network interfaces. Still work to do in ipf, and
other families than inet.


# 1.12 23-Jun-1999 cmetz

Added some protocol independent interfaces (supposedly IPv6 support APIs, but
ones that are useful for all protocols, not just IPv6).


Revision tags: OPENBSD_2_5_BASE
# 1.11 13-Mar-1999 deraadt

make ifa_refcnt a u_int; andrewb@demon.net


# 1.10 26-Feb-1999 jason

Ethernet bridge/IP firewall driver.


# 1.9 07-Jan-1999 deraadt

fix IFAFREE() to be safe for if/else nesting


Revision tags: OPENBSD_2_4_BASE
# 1.8 03-Sep-1998 jason

o OpenBSD gets if_media support (from NetBSD)
o rework/simplify if_xl to use it


Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE OPENBSD_2_2_BASE OPENBSD_2_3_BASE
# 1.7 02-Jul-1996 niklas

-Wall & -Wstrict-prototype fixes


# 1.6 29-Jun-1996 deraadt

provide if_attachhead(), and make if_loop use it


# 1.5 10-May-1996 deraadt

if_name/if_unit -> if_xname/if_softc


# 1.4 06-May-1996 mickey

if.h was missed from the commit.
if_ethersubr.c: missed variables added.


# 1.3 05-Mar-1996 mickey

Changes for ifconfig to compile.


# 1.2 03-Mar-1996 niklas

From NetBSD: 960217 merge


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.216 11-Apr-2024 bluhm

Prevent changing interface loopback flag from userland.

IFF_LOOPBACK is telling userland the behaviour of a specific driver,
it is supposed to be static and permanent. Clearing the loopback
flag on lo0 could lead to a kernel crash due to inconsistent multicast
igmp group.

Reported-by: syzbot+2f24ed6c8ddb2d6bb22c@syzkaller.appspotmail.com
OK claudio@ deraadt@


Revision tags: OPENBSD_7_5_BASE
# 1.215 11-Nov-2023 bluhm

Pass constant struct sockaddr to interface lookup functions.

OK mvs@


Revision tags: OPENBSD_7_4_BASE
# 1.214 30-May-2023 dlg

add net_tq_barriers

this waits once for something to end in all the net tqs.

ok claudio@


# 1.213 16-May-2023 jan

Use separate IFCAPs for LRO and TSO.

This diff introduces separate capabilities for TCP offloading. We split this
into LRO (large receive offloading) and TSO (TCP segmentation offloading).
LRO can be turned on/off via tcprecvoffload option of ifconfig and is not
inherited to sub interfaces.

TSO is inherited by sub interfaces to signal this hardware offloading capability
to the network stack.

With tweaks from bluhm, claudio and dlg

ok bluhm, claudio


# 1.212 15-May-2023 bluhm

Implement the TCP/IP layer for hardware TCP segmentation offload.
If the driver of a network interface claims to support TSO, do not
chop the packet in software, but pass it down to the interface
layer.
Precalculate parts of the pseudo header checksum, but without the
packet length. The length of all generated smaller packets is not
known yet. Driver and hardware will use the mbuf packet header
field ph_mss to calculate it and update checksum.
Introduce separate flags IFCAP_TSOv4 and IFCAP_TSOv6 as hardware
might support ony one protocol family. The old flag IFXF_TSO is
only relevant for large receive offload. It is missnamed, but keep
that for now.
Note that drivers do not set TSO capabilites yet. Also the ifconfig
flags and pseudo interfaces capabilities will be done separately.
So this commit should not change behavior.
heavily based on the work from jan@; OK sashan@


Revision tags: OPENBSD_7_3_BASE
# 1.211 07-Mar-2023 jan

Avoid enabling TSO on interfaces which are already attached to a bridge.

with tweaks from claudio and deraadt

ok claudio, bluhm


# 1.210 27-Feb-2023 jan

Turn off TSO if interface is added to layer 2 devices.

ok bluhm@, claudio@


Revision tags: OPENBSD_7_2_BASE
# 1.209 27-Jun-2022 jan

Introduce Large Receive Offloading of TCP segment offloading for ix(4). It is
disabled by default. Also add a tso option to ifconfig(8) to enable and
disable this feature.

ok deraadt


Revision tags: OPENBSD_6_9_BASE OPENBSD_7_0_BASE OPENBSD_7_1_BASE
# 1.208 11-Mar-2021 florian

When RFC 8981 obsoleted RFC 4941 the terminology changed from
"privacy extensions" to "temporary address extensions"

Change ifconfig(8) to output temporary after temporary addresses and
add "temporary" option which is an alias for autoconfprivacy for now.

Also make AUTOCONF6TEMP a positiv flag that is set by default.
Previously the negative flag "INET6_NOPRIVACY" was set when privacy
addresses were disabled. This makes the flags output less ugly and
will allow us to disable autoconf addresses while having temporary
addresses enabled in the future.

More work is needed in slaacd.

input benno, jmc, deraadt
previous verison OK benno
OK jmc, kn


# 1.207 20-Feb-2021 dlg

add a MONITOR flag to ifaces to say they're only used for watching packets.

an example use of this is when you have a span port on a switch and
you want to be able to see the packets coming out of it with tcpdump,
but do not want these packets to enter the network stack for
processing. this is particularly important if the span port is
pushing a copy of any packets related to the machine doing the
monitoring as it will confuse pf states and the stack.

ok benno@


# 1.206 01-Feb-2021 mvs

ifunit() was fully replaced by if_unit(9) and should go away.

ok bluhm@ dlg@


# 1.205 18-Jan-2021 mvs

Introduce new function if_unit(9). This function returns a pointer the
interface descriptor corresponding to the unique name. This descriptor
is guaranteed to be valid until if_put(9) is called on the returned
pointer. if_unit(9) should replace already existent ifunit() which
returns descriptor not safe for dereference when context was switched.
This allow us to avoid some use-after-free issues in ioctl(2) path.
Also this unifies interface descriptor usage.

ok claudio@ sashan@


# 1.204 30-Sep-2020 mvs

We have no if_attachtail() function so remove the declaration.

ok deraadt@ claudio@


Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE OPENBSD_6_8_BASE
# 1.203 25-Jul-2019 krw

AF_INET comes before AF_INET6. Shorten line to <80 chars.

pointed out by claudio@


# 1.202 25-Jul-2019 krw

Add IFXF_AUTOCONF4 to if_xflags to match IFXF_AUTOCONF6. Let
ifconfig set/unset it.

ok deraadt@ kmos@


# 1.201 19-Apr-2019 dlg

add IF_HDRPRIO_OUTER for rxprio

IF_HDRPRIO_OUTER says you want the priority from the outer encap header.

ok claudio@


Revision tags: OPENBSD_6_5_BASE
# 1.200 10-Apr-2019 dlg

add struct if_sffpage so userland can read a page of sfp/qsfp info

this will be used by ifconfig so it can show you things like who
mades what module you're using and when it was made, and on some
modules you get diag info like temperature, vcc, and rx and tx
powers.

im putting the kernel side in so we can keep fiddling with the
userland printf side of things.

this work was done based on a question by rachel roch
ok deraadt@
enthusiasm from many including mikeb@ sthen@ patrick@


# 1.199 23-Jan-2019 dlg

add a SIOCGPWE3 ioctl for interfaces to advertise pwe3 capability

im going to turn mpw into an ethernet interface, which includes
changing its if_type to IFT_ETHER. currently ldpd looks for if_type
IFT_MPLSTUNNEL to decide if an interface is a pseudowire, ie, it's
going to break. the ioctl will let ldpd ask the interface if it is
pseudowire capable as an alternative.

ok claudio@


# 1.198 12-Nov-2018 dlg

add ifreq bits for the tx header prio field ioctls

a tx header prio can set to a fixed value from 0 to 7, or magic
values to represent populating the prio field from the encapsulated
packet, or from the mbuf prio value.

ok claudio@


# 1.197 12-Nov-2018 krw

Add new routing socket message RTM_80211INFO to provide details of
802.11 interface state changes (e.g. SSID) to interested parties.

Original diff from phessler@. Many suggestions and tweaks from
claudio@, stsp@, anton@.

ok claudio@ stsp@ anton@ phessler@


# 1.196 11-Nov-2018 dlg

use the llprio on gre(4) and eoip(4) interfaces for the keepalive tos

llprios are valued 0 to 7, while the ip tos/dscp/tclass is an 8 bit
value. fortunately the high 3 bits map nicely to the llprio values,
so we shift the llprio into place when generating the keepalive
frames. the llprio is defaulted to the value that cisco uses for
their gre keepalives.


Revision tags: OPENBSD_6_4_BASE
# 1.195 12-Sep-2018 krw

Fix obvious cut&pasto in comment (ifa_msghdr -> if_announcemsghdr).

ok claudio@


# 1.194 30-May-2018 dlg

restrict the prio values from SIOCSIFLLPRIO to what the kernel handles

previously the ioctl code checked that prio was an int less than
UCHAR_MAX, but the rest of the kernel (and priq code in particular)
expects it to be between 0 and 7 inclusive.

ok krw@ tb@


# 1.193 25-Apr-2018 jca

Make this header standalone #if __BSD_VISIBLE, by including needed headers

Puts us in line with Free/NetBSD and Linux and will get us rid of
pointless patches in the ports tree. ok guenther@ deraadt@


Revision tags: OPENBSD_6_3_BASE
# 1.192 19-Feb-2018 dlg

tunneldf needs ifr_df


# 1.191 10-Feb-2018 florian

Implement RFC 7217: "A Method for Generating Semantically Opaque
Interface Identifiers with IPv6 Stateless Address Autoconfiguration."

"An IPv6 address configured using this method is stable within each
subnet, but the corresponding Interface Identifier changes when the
host moves from one network to another. This method is meant to be an
alternative to generating Interface Identifiers based on hardware
addresses."

OK naddy, sthen


# 1.190 16-Jan-2018 mpi

Recycle IFF_NOTRAILERS into IFF_STATICARP and document ownerhsip
of IFF* flags.

inputs from jmc@, ok bluhm@, visa@


# 1.189 21-Dec-2017 dlg

prototype if_attach_iqueues so drivers can configure multiple iqs.


# 1.188 09-Nov-2017 tb

The cmd argument of ifconf() has been unused since COMPAT_LINUX was
purged. Remove it and move the prototype to if.c since ifconf() is
not used outside of this file.

ok mpi


# 1.187 31-Oct-2017 sashan

- add one more softnet taskq
NOTE: code still runs with single softnet task. change definition of
SOFTNET_TASKS in net/if.c, if you want to have more than one softnet task

OK mpi@, OK phessler@


Revision tags: OPENBSD_6_1_BASE OPENBSD_6_2_BASE
# 1.186 24-Jan-2017 krw

A space here, a space there. Soon we're talking real whitespace
rectification.


# 1.185 24-Jan-2017 dlg

add support for multiple transmit ifqueues per network interface.

an ifq to transmit a packet is picked by the current traffic
conditioner (ie, priq or hfsc) by providing an index into an array
of ifqs. by default interfaces get a single ifq but can ask for
more using if_attach_queues().

the vast majority of our drivers still think there's a 1:1 mapping
between interfaces and transmit queues, so their if_start routines
take an ifnet pointer instead of a pointer to the ifqueue struct.
instead of changing all the drivers in the tree, drivers can opt
into using an if_qstart routine and setting the IFXF_MPSAFE flag.
the stack provides a compatability wrapper from the new if_qstart
handler to the previous if_start handlers if IFXF_MPSAFE isnt set.

enabling hfsc on an interface configures it to transmit everything
through the first ifq. any other ifqs are left configured as priq,
but unused, when hfsc is enabled.

getting this in now so everyone can kick the tyres.

ok mpi@ visa@ (who provided some tweaks for cnmac).


# 1.184 23-Jan-2017 mpi

Flag pseudo-interfaces as such in order to call add_net_randomness()
only once per packet.

Fix a regression introduced when if_input() started to be called by
every pseudo-driver.

ok claudio@, dlg@


# 1.183 23-Jan-2017 dlg

i botched the copyout to ifr->ifr_data in SIOCGIFDATA.

this lets pflogd run again.

rename if_data() to if_getdata() while here to make grepping for
things less noisy.

reported by jsg@
worked through with deraadt@


# 1.182 23-Jan-2017 dlg

merge the ifnet and ifqueue stats together when userland wants them.

a new if_data() function takes a pointer to ifnet and merges its
if_data and ifq statistics. it takes the ifq mutex around the reads
of the ifq stats so they get a consistent copy.

the ifnet and ifq stats are merged because some parts of the stack
still update the ifnet counters.

ok visa@ (on an earlier diff) mpi@ claudio@


# 1.181 12-Dec-2016 mpi

Remove most of the splsoftnet() recursions related to cloned interfaces.

inputs and ok bluhm@


# 1.180 27-Oct-2016 dlg

add a new pool for 2k + 2 byte (mcl2k2) clusters.

a certain vendor likes to make chips that specify the rx buffer
sizes in kilobyte increments. unfortunately it places the ethernet
header on the start of the rx buffer, which means if you give it a
mcl2k cluster, the ethernet header will not be ETHER_ALIGNed cos
mcl2k clusters are always allocated on 2k boundarys (cos they pack
into pages well). that in turn means the ip header wont be aligned
correctly.

the current workaround on these chips has been to let non-strict
alignment archs just use the normal 2k cluster, but use whatever
cluster can fit 2k + 2 on strict archs. that turns out to be the
4k cluster, meaning we waste nearly 2k of space on every packet.

properly aligning the ethernet header and ip headers gives a
performance boost, even on non-strict archs.


# 1.179 04-Sep-2016 reyk

Move code to change the rdomain of an interface from the ioctl switch case
to a new function if_setrdomain().

OK mpi@ henning@


# 1.178 03-Sep-2016 reyk

Add support for a multipoint-to-multipoint mode in vxlan(4). In this
mode, vxlan(4) must be configured to accept any virtual network
identifier with "vnetid any" and added to a bridge(4) or switch(4).
This way the driver will dynamically learn the tunnel endpoints and
their vnetids for the responses and can be used to dynamically bridge
between VXLANs. It is also being used in combination with switch(4)
and the OpenFlow tunnel classifiers.

With input from yasuoka@ goda@
OK deraadt@ dlg@


Revision tags: OPENBSD_6_0_BASE
# 1.177 10-Jun-2016 vgross

Add the "llprio" field to struct ifnet, and the corresponding keyword
to ifconfig.

"llprio" allows one to set the priority of packets that do not go through
pf(4), as the case is for arp(4) or bpf(4).

ok sthen@ mikeb@


# 1.176 02-Mar-2016 dlg

provide generic ioctls for managing an interfaces parent

in the future this will subsume the individual vlandev, carpdev,
pppoedev, foodev options for things like vlan, carp, pppoe, etc.

inspired by vnetid

ok mpi@ jmatthew@


Revision tags: OPENBSD_5_9_BASE
# 1.175 05-Dec-2015 deraadt

avoid an ugly wrap in a comment


# 1.174 03-Dec-2015 dlg

rework if_start to allow nics to provide an mpsafe start routine.

existing start routines will still be called under the kernel lock
and at IPL_NET.

mpsafe start routines will be serialised so only one instance of
each interfaces function will be running in the kernel at any point
in time. this guarantees packets will be dequeued in order, and the
start routines dont have to lock against themselves because if_start
does it for them.

the code to do that is based on the scsi runqueue code.

this also provides an if_start_barrier() function that should wait
until any currently running instances of if_start have finished.

a driver can opt in to the mpsafe if_start call by doing the following:

1. setting ifp->if_xflags = IFXF_MPSAFE
2. only calling if_start() instead of its own start routine
3. clearing IFF_RUNNING before calling if_start_barrier() on its way down
4. only using IFQ_DEQUEUE (not ifq_deq_begin/commit/rollback)

to simplify the implementation the tx mitigation code has been removed.

tested by several
ok mpi@ jmatthew@


# 1.173 20-Nov-2015 mpi

Keep if_ref() private, if_get() is what you want to use before if_put().

The thread detaching an interface will sleep until all references to this
interface have been released. So we decided to only keep references for
a short period of time.

Keeping if_ref() private will hopefully help preserve this goal as long
as it makes sense.

Calling if_get()/if_put() in the same function also allows us to make
use of static analysis tools (thanks jsg@!) to catch our errors.

ok dlg@


# 1.172 24-Oct-2015 reyk

Add pair(4), a vether-based virtual Ethernet driver to interconnect
rdomains and bridges on the local system. This can be used to route
through local rdomains, to create L2 devices (like trunks) between
them, and many other things.

Discussed with many, with input from mpi@
OK sthen@ phessler@ yasuoka@ mikeb@


# 1.171 23-Oct-2015 claudio

Introduce a new sysctl NET_RT_IFNAMES that returns only ifnames to ifindex
mappings. This will be used by if_nameindex(3), if_nametoindex(3) and
if_indextoname(3) soon to fix the issues in pledge because of inet6 link
local addressing.
OK mpi@ benno@ deraadt@
The libc version will follow soon so better start updating your kernels


# 1.170 23-Oct-2015 dlg

tweak the vnetid so it can be optional and therefore cleared/deleted.

the abstract vnetid is promoted to a uin32_t, and adds a SIOCDVNETID
ioctl so it can be cleared.

this is all because i set an assignment on implementing a virtual
network interface and the students got confused when vnetid 0 didnt
show up in ifconfig output.

the vnetid in the vxlan(4) protocol is optional, but the current
code confuses 0 with no vnetid being set. this makes it clear.

ok reyk@ who also simplified my diff


# 1.169 05-Oct-2015 uebayasi

Add ifi_oqdrops and its alias to struct if_data.

Necessary bumps in Ports will be handled by sthen@.

OK mpi@ dlg@


# 1.168 27-Sep-2015 stsp

Add if_setlladdr(), factored out from ifioctl(). Will be used by iwm(4) soon.
With suggestions from tedu@ and guenther@
ok kettenis@


# 1.167 11-Sep-2015 stsp

Make room for media types of the future. Extend the ifmedia word to 64 bits.
This changes numbers of the SIOCSIFMEDIA and SIOCGIFMEDIA ioctls and
grows struct ifmediareq.

Old ifconfig and dhclient binaries can still assign addresses, however
the 'media' subcommand stops working. Recompiling ifconfig and dhclient
with new headers before a reboot should not be necessary unless in very
special circumstances where non-default media settings must be used to
get link and console access is not available.

There may be some MD fallout but that will be cleared up later.

ok deraadt miod
with help and suggestions from several sharks attending l2k15


# 1.166 09-Sep-2015 dlg

introduce reference counts for interfaces (ie, struct ifnet *ifp).

if_get can get a reference to an ifp, but it never releases that
reference. this provides an if_put function that can be used to
decrement the refcount.

we cannot come up with a scheme for letting the network stack run on
one (or many) cpus while ioctls are pulling interfaces down on another
cpu without refcounts for the interfaces.

if_put is going in now so we can go through the stack and put the
necessary calls to it in, and then we'll backfill this implementation
to actually check the refcounts when the interface detaches.

ok mpi@ mikeb@ claudio@


# 1.165 30-Aug-2015 mpi

Use a global table for domains instead of building a list at run time.

As a side effect there's no need to run if_attachdomain() after the
list of domains has been built.

ok claudio@, reyk@


Revision tags: OPENBSD_5_8_BASE
# 1.164 07-Jun-2015 jsg

Introduce unhandled_af() for cases where code conditionally does
something based on an address family and later assumes one of the paths
was taken. This was initially just calls to panic until guenther
suggested a function to reduce the amount of strings needed.

This reduces the amount of noise with static analysers and acts
as a sanity check.

ok guenther@ bluhm@


# 1.163 18-May-2015 reyk

Move the rdomain from struct ifnet into struct if_data. This way it
will be exported to userland with the existing sysctl, getifaddrs()
and routing socket (if_msghdr.ifm_data) interfaces that expose
if_data. All programs and daemons - Apps - that call the
SIOCGIFRDOMAIN ioctl in a getifaddrs() loop or after receiving an
interface message on the routing socket can now remove the pointless
additional ioctl. In base, that could be: dhclient, isakmpd, dhcpd,
dhcrelay, ntpd, ospfd, ripd, ifconfig.

No ABI breakage because it uses a previously unused pad field in if_data.

OK mpi@ deraadt@


# 1.162 10-Apr-2015 mpi

Run detach hook and similar before cleaning up any other resource when
an interface is destroyed/removed. This way we can ensure pseudo-driver
changes done after attaching an interface are undone before detaching it.

Note: it is safe to call if_deactivate() multiple times as the interface
should not have any attached pseudo-interface after the first call.

ok deraadt@, dlg@


# 1.161 18-Mar-2015 dlg

remove the congestion handling from struct ifqueue.

its only used for the ip and ip6 network stack input queues, so it
seems unfair that every instance of ifqueue has to carry a pointer
around for this specific use case.

this moves the congestion marker to a kernel global. if we detect
that we're congested, we assume the whole system is busy and punish
all input queues.

marking a system as congested is done by setting the global to the
current value of ticks. as the system moves away from that value,
it moves away from being congested until the comparison fails.

written at s2k15
ok henning@ beck@ bluhm@ claudio@


Revision tags: OPENBSD_5_7_BASE
# 1.160 08-Feb-2015 mpi

Introduce if_input() a function to pass packets dequeued from a
recieving ring to the stack.

if_input() is at the moment a drop-in replacement for ether_input_mbuf()
but will let us stack pseudo-driver in a nice way in order to no longer
call ether_input() recursively.

ok pelikan@, reyk@, blambert@, henning@


# 1.159 06-Jan-2015 stsp

Remove the NOINET6 interface flag, a left-over from the times when IPv6
was enabled by default. Add AFATTACH/AFDETACH ioctls which enable/disable
an address family for an interface (currently used for IPv6 only).

New kernel needs new ifconfig for IPv6 configuration (address assignment
still works with old ifconfig making this easy to cross over).

Committing on behalf of henning@ who is currently lebensmittelvergiftet.
ok stsp, benno, mpi


# 1.158 05-Dec-2014 mpi

Explicitly include <net/if_var.h> instead of pulling it in <net/if.h>.

ok mikeb@, krw@, bluhm@, tedu@


Revision tags: OPENBSD_5_6_BASE
# 1.157 14-Jul-2014 dlg

now that receive ring accounting has been pulled out of the mbuf layer,
we can pull the space the mbuf layer used to do per interface accounting
out of struct if_data.

saves a hundredish bytes on every interface.

ok deraadt@ claudio@


# 1.156 11-Jul-2014 henning

introduce the IFXF_AUTOCONF6 interface flag which controls wether we
accept rtadvs on that interface. the global net.inet6.ip6.accept_rtadv
sysctl just doesn't cut it, even tho the spec wants that - but in their
little absurd world, a host just has one interface by definition anyway...
the sysctlgoes away.
lots of head scratching, brain cell elemination etc from bluhm benno stsp
florian, excitement from simon and todd, ok bluhm stsp benno florian


# 1.155 08-Jul-2014 dlg

introduce the if_rxr api. it is intended to pull the rx ring accounting
out of the mbuf layer, and break the assumption that an interface will
only have a single ring per mbuf cluster size.

mpi@ is ok with moving this forward


# 1.154 13-Jun-2014 mpi

Instead of updating all the cluster allocation water marks of all the
interfaces when the kernel is livelocked, only do it for the current
pool and defer the other updates.

This allow us to get rid of an interface list iteration in a critical
path.

Ridding the libc crank since this change introduce an ABI break.

ok claudio@


Revision tags: OPENBSD_5_5_BASE
# 1.153 21-Nov-2013 mikeb

split kernel parts of the if.h into a separate header file if_var.h
which allows us to modify ifnet structure in a relatively safe way;
discussed with deraadt, ok mpi


# 1.152 09-Nov-2013 dlg

ticks is compared against mcl_grown to see if time has elapsed since
the rx ring was last allowed to grow and then assigned to it. it
is erroneous to do this because mcl_grown is a u_int and ticks is an
int.

this makes mcl_grown an int, and follows the idiom in kern_timeout.c
of going "thing - ticks < diff", which better copes with ticks
wrapping around and being used to calculate relative intervals.

ok pirofti@ guenther@


# 1.151 01-Nov-2013 pelikan

keep net/hfsc.h away from userspace, except in pfctl

tested by naddy, ok deraadt


# 1.150 21-Oct-2013 benno

nuke comment. How soon is now?
"do it" deraadt@


# 1.149 19-Oct-2013 reyk

Bring back the if_detachhook. We're going to have more users now.

ok mpi@ henning@ benno@


# 1.148 13-Oct-2013 reyk

Import vxlan(4), the virtual extensible local area network tunnel
interface. VXLAN is a UDP-based tunnelling protocol for overlaying
virtualized layer 2 networks over layer 3 networks. The implementation
is based on draft-mahalingam-dutt-dcops-vxlan-04 and has been tested
with other implementations in the wild.

put it in deraadt@


# 1.147 12-Oct-2013 henning

new bandwidth shaping subsystem, kernel side
uses hfsc behind the scenes; altq stays in parallel for a migration phase.
if.h even more messy for the transition, but eventuelly it should become
readable...
looked over & tested by many, ok phessler sthen


# 1.146 17-Sep-2013 mpi

Change vlan(4) detach procedure to not use a hook but a list of vlans
on the parent interface. This is similar to what bridge(4), trunk(4)
or carp(4) are doing and allows us to get rid of the detachhook.

ok reyk@, mikeb@


# 1.145 28-Aug-2013 mpi

Remove unused argument from *rtrequest()

ok krw@, mikeb@


Revision tags: OPENBSD_5_4_BASE
# 1.144 20-Jun-2013 mpi

Revert previous and unbreak asr, the new include should be protected.

Reported by naddy@


# 1.143 20-Jun-2013 mpi

Allocate the various hook head descriptors as part of the ifnet
structure rather than doing various M_WAITOK allocations during
the *attach() functions, we always rely on them anyway.

ok mikeb@, uebayasi@


# 1.142 02-Apr-2013 mpi

Instead of storing the link-level address of every interface in a global
array indexed by interface numbers, add a new field to the interface
descriptor pointing to it.

claudio@ and todd@ like it, ok mikeb@


# 1.141 26-Mar-2013 mpi

Remove various read-only *maxlen variables and use IFQ_MAXLEN directly.

ok beck@, mikeb@


# 1.140 20-Mar-2013 mpi

Introduce if_get() to retrieve an interface descriptor pointer given
an interface index and replace all the redondant checks and accesses
to a global array by a call to this function.

With imputs from and ok bluhm@, mikeb@


# 1.139 07-Mar-2013 mpi

Remove unused ifa_ifwithaf() function.

ok mikeb@, miod@


# 1.138 07-Mar-2013 mpi

Remove the IFAFREE() macro, the ifafree() function it was calling already
check for the reference counter.

ok mikeb@, miod@, pelikan@, kettenis@, krw@


Revision tags: OPENBSD_5_3_BASE
# 1.137 23-Nov-2012 sthen

Add SIOCGIFHARDMTU to allow retrieving the driver's maximum supported MTU
looks fine reyk@ ok mikeb@


# 1.136 11-Nov-2012 deraadt

align ifaliasreq.ifra_addr similar to the way that ifreq is fixed --
a gruesome union, to block the compiler from placing the struct
incorrectly aligned on stack frames
ok guenther


# 1.135 05-Oct-2012 camield

Point an interface directly to its bridgeport configuration, instead
of to the bridge itself. This is ok, since an interface can only be part
of one bridge, and the parent bridge is easy to find from the bridgeport.

This way we can get rid of a lot of list walks, improving performance
and shortening the code.

ok henning stsp sthen reyk


# 1.134 19-Sep-2012 henning

defina an IFCAP_CSUM_MASK, covering IFCAP_CSUM_*, and use it in if_vlan.c
to replace the list of them.
this actually makes vlan inherit the IPv6 CSUM flags from it's parent, that
had been commented out since this code was committed back in 2001.
ok benno mpf


# 1.133 10-Sep-2012 guenther

Bring into compliance with POSIX, exposing just the specified bits.

Requested by jasper@, ok kettenis@


# 1.132 21-Aug-2012 bluhm

Reverse the name and meaning of the IFXF_INET6_PRIVACY interface
flag. It is now called IFXF_INET6_NOPRIVACY. So IPv6 privacy
addresses are on by default without resetting the flag during
ifconfig down/up.
OK stsp@, sperreault@ (who wrote the same diff)


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.131 02-Dec-2011 haesbaert

Kill unused IFCAP_IPSEC and IFCAP_IPCOMP.

ok claudio@ henning@ mikeb@


# 1.130 02-Nov-2011 haesbaert

Expose if_capabilities to userland so that ifconfig can display the
device hardware features.
Tune ifconfig to show them with 'hwfeatures' argument.
While here, kill some old unused capabilities and respect 80 columns
in brconfig.h.

ok mcbride@, henning@, mpf@.


# 1.129 07-Oct-2011 henning

rename some vars and functions
unfortunately altq is one giant namespace violation. rename just those that
conflict with new stuff for now only to be found on my laptop. reduce pain,
the diff is huge already. ok ryan


Revision tags: OPENBSD_5_0_BASE
# 1.128 08-Jul-2011 henning

new priority queueing implementation, extremely low overhead, thus fast.
unconditional, always on. 8 priority levels, as every better switch, the
vlan header etc etc. ok ryan mpf sthen, pea tested as well


# 1.127 07-Jul-2011 henning

provide IF_LEN and IFQ_LEN to access ifq_len on an ifqueue, ryan ok


# 1.126 05-Jul-2011 henning

now of course I only noticed if_qflush is completely unused after
adjusting it to the new world order in my tree... remove it, ok ryan claudio


# 1.125 03-Jul-2011 henning

IFQ_CLASSIFY is also just schrapnel


# 1.124 03-Jul-2011 henning

no traces of ALTQ_DECL to be found anywhere, thus kill the #defines


# 1.123 03-Jul-2011 claudio

LINK_STATE_IS_UP() should consider LINK_STATE_UNKNOWN as an up state.
This is now possible because carp no longer uses LINK_STATE_UNKNOWN
for a state that is considered down. This will simplify a lot of code.
OK mpf@ mcbride@ henning@


# 1.122 13-Mar-2011 stsp

Add a way to enable/disable Wake On LAN with ifconfig.
ok deraadt


Revision tags: OPENBSD_4_9_BASE
# 1.121 17-Nov-2010 henning

introduce ifa_update_broadaddr to update an ifaddr's broadcast address,
trivial for the moment, more needed soon
tested by many as part of a larger diff, ok sthen claudio dlg krw


# 1.120 24-Sep-2010 claudio

Implement if_freenameindex() as a real function as required by posix.
OK deraadt@, millert@


# 1.119 23-Sep-2010 dlg

tweak the mclgeti algorithm to behave better under load.

instead of letting hardware rings grow on every interrupt, restrict
it so it can only grow once per softclock tick. we can only punish
the rings on softclock ticks, so it make sense to only grow on
softclock tick boundaries too.

the rings are now punished after >1 lost softclock tick rather than
>2. mclgeti is now more aggressive at detecting livelock.

the rings get punished by an 8th, rather than by half.

we now allow the rings to be punished again even if the system is
already considered in livelock.

without this diff a livelocked system will have its rx ring sizes
scale up and down very rapidly, while holding the rings low for too
long. this affected throughput significantly.

discussed and tested heavily at j2k10. there are still some games
with softnet we can play, but this is a good first step.

"put it in" and ok deraadt@
ok claudio@ krw@ henning@ mcbride@

if we find out that it sucks we can pull it out again later. till then
we'll run with it and see how it goes.


# 1.118 27-Aug-2010 jsg

remove the unused if_init callback in struct ifnet
ok deraadt@ henning@ claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.117 26-Jun-2010 claudio

Implement a simple keepalive mechanism in gre(4) that is compatible with
the one used by Cisco. It sends a return gre packet inside a gre packet
to the other side and expects it to return.
OK deraadt, reyk additional testing by sthen


# 1.116 28-May-2010 claudio

Rework the way we handle MPLS in the kernel. Instead of fumbling MPLS into
ether_output() and later on other L2 output functions use a trick and over-
load the ifp->if_output() function pointer on MPLS enabled interfaces to
go through mpls_output() which will then call the link level output function.
By setting IFXF_MPLS on an interface the output pointers are switched.
This now allows to cleanup the MPLS input and output pathes and fix mpe(4)
so that the MPLS code now actually works for both P and PE systems.
Tested by myself and michele
(A custom kernel with MPLS and mpe enabled is still needed).


# 1.115 17-Apr-2010 deraadt

split SIOCSIFLLADDR code out into an ifnewlladr() function
ok stsp


# 1.114 06-Apr-2010 stsp

Simple implementation of RFC4941, "Privacy Extensions for Stateless
Address Autoconfiguration in IPv6". For those among us who are paranoid
about broadcasting their MAC address to the IPv6 internet.

Man page help from jmc, testing by weerd, arc4random API hints from djm.

ok deraadt, claudio


Revision tags: OPENBSD_4_7_BASE
# 1.113 13-Jan-2010 henning

maintain a global RB tree of all local addresses in the system. this
includes AF_LINK addresses (aka mac addresses in the ethernet case). for
inet this also includes the broadcast addresses.
depends on ifinit() called earlier so we have a chance to pool_init before
autoconf assigns the AF_LINK addresses, the v6 fix, and the ifa_add/del
abstraction i just committed.
this is a change in semantics, it is now illegal to change the actual
address in an ifaddr struct because then the RB tree becomes unbalanced.
nothing using this tree yet.
ok theo ryan dlg


# 1.112 13-Jan-2010 henning

instead of fiddling with the per-interface address lists directly in
many places create a proper API (ifa_add / ifa_del) and use it.
ok theo ryan dlg


# 1.111 12-Jan-2010 deraadt

Make the structures for ifa_msghdr and friends even more like
the route messages so that people and compilers will not get
confused.
ok claudio


# 1.110 17-Sep-2009 claudio

Remove the comaptibility structures for routing socket version 3.
The RTM_VERSION bump is 2 years ago and so there is no need for this.
Diff made by tedu@ some time ago but got never commited so I do it now.


# 1.109 14-Sep-2009 claudio

Add a way to convert the ifi_link_state to a string without the use of
if_media. This makes link state tracking a lot easier as there is no need
to convert if types to if_media types, etc. Additionally this allows us
to extend the link states to include states tracked on higher protocol layers.
gre(4) keepalives packets, bfd and udld can be implemented without ugly hacks.
OK henning, michele, sthen, deraadt


# 1.108 10-Aug-2009 deraadt

At sys_reboot time, bring all the interfaces down so that their xxstop
functions are called, which will turn off DMA. Receiving packets into
your memory after a system reboot is pretty nasty. This will also mean
that the shutdown hooks can go; this solution is smaller.
ok henning miod dlg kettenis


Revision tags: OPENBSD_4_6_BASE
# 1.107 06-Jun-2009 rainer

when xflags got changed, tell the userland by routing sockets

ok henning@


# 1.106 05-Jun-2009 claudio

Initial support for routing domains. This allows to bind interfaces to
alternate routing table and separate them from other interfaces in distinct
routing tables. The same network can now be used in any doamin at the same
time without causing conflicts.
This diff is mostly mechanical and adds the necessary rdomain checks accross
net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6.
input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@


# 1.105 04-Jun-2009 henning

allow IPvShit to be turned off completely per-interface.
ifconfig em0 -inet6
deletes all v6 addresses including link-local and prevents new ones from
being added.
ifconfig em0 inet6 <addr>
re-enables v6, brings the link local back and adds optional <addr>
ok theo reyk


# 1.104 03-Jun-2009 beck

make wireless interfaces priority 4 by default. other interfaces remain
priority 0. while we are in here make sure we add wi interfaces to group "wlan"
in the same way the net80211 stuff already is.

this makes dhcp multiple default routes useful on laptops.

ok claudio@


Revision tags: OPENBSD_4_5_BASE
# 1.103 27-Jan-2009 dlg

make drivers tell the mclgeti allocator what their maximum ring size is
to prevent the hwm growing beyond that. this allows the livelock mitigation
to do something where the hwm used to grow beyond twice the rx rings size.

ok kettenis@ claudio@


# 1.102 12-Dec-2008 claudio

Introduce a if_priority that will be added to RTP_STATIC when routes are
added without an expilict priority. This allows to specify less prefered
interfaces that will only take over if the primary interface loses link.
OK deraadt@


# 1.101 11-Dec-2008 deraadt

export per-interface mbuf cluster pool use statistics out to userland
inside if_data, so that netstat(1) and systat(1) can see them
ok dlg


# 1.100 30-Nov-2008 brad

- Remove unused if_reset "bus reset routine" field in the ifnet struct.
- Add if_stop "stop routine" field in the ifnet struct.

ok mglocker@


# 1.99 26-Nov-2008 dlg

provide m_clsetlwm, an interface for an interface to raise its low
watermark for mbuf cluster allocations.

this is necessary for things like bge which cannot cope with less than a
certain number of pkts on the ring.

ok deraadt@


# 1.98 25-Nov-2008 deraadt

Factor increases are not needed, +1 appears to work as well.
ok dlg


# 1.97 24-Nov-2008 deraadt

move MCLPOOLS to if.h and force uipc_mbuf.c to get if.h, there is no
other option
ok dlg


# 1.96 24-Nov-2008 dlg

add several backend pools to allocate mbufs clusters of various sizes out
of. currently limited to MCLBYTES (2048 bytes) and 4096 bytes until pools
can allocate objects of sizes greater than PAGESIZE.

this allows drivers to ask for "jumbo" packets to fill rx rings with.

the second half of this change is per interface mbuf cluster allocator
statistics. drivers can use the new interface (MCLGETI), which will use
these stats to selectively fail allocations based on demand for mbufs. if
the driver isnt rapidly consuming rx mbufs, we dont allow it to allocate
many to put on its rx ring.

drivers require modifications to take advantage of both the new allocation
semantic and large clusters.

this was written and developed with deraadt@ over the last two days
ok deraadt@ claudio@


# 1.95 07-Nov-2008 deraadt

give this some /* CONSTCOND */ love


Revision tags: OPENBSD_4_4_BASE
# 1.94 10-Apr-2008 dlg

introduce mitigation for the calling of an interfaces start routine.

decent drivers prefer to have a lot of packets on the send queue so they
can queue a lot of them up on the tx ring and then post them all in one
big chunk. unfortunately our stack queues one packet onto the send queue
and then calls the start handler immediately.

this mitigates against that queue, send, queue, send behaviour by trying to
call the start routine only once per softnet. now its queue, queue, queue,
send.

this is the result of a lot of discussion with claudio@
tested by many.


Revision tags: OPENBSD_4_3_BASE
# 1.93 18-Nov-2007 mpf

Sync struct ifaltq to match struct ifqueue.
I wonder why 64-bit archs have not been bitten by this.
OK mcbride@, henning@


# 1.92 03-Sep-2007 claudio

Bump RTM_VERSION to 4 and start a new aera of routing in OpenBSD :)
Changes include 64bit counters instead of u_long, routing table id in the header
of most messages, reserved routing priority field, added a hdrlen field to skip
over the header so that binary compatibility becomes easier.
A minimal backward support for old binaries is included to ease upgrades but
don't expect anything more than ifconfig, route and dhclient to correctly work.
OK henning@ mglocker@


Revision tags: OPENBSD_4_2_BASE
# 1.91 25-Jun-2007 henning

crank ifq_maxlen from 50 to 256, so it is not smaller than most interfaces
rx rings any more. forwarding boxes with many fast interfaces can still use
some more, but this is a saner default.
ok deraadt markus henric


# 1.90 14-Jun-2007 reyk

Add a new "rtlabel" option to ifconfig. It allows to specify a route label
which will be used for new interface routes. For example,
ifconfig em0 10.1.1.0 255.255.255.0 rtlabel RING_1
will set the new interface address and attach the route label RING_1 to
the corresponding route.

manpage bits from jmc@
ok claudio@ henning@


# 1.89 29-May-2007 uwe

Define IF_ENQUEUE() and friends as proper C statements using do ... while
ok henning


# 1.88 26-May-2007 jason

one extern seems to be better than 20 for ifqmaxlen; ok krw


# 1.87 27-Mar-2007 jmc

grammar from bret lambert, and one more from me;


Revision tags: OPENBSD_4_1_BASE
# 1.86 09-Feb-2007 jmc

grammar fix from bret lambert;


# 1.85 28-Nov-2006 reyk

add additional link states to report the half duplex / full duplex
state, if known by the driver. this is required to check the full
duplex state without depending on the ifmedia ioctl which can't be
called in the kernel without process context.

ok henning@, brad@


# 1.84 16-Nov-2006 henning

introduce if_creategroup() to create an empty interface group.
code factored out from if_addgroup(), previously a group always had to have
members. ok mpf mcbride


# 1.83 31-Oct-2006 jason

ether_input_mbuf() isn't necessary, turn it into a macro and deal with
it's "special" case in ether_input(). Based on similiar idea in FreeBSD.
ok brad


Revision tags: OPENBSD_4_0_BASE
# 1.82 02-Jun-2006 mpf

Introduce attributes to interface groups.
As a first user, move the global carp(4) demotion counter
into the interface group. Thus we have the possibility
to define which carp interfaces are demoted together.

Put the demotion counter into the reserved field of the carp header.
With this, we can have carp act smarter if multiple errors occur.
It now always takes over other carp peers, that are advertising
with a higher demote count. As a side effect, we can also have
group failovers without the need of running in preempt mode.
The protocol change does not break compability with older
implementations.

Collaborative work with mcbride@

OK mcbride@, henning@


# 1.81 27-May-2006 brad

remove IFCAP_JUMBO_MTU interface capabilities flag and set if_hardmtu in a few
more drivers.

ok reyk@


# 1.80 26-May-2006 deraadt

rename jumbo mtu to if_hardmtu; ok brad reyk


# 1.79 19-May-2006 reyk

add a if_jumbo_mtu field to the interface structure for drivers
supporting ethernet jumbo frames. there's no standard for the size of
jumbo MTUs, so either let the driver set it's own value or use 9000
byte jumbo frames by default.

ok brad@


# 1.78 04-Mar-2006 brad

With the exception of two other small uncommited diffs this moves
the remainder of the network stack from splimp to splnet.

ok miod@


Revision tags: OPENBSD_3_9_BASE
# 1.77 09-Feb-2006 reyk

add an interface detach hook and use it with the vlan(4) driver. this
fixes a possible crash if the parent interface has been destroyed
(like vlan on trunk) before destroying the vlan interface.

ok brad@


Revision tags: OPENBSD_3_8_BASE
# 1.76 14-Jun-2005 henning

rename function and define to reflect the external -> egress name change
so it is clear what it is all about


# 1.75 14-Jun-2005 henning

use "egress" instead of "external" for the interface group containing the
interfaces the default route(s) point to, proposed deraadt some days ago,
ok djm deraadt


# 1.74 12-Jun-2005 henning

add SIOCGIFGMEMB ioctl, returns a list of all interfaces who are member of
the given group, markus ok


# 1.73 07-Jun-2005 henning

introduce a default "external" interface group, containing the interface(s)
the the default route(s) point to.
handles IPv4 and IPv6 as well as multipath routes.
follows default route changes, of course.
eases writing pf rulesets especially on laptops etc. that use different
interfaces depending on the environment (wired, wireless, ...)
ok theo ryan


# 1.72 06-Jun-2005 henning

use a define instead of hardcoding "all" in 3 places


# 1.71 05-Jun-2005 henning

const'ify the char *groupname param to if_addgroup and if_delgroup


# 1.70 24-May-2005 markus

add net.inet.ip.ifq for monitoring and changing ifqueue; similar to netbsd
ok henning


# 1.69 24-May-2005 reyk

initial import of a trunking (link aggregation and link failover)
implementation. it currently supports round robin mode with link state
checking, additional modes will be added later.

ok brad@, deraadt@


# 1.68 24-May-2005 henning

keep a list of member interfaces in ifg_group


# 1.67 22-May-2005 henning

allow pf to match on interface groups
pass on mygroup ...
markus ok


# 1.66 21-May-2005 henning

clean up and rework the interface absraction code big time, rip out multiple
useless layers of indirection and make the code way cleaner overall.
this is just the start, more to come...
worked very hard on by Ryan and me in Montreal last week, on the airplane to
vancouver and yesterday here in calgary. it hurt.
ok ryan theo


# 1.65 20-Apr-2005 mpf

Introduce if_linkstatehooks.
This converts if_link_state_change() to a generic usable
callback with dohooks().

OK henning@, camield@
Tested by camield@ and Alexey E. Suslikov


Revision tags: OPENBSD_3_7_BASE
# 1.64 07-Feb-2005 mcbride

Add new function if_link_state_change() to take care of sending messages
on the routing socket and notifying carp() of link changes.

ok brad@ mpf@


# 1.63 14-Jan-2005 henning

remove old ifgroups ioctls
the old ifgroups haven't been in use ever really, and the new
implementation is 3 months old today. theo ok (3 months ago)


# 1.62 07-Dec-2004 mcbride

Convert carp(4) to behave more like a regular interface, much in the same
style as vlan(4). carp interfaces no longer require the physical interface
to be on the same subnet as the carp interface, or even that the physical
interface has an adress at all, so CARP can now be used on /30 networks.

ok deraadt@ henning@


# 1.61 07-Dec-2004 mcbride

KNF


# 1.60 03-Dec-2004 henning

do not use one struct timeout for the if congestion stuff, but embed
a struct timeout to struct ifqueue so that each one has its own - it
is a per-queue thing. from chris pascoe


# 1.59 10-Nov-2004 grange

Safer IF_INPUT_ENQUEUE macro.

ok millert@


# 1.58 14-Oct-2004 mickey

avoid stupid commons


# 1.57 11-Oct-2004 henning

ifgroups reqrite
there is now a TAILQ with all interface groups as members, and
in struct ofnet there is only a pointer to the group structure stored
and not its name.
mostly hacked at c2k4 and somewhere over the atlantic ocean
ok markus mcbride


Revision tags: OPENBSD_3_6_BASE
# 1.56 26-Jun-2004 markus

cleanup ioctl for ifgroups; ok pb@


# 1.55 25-Jun-2004 pb

introduce "interface groups"

by "ifconfig fxp0 group foobar" "ifconfig xl0 group foobar"
these two interfaces are in one group.
Every interface has its if-family as default group.

idea/design from henning@, based on some work/disucssion from Joris Vink.

henning@, mcbride@ ok.


# 1.54 20-Jun-2004 beck

undo mbuf cluster breakage that causes free'ed packets to show up on the
input queues when using dhcp and hostap wi, or xl, or fxp....
ok art@


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.53 29-May-2004 jcs

introduce SIOCSIFDESCR and SIOCGIFDESCR to maintain interface
descriptions, configurable with ifconfig

help from various, ok deraadt@


# 1.52 18-May-2004 brad

if_ether.h
add ETHER_MAX_LEN_JUMBO, ETHER_VLAN_ENCAP_LEN, ETHER_ALIGN, and
ETHERMTU_JUMBO constants.

if.h
add a few more interface capabilities flags.

Some from NetBSD, some from FreeBSD.

ok markus@


# 1.51 26-Apr-2004 mcbride

Before enqueueing the packet, copy the contents of incoming clusters
to the mbuf and free the cluster when it contains a small packet.

ok deraadt@


# 1.50 17-Apr-2004 henning

add a congestion indicator to if_queue. It is set when the input queue
is full, along with a timer that unsets it again after 10ms.
The input queue beeing full is a reliable indicator for CPU overload, and
this flag allows other subsystems to cope with the situation.
hacked with beck
ok kjc@ markus@ beck@


Revision tags: OPENBSD_3_5_BASE
# 1.49 15-Jan-2004 markus

add a RTM_IFANNOUNCE message; from netbsd; ok itojun, henning


# 1.48 16-Dec-2003 markus

return error in ifc_destroy; ok deraadt, itojun, cedric, hshoexer


# 1.47 10-Dec-2003 itojun

use if_indexlim (instead of if_index) and ifindex2ifnet[x] != NULL
to check if interface exists, as (1) if_index will have different meaning
(2) ifindex2ifnet could become NULL when interface gets destroyed,
when we introduce dynamically-created interfaces. markus ok


# 1.46 08-Dec-2003 markus

add IOCIFGCLONERS; ifconfig -C; from netbsd; ok henning, deraadt


# 1.45 03-Dec-2003 markus

support for network interface "cloning", e.g. gif(4) via ifconfig(8)


# 1.44 19-Oct-2003 david

more typos


# 1.43 17-Oct-2003 mcbride

Common Address Redundancy Protocol

Allows multiple hosts to share an IP address, providing high availability
and load balancing.

Based on code by mickey@, with additional help from markus@
and Marco_Pfatschbacher@genua.de

ok deraadt@


Revision tags: OPENBSD_3_4_BASE
# 1.42 25-Aug-2003 fgsch

if_init support, required by ieee80211.
deraadt@ ok.


# 1.41 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_A UBC_SYNC_B
# 1.40 03-Jul-2002 miod

Change all variables definitions (int foo) in sys/sys/*.h to variable
declarations (extern int foo), and compensate in the appropriate locations.


# 1.39 30-Jun-2002 itojun

allocate sockaddr_dl for ifnet in if_alloc_sadl(), as we don't always know
the size of sockaddr_dl on if_attach() - for instance, see ether_ifattach().
from netbsd. fgs ok


# 1.38 23-Jun-2002 itojun

g/c last remains of old ipv6 prefix management


# 1.37 27-May-2002 itojun

if_attach() gets called before domaininit(). scan all interfaces for if_afdata
initialization after domaininit().


# 1.36 27-May-2002 itojun

framework to add af-dependent data structure to struct ifnet.
as discussed at bsd-api-discuss. sync w/kame


# 1.35 24-Apr-2002 dhartmei

Add hooks to struct ifnet that allow to register callbacks that will be
notified of interface address changes. ok provos@, angelos@


Revision tags: OPENBSD_3_1_BASE
# 1.34 15-Mar-2002 millert

Cosmetic changes only, primarily making comments line up nicely after the
__P removal.


# 1.33 14-Mar-2002 millert

First round of __P removal in sys


# 1.32 23-Jan-2002 fgsch

compatability -> compatibility.


Revision tags: OPENBSD_3_0_BASE UBC_BASE
# 1.31 05-Jul-2001 angelos

branches: 1.31.4;
KNF


# 1.30 05-Jul-2001 jjbg

Include files for IPComp support. angelos@ ok.


# 1.29 27-Jun-2001 kjc

ALTQ base modifications to the kernel.
- ALTQ introduces a set of new queue macros that coexist with the
traditional IF_XXX macros.
- "struct ifaltq" replaces "struct ifqueue" in "struct ifnet".
- assign cdev major 74 for i386 and 54 for alpha as ALTQ control interface.


# 1.28 23-Jun-2001 fgsch

Add ether_input_mbuf to help us remove the ether_header from
ether_input; all drivers should start migrating to this.
Discussed with jason@, deraadt@ more or les ok'ed.


# 1.27 15-Jun-2001 itojun

change the meaning of ifnet.if_lastchange to meet RFC1573 ifLastChange.
follows BSD/OS practice and ucd-snmp code (FreeBSD does it for specific
interfaces only).

was: if_lastchange get updated on every packet transmission/receipt.
now: if_lastchange get updated when IFF_UP is changed.


# 1.26 09-Jun-2001 angelos

By popular demand, protect from multiple inclusion, and fix to use the
same naming style.


# 1.25 28-May-2001 angelos

IPSECv4 -> IPSEC


# 1.24 28-May-2001 angelos

No need for separate ESP/AH interface capabilities.


# 1.23 28-May-2001 angelos

Interface capabilities (based on NetBSD, but merge ethercom and ifnet
capabilities into one, in the ifp).


Revision tags: OPENBSD_2_9_BASE
# 1.22 06-Feb-2001 mickey

allow changing number of loopbacks in ukc.
change rest of the code to use lo0ifp pointing
to the corresponding struct ifnet.
itojun@ and niklas@ ok


# 1.21 19-Jan-2001 itojun

pull post-4.4BSD change to sys/net/route.c from BSD/OS 4.2 (UCB copyrighted).

have sys/net/route.c:rtrequest1(), which takes rt_addrinfo * as the argument.
pass rt_addrinfo all the way down to rtrequest, and ifa->ifa_rtrequest.
3rd arg of ifa->ifa_rtrequest is now rt_addrinfo * instead of sockaddr *
(almost noone is using it anyways).

benefit: the follwoing command now works. previously we need two route(8)
invocations, "add" then "change".
# route add -inet6 default ::1 -ifp gif0

remove unsafe typecast in rtrequest(), from rtentry * to sockaddr *. it was
introduced by 4.3BSD-reno and never corrected.

XXX is eon_rtrequest() change correct regarding to 3rd arg?
eon_rtrequest() and rtrequest() were incorrect since 4.3BSD-reno,
so i do not have correct answer in the source code.
someone with more clue about netiso-over-ip, please help.


Revision tags: OPENBSD_2_8_BASE
# 1.20 20-Sep-2000 art

Since ifa_refcnt was bumped to an int and rt_flags is an int too, bump
ifa_flags to int.


# 1.19 28-Aug-2000 deraadt

changing the size of if_data has heavy impact on userland compat. there
was even a u_char slot available for ifi_link_state, which clearly does not
need a full 32 bits.


# 1.18 26-Aug-2000 nate

sync mii code with netbsd
adds detach functionality for phys
some code cleanup

Nobody really had time to test all of this out, but theo said commit anyway


Revision tags: OPENBSD_2_7_BASE
# 1.17 22-Mar-2000 itojun

remove if_withname(), which was imported during KAME merge by mistake.


# 1.16 21-Mar-2000 mickey

add SIOCGIFMTU/SIOCSIFMTU; remediate redundant code of tun, ppp, sppp; chris@ ok


Revision tags: SMP_BASE
# 1.15 02-Feb-2000 itojun

branches: 1.15.2;
wrap IFAFREE() by "do {} while (0)". it wasn't safe enough.


Revision tags: kame_19991208
# 1.14 08-Dec-1999 itojun

bring in KAME IPv6 code, dated 19991208.
replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support.
see sys/netinet6/{TODO,IMPLEMENTATION} for more details.

GENERIC configuration should work fine as before. GENERIC.v6 works fine
as well, but you'll need KAME userland tools to play with IPv6 (will be
bringed into soon).


Revision tags: OPENBSD_2_6_BASE
# 1.13 08-Aug-1999 niklas

Support detaching of network interfaces. Still work to do in ipf, and
other families than inet.


# 1.12 23-Jun-1999 cmetz

Added some protocol independent interfaces (supposedly IPv6 support APIs, but
ones that are useful for all protocols, not just IPv6).


Revision tags: OPENBSD_2_5_BASE
# 1.11 13-Mar-1999 deraadt

make ifa_refcnt a u_int; andrewb@demon.net


# 1.10 26-Feb-1999 jason

Ethernet bridge/IP firewall driver.


# 1.9 07-Jan-1999 deraadt

fix IFAFREE() to be safe for if/else nesting


Revision tags: OPENBSD_2_4_BASE
# 1.8 03-Sep-1998 jason

o OpenBSD gets if_media support (from NetBSD)
o rework/simplify if_xl to use it


Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE OPENBSD_2_2_BASE OPENBSD_2_3_BASE
# 1.7 02-Jul-1996 niklas

-Wall & -Wstrict-prototype fixes


# 1.6 29-Jun-1996 deraadt

provide if_attachhead(), and make if_loop use it


# 1.5 10-May-1996 deraadt

if_name/if_unit -> if_xname/if_softc


# 1.4 06-May-1996 mickey

if.h was missed from the commit.
if_ethersubr.c: missed variables added.


# 1.3 05-Mar-1996 mickey

Changes for ifconfig to compile.


# 1.2 03-Mar-1996 niklas

From NetBSD: 960217 merge


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.215 11-Nov-2023 bluhm

Pass constant struct sockaddr to interface lookup functions.

OK mvs@


Revision tags: OPENBSD_7_4_BASE
# 1.214 30-May-2023 dlg

add net_tq_barriers

this waits once for something to end in all the net tqs.

ok claudio@


# 1.213 16-May-2023 jan

Use separate IFCAPs for LRO and TSO.

This diff introduces separate capabilities for TCP offloading. We split this
into LRO (large receive offloading) and TSO (TCP segmentation offloading).
LRO can be turned on/off via tcprecvoffload option of ifconfig and is not
inherited to sub interfaces.

TSO is inherited by sub interfaces to signal this hardware offloading capability
to the network stack.

With tweaks from bluhm, claudio and dlg

ok bluhm, claudio


# 1.212 15-May-2023 bluhm

Implement the TCP/IP layer for hardware TCP segmentation offload.
If the driver of a network interface claims to support TSO, do not
chop the packet in software, but pass it down to the interface
layer.
Precalculate parts of the pseudo header checksum, but without the
packet length. The length of all generated smaller packets is not
known yet. Driver and hardware will use the mbuf packet header
field ph_mss to calculate it and update checksum.
Introduce separate flags IFCAP_TSOv4 and IFCAP_TSOv6 as hardware
might support ony one protocol family. The old flag IFXF_TSO is
only relevant for large receive offload. It is missnamed, but keep
that for now.
Note that drivers do not set TSO capabilites yet. Also the ifconfig
flags and pseudo interfaces capabilities will be done separately.
So this commit should not change behavior.
heavily based on the work from jan@; OK sashan@


Revision tags: OPENBSD_7_3_BASE
# 1.211 07-Mar-2023 jan

Avoid enabling TSO on interfaces which are already attached to a bridge.

with tweaks from claudio and deraadt

ok claudio, bluhm


# 1.210 27-Feb-2023 jan

Turn off TSO if interface is added to layer 2 devices.

ok bluhm@, claudio@


Revision tags: OPENBSD_7_2_BASE
# 1.209 27-Jun-2022 jan

Introduce Large Receive Offloading of TCP segment offloading for ix(4). It is
disabled by default. Also add a tso option to ifconfig(8) to enable and
disable this feature.

ok deraadt


Revision tags: OPENBSD_6_9_BASE OPENBSD_7_0_BASE OPENBSD_7_1_BASE
# 1.208 11-Mar-2021 florian

When RFC 8981 obsoleted RFC 4941 the terminology changed from
"privacy extensions" to "temporary address extensions"

Change ifconfig(8) to output temporary after temporary addresses and
add "temporary" option which is an alias for autoconfprivacy for now.

Also make AUTOCONF6TEMP a positiv flag that is set by default.
Previously the negative flag "INET6_NOPRIVACY" was set when privacy
addresses were disabled. This makes the flags output less ugly and
will allow us to disable autoconf addresses while having temporary
addresses enabled in the future.

More work is needed in slaacd.

input benno, jmc, deraadt
previous verison OK benno
OK jmc, kn


# 1.207 20-Feb-2021 dlg

add a MONITOR flag to ifaces to say they're only used for watching packets.

an example use of this is when you have a span port on a switch and
you want to be able to see the packets coming out of it with tcpdump,
but do not want these packets to enter the network stack for
processing. this is particularly important if the span port is
pushing a copy of any packets related to the machine doing the
monitoring as it will confuse pf states and the stack.

ok benno@


# 1.206 01-Feb-2021 mvs

ifunit() was fully replaced by if_unit(9) and should go away.

ok bluhm@ dlg@


# 1.205 18-Jan-2021 mvs

Introduce new function if_unit(9). This function returns a pointer the
interface descriptor corresponding to the unique name. This descriptor
is guaranteed to be valid until if_put(9) is called on the returned
pointer. if_unit(9) should replace already existent ifunit() which
returns descriptor not safe for dereference when context was switched.
This allow us to avoid some use-after-free issues in ioctl(2) path.
Also this unifies interface descriptor usage.

ok claudio@ sashan@


# 1.204 30-Sep-2020 mvs

We have no if_attachtail() function so remove the declaration.

ok deraadt@ claudio@


Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE OPENBSD_6_8_BASE
# 1.203 25-Jul-2019 krw

AF_INET comes before AF_INET6. Shorten line to <80 chars.

pointed out by claudio@


# 1.202 25-Jul-2019 krw

Add IFXF_AUTOCONF4 to if_xflags to match IFXF_AUTOCONF6. Let
ifconfig set/unset it.

ok deraadt@ kmos@


# 1.201 19-Apr-2019 dlg

add IF_HDRPRIO_OUTER for rxprio

IF_HDRPRIO_OUTER says you want the priority from the outer encap header.

ok claudio@


Revision tags: OPENBSD_6_5_BASE
# 1.200 10-Apr-2019 dlg

add struct if_sffpage so userland can read a page of sfp/qsfp info

this will be used by ifconfig so it can show you things like who
mades what module you're using and when it was made, and on some
modules you get diag info like temperature, vcc, and rx and tx
powers.

im putting the kernel side in so we can keep fiddling with the
userland printf side of things.

this work was done based on a question by rachel roch
ok deraadt@
enthusiasm from many including mikeb@ sthen@ patrick@


# 1.199 23-Jan-2019 dlg

add a SIOCGPWE3 ioctl for interfaces to advertise pwe3 capability

im going to turn mpw into an ethernet interface, which includes
changing its if_type to IFT_ETHER. currently ldpd looks for if_type
IFT_MPLSTUNNEL to decide if an interface is a pseudowire, ie, it's
going to break. the ioctl will let ldpd ask the interface if it is
pseudowire capable as an alternative.

ok claudio@


# 1.198 12-Nov-2018 dlg

add ifreq bits for the tx header prio field ioctls

a tx header prio can set to a fixed value from 0 to 7, or magic
values to represent populating the prio field from the encapsulated
packet, or from the mbuf prio value.

ok claudio@


# 1.197 12-Nov-2018 krw

Add new routing socket message RTM_80211INFO to provide details of
802.11 interface state changes (e.g. SSID) to interested parties.

Original diff from phessler@. Many suggestions and tweaks from
claudio@, stsp@, anton@.

ok claudio@ stsp@ anton@ phessler@


# 1.196 11-Nov-2018 dlg

use the llprio on gre(4) and eoip(4) interfaces for the keepalive tos

llprios are valued 0 to 7, while the ip tos/dscp/tclass is an 8 bit
value. fortunately the high 3 bits map nicely to the llprio values,
so we shift the llprio into place when generating the keepalive
frames. the llprio is defaulted to the value that cisco uses for
their gre keepalives.


Revision tags: OPENBSD_6_4_BASE
# 1.195 12-Sep-2018 krw

Fix obvious cut&pasto in comment (ifa_msghdr -> if_announcemsghdr).

ok claudio@


# 1.194 30-May-2018 dlg

restrict the prio values from SIOCSIFLLPRIO to what the kernel handles

previously the ioctl code checked that prio was an int less than
UCHAR_MAX, but the rest of the kernel (and priq code in particular)
expects it to be between 0 and 7 inclusive.

ok krw@ tb@


# 1.193 25-Apr-2018 jca

Make this header standalone #if __BSD_VISIBLE, by including needed headers

Puts us in line with Free/NetBSD and Linux and will get us rid of
pointless patches in the ports tree. ok guenther@ deraadt@


Revision tags: OPENBSD_6_3_BASE
# 1.192 19-Feb-2018 dlg

tunneldf needs ifr_df


# 1.191 10-Feb-2018 florian

Implement RFC 7217: "A Method for Generating Semantically Opaque
Interface Identifiers with IPv6 Stateless Address Autoconfiguration."

"An IPv6 address configured using this method is stable within each
subnet, but the corresponding Interface Identifier changes when the
host moves from one network to another. This method is meant to be an
alternative to generating Interface Identifiers based on hardware
addresses."

OK naddy, sthen


# 1.190 16-Jan-2018 mpi

Recycle IFF_NOTRAILERS into IFF_STATICARP and document ownerhsip
of IFF* flags.

inputs from jmc@, ok bluhm@, visa@


# 1.189 21-Dec-2017 dlg

prototype if_attach_iqueues so drivers can configure multiple iqs.


# 1.188 09-Nov-2017 tb

The cmd argument of ifconf() has been unused since COMPAT_LINUX was
purged. Remove it and move the prototype to if.c since ifconf() is
not used outside of this file.

ok mpi


# 1.187 31-Oct-2017 sashan

- add one more softnet taskq
NOTE: code still runs with single softnet task. change definition of
SOFTNET_TASKS in net/if.c, if you want to have more than one softnet task

OK mpi@, OK phessler@


Revision tags: OPENBSD_6_1_BASE OPENBSD_6_2_BASE
# 1.186 24-Jan-2017 krw

A space here, a space there. Soon we're talking real whitespace
rectification.


# 1.185 24-Jan-2017 dlg

add support for multiple transmit ifqueues per network interface.

an ifq to transmit a packet is picked by the current traffic
conditioner (ie, priq or hfsc) by providing an index into an array
of ifqs. by default interfaces get a single ifq but can ask for
more using if_attach_queues().

the vast majority of our drivers still think there's a 1:1 mapping
between interfaces and transmit queues, so their if_start routines
take an ifnet pointer instead of a pointer to the ifqueue struct.
instead of changing all the drivers in the tree, drivers can opt
into using an if_qstart routine and setting the IFXF_MPSAFE flag.
the stack provides a compatability wrapper from the new if_qstart
handler to the previous if_start handlers if IFXF_MPSAFE isnt set.

enabling hfsc on an interface configures it to transmit everything
through the first ifq. any other ifqs are left configured as priq,
but unused, when hfsc is enabled.

getting this in now so everyone can kick the tyres.

ok mpi@ visa@ (who provided some tweaks for cnmac).


# 1.184 23-Jan-2017 mpi

Flag pseudo-interfaces as such in order to call add_net_randomness()
only once per packet.

Fix a regression introduced when if_input() started to be called by
every pseudo-driver.

ok claudio@, dlg@


# 1.183 23-Jan-2017 dlg

i botched the copyout to ifr->ifr_data in SIOCGIFDATA.

this lets pflogd run again.

rename if_data() to if_getdata() while here to make grepping for
things less noisy.

reported by jsg@
worked through with deraadt@


# 1.182 23-Jan-2017 dlg

merge the ifnet and ifqueue stats together when userland wants them.

a new if_data() function takes a pointer to ifnet and merges its
if_data and ifq statistics. it takes the ifq mutex around the reads
of the ifq stats so they get a consistent copy.

the ifnet and ifq stats are merged because some parts of the stack
still update the ifnet counters.

ok visa@ (on an earlier diff) mpi@ claudio@


# 1.181 12-Dec-2016 mpi

Remove most of the splsoftnet() recursions related to cloned interfaces.

inputs and ok bluhm@


# 1.180 27-Oct-2016 dlg

add a new pool for 2k + 2 byte (mcl2k2) clusters.

a certain vendor likes to make chips that specify the rx buffer
sizes in kilobyte increments. unfortunately it places the ethernet
header on the start of the rx buffer, which means if you give it a
mcl2k cluster, the ethernet header will not be ETHER_ALIGNed cos
mcl2k clusters are always allocated on 2k boundarys (cos they pack
into pages well). that in turn means the ip header wont be aligned
correctly.

the current workaround on these chips has been to let non-strict
alignment archs just use the normal 2k cluster, but use whatever
cluster can fit 2k + 2 on strict archs. that turns out to be the
4k cluster, meaning we waste nearly 2k of space on every packet.

properly aligning the ethernet header and ip headers gives a
performance boost, even on non-strict archs.


# 1.179 04-Sep-2016 reyk

Move code to change the rdomain of an interface from the ioctl switch case
to a new function if_setrdomain().

OK mpi@ henning@


# 1.178 03-Sep-2016 reyk

Add support for a multipoint-to-multipoint mode in vxlan(4). In this
mode, vxlan(4) must be configured to accept any virtual network
identifier with "vnetid any" and added to a bridge(4) or switch(4).
This way the driver will dynamically learn the tunnel endpoints and
their vnetids for the responses and can be used to dynamically bridge
between VXLANs. It is also being used in combination with switch(4)
and the OpenFlow tunnel classifiers.

With input from yasuoka@ goda@
OK deraadt@ dlg@


Revision tags: OPENBSD_6_0_BASE
# 1.177 10-Jun-2016 vgross

Add the "llprio" field to struct ifnet, and the corresponding keyword
to ifconfig.

"llprio" allows one to set the priority of packets that do not go through
pf(4), as the case is for arp(4) or bpf(4).

ok sthen@ mikeb@


# 1.176 02-Mar-2016 dlg

provide generic ioctls for managing an interfaces parent

in the future this will subsume the individual vlandev, carpdev,
pppoedev, foodev options for things like vlan, carp, pppoe, etc.

inspired by vnetid

ok mpi@ jmatthew@


Revision tags: OPENBSD_5_9_BASE
# 1.175 05-Dec-2015 deraadt

avoid an ugly wrap in a comment


# 1.174 03-Dec-2015 dlg

rework if_start to allow nics to provide an mpsafe start routine.

existing start routines will still be called under the kernel lock
and at IPL_NET.

mpsafe start routines will be serialised so only one instance of
each interfaces function will be running in the kernel at any point
in time. this guarantees packets will be dequeued in order, and the
start routines dont have to lock against themselves because if_start
does it for them.

the code to do that is based on the scsi runqueue code.

this also provides an if_start_barrier() function that should wait
until any currently running instances of if_start have finished.

a driver can opt in to the mpsafe if_start call by doing the following:

1. setting ifp->if_xflags = IFXF_MPSAFE
2. only calling if_start() instead of its own start routine
3. clearing IFF_RUNNING before calling if_start_barrier() on its way down
4. only using IFQ_DEQUEUE (not ifq_deq_begin/commit/rollback)

to simplify the implementation the tx mitigation code has been removed.

tested by several
ok mpi@ jmatthew@


# 1.173 20-Nov-2015 mpi

Keep if_ref() private, if_get() is what you want to use before if_put().

The thread detaching an interface will sleep until all references to this
interface have been released. So we decided to only keep references for
a short period of time.

Keeping if_ref() private will hopefully help preserve this goal as long
as it makes sense.

Calling if_get()/if_put() in the same function also allows us to make
use of static analysis tools (thanks jsg@!) to catch our errors.

ok dlg@


# 1.172 24-Oct-2015 reyk

Add pair(4), a vether-based virtual Ethernet driver to interconnect
rdomains and bridges on the local system. This can be used to route
through local rdomains, to create L2 devices (like trunks) between
them, and many other things.

Discussed with many, with input from mpi@
OK sthen@ phessler@ yasuoka@ mikeb@


# 1.171 23-Oct-2015 claudio

Introduce a new sysctl NET_RT_IFNAMES that returns only ifnames to ifindex
mappings. This will be used by if_nameindex(3), if_nametoindex(3) and
if_indextoname(3) soon to fix the issues in pledge because of inet6 link
local addressing.
OK mpi@ benno@ deraadt@
The libc version will follow soon so better start updating your kernels


# 1.170 23-Oct-2015 dlg

tweak the vnetid so it can be optional and therefore cleared/deleted.

the abstract vnetid is promoted to a uin32_t, and adds a SIOCDVNETID
ioctl so it can be cleared.

this is all because i set an assignment on implementing a virtual
network interface and the students got confused when vnetid 0 didnt
show up in ifconfig output.

the vnetid in the vxlan(4) protocol is optional, but the current
code confuses 0 with no vnetid being set. this makes it clear.

ok reyk@ who also simplified my diff


# 1.169 05-Oct-2015 uebayasi

Add ifi_oqdrops and its alias to struct if_data.

Necessary bumps in Ports will be handled by sthen@.

OK mpi@ dlg@


# 1.168 27-Sep-2015 stsp

Add if_setlladdr(), factored out from ifioctl(). Will be used by iwm(4) soon.
With suggestions from tedu@ and guenther@
ok kettenis@


# 1.167 11-Sep-2015 stsp

Make room for media types of the future. Extend the ifmedia word to 64 bits.
This changes numbers of the SIOCSIFMEDIA and SIOCGIFMEDIA ioctls and
grows struct ifmediareq.

Old ifconfig and dhclient binaries can still assign addresses, however
the 'media' subcommand stops working. Recompiling ifconfig and dhclient
with new headers before a reboot should not be necessary unless in very
special circumstances where non-default media settings must be used to
get link and console access is not available.

There may be some MD fallout but that will be cleared up later.

ok deraadt miod
with help and suggestions from several sharks attending l2k15


# 1.166 09-Sep-2015 dlg

introduce reference counts for interfaces (ie, struct ifnet *ifp).

if_get can get a reference to an ifp, but it never releases that
reference. this provides an if_put function that can be used to
decrement the refcount.

we cannot come up with a scheme for letting the network stack run on
one (or many) cpus while ioctls are pulling interfaces down on another
cpu without refcounts for the interfaces.

if_put is going in now so we can go through the stack and put the
necessary calls to it in, and then we'll backfill this implementation
to actually check the refcounts when the interface detaches.

ok mpi@ mikeb@ claudio@


# 1.165 30-Aug-2015 mpi

Use a global table for domains instead of building a list at run time.

As a side effect there's no need to run if_attachdomain() after the
list of domains has been built.

ok claudio@, reyk@


Revision tags: OPENBSD_5_8_BASE
# 1.164 07-Jun-2015 jsg

Introduce unhandled_af() for cases where code conditionally does
something based on an address family and later assumes one of the paths
was taken. This was initially just calls to panic until guenther
suggested a function to reduce the amount of strings needed.

This reduces the amount of noise with static analysers and acts
as a sanity check.

ok guenther@ bluhm@


# 1.163 18-May-2015 reyk

Move the rdomain from struct ifnet into struct if_data. This way it
will be exported to userland with the existing sysctl, getifaddrs()
and routing socket (if_msghdr.ifm_data) interfaces that expose
if_data. All programs and daemons - Apps - that call the
SIOCGIFRDOMAIN ioctl in a getifaddrs() loop or after receiving an
interface message on the routing socket can now remove the pointless
additional ioctl. In base, that could be: dhclient, isakmpd, dhcpd,
dhcrelay, ntpd, ospfd, ripd, ifconfig.

No ABI breakage because it uses a previously unused pad field in if_data.

OK mpi@ deraadt@


# 1.162 10-Apr-2015 mpi

Run detach hook and similar before cleaning up any other resource when
an interface is destroyed/removed. This way we can ensure pseudo-driver
changes done after attaching an interface are undone before detaching it.

Note: it is safe to call if_deactivate() multiple times as the interface
should not have any attached pseudo-interface after the first call.

ok deraadt@, dlg@


# 1.161 18-Mar-2015 dlg

remove the congestion handling from struct ifqueue.

its only used for the ip and ip6 network stack input queues, so it
seems unfair that every instance of ifqueue has to carry a pointer
around for this specific use case.

this moves the congestion marker to a kernel global. if we detect
that we're congested, we assume the whole system is busy and punish
all input queues.

marking a system as congested is done by setting the global to the
current value of ticks. as the system moves away from that value,
it moves away from being congested until the comparison fails.

written at s2k15
ok henning@ beck@ bluhm@ claudio@


Revision tags: OPENBSD_5_7_BASE
# 1.160 08-Feb-2015 mpi

Introduce if_input() a function to pass packets dequeued from a
recieving ring to the stack.

if_input() is at the moment a drop-in replacement for ether_input_mbuf()
but will let us stack pseudo-driver in a nice way in order to no longer
call ether_input() recursively.

ok pelikan@, reyk@, blambert@, henning@


# 1.159 06-Jan-2015 stsp

Remove the NOINET6 interface flag, a left-over from the times when IPv6
was enabled by default. Add AFATTACH/AFDETACH ioctls which enable/disable
an address family for an interface (currently used for IPv6 only).

New kernel needs new ifconfig for IPv6 configuration (address assignment
still works with old ifconfig making this easy to cross over).

Committing on behalf of henning@ who is currently lebensmittelvergiftet.
ok stsp, benno, mpi


# 1.158 05-Dec-2014 mpi

Explicitly include <net/if_var.h> instead of pulling it in <net/if.h>.

ok mikeb@, krw@, bluhm@, tedu@


Revision tags: OPENBSD_5_6_BASE
# 1.157 14-Jul-2014 dlg

now that receive ring accounting has been pulled out of the mbuf layer,
we can pull the space the mbuf layer used to do per interface accounting
out of struct if_data.

saves a hundredish bytes on every interface.

ok deraadt@ claudio@


# 1.156 11-Jul-2014 henning

introduce the IFXF_AUTOCONF6 interface flag which controls wether we
accept rtadvs on that interface. the global net.inet6.ip6.accept_rtadv
sysctl just doesn't cut it, even tho the spec wants that - but in their
little absurd world, a host just has one interface by definition anyway...
the sysctlgoes away.
lots of head scratching, brain cell elemination etc from bluhm benno stsp
florian, excitement from simon and todd, ok bluhm stsp benno florian


# 1.155 08-Jul-2014 dlg

introduce the if_rxr api. it is intended to pull the rx ring accounting
out of the mbuf layer, and break the assumption that an interface will
only have a single ring per mbuf cluster size.

mpi@ is ok with moving this forward


# 1.154 13-Jun-2014 mpi

Instead of updating all the cluster allocation water marks of all the
interfaces when the kernel is livelocked, only do it for the current
pool and defer the other updates.

This allow us to get rid of an interface list iteration in a critical
path.

Ridding the libc crank since this change introduce an ABI break.

ok claudio@


Revision tags: OPENBSD_5_5_BASE
# 1.153 21-Nov-2013 mikeb

split kernel parts of the if.h into a separate header file if_var.h
which allows us to modify ifnet structure in a relatively safe way;
discussed with deraadt, ok mpi


# 1.152 09-Nov-2013 dlg

ticks is compared against mcl_grown to see if time has elapsed since
the rx ring was last allowed to grow and then assigned to it. it
is erroneous to do this because mcl_grown is a u_int and ticks is an
int.

this makes mcl_grown an int, and follows the idiom in kern_timeout.c
of going "thing - ticks < diff", which better copes with ticks
wrapping around and being used to calculate relative intervals.

ok pirofti@ guenther@


# 1.151 01-Nov-2013 pelikan

keep net/hfsc.h away from userspace, except in pfctl

tested by naddy, ok deraadt


# 1.150 21-Oct-2013 benno

nuke comment. How soon is now?
"do it" deraadt@


# 1.149 19-Oct-2013 reyk

Bring back the if_detachhook. We're going to have more users now.

ok mpi@ henning@ benno@


# 1.148 13-Oct-2013 reyk

Import vxlan(4), the virtual extensible local area network tunnel
interface. VXLAN is a UDP-based tunnelling protocol for overlaying
virtualized layer 2 networks over layer 3 networks. The implementation
is based on draft-mahalingam-dutt-dcops-vxlan-04 and has been tested
with other implementations in the wild.

put it in deraadt@


# 1.147 12-Oct-2013 henning

new bandwidth shaping subsystem, kernel side
uses hfsc behind the scenes; altq stays in parallel for a migration phase.
if.h even more messy for the transition, but eventuelly it should become
readable...
looked over & tested by many, ok phessler sthen


# 1.146 17-Sep-2013 mpi

Change vlan(4) detach procedure to not use a hook but a list of vlans
on the parent interface. This is similar to what bridge(4), trunk(4)
or carp(4) are doing and allows us to get rid of the detachhook.

ok reyk@, mikeb@


# 1.145 28-Aug-2013 mpi

Remove unused argument from *rtrequest()

ok krw@, mikeb@


Revision tags: OPENBSD_5_4_BASE
# 1.144 20-Jun-2013 mpi

Revert previous and unbreak asr, the new include should be protected.

Reported by naddy@


# 1.143 20-Jun-2013 mpi

Allocate the various hook head descriptors as part of the ifnet
structure rather than doing various M_WAITOK allocations during
the *attach() functions, we always rely on them anyway.

ok mikeb@, uebayasi@


# 1.142 02-Apr-2013 mpi

Instead of storing the link-level address of every interface in a global
array indexed by interface numbers, add a new field to the interface
descriptor pointing to it.

claudio@ and todd@ like it, ok mikeb@


# 1.141 26-Mar-2013 mpi

Remove various read-only *maxlen variables and use IFQ_MAXLEN directly.

ok beck@, mikeb@


# 1.140 20-Mar-2013 mpi

Introduce if_get() to retrieve an interface descriptor pointer given
an interface index and replace all the redondant checks and accesses
to a global array by a call to this function.

With imputs from and ok bluhm@, mikeb@


# 1.139 07-Mar-2013 mpi

Remove unused ifa_ifwithaf() function.

ok mikeb@, miod@


# 1.138 07-Mar-2013 mpi

Remove the IFAFREE() macro, the ifafree() function it was calling already
check for the reference counter.

ok mikeb@, miod@, pelikan@, kettenis@, krw@


Revision tags: OPENBSD_5_3_BASE
# 1.137 23-Nov-2012 sthen

Add SIOCGIFHARDMTU to allow retrieving the driver's maximum supported MTU
looks fine reyk@ ok mikeb@


# 1.136 11-Nov-2012 deraadt

align ifaliasreq.ifra_addr similar to the way that ifreq is fixed --
a gruesome union, to block the compiler from placing the struct
incorrectly aligned on stack frames
ok guenther


# 1.135 05-Oct-2012 camield

Point an interface directly to its bridgeport configuration, instead
of to the bridge itself. This is ok, since an interface can only be part
of one bridge, and the parent bridge is easy to find from the bridgeport.

This way we can get rid of a lot of list walks, improving performance
and shortening the code.

ok henning stsp sthen reyk


# 1.134 19-Sep-2012 henning

defina an IFCAP_CSUM_MASK, covering IFCAP_CSUM_*, and use it in if_vlan.c
to replace the list of them.
this actually makes vlan inherit the IPv6 CSUM flags from it's parent, that
had been commented out since this code was committed back in 2001.
ok benno mpf


# 1.133 10-Sep-2012 guenther

Bring into compliance with POSIX, exposing just the specified bits.

Requested by jasper@, ok kettenis@


# 1.132 21-Aug-2012 bluhm

Reverse the name and meaning of the IFXF_INET6_PRIVACY interface
flag. It is now called IFXF_INET6_NOPRIVACY. So IPv6 privacy
addresses are on by default without resetting the flag during
ifconfig down/up.
OK stsp@, sperreault@ (who wrote the same diff)


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.131 02-Dec-2011 haesbaert

Kill unused IFCAP_IPSEC and IFCAP_IPCOMP.

ok claudio@ henning@ mikeb@


# 1.130 02-Nov-2011 haesbaert

Expose if_capabilities to userland so that ifconfig can display the
device hardware features.
Tune ifconfig to show them with 'hwfeatures' argument.
While here, kill some old unused capabilities and respect 80 columns
in brconfig.h.

ok mcbride@, henning@, mpf@.


# 1.129 07-Oct-2011 henning

rename some vars and functions
unfortunately altq is one giant namespace violation. rename just those that
conflict with new stuff for now only to be found on my laptop. reduce pain,
the diff is huge already. ok ryan


Revision tags: OPENBSD_5_0_BASE
# 1.128 08-Jul-2011 henning

new priority queueing implementation, extremely low overhead, thus fast.
unconditional, always on. 8 priority levels, as every better switch, the
vlan header etc etc. ok ryan mpf sthen, pea tested as well


# 1.127 07-Jul-2011 henning

provide IF_LEN and IFQ_LEN to access ifq_len on an ifqueue, ryan ok


# 1.126 05-Jul-2011 henning

now of course I only noticed if_qflush is completely unused after
adjusting it to the new world order in my tree... remove it, ok ryan claudio


# 1.125 03-Jul-2011 henning

IFQ_CLASSIFY is also just schrapnel


# 1.124 03-Jul-2011 henning

no traces of ALTQ_DECL to be found anywhere, thus kill the #defines


# 1.123 03-Jul-2011 claudio

LINK_STATE_IS_UP() should consider LINK_STATE_UNKNOWN as an up state.
This is now possible because carp no longer uses LINK_STATE_UNKNOWN
for a state that is considered down. This will simplify a lot of code.
OK mpf@ mcbride@ henning@


# 1.122 13-Mar-2011 stsp

Add a way to enable/disable Wake On LAN with ifconfig.
ok deraadt


Revision tags: OPENBSD_4_9_BASE
# 1.121 17-Nov-2010 henning

introduce ifa_update_broadaddr to update an ifaddr's broadcast address,
trivial for the moment, more needed soon
tested by many as part of a larger diff, ok sthen claudio dlg krw


# 1.120 24-Sep-2010 claudio

Implement if_freenameindex() as a real function as required by posix.
OK deraadt@, millert@


# 1.119 23-Sep-2010 dlg

tweak the mclgeti algorithm to behave better under load.

instead of letting hardware rings grow on every interrupt, restrict
it so it can only grow once per softclock tick. we can only punish
the rings on softclock ticks, so it make sense to only grow on
softclock tick boundaries too.

the rings are now punished after >1 lost softclock tick rather than
>2. mclgeti is now more aggressive at detecting livelock.

the rings get punished by an 8th, rather than by half.

we now allow the rings to be punished again even if the system is
already considered in livelock.

without this diff a livelocked system will have its rx ring sizes
scale up and down very rapidly, while holding the rings low for too
long. this affected throughput significantly.

discussed and tested heavily at j2k10. there are still some games
with softnet we can play, but this is a good first step.

"put it in" and ok deraadt@
ok claudio@ krw@ henning@ mcbride@

if we find out that it sucks we can pull it out again later. till then
we'll run with it and see how it goes.


# 1.118 27-Aug-2010 jsg

remove the unused if_init callback in struct ifnet
ok deraadt@ henning@ claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.117 26-Jun-2010 claudio

Implement a simple keepalive mechanism in gre(4) that is compatible with
the one used by Cisco. It sends a return gre packet inside a gre packet
to the other side and expects it to return.
OK deraadt, reyk additional testing by sthen


# 1.116 28-May-2010 claudio

Rework the way we handle MPLS in the kernel. Instead of fumbling MPLS into
ether_output() and later on other L2 output functions use a trick and over-
load the ifp->if_output() function pointer on MPLS enabled interfaces to
go through mpls_output() which will then call the link level output function.
By setting IFXF_MPLS on an interface the output pointers are switched.
This now allows to cleanup the MPLS input and output pathes and fix mpe(4)
so that the MPLS code now actually works for both P and PE systems.
Tested by myself and michele
(A custom kernel with MPLS and mpe enabled is still needed).


# 1.115 17-Apr-2010 deraadt

split SIOCSIFLLADDR code out into an ifnewlladr() function
ok stsp


# 1.114 06-Apr-2010 stsp

Simple implementation of RFC4941, "Privacy Extensions for Stateless
Address Autoconfiguration in IPv6". For those among us who are paranoid
about broadcasting their MAC address to the IPv6 internet.

Man page help from jmc, testing by weerd, arc4random API hints from djm.

ok deraadt, claudio


Revision tags: OPENBSD_4_7_BASE
# 1.113 13-Jan-2010 henning

maintain a global RB tree of all local addresses in the system. this
includes AF_LINK addresses (aka mac addresses in the ethernet case). for
inet this also includes the broadcast addresses.
depends on ifinit() called earlier so we have a chance to pool_init before
autoconf assigns the AF_LINK addresses, the v6 fix, and the ifa_add/del
abstraction i just committed.
this is a change in semantics, it is now illegal to change the actual
address in an ifaddr struct because then the RB tree becomes unbalanced.
nothing using this tree yet.
ok theo ryan dlg


# 1.112 13-Jan-2010 henning

instead of fiddling with the per-interface address lists directly in
many places create a proper API (ifa_add / ifa_del) and use it.
ok theo ryan dlg


# 1.111 12-Jan-2010 deraadt

Make the structures for ifa_msghdr and friends even more like
the route messages so that people and compilers will not get
confused.
ok claudio


# 1.110 17-Sep-2009 claudio

Remove the comaptibility structures for routing socket version 3.
The RTM_VERSION bump is 2 years ago and so there is no need for this.
Diff made by tedu@ some time ago but got never commited so I do it now.


# 1.109 14-Sep-2009 claudio

Add a way to convert the ifi_link_state to a string without the use of
if_media. This makes link state tracking a lot easier as there is no need
to convert if types to if_media types, etc. Additionally this allows us
to extend the link states to include states tracked on higher protocol layers.
gre(4) keepalives packets, bfd and udld can be implemented without ugly hacks.
OK henning, michele, sthen, deraadt


# 1.108 10-Aug-2009 deraadt

At sys_reboot time, bring all the interfaces down so that their xxstop
functions are called, which will turn off DMA. Receiving packets into
your memory after a system reboot is pretty nasty. This will also mean
that the shutdown hooks can go; this solution is smaller.
ok henning miod dlg kettenis


Revision tags: OPENBSD_4_6_BASE
# 1.107 06-Jun-2009 rainer

when xflags got changed, tell the userland by routing sockets

ok henning@


# 1.106 05-Jun-2009 claudio

Initial support for routing domains. This allows to bind interfaces to
alternate routing table and separate them from other interfaces in distinct
routing tables. The same network can now be used in any doamin at the same
time without causing conflicts.
This diff is mostly mechanical and adds the necessary rdomain checks accross
net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6.
input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@


# 1.105 04-Jun-2009 henning

allow IPvShit to be turned off completely per-interface.
ifconfig em0 -inet6
deletes all v6 addresses including link-local and prevents new ones from
being added.
ifconfig em0 inet6 <addr>
re-enables v6, brings the link local back and adds optional <addr>
ok theo reyk


# 1.104 03-Jun-2009 beck

make wireless interfaces priority 4 by default. other interfaces remain
priority 0. while we are in here make sure we add wi interfaces to group "wlan"
in the same way the net80211 stuff already is.

this makes dhcp multiple default routes useful on laptops.

ok claudio@


Revision tags: OPENBSD_4_5_BASE
# 1.103 27-Jan-2009 dlg

make drivers tell the mclgeti allocator what their maximum ring size is
to prevent the hwm growing beyond that. this allows the livelock mitigation
to do something where the hwm used to grow beyond twice the rx rings size.

ok kettenis@ claudio@


# 1.102 12-Dec-2008 claudio

Introduce a if_priority that will be added to RTP_STATIC when routes are
added without an expilict priority. This allows to specify less prefered
interfaces that will only take over if the primary interface loses link.
OK deraadt@


# 1.101 11-Dec-2008 deraadt

export per-interface mbuf cluster pool use statistics out to userland
inside if_data, so that netstat(1) and systat(1) can see them
ok dlg


# 1.100 30-Nov-2008 brad

- Remove unused if_reset "bus reset routine" field in the ifnet struct.
- Add if_stop "stop routine" field in the ifnet struct.

ok mglocker@


# 1.99 26-Nov-2008 dlg

provide m_clsetlwm, an interface for an interface to raise its low
watermark for mbuf cluster allocations.

this is necessary for things like bge which cannot cope with less than a
certain number of pkts on the ring.

ok deraadt@


# 1.98 25-Nov-2008 deraadt

Factor increases are not needed, +1 appears to work as well.
ok dlg


# 1.97 24-Nov-2008 deraadt

move MCLPOOLS to if.h and force uipc_mbuf.c to get if.h, there is no
other option
ok dlg


# 1.96 24-Nov-2008 dlg

add several backend pools to allocate mbufs clusters of various sizes out
of. currently limited to MCLBYTES (2048 bytes) and 4096 bytes until pools
can allocate objects of sizes greater than PAGESIZE.

this allows drivers to ask for "jumbo" packets to fill rx rings with.

the second half of this change is per interface mbuf cluster allocator
statistics. drivers can use the new interface (MCLGETI), which will use
these stats to selectively fail allocations based on demand for mbufs. if
the driver isnt rapidly consuming rx mbufs, we dont allow it to allocate
many to put on its rx ring.

drivers require modifications to take advantage of both the new allocation
semantic and large clusters.

this was written and developed with deraadt@ over the last two days
ok deraadt@ claudio@


# 1.95 07-Nov-2008 deraadt

give this some /* CONSTCOND */ love


Revision tags: OPENBSD_4_4_BASE
# 1.94 10-Apr-2008 dlg

introduce mitigation for the calling of an interfaces start routine.

decent drivers prefer to have a lot of packets on the send queue so they
can queue a lot of them up on the tx ring and then post them all in one
big chunk. unfortunately our stack queues one packet onto the send queue
and then calls the start handler immediately.

this mitigates against that queue, send, queue, send behaviour by trying to
call the start routine only once per softnet. now its queue, queue, queue,
send.

this is the result of a lot of discussion with claudio@
tested by many.


Revision tags: OPENBSD_4_3_BASE
# 1.93 18-Nov-2007 mpf

Sync struct ifaltq to match struct ifqueue.
I wonder why 64-bit archs have not been bitten by this.
OK mcbride@, henning@


# 1.92 03-Sep-2007 claudio

Bump RTM_VERSION to 4 and start a new aera of routing in OpenBSD :)
Changes include 64bit counters instead of u_long, routing table id in the header
of most messages, reserved routing priority field, added a hdrlen field to skip
over the header so that binary compatibility becomes easier.
A minimal backward support for old binaries is included to ease upgrades but
don't expect anything more than ifconfig, route and dhclient to correctly work.
OK henning@ mglocker@


Revision tags: OPENBSD_4_2_BASE
# 1.91 25-Jun-2007 henning

crank ifq_maxlen from 50 to 256, so it is not smaller than most interfaces
rx rings any more. forwarding boxes with many fast interfaces can still use
some more, but this is a saner default.
ok deraadt markus henric


# 1.90 14-Jun-2007 reyk

Add a new "rtlabel" option to ifconfig. It allows to specify a route label
which will be used for new interface routes. For example,
ifconfig em0 10.1.1.0 255.255.255.0 rtlabel RING_1
will set the new interface address and attach the route label RING_1 to
the corresponding route.

manpage bits from jmc@
ok claudio@ henning@


# 1.89 29-May-2007 uwe

Define IF_ENQUEUE() and friends as proper C statements using do ... while
ok henning


# 1.88 26-May-2007 jason

one extern seems to be better than 20 for ifqmaxlen; ok krw


# 1.87 27-Mar-2007 jmc

grammar from bret lambert, and one more from me;


Revision tags: OPENBSD_4_1_BASE
# 1.86 09-Feb-2007 jmc

grammar fix from bret lambert;


# 1.85 28-Nov-2006 reyk

add additional link states to report the half duplex / full duplex
state, if known by the driver. this is required to check the full
duplex state without depending on the ifmedia ioctl which can't be
called in the kernel without process context.

ok henning@, brad@


# 1.84 16-Nov-2006 henning

introduce if_creategroup() to create an empty interface group.
code factored out from if_addgroup(), previously a group always had to have
members. ok mpf mcbride


# 1.83 31-Oct-2006 jason

ether_input_mbuf() isn't necessary, turn it into a macro and deal with
it's "special" case in ether_input(). Based on similiar idea in FreeBSD.
ok brad


Revision tags: OPENBSD_4_0_BASE
# 1.82 02-Jun-2006 mpf

Introduce attributes to interface groups.
As a first user, move the global carp(4) demotion counter
into the interface group. Thus we have the possibility
to define which carp interfaces are demoted together.

Put the demotion counter into the reserved field of the carp header.
With this, we can have carp act smarter if multiple errors occur.
It now always takes over other carp peers, that are advertising
with a higher demote count. As a side effect, we can also have
group failovers without the need of running in preempt mode.
The protocol change does not break compability with older
implementations.

Collaborative work with mcbride@

OK mcbride@, henning@


# 1.81 27-May-2006 brad

remove IFCAP_JUMBO_MTU interface capabilities flag and set if_hardmtu in a few
more drivers.

ok reyk@


# 1.80 26-May-2006 deraadt

rename jumbo mtu to if_hardmtu; ok brad reyk


# 1.79 19-May-2006 reyk

add a if_jumbo_mtu field to the interface structure for drivers
supporting ethernet jumbo frames. there's no standard for the size of
jumbo MTUs, so either let the driver set it's own value or use 9000
byte jumbo frames by default.

ok brad@


# 1.78 04-Mar-2006 brad

With the exception of two other small uncommited diffs this moves
the remainder of the network stack from splimp to splnet.

ok miod@


Revision tags: OPENBSD_3_9_BASE
# 1.77 09-Feb-2006 reyk

add an interface detach hook and use it with the vlan(4) driver. this
fixes a possible crash if the parent interface has been destroyed
(like vlan on trunk) before destroying the vlan interface.

ok brad@


Revision tags: OPENBSD_3_8_BASE
# 1.76 14-Jun-2005 henning

rename function and define to reflect the external -> egress name change
so it is clear what it is all about


# 1.75 14-Jun-2005 henning

use "egress" instead of "external" for the interface group containing the
interfaces the default route(s) point to, proposed deraadt some days ago,
ok djm deraadt


# 1.74 12-Jun-2005 henning

add SIOCGIFGMEMB ioctl, returns a list of all interfaces who are member of
the given group, markus ok


# 1.73 07-Jun-2005 henning

introduce a default "external" interface group, containing the interface(s)
the the default route(s) point to.
handles IPv4 and IPv6 as well as multipath routes.
follows default route changes, of course.
eases writing pf rulesets especially on laptops etc. that use different
interfaces depending on the environment (wired, wireless, ...)
ok theo ryan


# 1.72 06-Jun-2005 henning

use a define instead of hardcoding "all" in 3 places


# 1.71 05-Jun-2005 henning

const'ify the char *groupname param to if_addgroup and if_delgroup


# 1.70 24-May-2005 markus

add net.inet.ip.ifq for monitoring and changing ifqueue; similar to netbsd
ok henning


# 1.69 24-May-2005 reyk

initial import of a trunking (link aggregation and link failover)
implementation. it currently supports round robin mode with link state
checking, additional modes will be added later.

ok brad@, deraadt@


# 1.68 24-May-2005 henning

keep a list of member interfaces in ifg_group


# 1.67 22-May-2005 henning

allow pf to match on interface groups
pass on mygroup ...
markus ok


# 1.66 21-May-2005 henning

clean up and rework the interface absraction code big time, rip out multiple
useless layers of indirection and make the code way cleaner overall.
this is just the start, more to come...
worked very hard on by Ryan and me in Montreal last week, on the airplane to
vancouver and yesterday here in calgary. it hurt.
ok ryan theo


# 1.65 20-Apr-2005 mpf

Introduce if_linkstatehooks.
This converts if_link_state_change() to a generic usable
callback with dohooks().

OK henning@, camield@
Tested by camield@ and Alexey E. Suslikov


Revision tags: OPENBSD_3_7_BASE
# 1.64 07-Feb-2005 mcbride

Add new function if_link_state_change() to take care of sending messages
on the routing socket and notifying carp() of link changes.

ok brad@ mpf@


# 1.63 14-Jan-2005 henning

remove old ifgroups ioctls
the old ifgroups haven't been in use ever really, and the new
implementation is 3 months old today. theo ok (3 months ago)


# 1.62 07-Dec-2004 mcbride

Convert carp(4) to behave more like a regular interface, much in the same
style as vlan(4). carp interfaces no longer require the physical interface
to be on the same subnet as the carp interface, or even that the physical
interface has an adress at all, so CARP can now be used on /30 networks.

ok deraadt@ henning@


# 1.61 07-Dec-2004 mcbride

KNF


# 1.60 03-Dec-2004 henning

do not use one struct timeout for the if congestion stuff, but embed
a struct timeout to struct ifqueue so that each one has its own - it
is a per-queue thing. from chris pascoe


# 1.59 10-Nov-2004 grange

Safer IF_INPUT_ENQUEUE macro.

ok millert@


# 1.58 14-Oct-2004 mickey

avoid stupid commons


# 1.57 11-Oct-2004 henning

ifgroups reqrite
there is now a TAILQ with all interface groups as members, and
in struct ofnet there is only a pointer to the group structure stored
and not its name.
mostly hacked at c2k4 and somewhere over the atlantic ocean
ok markus mcbride


Revision tags: OPENBSD_3_6_BASE
# 1.56 26-Jun-2004 markus

cleanup ioctl for ifgroups; ok pb@


# 1.55 25-Jun-2004 pb

introduce "interface groups"

by "ifconfig fxp0 group foobar" "ifconfig xl0 group foobar"
these two interfaces are in one group.
Every interface has its if-family as default group.

idea/design from henning@, based on some work/disucssion from Joris Vink.

henning@, mcbride@ ok.


# 1.54 20-Jun-2004 beck

undo mbuf cluster breakage that causes free'ed packets to show up on the
input queues when using dhcp and hostap wi, or xl, or fxp....
ok art@


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.53 29-May-2004 jcs

introduce SIOCSIFDESCR and SIOCGIFDESCR to maintain interface
descriptions, configurable with ifconfig

help from various, ok deraadt@


# 1.52 18-May-2004 brad

if_ether.h
add ETHER_MAX_LEN_JUMBO, ETHER_VLAN_ENCAP_LEN, ETHER_ALIGN, and
ETHERMTU_JUMBO constants.

if.h
add a few more interface capabilities flags.

Some from NetBSD, some from FreeBSD.

ok markus@


# 1.51 26-Apr-2004 mcbride

Before enqueueing the packet, copy the contents of incoming clusters
to the mbuf and free the cluster when it contains a small packet.

ok deraadt@


# 1.50 17-Apr-2004 henning

add a congestion indicator to if_queue. It is set when the input queue
is full, along with a timer that unsets it again after 10ms.
The input queue beeing full is a reliable indicator for CPU overload, and
this flag allows other subsystems to cope with the situation.
hacked with beck
ok kjc@ markus@ beck@


Revision tags: OPENBSD_3_5_BASE
# 1.49 15-Jan-2004 markus

add a RTM_IFANNOUNCE message; from netbsd; ok itojun, henning


# 1.48 16-Dec-2003 markus

return error in ifc_destroy; ok deraadt, itojun, cedric, hshoexer


# 1.47 10-Dec-2003 itojun

use if_indexlim (instead of if_index) and ifindex2ifnet[x] != NULL
to check if interface exists, as (1) if_index will have different meaning
(2) ifindex2ifnet could become NULL when interface gets destroyed,
when we introduce dynamically-created interfaces. markus ok


# 1.46 08-Dec-2003 markus

add IOCIFGCLONERS; ifconfig -C; from netbsd; ok henning, deraadt


# 1.45 03-Dec-2003 markus

support for network interface "cloning", e.g. gif(4) via ifconfig(8)


# 1.44 19-Oct-2003 david

more typos


# 1.43 17-Oct-2003 mcbride

Common Address Redundancy Protocol

Allows multiple hosts to share an IP address, providing high availability
and load balancing.

Based on code by mickey@, with additional help from markus@
and Marco_Pfatschbacher@genua.de

ok deraadt@


Revision tags: OPENBSD_3_4_BASE
# 1.42 25-Aug-2003 fgsch

if_init support, required by ieee80211.
deraadt@ ok.


# 1.41 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_A UBC_SYNC_B
# 1.40 03-Jul-2002 miod

Change all variables definitions (int foo) in sys/sys/*.h to variable
declarations (extern int foo), and compensate in the appropriate locations.


# 1.39 30-Jun-2002 itojun

allocate sockaddr_dl for ifnet in if_alloc_sadl(), as we don't always know
the size of sockaddr_dl on if_attach() - for instance, see ether_ifattach().
from netbsd. fgs ok


# 1.38 23-Jun-2002 itojun

g/c last remains of old ipv6 prefix management


# 1.37 27-May-2002 itojun

if_attach() gets called before domaininit(). scan all interfaces for if_afdata
initialization after domaininit().


# 1.36 27-May-2002 itojun

framework to add af-dependent data structure to struct ifnet.
as discussed at bsd-api-discuss. sync w/kame


# 1.35 24-Apr-2002 dhartmei

Add hooks to struct ifnet that allow to register callbacks that will be
notified of interface address changes. ok provos@, angelos@


Revision tags: OPENBSD_3_1_BASE
# 1.34 15-Mar-2002 millert

Cosmetic changes only, primarily making comments line up nicely after the
__P removal.


# 1.33 14-Mar-2002 millert

First round of __P removal in sys


# 1.32 23-Jan-2002 fgsch

compatability -> compatibility.


Revision tags: OPENBSD_3_0_BASE UBC_BASE
# 1.31 05-Jul-2001 angelos

branches: 1.31.4;
KNF


# 1.30 05-Jul-2001 jjbg

Include files for IPComp support. angelos@ ok.


# 1.29 27-Jun-2001 kjc

ALTQ base modifications to the kernel.
- ALTQ introduces a set of new queue macros that coexist with the
traditional IF_XXX macros.
- "struct ifaltq" replaces "struct ifqueue" in "struct ifnet".
- assign cdev major 74 for i386 and 54 for alpha as ALTQ control interface.


# 1.28 23-Jun-2001 fgsch

Add ether_input_mbuf to help us remove the ether_header from
ether_input; all drivers should start migrating to this.
Discussed with jason@, deraadt@ more or les ok'ed.


# 1.27 15-Jun-2001 itojun

change the meaning of ifnet.if_lastchange to meet RFC1573 ifLastChange.
follows BSD/OS practice and ucd-snmp code (FreeBSD does it for specific
interfaces only).

was: if_lastchange get updated on every packet transmission/receipt.
now: if_lastchange get updated when IFF_UP is changed.


# 1.26 09-Jun-2001 angelos

By popular demand, protect from multiple inclusion, and fix to use the
same naming style.


# 1.25 28-May-2001 angelos

IPSECv4 -> IPSEC


# 1.24 28-May-2001 angelos

No need for separate ESP/AH interface capabilities.


# 1.23 28-May-2001 angelos

Interface capabilities (based on NetBSD, but merge ethercom and ifnet
capabilities into one, in the ifp).


Revision tags: OPENBSD_2_9_BASE
# 1.22 06-Feb-2001 mickey

allow changing number of loopbacks in ukc.
change rest of the code to use lo0ifp pointing
to the corresponding struct ifnet.
itojun@ and niklas@ ok


# 1.21 19-Jan-2001 itojun

pull post-4.4BSD change to sys/net/route.c from BSD/OS 4.2 (UCB copyrighted).

have sys/net/route.c:rtrequest1(), which takes rt_addrinfo * as the argument.
pass rt_addrinfo all the way down to rtrequest, and ifa->ifa_rtrequest.
3rd arg of ifa->ifa_rtrequest is now rt_addrinfo * instead of sockaddr *
(almost noone is using it anyways).

benefit: the follwoing command now works. previously we need two route(8)
invocations, "add" then "change".
# route add -inet6 default ::1 -ifp gif0

remove unsafe typecast in rtrequest(), from rtentry * to sockaddr *. it was
introduced by 4.3BSD-reno and never corrected.

XXX is eon_rtrequest() change correct regarding to 3rd arg?
eon_rtrequest() and rtrequest() were incorrect since 4.3BSD-reno,
so i do not have correct answer in the source code.
someone with more clue about netiso-over-ip, please help.


Revision tags: OPENBSD_2_8_BASE
# 1.20 20-Sep-2000 art

Since ifa_refcnt was bumped to an int and rt_flags is an int too, bump
ifa_flags to int.


# 1.19 28-Aug-2000 deraadt

changing the size of if_data has heavy impact on userland compat. there
was even a u_char slot available for ifi_link_state, which clearly does not
need a full 32 bits.


# 1.18 26-Aug-2000 nate

sync mii code with netbsd
adds detach functionality for phys
some code cleanup

Nobody really had time to test all of this out, but theo said commit anyway


Revision tags: OPENBSD_2_7_BASE
# 1.17 22-Mar-2000 itojun

remove if_withname(), which was imported during KAME merge by mistake.


# 1.16 21-Mar-2000 mickey

add SIOCGIFMTU/SIOCSIFMTU; remediate redundant code of tun, ppp, sppp; chris@ ok


Revision tags: SMP_BASE
# 1.15 02-Feb-2000 itojun

branches: 1.15.2;
wrap IFAFREE() by "do {} while (0)". it wasn't safe enough.


Revision tags: kame_19991208
# 1.14 08-Dec-1999 itojun

bring in KAME IPv6 code, dated 19991208.
replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support.
see sys/netinet6/{TODO,IMPLEMENTATION} for more details.

GENERIC configuration should work fine as before. GENERIC.v6 works fine
as well, but you'll need KAME userland tools to play with IPv6 (will be
bringed into soon).


Revision tags: OPENBSD_2_6_BASE
# 1.13 08-Aug-1999 niklas

Support detaching of network interfaces. Still work to do in ipf, and
other families than inet.


# 1.12 23-Jun-1999 cmetz

Added some protocol independent interfaces (supposedly IPv6 support APIs, but
ones that are useful for all protocols, not just IPv6).


Revision tags: OPENBSD_2_5_BASE
# 1.11 13-Mar-1999 deraadt

make ifa_refcnt a u_int; andrewb@demon.net


# 1.10 26-Feb-1999 jason

Ethernet bridge/IP firewall driver.


# 1.9 07-Jan-1999 deraadt

fix IFAFREE() to be safe for if/else nesting


Revision tags: OPENBSD_2_4_BASE
# 1.8 03-Sep-1998 jason

o OpenBSD gets if_media support (from NetBSD)
o rework/simplify if_xl to use it


Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE OPENBSD_2_2_BASE OPENBSD_2_3_BASE
# 1.7 02-Jul-1996 niklas

-Wall & -Wstrict-prototype fixes


# 1.6 29-Jun-1996 deraadt

provide if_attachhead(), and make if_loop use it


# 1.5 10-May-1996 deraadt

if_name/if_unit -> if_xname/if_softc


# 1.4 06-May-1996 mickey

if.h was missed from the commit.
if_ethersubr.c: missed variables added.


# 1.3 05-Mar-1996 mickey

Changes for ifconfig to compile.


# 1.2 03-Mar-1996 niklas

From NetBSD: 960217 merge


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.214 30-May-2023 dlg

add net_tq_barriers

this waits once for something to end in all the net tqs.

ok claudio@


# 1.213 16-May-2023 jan

Use separate IFCAPs for LRO and TSO.

This diff introduces separate capabilities for TCP offloading. We split this
into LRO (large receive offloading) and TSO (TCP segmentation offloading).
LRO can be turned on/off via tcprecvoffload option of ifconfig and is not
inherited to sub interfaces.

TSO is inherited by sub interfaces to signal this hardware offloading capability
to the network stack.

With tweaks from bluhm, claudio and dlg

ok bluhm, claudio


# 1.212 15-May-2023 bluhm

Implement the TCP/IP layer for hardware TCP segmentation offload.
If the driver of a network interface claims to support TSO, do not
chop the packet in software, but pass it down to the interface
layer.
Precalculate parts of the pseudo header checksum, but without the
packet length. The length of all generated smaller packets is not
known yet. Driver and hardware will use the mbuf packet header
field ph_mss to calculate it and update checksum.
Introduce separate flags IFCAP_TSOv4 and IFCAP_TSOv6 as hardware
might support ony one protocol family. The old flag IFXF_TSO is
only relevant for large receive offload. It is missnamed, but keep
that for now.
Note that drivers do not set TSO capabilites yet. Also the ifconfig
flags and pseudo interfaces capabilities will be done separately.
So this commit should not change behavior.
heavily based on the work from jan@; OK sashan@


Revision tags: OPENBSD_7_3_BASE
# 1.211 07-Mar-2023 jan

Avoid enabling TSO on interfaces which are already attached to a bridge.

with tweaks from claudio and deraadt

ok claudio, bluhm


# 1.210 27-Feb-2023 jan

Turn off TSO if interface is added to layer 2 devices.

ok bluhm@, claudio@


Revision tags: OPENBSD_7_2_BASE
# 1.209 27-Jun-2022 jan

Introduce Large Receive Offloading of TCP segment offloading for ix(4). It is
disabled by default. Also add a tso option to ifconfig(8) to enable and
disable this feature.

ok deraadt


Revision tags: OPENBSD_6_9_BASE OPENBSD_7_0_BASE OPENBSD_7_1_BASE
# 1.208 11-Mar-2021 florian

When RFC 8981 obsoleted RFC 4941 the terminology changed from
"privacy extensions" to "temporary address extensions"

Change ifconfig(8) to output temporary after temporary addresses and
add "temporary" option which is an alias for autoconfprivacy for now.

Also make AUTOCONF6TEMP a positiv flag that is set by default.
Previously the negative flag "INET6_NOPRIVACY" was set when privacy
addresses were disabled. This makes the flags output less ugly and
will allow us to disable autoconf addresses while having temporary
addresses enabled in the future.

More work is needed in slaacd.

input benno, jmc, deraadt
previous verison OK benno
OK jmc, kn


# 1.207 20-Feb-2021 dlg

add a MONITOR flag to ifaces to say they're only used for watching packets.

an example use of this is when you have a span port on a switch and
you want to be able to see the packets coming out of it with tcpdump,
but do not want these packets to enter the network stack for
processing. this is particularly important if the span port is
pushing a copy of any packets related to the machine doing the
monitoring as it will confuse pf states and the stack.

ok benno@


# 1.206 01-Feb-2021 mvs

ifunit() was fully replaced by if_unit(9) and should go away.

ok bluhm@ dlg@


# 1.205 18-Jan-2021 mvs

Introduce new function if_unit(9). This function returns a pointer the
interface descriptor corresponding to the unique name. This descriptor
is guaranteed to be valid until if_put(9) is called on the returned
pointer. if_unit(9) should replace already existent ifunit() which
returns descriptor not safe for dereference when context was switched.
This allow us to avoid some use-after-free issues in ioctl(2) path.
Also this unifies interface descriptor usage.

ok claudio@ sashan@


# 1.204 30-Sep-2020 mvs

We have no if_attachtail() function so remove the declaration.

ok deraadt@ claudio@


Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE OPENBSD_6_8_BASE
# 1.203 25-Jul-2019 krw

AF_INET comes before AF_INET6. Shorten line to <80 chars.

pointed out by claudio@


# 1.202 25-Jul-2019 krw

Add IFXF_AUTOCONF4 to if_xflags to match IFXF_AUTOCONF6. Let
ifconfig set/unset it.

ok deraadt@ kmos@


# 1.201 19-Apr-2019 dlg

add IF_HDRPRIO_OUTER for rxprio

IF_HDRPRIO_OUTER says you want the priority from the outer encap header.

ok claudio@


Revision tags: OPENBSD_6_5_BASE
# 1.200 10-Apr-2019 dlg

add struct if_sffpage so userland can read a page of sfp/qsfp info

this will be used by ifconfig so it can show you things like who
mades what module you're using and when it was made, and on some
modules you get diag info like temperature, vcc, and rx and tx
powers.

im putting the kernel side in so we can keep fiddling with the
userland printf side of things.

this work was done based on a question by rachel roch
ok deraadt@
enthusiasm from many including mikeb@ sthen@ patrick@


# 1.199 23-Jan-2019 dlg

add a SIOCGPWE3 ioctl for interfaces to advertise pwe3 capability

im going to turn mpw into an ethernet interface, which includes
changing its if_type to IFT_ETHER. currently ldpd looks for if_type
IFT_MPLSTUNNEL to decide if an interface is a pseudowire, ie, it's
going to break. the ioctl will let ldpd ask the interface if it is
pseudowire capable as an alternative.

ok claudio@


# 1.198 12-Nov-2018 dlg

add ifreq bits for the tx header prio field ioctls

a tx header prio can set to a fixed value from 0 to 7, or magic
values to represent populating the prio field from the encapsulated
packet, or from the mbuf prio value.

ok claudio@


# 1.197 12-Nov-2018 krw

Add new routing socket message RTM_80211INFO to provide details of
802.11 interface state changes (e.g. SSID) to interested parties.

Original diff from phessler@. Many suggestions and tweaks from
claudio@, stsp@, anton@.

ok claudio@ stsp@ anton@ phessler@


# 1.196 11-Nov-2018 dlg

use the llprio on gre(4) and eoip(4) interfaces for the keepalive tos

llprios are valued 0 to 7, while the ip tos/dscp/tclass is an 8 bit
value. fortunately the high 3 bits map nicely to the llprio values,
so we shift the llprio into place when generating the keepalive
frames. the llprio is defaulted to the value that cisco uses for
their gre keepalives.


Revision tags: OPENBSD_6_4_BASE
# 1.195 12-Sep-2018 krw

Fix obvious cut&pasto in comment (ifa_msghdr -> if_announcemsghdr).

ok claudio@


# 1.194 30-May-2018 dlg

restrict the prio values from SIOCSIFLLPRIO to what the kernel handles

previously the ioctl code checked that prio was an int less than
UCHAR_MAX, but the rest of the kernel (and priq code in particular)
expects it to be between 0 and 7 inclusive.

ok krw@ tb@


# 1.193 25-Apr-2018 jca

Make this header standalone #if __BSD_VISIBLE, by including needed headers

Puts us in line with Free/NetBSD and Linux and will get us rid of
pointless patches in the ports tree. ok guenther@ deraadt@


Revision tags: OPENBSD_6_3_BASE
# 1.192 19-Feb-2018 dlg

tunneldf needs ifr_df


# 1.191 10-Feb-2018 florian

Implement RFC 7217: "A Method for Generating Semantically Opaque
Interface Identifiers with IPv6 Stateless Address Autoconfiguration."

"An IPv6 address configured using this method is stable within each
subnet, but the corresponding Interface Identifier changes when the
host moves from one network to another. This method is meant to be an
alternative to generating Interface Identifiers based on hardware
addresses."

OK naddy, sthen


# 1.190 16-Jan-2018 mpi

Recycle IFF_NOTRAILERS into IFF_STATICARP and document ownerhsip
of IFF* flags.

inputs from jmc@, ok bluhm@, visa@


# 1.189 21-Dec-2017 dlg

prototype if_attach_iqueues so drivers can configure multiple iqs.


# 1.188 09-Nov-2017 tb

The cmd argument of ifconf() has been unused since COMPAT_LINUX was
purged. Remove it and move the prototype to if.c since ifconf() is
not used outside of this file.

ok mpi


# 1.187 31-Oct-2017 sashan

- add one more softnet taskq
NOTE: code still runs with single softnet task. change definition of
SOFTNET_TASKS in net/if.c, if you want to have more than one softnet task

OK mpi@, OK phessler@


Revision tags: OPENBSD_6_1_BASE OPENBSD_6_2_BASE
# 1.186 24-Jan-2017 krw

A space here, a space there. Soon we're talking real whitespace
rectification.


# 1.185 24-Jan-2017 dlg

add support for multiple transmit ifqueues per network interface.

an ifq to transmit a packet is picked by the current traffic
conditioner (ie, priq or hfsc) by providing an index into an array
of ifqs. by default interfaces get a single ifq but can ask for
more using if_attach_queues().

the vast majority of our drivers still think there's a 1:1 mapping
between interfaces and transmit queues, so their if_start routines
take an ifnet pointer instead of a pointer to the ifqueue struct.
instead of changing all the drivers in the tree, drivers can opt
into using an if_qstart routine and setting the IFXF_MPSAFE flag.
the stack provides a compatability wrapper from the new if_qstart
handler to the previous if_start handlers if IFXF_MPSAFE isnt set.

enabling hfsc on an interface configures it to transmit everything
through the first ifq. any other ifqs are left configured as priq,
but unused, when hfsc is enabled.

getting this in now so everyone can kick the tyres.

ok mpi@ visa@ (who provided some tweaks for cnmac).


# 1.184 23-Jan-2017 mpi

Flag pseudo-interfaces as such in order to call add_net_randomness()
only once per packet.

Fix a regression introduced when if_input() started to be called by
every pseudo-driver.

ok claudio@, dlg@


# 1.183 23-Jan-2017 dlg

i botched the copyout to ifr->ifr_data in SIOCGIFDATA.

this lets pflogd run again.

rename if_data() to if_getdata() while here to make grepping for
things less noisy.

reported by jsg@
worked through with deraadt@


# 1.182 23-Jan-2017 dlg

merge the ifnet and ifqueue stats together when userland wants them.

a new if_data() function takes a pointer to ifnet and merges its
if_data and ifq statistics. it takes the ifq mutex around the reads
of the ifq stats so they get a consistent copy.

the ifnet and ifq stats are merged because some parts of the stack
still update the ifnet counters.

ok visa@ (on an earlier diff) mpi@ claudio@


# 1.181 12-Dec-2016 mpi

Remove most of the splsoftnet() recursions related to cloned interfaces.

inputs and ok bluhm@


# 1.180 27-Oct-2016 dlg

add a new pool for 2k + 2 byte (mcl2k2) clusters.

a certain vendor likes to make chips that specify the rx buffer
sizes in kilobyte increments. unfortunately it places the ethernet
header on the start of the rx buffer, which means if you give it a
mcl2k cluster, the ethernet header will not be ETHER_ALIGNed cos
mcl2k clusters are always allocated on 2k boundarys (cos they pack
into pages well). that in turn means the ip header wont be aligned
correctly.

the current workaround on these chips has been to let non-strict
alignment archs just use the normal 2k cluster, but use whatever
cluster can fit 2k + 2 on strict archs. that turns out to be the
4k cluster, meaning we waste nearly 2k of space on every packet.

properly aligning the ethernet header and ip headers gives a
performance boost, even on non-strict archs.


# 1.179 04-Sep-2016 reyk

Move code to change the rdomain of an interface from the ioctl switch case
to a new function if_setrdomain().

OK mpi@ henning@


# 1.178 03-Sep-2016 reyk

Add support for a multipoint-to-multipoint mode in vxlan(4). In this
mode, vxlan(4) must be configured to accept any virtual network
identifier with "vnetid any" and added to a bridge(4) or switch(4).
This way the driver will dynamically learn the tunnel endpoints and
their vnetids for the responses and can be used to dynamically bridge
between VXLANs. It is also being used in combination with switch(4)
and the OpenFlow tunnel classifiers.

With input from yasuoka@ goda@
OK deraadt@ dlg@


Revision tags: OPENBSD_6_0_BASE
# 1.177 10-Jun-2016 vgross

Add the "llprio" field to struct ifnet, and the corresponding keyword
to ifconfig.

"llprio" allows one to set the priority of packets that do not go through
pf(4), as the case is for arp(4) or bpf(4).

ok sthen@ mikeb@


# 1.176 02-Mar-2016 dlg

provide generic ioctls for managing an interfaces parent

in the future this will subsume the individual vlandev, carpdev,
pppoedev, foodev options for things like vlan, carp, pppoe, etc.

inspired by vnetid

ok mpi@ jmatthew@


Revision tags: OPENBSD_5_9_BASE
# 1.175 05-Dec-2015 deraadt

avoid an ugly wrap in a comment


# 1.174 03-Dec-2015 dlg

rework if_start to allow nics to provide an mpsafe start routine.

existing start routines will still be called under the kernel lock
and at IPL_NET.

mpsafe start routines will be serialised so only one instance of
each interfaces function will be running in the kernel at any point
in time. this guarantees packets will be dequeued in order, and the
start routines dont have to lock against themselves because if_start
does it for them.

the code to do that is based on the scsi runqueue code.

this also provides an if_start_barrier() function that should wait
until any currently running instances of if_start have finished.

a driver can opt in to the mpsafe if_start call by doing the following:

1. setting ifp->if_xflags = IFXF_MPSAFE
2. only calling if_start() instead of its own start routine
3. clearing IFF_RUNNING before calling if_start_barrier() on its way down
4. only using IFQ_DEQUEUE (not ifq_deq_begin/commit/rollback)

to simplify the implementation the tx mitigation code has been removed.

tested by several
ok mpi@ jmatthew@


# 1.173 20-Nov-2015 mpi

Keep if_ref() private, if_get() is what you want to use before if_put().

The thread detaching an interface will sleep until all references to this
interface have been released. So we decided to only keep references for
a short period of time.

Keeping if_ref() private will hopefully help preserve this goal as long
as it makes sense.

Calling if_get()/if_put() in the same function also allows us to make
use of static analysis tools (thanks jsg@!) to catch our errors.

ok dlg@


# 1.172 24-Oct-2015 reyk

Add pair(4), a vether-based virtual Ethernet driver to interconnect
rdomains and bridges on the local system. This can be used to route
through local rdomains, to create L2 devices (like trunks) between
them, and many other things.

Discussed with many, with input from mpi@
OK sthen@ phessler@ yasuoka@ mikeb@


# 1.171 23-Oct-2015 claudio

Introduce a new sysctl NET_RT_IFNAMES that returns only ifnames to ifindex
mappings. This will be used by if_nameindex(3), if_nametoindex(3) and
if_indextoname(3) soon to fix the issues in pledge because of inet6 link
local addressing.
OK mpi@ benno@ deraadt@
The libc version will follow soon so better start updating your kernels


# 1.170 23-Oct-2015 dlg

tweak the vnetid so it can be optional and therefore cleared/deleted.

the abstract vnetid is promoted to a uin32_t, and adds a SIOCDVNETID
ioctl so it can be cleared.

this is all because i set an assignment on implementing a virtual
network interface and the students got confused when vnetid 0 didnt
show up in ifconfig output.

the vnetid in the vxlan(4) protocol is optional, but the current
code confuses 0 with no vnetid being set. this makes it clear.

ok reyk@ who also simplified my diff


# 1.169 05-Oct-2015 uebayasi

Add ifi_oqdrops and its alias to struct if_data.

Necessary bumps in Ports will be handled by sthen@.

OK mpi@ dlg@


# 1.168 27-Sep-2015 stsp

Add if_setlladdr(), factored out from ifioctl(). Will be used by iwm(4) soon.
With suggestions from tedu@ and guenther@
ok kettenis@


# 1.167 11-Sep-2015 stsp

Make room for media types of the future. Extend the ifmedia word to 64 bits.
This changes numbers of the SIOCSIFMEDIA and SIOCGIFMEDIA ioctls and
grows struct ifmediareq.

Old ifconfig and dhclient binaries can still assign addresses, however
the 'media' subcommand stops working. Recompiling ifconfig and dhclient
with new headers before a reboot should not be necessary unless in very
special circumstances where non-default media settings must be used to
get link and console access is not available.

There may be some MD fallout but that will be cleared up later.

ok deraadt miod
with help and suggestions from several sharks attending l2k15


# 1.166 09-Sep-2015 dlg

introduce reference counts for interfaces (ie, struct ifnet *ifp).

if_get can get a reference to an ifp, but it never releases that
reference. this provides an if_put function that can be used to
decrement the refcount.

we cannot come up with a scheme for letting the network stack run on
one (or many) cpus while ioctls are pulling interfaces down on another
cpu without refcounts for the interfaces.

if_put is going in now so we can go through the stack and put the
necessary calls to it in, and then we'll backfill this implementation
to actually check the refcounts when the interface detaches.

ok mpi@ mikeb@ claudio@


# 1.165 30-Aug-2015 mpi

Use a global table for domains instead of building a list at run time.

As a side effect there's no need to run if_attachdomain() after the
list of domains has been built.

ok claudio@, reyk@


Revision tags: OPENBSD_5_8_BASE
# 1.164 07-Jun-2015 jsg

Introduce unhandled_af() for cases where code conditionally does
something based on an address family and later assumes one of the paths
was taken. This was initially just calls to panic until guenther
suggested a function to reduce the amount of strings needed.

This reduces the amount of noise with static analysers and acts
as a sanity check.

ok guenther@ bluhm@


# 1.163 18-May-2015 reyk

Move the rdomain from struct ifnet into struct if_data. This way it
will be exported to userland with the existing sysctl, getifaddrs()
and routing socket (if_msghdr.ifm_data) interfaces that expose
if_data. All programs and daemons - Apps - that call the
SIOCGIFRDOMAIN ioctl in a getifaddrs() loop or after receiving an
interface message on the routing socket can now remove the pointless
additional ioctl. In base, that could be: dhclient, isakmpd, dhcpd,
dhcrelay, ntpd, ospfd, ripd, ifconfig.

No ABI breakage because it uses a previously unused pad field in if_data.

OK mpi@ deraadt@


# 1.162 10-Apr-2015 mpi

Run detach hook and similar before cleaning up any other resource when
an interface is destroyed/removed. This way we can ensure pseudo-driver
changes done after attaching an interface are undone before detaching it.

Note: it is safe to call if_deactivate() multiple times as the interface
should not have any attached pseudo-interface after the first call.

ok deraadt@, dlg@


# 1.161 18-Mar-2015 dlg

remove the congestion handling from struct ifqueue.

its only used for the ip and ip6 network stack input queues, so it
seems unfair that every instance of ifqueue has to carry a pointer
around for this specific use case.

this moves the congestion marker to a kernel global. if we detect
that we're congested, we assume the whole system is busy and punish
all input queues.

marking a system as congested is done by setting the global to the
current value of ticks. as the system moves away from that value,
it moves away from being congested until the comparison fails.

written at s2k15
ok henning@ beck@ bluhm@ claudio@


Revision tags: OPENBSD_5_7_BASE
# 1.160 08-Feb-2015 mpi

Introduce if_input() a function to pass packets dequeued from a
recieving ring to the stack.

if_input() is at the moment a drop-in replacement for ether_input_mbuf()
but will let us stack pseudo-driver in a nice way in order to no longer
call ether_input() recursively.

ok pelikan@, reyk@, blambert@, henning@


# 1.159 06-Jan-2015 stsp

Remove the NOINET6 interface flag, a left-over from the times when IPv6
was enabled by default. Add AFATTACH/AFDETACH ioctls which enable/disable
an address family for an interface (currently used for IPv6 only).

New kernel needs new ifconfig for IPv6 configuration (address assignment
still works with old ifconfig making this easy to cross over).

Committing on behalf of henning@ who is currently lebensmittelvergiftet.
ok stsp, benno, mpi


# 1.158 05-Dec-2014 mpi

Explicitly include <net/if_var.h> instead of pulling it in <net/if.h>.

ok mikeb@, krw@, bluhm@, tedu@


Revision tags: OPENBSD_5_6_BASE
# 1.157 14-Jul-2014 dlg

now that receive ring accounting has been pulled out of the mbuf layer,
we can pull the space the mbuf layer used to do per interface accounting
out of struct if_data.

saves a hundredish bytes on every interface.

ok deraadt@ claudio@


# 1.156 11-Jul-2014 henning

introduce the IFXF_AUTOCONF6 interface flag which controls wether we
accept rtadvs on that interface. the global net.inet6.ip6.accept_rtadv
sysctl just doesn't cut it, even tho the spec wants that - but in their
little absurd world, a host just has one interface by definition anyway...
the sysctlgoes away.
lots of head scratching, brain cell elemination etc from bluhm benno stsp
florian, excitement from simon and todd, ok bluhm stsp benno florian


# 1.155 08-Jul-2014 dlg

introduce the if_rxr api. it is intended to pull the rx ring accounting
out of the mbuf layer, and break the assumption that an interface will
only have a single ring per mbuf cluster size.

mpi@ is ok with moving this forward


# 1.154 13-Jun-2014 mpi

Instead of updating all the cluster allocation water marks of all the
interfaces when the kernel is livelocked, only do it for the current
pool and defer the other updates.

This allow us to get rid of an interface list iteration in a critical
path.

Ridding the libc crank since this change introduce an ABI break.

ok claudio@


Revision tags: OPENBSD_5_5_BASE
# 1.153 21-Nov-2013 mikeb

split kernel parts of the if.h into a separate header file if_var.h
which allows us to modify ifnet structure in a relatively safe way;
discussed with deraadt, ok mpi


# 1.152 09-Nov-2013 dlg

ticks is compared against mcl_grown to see if time has elapsed since
the rx ring was last allowed to grow and then assigned to it. it
is erroneous to do this because mcl_grown is a u_int and ticks is an
int.

this makes mcl_grown an int, and follows the idiom in kern_timeout.c
of going "thing - ticks < diff", which better copes with ticks
wrapping around and being used to calculate relative intervals.

ok pirofti@ guenther@


# 1.151 01-Nov-2013 pelikan

keep net/hfsc.h away from userspace, except in pfctl

tested by naddy, ok deraadt


# 1.150 21-Oct-2013 benno

nuke comment. How soon is now?
"do it" deraadt@


# 1.149 19-Oct-2013 reyk

Bring back the if_detachhook. We're going to have more users now.

ok mpi@ henning@ benno@


# 1.148 13-Oct-2013 reyk

Import vxlan(4), the virtual extensible local area network tunnel
interface. VXLAN is a UDP-based tunnelling protocol for overlaying
virtualized layer 2 networks over layer 3 networks. The implementation
is based on draft-mahalingam-dutt-dcops-vxlan-04 and has been tested
with other implementations in the wild.

put it in deraadt@


# 1.147 12-Oct-2013 henning

new bandwidth shaping subsystem, kernel side
uses hfsc behind the scenes; altq stays in parallel for a migration phase.
if.h even more messy for the transition, but eventuelly it should become
readable...
looked over & tested by many, ok phessler sthen


# 1.146 17-Sep-2013 mpi

Change vlan(4) detach procedure to not use a hook but a list of vlans
on the parent interface. This is similar to what bridge(4), trunk(4)
or carp(4) are doing and allows us to get rid of the detachhook.

ok reyk@, mikeb@


# 1.145 28-Aug-2013 mpi

Remove unused argument from *rtrequest()

ok krw@, mikeb@


Revision tags: OPENBSD_5_4_BASE
# 1.144 20-Jun-2013 mpi

Revert previous and unbreak asr, the new include should be protected.

Reported by naddy@


# 1.143 20-Jun-2013 mpi

Allocate the various hook head descriptors as part of the ifnet
structure rather than doing various M_WAITOK allocations during
the *attach() functions, we always rely on them anyway.

ok mikeb@, uebayasi@


# 1.142 02-Apr-2013 mpi

Instead of storing the link-level address of every interface in a global
array indexed by interface numbers, add a new field to the interface
descriptor pointing to it.

claudio@ and todd@ like it, ok mikeb@


# 1.141 26-Mar-2013 mpi

Remove various read-only *maxlen variables and use IFQ_MAXLEN directly.

ok beck@, mikeb@


# 1.140 20-Mar-2013 mpi

Introduce if_get() to retrieve an interface descriptor pointer given
an interface index and replace all the redondant checks and accesses
to a global array by a call to this function.

With imputs from and ok bluhm@, mikeb@


# 1.139 07-Mar-2013 mpi

Remove unused ifa_ifwithaf() function.

ok mikeb@, miod@


# 1.138 07-Mar-2013 mpi

Remove the IFAFREE() macro, the ifafree() function it was calling already
check for the reference counter.

ok mikeb@, miod@, pelikan@, kettenis@, krw@


Revision tags: OPENBSD_5_3_BASE
# 1.137 23-Nov-2012 sthen

Add SIOCGIFHARDMTU to allow retrieving the driver's maximum supported MTU
looks fine reyk@ ok mikeb@


# 1.136 11-Nov-2012 deraadt

align ifaliasreq.ifra_addr similar to the way that ifreq is fixed --
a gruesome union, to block the compiler from placing the struct
incorrectly aligned on stack frames
ok guenther


# 1.135 05-Oct-2012 camield

Point an interface directly to its bridgeport configuration, instead
of to the bridge itself. This is ok, since an interface can only be part
of one bridge, and the parent bridge is easy to find from the bridgeport.

This way we can get rid of a lot of list walks, improving performance
and shortening the code.

ok henning stsp sthen reyk


# 1.134 19-Sep-2012 henning

defina an IFCAP_CSUM_MASK, covering IFCAP_CSUM_*, and use it in if_vlan.c
to replace the list of them.
this actually makes vlan inherit the IPv6 CSUM flags from it's parent, that
had been commented out since this code was committed back in 2001.
ok benno mpf


# 1.133 10-Sep-2012 guenther

Bring into compliance with POSIX, exposing just the specified bits.

Requested by jasper@, ok kettenis@


# 1.132 21-Aug-2012 bluhm

Reverse the name and meaning of the IFXF_INET6_PRIVACY interface
flag. It is now called IFXF_INET6_NOPRIVACY. So IPv6 privacy
addresses are on by default without resetting the flag during
ifconfig down/up.
OK stsp@, sperreault@ (who wrote the same diff)


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.131 02-Dec-2011 haesbaert

Kill unused IFCAP_IPSEC and IFCAP_IPCOMP.

ok claudio@ henning@ mikeb@


# 1.130 02-Nov-2011 haesbaert

Expose if_capabilities to userland so that ifconfig can display the
device hardware features.
Tune ifconfig to show them with 'hwfeatures' argument.
While here, kill some old unused capabilities and respect 80 columns
in brconfig.h.

ok mcbride@, henning@, mpf@.


# 1.129 07-Oct-2011 henning

rename some vars and functions
unfortunately altq is one giant namespace violation. rename just those that
conflict with new stuff for now only to be found on my laptop. reduce pain,
the diff is huge already. ok ryan


Revision tags: OPENBSD_5_0_BASE
# 1.128 08-Jul-2011 henning

new priority queueing implementation, extremely low overhead, thus fast.
unconditional, always on. 8 priority levels, as every better switch, the
vlan header etc etc. ok ryan mpf sthen, pea tested as well


# 1.127 07-Jul-2011 henning

provide IF_LEN and IFQ_LEN to access ifq_len on an ifqueue, ryan ok


# 1.126 05-Jul-2011 henning

now of course I only noticed if_qflush is completely unused after
adjusting it to the new world order in my tree... remove it, ok ryan claudio


# 1.125 03-Jul-2011 henning

IFQ_CLASSIFY is also just schrapnel


# 1.124 03-Jul-2011 henning

no traces of ALTQ_DECL to be found anywhere, thus kill the #defines


# 1.123 03-Jul-2011 claudio

LINK_STATE_IS_UP() should consider LINK_STATE_UNKNOWN as an up state.
This is now possible because carp no longer uses LINK_STATE_UNKNOWN
for a state that is considered down. This will simplify a lot of code.
OK mpf@ mcbride@ henning@


# 1.122 13-Mar-2011 stsp

Add a way to enable/disable Wake On LAN with ifconfig.
ok deraadt


Revision tags: OPENBSD_4_9_BASE
# 1.121 17-Nov-2010 henning

introduce ifa_update_broadaddr to update an ifaddr's broadcast address,
trivial for the moment, more needed soon
tested by many as part of a larger diff, ok sthen claudio dlg krw


# 1.120 24-Sep-2010 claudio

Implement if_freenameindex() as a real function as required by posix.
OK deraadt@, millert@


# 1.119 23-Sep-2010 dlg

tweak the mclgeti algorithm to behave better under load.

instead of letting hardware rings grow on every interrupt, restrict
it so it can only grow once per softclock tick. we can only punish
the rings on softclock ticks, so it make sense to only grow on
softclock tick boundaries too.

the rings are now punished after >1 lost softclock tick rather than
>2. mclgeti is now more aggressive at detecting livelock.

the rings get punished by an 8th, rather than by half.

we now allow the rings to be punished again even if the system is
already considered in livelock.

without this diff a livelocked system will have its rx ring sizes
scale up and down very rapidly, while holding the rings low for too
long. this affected throughput significantly.

discussed and tested heavily at j2k10. there are still some games
with softnet we can play, but this is a good first step.

"put it in" and ok deraadt@
ok claudio@ krw@ henning@ mcbride@

if we find out that it sucks we can pull it out again later. till then
we'll run with it and see how it goes.


# 1.118 27-Aug-2010 jsg

remove the unused if_init callback in struct ifnet
ok deraadt@ henning@ claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.117 26-Jun-2010 claudio

Implement a simple keepalive mechanism in gre(4) that is compatible with
the one used by Cisco. It sends a return gre packet inside a gre packet
to the other side and expects it to return.
OK deraadt, reyk additional testing by sthen


# 1.116 28-May-2010 claudio

Rework the way we handle MPLS in the kernel. Instead of fumbling MPLS into
ether_output() and later on other L2 output functions use a trick and over-
load the ifp->if_output() function pointer on MPLS enabled interfaces to
go through mpls_output() which will then call the link level output function.
By setting IFXF_MPLS on an interface the output pointers are switched.
This now allows to cleanup the MPLS input and output pathes and fix mpe(4)
so that the MPLS code now actually works for both P and PE systems.
Tested by myself and michele
(A custom kernel with MPLS and mpe enabled is still needed).


# 1.115 17-Apr-2010 deraadt

split SIOCSIFLLADDR code out into an ifnewlladr() function
ok stsp


# 1.114 06-Apr-2010 stsp

Simple implementation of RFC4941, "Privacy Extensions for Stateless
Address Autoconfiguration in IPv6". For those among us who are paranoid
about broadcasting their MAC address to the IPv6 internet.

Man page help from jmc, testing by weerd, arc4random API hints from djm.

ok deraadt, claudio


Revision tags: OPENBSD_4_7_BASE
# 1.113 13-Jan-2010 henning

maintain a global RB tree of all local addresses in the system. this
includes AF_LINK addresses (aka mac addresses in the ethernet case). for
inet this also includes the broadcast addresses.
depends on ifinit() called earlier so we have a chance to pool_init before
autoconf assigns the AF_LINK addresses, the v6 fix, and the ifa_add/del
abstraction i just committed.
this is a change in semantics, it is now illegal to change the actual
address in an ifaddr struct because then the RB tree becomes unbalanced.
nothing using this tree yet.
ok theo ryan dlg


# 1.112 13-Jan-2010 henning

instead of fiddling with the per-interface address lists directly in
many places create a proper API (ifa_add / ifa_del) and use it.
ok theo ryan dlg


# 1.111 12-Jan-2010 deraadt

Make the structures for ifa_msghdr and friends even more like
the route messages so that people and compilers will not get
confused.
ok claudio


# 1.110 17-Sep-2009 claudio

Remove the comaptibility structures for routing socket version 3.
The RTM_VERSION bump is 2 years ago and so there is no need for this.
Diff made by tedu@ some time ago but got never commited so I do it now.


# 1.109 14-Sep-2009 claudio

Add a way to convert the ifi_link_state to a string without the use of
if_media. This makes link state tracking a lot easier as there is no need
to convert if types to if_media types, etc. Additionally this allows us
to extend the link states to include states tracked on higher protocol layers.
gre(4) keepalives packets, bfd and udld can be implemented without ugly hacks.
OK henning, michele, sthen, deraadt


# 1.108 10-Aug-2009 deraadt

At sys_reboot time, bring all the interfaces down so that their xxstop
functions are called, which will turn off DMA. Receiving packets into
your memory after a system reboot is pretty nasty. This will also mean
that the shutdown hooks can go; this solution is smaller.
ok henning miod dlg kettenis


Revision tags: OPENBSD_4_6_BASE
# 1.107 06-Jun-2009 rainer

when xflags got changed, tell the userland by routing sockets

ok henning@


# 1.106 05-Jun-2009 claudio

Initial support for routing domains. This allows to bind interfaces to
alternate routing table and separate them from other interfaces in distinct
routing tables. The same network can now be used in any doamin at the same
time without causing conflicts.
This diff is mostly mechanical and adds the necessary rdomain checks accross
net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6.
input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@


# 1.105 04-Jun-2009 henning

allow IPvShit to be turned off completely per-interface.
ifconfig em0 -inet6
deletes all v6 addresses including link-local and prevents new ones from
being added.
ifconfig em0 inet6 <addr>
re-enables v6, brings the link local back and adds optional <addr>
ok theo reyk


# 1.104 03-Jun-2009 beck

make wireless interfaces priority 4 by default. other interfaces remain
priority 0. while we are in here make sure we add wi interfaces to group "wlan"
in the same way the net80211 stuff already is.

this makes dhcp multiple default routes useful on laptops.

ok claudio@


Revision tags: OPENBSD_4_5_BASE
# 1.103 27-Jan-2009 dlg

make drivers tell the mclgeti allocator what their maximum ring size is
to prevent the hwm growing beyond that. this allows the livelock mitigation
to do something where the hwm used to grow beyond twice the rx rings size.

ok kettenis@ claudio@


# 1.102 12-Dec-2008 claudio

Introduce a if_priority that will be added to RTP_STATIC when routes are
added without an expilict priority. This allows to specify less prefered
interfaces that will only take over if the primary interface loses link.
OK deraadt@


# 1.101 11-Dec-2008 deraadt

export per-interface mbuf cluster pool use statistics out to userland
inside if_data, so that netstat(1) and systat(1) can see them
ok dlg


# 1.100 30-Nov-2008 brad

- Remove unused if_reset "bus reset routine" field in the ifnet struct.
- Add if_stop "stop routine" field in the ifnet struct.

ok mglocker@


# 1.99 26-Nov-2008 dlg

provide m_clsetlwm, an interface for an interface to raise its low
watermark for mbuf cluster allocations.

this is necessary for things like bge which cannot cope with less than a
certain number of pkts on the ring.

ok deraadt@


# 1.98 25-Nov-2008 deraadt

Factor increases are not needed, +1 appears to work as well.
ok dlg


# 1.97 24-Nov-2008 deraadt

move MCLPOOLS to if.h and force uipc_mbuf.c to get if.h, there is no
other option
ok dlg


# 1.96 24-Nov-2008 dlg

add several backend pools to allocate mbufs clusters of various sizes out
of. currently limited to MCLBYTES (2048 bytes) and 4096 bytes until pools
can allocate objects of sizes greater than PAGESIZE.

this allows drivers to ask for "jumbo" packets to fill rx rings with.

the second half of this change is per interface mbuf cluster allocator
statistics. drivers can use the new interface (MCLGETI), which will use
these stats to selectively fail allocations based on demand for mbufs. if
the driver isnt rapidly consuming rx mbufs, we dont allow it to allocate
many to put on its rx ring.

drivers require modifications to take advantage of both the new allocation
semantic and large clusters.

this was written and developed with deraadt@ over the last two days
ok deraadt@ claudio@


# 1.95 07-Nov-2008 deraadt

give this some /* CONSTCOND */ love


Revision tags: OPENBSD_4_4_BASE
# 1.94 10-Apr-2008 dlg

introduce mitigation for the calling of an interfaces start routine.

decent drivers prefer to have a lot of packets on the send queue so they
can queue a lot of them up on the tx ring and then post them all in one
big chunk. unfortunately our stack queues one packet onto the send queue
and then calls the start handler immediately.

this mitigates against that queue, send, queue, send behaviour by trying to
call the start routine only once per softnet. now its queue, queue, queue,
send.

this is the result of a lot of discussion with claudio@
tested by many.


Revision tags: OPENBSD_4_3_BASE
# 1.93 18-Nov-2007 mpf

Sync struct ifaltq to match struct ifqueue.
I wonder why 64-bit archs have not been bitten by this.
OK mcbride@, henning@


# 1.92 03-Sep-2007 claudio

Bump RTM_VERSION to 4 and start a new aera of routing in OpenBSD :)
Changes include 64bit counters instead of u_long, routing table id in the header
of most messages, reserved routing priority field, added a hdrlen field to skip
over the header so that binary compatibility becomes easier.
A minimal backward support for old binaries is included to ease upgrades but
don't expect anything more than ifconfig, route and dhclient to correctly work.
OK henning@ mglocker@


Revision tags: OPENBSD_4_2_BASE
# 1.91 25-Jun-2007 henning

crank ifq_maxlen from 50 to 256, so it is not smaller than most interfaces
rx rings any more. forwarding boxes with many fast interfaces can still use
some more, but this is a saner default.
ok deraadt markus henric


# 1.90 14-Jun-2007 reyk

Add a new "rtlabel" option to ifconfig. It allows to specify a route label
which will be used for new interface routes. For example,
ifconfig em0 10.1.1.0 255.255.255.0 rtlabel RING_1
will set the new interface address and attach the route label RING_1 to
the corresponding route.

manpage bits from jmc@
ok claudio@ henning@


# 1.89 29-May-2007 uwe

Define IF_ENQUEUE() and friends as proper C statements using do ... while
ok henning


# 1.88 26-May-2007 jason

one extern seems to be better than 20 for ifqmaxlen; ok krw


# 1.87 27-Mar-2007 jmc

grammar from bret lambert, and one more from me;


Revision tags: OPENBSD_4_1_BASE
# 1.86 09-Feb-2007 jmc

grammar fix from bret lambert;


# 1.85 28-Nov-2006 reyk

add additional link states to report the half duplex / full duplex
state, if known by the driver. this is required to check the full
duplex state without depending on the ifmedia ioctl which can't be
called in the kernel without process context.

ok henning@, brad@


# 1.84 16-Nov-2006 henning

introduce if_creategroup() to create an empty interface group.
code factored out from if_addgroup(), previously a group always had to have
members. ok mpf mcbride


# 1.83 31-Oct-2006 jason

ether_input_mbuf() isn't necessary, turn it into a macro and deal with
it's "special" case in ether_input(). Based on similiar idea in FreeBSD.
ok brad


Revision tags: OPENBSD_4_0_BASE
# 1.82 02-Jun-2006 mpf

Introduce attributes to interface groups.
As a first user, move the global carp(4) demotion counter
into the interface group. Thus we have the possibility
to define which carp interfaces are demoted together.

Put the demotion counter into the reserved field of the carp header.
With this, we can have carp act smarter if multiple errors occur.
It now always takes over other carp peers, that are advertising
with a higher demote count. As a side effect, we can also have
group failovers without the need of running in preempt mode.
The protocol change does not break compability with older
implementations.

Collaborative work with mcbride@

OK mcbride@, henning@


# 1.81 27-May-2006 brad

remove IFCAP_JUMBO_MTU interface capabilities flag and set if_hardmtu in a few
more drivers.

ok reyk@


# 1.80 26-May-2006 deraadt

rename jumbo mtu to if_hardmtu; ok brad reyk


# 1.79 19-May-2006 reyk

add a if_jumbo_mtu field to the interface structure for drivers
supporting ethernet jumbo frames. there's no standard for the size of
jumbo MTUs, so either let the driver set it's own value or use 9000
byte jumbo frames by default.

ok brad@


# 1.78 04-Mar-2006 brad

With the exception of two other small uncommited diffs this moves
the remainder of the network stack from splimp to splnet.

ok miod@


Revision tags: OPENBSD_3_9_BASE
# 1.77 09-Feb-2006 reyk

add an interface detach hook and use it with the vlan(4) driver. this
fixes a possible crash if the parent interface has been destroyed
(like vlan on trunk) before destroying the vlan interface.

ok brad@


Revision tags: OPENBSD_3_8_BASE
# 1.76 14-Jun-2005 henning

rename function and define to reflect the external -> egress name change
so it is clear what it is all about


# 1.75 14-Jun-2005 henning

use "egress" instead of "external" for the interface group containing the
interfaces the default route(s) point to, proposed deraadt some days ago,
ok djm deraadt


# 1.74 12-Jun-2005 henning

add SIOCGIFGMEMB ioctl, returns a list of all interfaces who are member of
the given group, markus ok


# 1.73 07-Jun-2005 henning

introduce a default "external" interface group, containing the interface(s)
the the default route(s) point to.
handles IPv4 and IPv6 as well as multipath routes.
follows default route changes, of course.
eases writing pf rulesets especially on laptops etc. that use different
interfaces depending on the environment (wired, wireless, ...)
ok theo ryan


# 1.72 06-Jun-2005 henning

use a define instead of hardcoding "all" in 3 places


# 1.71 05-Jun-2005 henning

const'ify the char *groupname param to if_addgroup and if_delgroup


# 1.70 24-May-2005 markus

add net.inet.ip.ifq for monitoring and changing ifqueue; similar to netbsd
ok henning


# 1.69 24-May-2005 reyk

initial import of a trunking (link aggregation and link failover)
implementation. it currently supports round robin mode with link state
checking, additional modes will be added later.

ok brad@, deraadt@


# 1.68 24-May-2005 henning

keep a list of member interfaces in ifg_group


# 1.67 22-May-2005 henning

allow pf to match on interface groups
pass on mygroup ...
markus ok


# 1.66 21-May-2005 henning

clean up and rework the interface absraction code big time, rip out multiple
useless layers of indirection and make the code way cleaner overall.
this is just the start, more to come...
worked very hard on by Ryan and me in Montreal last week, on the airplane to
vancouver and yesterday here in calgary. it hurt.
ok ryan theo


# 1.65 20-Apr-2005 mpf

Introduce if_linkstatehooks.
This converts if_link_state_change() to a generic usable
callback with dohooks().

OK henning@, camield@
Tested by camield@ and Alexey E. Suslikov


Revision tags: OPENBSD_3_7_BASE
# 1.64 07-Feb-2005 mcbride

Add new function if_link_state_change() to take care of sending messages
on the routing socket and notifying carp() of link changes.

ok brad@ mpf@


# 1.63 14-Jan-2005 henning

remove old ifgroups ioctls
the old ifgroups haven't been in use ever really, and the new
implementation is 3 months old today. theo ok (3 months ago)


# 1.62 07-Dec-2004 mcbride

Convert carp(4) to behave more like a regular interface, much in the same
style as vlan(4). carp interfaces no longer require the physical interface
to be on the same subnet as the carp interface, or even that the physical
interface has an adress at all, so CARP can now be used on /30 networks.

ok deraadt@ henning@


# 1.61 07-Dec-2004 mcbride

KNF


# 1.60 03-Dec-2004 henning

do not use one struct timeout for the if congestion stuff, but embed
a struct timeout to struct ifqueue so that each one has its own - it
is a per-queue thing. from chris pascoe


# 1.59 10-Nov-2004 grange

Safer IF_INPUT_ENQUEUE macro.

ok millert@


# 1.58 14-Oct-2004 mickey

avoid stupid commons


# 1.57 11-Oct-2004 henning

ifgroups reqrite
there is now a TAILQ with all interface groups as members, and
in struct ofnet there is only a pointer to the group structure stored
and not its name.
mostly hacked at c2k4 and somewhere over the atlantic ocean
ok markus mcbride


Revision tags: OPENBSD_3_6_BASE
# 1.56 26-Jun-2004 markus

cleanup ioctl for ifgroups; ok pb@


# 1.55 25-Jun-2004 pb

introduce "interface groups"

by "ifconfig fxp0 group foobar" "ifconfig xl0 group foobar"
these two interfaces are in one group.
Every interface has its if-family as default group.

idea/design from henning@, based on some work/disucssion from Joris Vink.

henning@, mcbride@ ok.


# 1.54 20-Jun-2004 beck

undo mbuf cluster breakage that causes free'ed packets to show up on the
input queues when using dhcp and hostap wi, or xl, or fxp....
ok art@


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.53 29-May-2004 jcs

introduce SIOCSIFDESCR and SIOCGIFDESCR to maintain interface
descriptions, configurable with ifconfig

help from various, ok deraadt@


# 1.52 18-May-2004 brad

if_ether.h
add ETHER_MAX_LEN_JUMBO, ETHER_VLAN_ENCAP_LEN, ETHER_ALIGN, and
ETHERMTU_JUMBO constants.

if.h
add a few more interface capabilities flags.

Some from NetBSD, some from FreeBSD.

ok markus@


# 1.51 26-Apr-2004 mcbride

Before enqueueing the packet, copy the contents of incoming clusters
to the mbuf and free the cluster when it contains a small packet.

ok deraadt@


# 1.50 17-Apr-2004 henning

add a congestion indicator to if_queue. It is set when the input queue
is full, along with a timer that unsets it again after 10ms.
The input queue beeing full is a reliable indicator for CPU overload, and
this flag allows other subsystems to cope with the situation.
hacked with beck
ok kjc@ markus@ beck@


Revision tags: OPENBSD_3_5_BASE
# 1.49 15-Jan-2004 markus

add a RTM_IFANNOUNCE message; from netbsd; ok itojun, henning


# 1.48 16-Dec-2003 markus

return error in ifc_destroy; ok deraadt, itojun, cedric, hshoexer


# 1.47 10-Dec-2003 itojun

use if_indexlim (instead of if_index) and ifindex2ifnet[x] != NULL
to check if interface exists, as (1) if_index will have different meaning
(2) ifindex2ifnet could become NULL when interface gets destroyed,
when we introduce dynamically-created interfaces. markus ok


# 1.46 08-Dec-2003 markus

add IOCIFGCLONERS; ifconfig -C; from netbsd; ok henning, deraadt


# 1.45 03-Dec-2003 markus

support for network interface "cloning", e.g. gif(4) via ifconfig(8)


# 1.44 19-Oct-2003 david

more typos


# 1.43 17-Oct-2003 mcbride

Common Address Redundancy Protocol

Allows multiple hosts to share an IP address, providing high availability
and load balancing.

Based on code by mickey@, with additional help from markus@
and Marco_Pfatschbacher@genua.de

ok deraadt@


Revision tags: OPENBSD_3_4_BASE
# 1.42 25-Aug-2003 fgsch

if_init support, required by ieee80211.
deraadt@ ok.


# 1.41 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_A UBC_SYNC_B
# 1.40 03-Jul-2002 miod

Change all variables definitions (int foo) in sys/sys/*.h to variable
declarations (extern int foo), and compensate in the appropriate locations.


# 1.39 30-Jun-2002 itojun

allocate sockaddr_dl for ifnet in if_alloc_sadl(), as we don't always know
the size of sockaddr_dl on if_attach() - for instance, see ether_ifattach().
from netbsd. fgs ok


# 1.38 23-Jun-2002 itojun

g/c last remains of old ipv6 prefix management


# 1.37 27-May-2002 itojun

if_attach() gets called before domaininit(). scan all interfaces for if_afdata
initialization after domaininit().


# 1.36 27-May-2002 itojun

framework to add af-dependent data structure to struct ifnet.
as discussed at bsd-api-discuss. sync w/kame


# 1.35 24-Apr-2002 dhartmei

Add hooks to struct ifnet that allow to register callbacks that will be
notified of interface address changes. ok provos@, angelos@


Revision tags: OPENBSD_3_1_BASE
# 1.34 15-Mar-2002 millert

Cosmetic changes only, primarily making comments line up nicely after the
__P removal.


# 1.33 14-Mar-2002 millert

First round of __P removal in sys


# 1.32 23-Jan-2002 fgsch

compatability -> compatibility.


Revision tags: OPENBSD_3_0_BASE UBC_BASE
# 1.31 05-Jul-2001 angelos

branches: 1.31.4;
KNF


# 1.30 05-Jul-2001 jjbg

Include files for IPComp support. angelos@ ok.


# 1.29 27-Jun-2001 kjc

ALTQ base modifications to the kernel.
- ALTQ introduces a set of new queue macros that coexist with the
traditional IF_XXX macros.
- "struct ifaltq" replaces "struct ifqueue" in "struct ifnet".
- assign cdev major 74 for i386 and 54 for alpha as ALTQ control interface.


# 1.28 23-Jun-2001 fgsch

Add ether_input_mbuf to help us remove the ether_header from
ether_input; all drivers should start migrating to this.
Discussed with jason@, deraadt@ more or les ok'ed.


# 1.27 15-Jun-2001 itojun

change the meaning of ifnet.if_lastchange to meet RFC1573 ifLastChange.
follows BSD/OS practice and ucd-snmp code (FreeBSD does it for specific
interfaces only).

was: if_lastchange get updated on every packet transmission/receipt.
now: if_lastchange get updated when IFF_UP is changed.


# 1.26 09-Jun-2001 angelos

By popular demand, protect from multiple inclusion, and fix to use the
same naming style.


# 1.25 28-May-2001 angelos

IPSECv4 -> IPSEC


# 1.24 28-May-2001 angelos

No need for separate ESP/AH interface capabilities.


# 1.23 28-May-2001 angelos

Interface capabilities (based on NetBSD, but merge ethercom and ifnet
capabilities into one, in the ifp).


Revision tags: OPENBSD_2_9_BASE
# 1.22 06-Feb-2001 mickey

allow changing number of loopbacks in ukc.
change rest of the code to use lo0ifp pointing
to the corresponding struct ifnet.
itojun@ and niklas@ ok


# 1.21 19-Jan-2001 itojun

pull post-4.4BSD change to sys/net/route.c from BSD/OS 4.2 (UCB copyrighted).

have sys/net/route.c:rtrequest1(), which takes rt_addrinfo * as the argument.
pass rt_addrinfo all the way down to rtrequest, and ifa->ifa_rtrequest.
3rd arg of ifa->ifa_rtrequest is now rt_addrinfo * instead of sockaddr *
(almost noone is using it anyways).

benefit: the follwoing command now works. previously we need two route(8)
invocations, "add" then "change".
# route add -inet6 default ::1 -ifp gif0

remove unsafe typecast in rtrequest(), from rtentry * to sockaddr *. it was
introduced by 4.3BSD-reno and never corrected.

XXX is eon_rtrequest() change correct regarding to 3rd arg?
eon_rtrequest() and rtrequest() were incorrect since 4.3BSD-reno,
so i do not have correct answer in the source code.
someone with more clue about netiso-over-ip, please help.


Revision tags: OPENBSD_2_8_BASE
# 1.20 20-Sep-2000 art

Since ifa_refcnt was bumped to an int and rt_flags is an int too, bump
ifa_flags to int.


# 1.19 28-Aug-2000 deraadt

changing the size of if_data has heavy impact on userland compat. there
was even a u_char slot available for ifi_link_state, which clearly does not
need a full 32 bits.


# 1.18 26-Aug-2000 nate

sync mii code with netbsd
adds detach functionality for phys
some code cleanup

Nobody really had time to test all of this out, but theo said commit anyway


Revision tags: OPENBSD_2_7_BASE
# 1.17 22-Mar-2000 itojun

remove if_withname(), which was imported during KAME merge by mistake.


# 1.16 21-Mar-2000 mickey

add SIOCGIFMTU/SIOCSIFMTU; remediate redundant code of tun, ppp, sppp; chris@ ok


Revision tags: SMP_BASE
# 1.15 02-Feb-2000 itojun

branches: 1.15.2;
wrap IFAFREE() by "do {} while (0)". it wasn't safe enough.


Revision tags: kame_19991208
# 1.14 08-Dec-1999 itojun

bring in KAME IPv6 code, dated 19991208.
replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support.
see sys/netinet6/{TODO,IMPLEMENTATION} for more details.

GENERIC configuration should work fine as before. GENERIC.v6 works fine
as well, but you'll need KAME userland tools to play with IPv6 (will be
bringed into soon).


Revision tags: OPENBSD_2_6_BASE
# 1.13 08-Aug-1999 niklas

Support detaching of network interfaces. Still work to do in ipf, and
other families than inet.


# 1.12 23-Jun-1999 cmetz

Added some protocol independent interfaces (supposedly IPv6 support APIs, but
ones that are useful for all protocols, not just IPv6).


Revision tags: OPENBSD_2_5_BASE
# 1.11 13-Mar-1999 deraadt

make ifa_refcnt a u_int; andrewb@demon.net


# 1.10 26-Feb-1999 jason

Ethernet bridge/IP firewall driver.


# 1.9 07-Jan-1999 deraadt

fix IFAFREE() to be safe for if/else nesting


Revision tags: OPENBSD_2_4_BASE
# 1.8 03-Sep-1998 jason

o OpenBSD gets if_media support (from NetBSD)
o rework/simplify if_xl to use it


Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE OPENBSD_2_2_BASE OPENBSD_2_3_BASE
# 1.7 02-Jul-1996 niklas

-Wall & -Wstrict-prototype fixes


# 1.6 29-Jun-1996 deraadt

provide if_attachhead(), and make if_loop use it


# 1.5 10-May-1996 deraadt

if_name/if_unit -> if_xname/if_softc


# 1.4 06-May-1996 mickey

if.h was missed from the commit.
if_ethersubr.c: missed variables added.


# 1.3 05-Mar-1996 mickey

Changes for ifconfig to compile.


# 1.2 03-Mar-1996 niklas

From NetBSD: 960217 merge


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.211 07-Mar-2023 jan

Avoid enabling TSO on interfaces which are already attached to a bridge.

with tweaks from claudio and deraadt

ok claudio, bluhm


# 1.210 27-Feb-2023 jan

Turn off TSO if interface is added to layer 2 devices.

ok bluhm@, claudio@


Revision tags: OPENBSD_7_2_BASE
# 1.209 27-Jun-2022 jan

Introduce Large Receive Offloading of TCP segment offloading for ix(4). It is
disabled by default. Also add a tso option to ifconfig(8) to enable and
disable this feature.

ok deraadt


Revision tags: OPENBSD_6_9_BASE OPENBSD_7_0_BASE OPENBSD_7_1_BASE
# 1.208 11-Mar-2021 florian

When RFC 8981 obsoleted RFC 4941 the terminology changed from
"privacy extensions" to "temporary address extensions"

Change ifconfig(8) to output temporary after temporary addresses and
add "temporary" option which is an alias for autoconfprivacy for now.

Also make AUTOCONF6TEMP a positiv flag that is set by default.
Previously the negative flag "INET6_NOPRIVACY" was set when privacy
addresses were disabled. This makes the flags output less ugly and
will allow us to disable autoconf addresses while having temporary
addresses enabled in the future.

More work is needed in slaacd.

input benno, jmc, deraadt
previous verison OK benno
OK jmc, kn


# 1.207 20-Feb-2021 dlg

add a MONITOR flag to ifaces to say they're only used for watching packets.

an example use of this is when you have a span port on a switch and
you want to be able to see the packets coming out of it with tcpdump,
but do not want these packets to enter the network stack for
processing. this is particularly important if the span port is
pushing a copy of any packets related to the machine doing the
monitoring as it will confuse pf states and the stack.

ok benno@


# 1.206 01-Feb-2021 mvs

ifunit() was fully replaced by if_unit(9) and should go away.

ok bluhm@ dlg@


# 1.205 18-Jan-2021 mvs

Introduce new function if_unit(9). This function returns a pointer the
interface descriptor corresponding to the unique name. This descriptor
is guaranteed to be valid until if_put(9) is called on the returned
pointer. if_unit(9) should replace already existent ifunit() which
returns descriptor not safe for dereference when context was switched.
This allow us to avoid some use-after-free issues in ioctl(2) path.
Also this unifies interface descriptor usage.

ok claudio@ sashan@


# 1.204 30-Sep-2020 mvs

We have no if_attachtail() function so remove the declaration.

ok deraadt@ claudio@


Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE OPENBSD_6_8_BASE
# 1.203 25-Jul-2019 krw

AF_INET comes before AF_INET6. Shorten line to <80 chars.

pointed out by claudio@


# 1.202 25-Jul-2019 krw

Add IFXF_AUTOCONF4 to if_xflags to match IFXF_AUTOCONF6. Let
ifconfig set/unset it.

ok deraadt@ kmos@


# 1.201 19-Apr-2019 dlg

add IF_HDRPRIO_OUTER for rxprio

IF_HDRPRIO_OUTER says you want the priority from the outer encap header.

ok claudio@


Revision tags: OPENBSD_6_5_BASE
# 1.200 10-Apr-2019 dlg

add struct if_sffpage so userland can read a page of sfp/qsfp info

this will be used by ifconfig so it can show you things like who
mades what module you're using and when it was made, and on some
modules you get diag info like temperature, vcc, and rx and tx
powers.

im putting the kernel side in so we can keep fiddling with the
userland printf side of things.

this work was done based on a question by rachel roch
ok deraadt@
enthusiasm from many including mikeb@ sthen@ patrick@


# 1.199 23-Jan-2019 dlg

add a SIOCGPWE3 ioctl for interfaces to advertise pwe3 capability

im going to turn mpw into an ethernet interface, which includes
changing its if_type to IFT_ETHER. currently ldpd looks for if_type
IFT_MPLSTUNNEL to decide if an interface is a pseudowire, ie, it's
going to break. the ioctl will let ldpd ask the interface if it is
pseudowire capable as an alternative.

ok claudio@


# 1.198 12-Nov-2018 dlg

add ifreq bits for the tx header prio field ioctls

a tx header prio can set to a fixed value from 0 to 7, or magic
values to represent populating the prio field from the encapsulated
packet, or from the mbuf prio value.

ok claudio@


# 1.197 12-Nov-2018 krw

Add new routing socket message RTM_80211INFO to provide details of
802.11 interface state changes (e.g. SSID) to interested parties.

Original diff from phessler@. Many suggestions and tweaks from
claudio@, stsp@, anton@.

ok claudio@ stsp@ anton@ phessler@


# 1.196 11-Nov-2018 dlg

use the llprio on gre(4) and eoip(4) interfaces for the keepalive tos

llprios are valued 0 to 7, while the ip tos/dscp/tclass is an 8 bit
value. fortunately the high 3 bits map nicely to the llprio values,
so we shift the llprio into place when generating the keepalive
frames. the llprio is defaulted to the value that cisco uses for
their gre keepalives.


Revision tags: OPENBSD_6_4_BASE
# 1.195 12-Sep-2018 krw

Fix obvious cut&pasto in comment (ifa_msghdr -> if_announcemsghdr).

ok claudio@


# 1.194 30-May-2018 dlg

restrict the prio values from SIOCSIFLLPRIO to what the kernel handles

previously the ioctl code checked that prio was an int less than
UCHAR_MAX, but the rest of the kernel (and priq code in particular)
expects it to be between 0 and 7 inclusive.

ok krw@ tb@


# 1.193 25-Apr-2018 jca

Make this header standalone #if __BSD_VISIBLE, by including needed headers

Puts us in line with Free/NetBSD and Linux and will get us rid of
pointless patches in the ports tree. ok guenther@ deraadt@


Revision tags: OPENBSD_6_3_BASE
# 1.192 19-Feb-2018 dlg

tunneldf needs ifr_df


# 1.191 10-Feb-2018 florian

Implement RFC 7217: "A Method for Generating Semantically Opaque
Interface Identifiers with IPv6 Stateless Address Autoconfiguration."

"An IPv6 address configured using this method is stable within each
subnet, but the corresponding Interface Identifier changes when the
host moves from one network to another. This method is meant to be an
alternative to generating Interface Identifiers based on hardware
addresses."

OK naddy, sthen


# 1.190 16-Jan-2018 mpi

Recycle IFF_NOTRAILERS into IFF_STATICARP and document ownerhsip
of IFF* flags.

inputs from jmc@, ok bluhm@, visa@


# 1.189 21-Dec-2017 dlg

prototype if_attach_iqueues so drivers can configure multiple iqs.


# 1.188 09-Nov-2017 tb

The cmd argument of ifconf() has been unused since COMPAT_LINUX was
purged. Remove it and move the prototype to if.c since ifconf() is
not used outside of this file.

ok mpi


# 1.187 31-Oct-2017 sashan

- add one more softnet taskq
NOTE: code still runs with single softnet task. change definition of
SOFTNET_TASKS in net/if.c, if you want to have more than one softnet task

OK mpi@, OK phessler@


Revision tags: OPENBSD_6_1_BASE OPENBSD_6_2_BASE
# 1.186 24-Jan-2017 krw

A space here, a space there. Soon we're talking real whitespace
rectification.


# 1.185 24-Jan-2017 dlg

add support for multiple transmit ifqueues per network interface.

an ifq to transmit a packet is picked by the current traffic
conditioner (ie, priq or hfsc) by providing an index into an array
of ifqs. by default interfaces get a single ifq but can ask for
more using if_attach_queues().

the vast majority of our drivers still think there's a 1:1 mapping
between interfaces and transmit queues, so their if_start routines
take an ifnet pointer instead of a pointer to the ifqueue struct.
instead of changing all the drivers in the tree, drivers can opt
into using an if_qstart routine and setting the IFXF_MPSAFE flag.
the stack provides a compatability wrapper from the new if_qstart
handler to the previous if_start handlers if IFXF_MPSAFE isnt set.

enabling hfsc on an interface configures it to transmit everything
through the first ifq. any other ifqs are left configured as priq,
but unused, when hfsc is enabled.

getting this in now so everyone can kick the tyres.

ok mpi@ visa@ (who provided some tweaks for cnmac).


# 1.184 23-Jan-2017 mpi

Flag pseudo-interfaces as such in order to call add_net_randomness()
only once per packet.

Fix a regression introduced when if_input() started to be called by
every pseudo-driver.

ok claudio@, dlg@


# 1.183 23-Jan-2017 dlg

i botched the copyout to ifr->ifr_data in SIOCGIFDATA.

this lets pflogd run again.

rename if_data() to if_getdata() while here to make grepping for
things less noisy.

reported by jsg@
worked through with deraadt@


# 1.182 23-Jan-2017 dlg

merge the ifnet and ifqueue stats together when userland wants them.

a new if_data() function takes a pointer to ifnet and merges its
if_data and ifq statistics. it takes the ifq mutex around the reads
of the ifq stats so they get a consistent copy.

the ifnet and ifq stats are merged because some parts of the stack
still update the ifnet counters.

ok visa@ (on an earlier diff) mpi@ claudio@


# 1.181 12-Dec-2016 mpi

Remove most of the splsoftnet() recursions related to cloned interfaces.

inputs and ok bluhm@


# 1.180 27-Oct-2016 dlg

add a new pool for 2k + 2 byte (mcl2k2) clusters.

a certain vendor likes to make chips that specify the rx buffer
sizes in kilobyte increments. unfortunately it places the ethernet
header on the start of the rx buffer, which means if you give it a
mcl2k cluster, the ethernet header will not be ETHER_ALIGNed cos
mcl2k clusters are always allocated on 2k boundarys (cos they pack
into pages well). that in turn means the ip header wont be aligned
correctly.

the current workaround on these chips has been to let non-strict
alignment archs just use the normal 2k cluster, but use whatever
cluster can fit 2k + 2 on strict archs. that turns out to be the
4k cluster, meaning we waste nearly 2k of space on every packet.

properly aligning the ethernet header and ip headers gives a
performance boost, even on non-strict archs.


# 1.179 04-Sep-2016 reyk

Move code to change the rdomain of an interface from the ioctl switch case
to a new function if_setrdomain().

OK mpi@ henning@


# 1.178 03-Sep-2016 reyk

Add support for a multipoint-to-multipoint mode in vxlan(4). In this
mode, vxlan(4) must be configured to accept any virtual network
identifier with "vnetid any" and added to a bridge(4) or switch(4).
This way the driver will dynamically learn the tunnel endpoints and
their vnetids for the responses and can be used to dynamically bridge
between VXLANs. It is also being used in combination with switch(4)
and the OpenFlow tunnel classifiers.

With input from yasuoka@ goda@
OK deraadt@ dlg@


Revision tags: OPENBSD_6_0_BASE
# 1.177 10-Jun-2016 vgross

Add the "llprio" field to struct ifnet, and the corresponding keyword
to ifconfig.

"llprio" allows one to set the priority of packets that do not go through
pf(4), as the case is for arp(4) or bpf(4).

ok sthen@ mikeb@


# 1.176 02-Mar-2016 dlg

provide generic ioctls for managing an interfaces parent

in the future this will subsume the individual vlandev, carpdev,
pppoedev, foodev options for things like vlan, carp, pppoe, etc.

inspired by vnetid

ok mpi@ jmatthew@


Revision tags: OPENBSD_5_9_BASE
# 1.175 05-Dec-2015 deraadt

avoid an ugly wrap in a comment


# 1.174 03-Dec-2015 dlg

rework if_start to allow nics to provide an mpsafe start routine.

existing start routines will still be called under the kernel lock
and at IPL_NET.

mpsafe start routines will be serialised so only one instance of
each interfaces function will be running in the kernel at any point
in time. this guarantees packets will be dequeued in order, and the
start routines dont have to lock against themselves because if_start
does it for them.

the code to do that is based on the scsi runqueue code.

this also provides an if_start_barrier() function that should wait
until any currently running instances of if_start have finished.

a driver can opt in to the mpsafe if_start call by doing the following:

1. setting ifp->if_xflags = IFXF_MPSAFE
2. only calling if_start() instead of its own start routine
3. clearing IFF_RUNNING before calling if_start_barrier() on its way down
4. only using IFQ_DEQUEUE (not ifq_deq_begin/commit/rollback)

to simplify the implementation the tx mitigation code has been removed.

tested by several
ok mpi@ jmatthew@


# 1.173 20-Nov-2015 mpi

Keep if_ref() private, if_get() is what you want to use before if_put().

The thread detaching an interface will sleep until all references to this
interface have been released. So we decided to only keep references for
a short period of time.

Keeping if_ref() private will hopefully help preserve this goal as long
as it makes sense.

Calling if_get()/if_put() in the same function also allows us to make
use of static analysis tools (thanks jsg@!) to catch our errors.

ok dlg@


# 1.172 24-Oct-2015 reyk

Add pair(4), a vether-based virtual Ethernet driver to interconnect
rdomains and bridges on the local system. This can be used to route
through local rdomains, to create L2 devices (like trunks) between
them, and many other things.

Discussed with many, with input from mpi@
OK sthen@ phessler@ yasuoka@ mikeb@


# 1.171 23-Oct-2015 claudio

Introduce a new sysctl NET_RT_IFNAMES that returns only ifnames to ifindex
mappings. This will be used by if_nameindex(3), if_nametoindex(3) and
if_indextoname(3) soon to fix the issues in pledge because of inet6 link
local addressing.
OK mpi@ benno@ deraadt@
The libc version will follow soon so better start updating your kernels


# 1.170 23-Oct-2015 dlg

tweak the vnetid so it can be optional and therefore cleared/deleted.

the abstract vnetid is promoted to a uin32_t, and adds a SIOCDVNETID
ioctl so it can be cleared.

this is all because i set an assignment on implementing a virtual
network interface and the students got confused when vnetid 0 didnt
show up in ifconfig output.

the vnetid in the vxlan(4) protocol is optional, but the current
code confuses 0 with no vnetid being set. this makes it clear.

ok reyk@ who also simplified my diff


# 1.169 05-Oct-2015 uebayasi

Add ifi_oqdrops and its alias to struct if_data.

Necessary bumps in Ports will be handled by sthen@.

OK mpi@ dlg@


# 1.168 27-Sep-2015 stsp

Add if_setlladdr(), factored out from ifioctl(). Will be used by iwm(4) soon.
With suggestions from tedu@ and guenther@
ok kettenis@


# 1.167 11-Sep-2015 stsp

Make room for media types of the future. Extend the ifmedia word to 64 bits.
This changes numbers of the SIOCSIFMEDIA and SIOCGIFMEDIA ioctls and
grows struct ifmediareq.

Old ifconfig and dhclient binaries can still assign addresses, however
the 'media' subcommand stops working. Recompiling ifconfig and dhclient
with new headers before a reboot should not be necessary unless in very
special circumstances where non-default media settings must be used to
get link and console access is not available.

There may be some MD fallout but that will be cleared up later.

ok deraadt miod
with help and suggestions from several sharks attending l2k15


# 1.166 09-Sep-2015 dlg

introduce reference counts for interfaces (ie, struct ifnet *ifp).

if_get can get a reference to an ifp, but it never releases that
reference. this provides an if_put function that can be used to
decrement the refcount.

we cannot come up with a scheme for letting the network stack run on
one (or many) cpus while ioctls are pulling interfaces down on another
cpu without refcounts for the interfaces.

if_put is going in now so we can go through the stack and put the
necessary calls to it in, and then we'll backfill this implementation
to actually check the refcounts when the interface detaches.

ok mpi@ mikeb@ claudio@


# 1.165 30-Aug-2015 mpi

Use a global table for domains instead of building a list at run time.

As a side effect there's no need to run if_attachdomain() after the
list of domains has been built.

ok claudio@, reyk@


Revision tags: OPENBSD_5_8_BASE
# 1.164 07-Jun-2015 jsg

Introduce unhandled_af() for cases where code conditionally does
something based on an address family and later assumes one of the paths
was taken. This was initially just calls to panic until guenther
suggested a function to reduce the amount of strings needed.

This reduces the amount of noise with static analysers and acts
as a sanity check.

ok guenther@ bluhm@


# 1.163 18-May-2015 reyk

Move the rdomain from struct ifnet into struct if_data. This way it
will be exported to userland with the existing sysctl, getifaddrs()
and routing socket (if_msghdr.ifm_data) interfaces that expose
if_data. All programs and daemons - Apps - that call the
SIOCGIFRDOMAIN ioctl in a getifaddrs() loop or after receiving an
interface message on the routing socket can now remove the pointless
additional ioctl. In base, that could be: dhclient, isakmpd, dhcpd,
dhcrelay, ntpd, ospfd, ripd, ifconfig.

No ABI breakage because it uses a previously unused pad field in if_data.

OK mpi@ deraadt@


# 1.162 10-Apr-2015 mpi

Run detach hook and similar before cleaning up any other resource when
an interface is destroyed/removed. This way we can ensure pseudo-driver
changes done after attaching an interface are undone before detaching it.

Note: it is safe to call if_deactivate() multiple times as the interface
should not have any attached pseudo-interface after the first call.

ok deraadt@, dlg@


# 1.161 18-Mar-2015 dlg

remove the congestion handling from struct ifqueue.

its only used for the ip and ip6 network stack input queues, so it
seems unfair that every instance of ifqueue has to carry a pointer
around for this specific use case.

this moves the congestion marker to a kernel global. if we detect
that we're congested, we assume the whole system is busy and punish
all input queues.

marking a system as congested is done by setting the global to the
current value of ticks. as the system moves away from that value,
it moves away from being congested until the comparison fails.

written at s2k15
ok henning@ beck@ bluhm@ claudio@


Revision tags: OPENBSD_5_7_BASE
# 1.160 08-Feb-2015 mpi

Introduce if_input() a function to pass packets dequeued from a
recieving ring to the stack.

if_input() is at the moment a drop-in replacement for ether_input_mbuf()
but will let us stack pseudo-driver in a nice way in order to no longer
call ether_input() recursively.

ok pelikan@, reyk@, blambert@, henning@


# 1.159 06-Jan-2015 stsp

Remove the NOINET6 interface flag, a left-over from the times when IPv6
was enabled by default. Add AFATTACH/AFDETACH ioctls which enable/disable
an address family for an interface (currently used for IPv6 only).

New kernel needs new ifconfig for IPv6 configuration (address assignment
still works with old ifconfig making this easy to cross over).

Committing on behalf of henning@ who is currently lebensmittelvergiftet.
ok stsp, benno, mpi


# 1.158 05-Dec-2014 mpi

Explicitly include <net/if_var.h> instead of pulling it in <net/if.h>.

ok mikeb@, krw@, bluhm@, tedu@


Revision tags: OPENBSD_5_6_BASE
# 1.157 14-Jul-2014 dlg

now that receive ring accounting has been pulled out of the mbuf layer,
we can pull the space the mbuf layer used to do per interface accounting
out of struct if_data.

saves a hundredish bytes on every interface.

ok deraadt@ claudio@


# 1.156 11-Jul-2014 henning

introduce the IFXF_AUTOCONF6 interface flag which controls wether we
accept rtadvs on that interface. the global net.inet6.ip6.accept_rtadv
sysctl just doesn't cut it, even tho the spec wants that - but in their
little absurd world, a host just has one interface by definition anyway...
the sysctlgoes away.
lots of head scratching, brain cell elemination etc from bluhm benno stsp
florian, excitement from simon and todd, ok bluhm stsp benno florian


# 1.155 08-Jul-2014 dlg

introduce the if_rxr api. it is intended to pull the rx ring accounting
out of the mbuf layer, and break the assumption that an interface will
only have a single ring per mbuf cluster size.

mpi@ is ok with moving this forward


# 1.154 13-Jun-2014 mpi

Instead of updating all the cluster allocation water marks of all the
interfaces when the kernel is livelocked, only do it for the current
pool and defer the other updates.

This allow us to get rid of an interface list iteration in a critical
path.

Ridding the libc crank since this change introduce an ABI break.

ok claudio@


Revision tags: OPENBSD_5_5_BASE
# 1.153 21-Nov-2013 mikeb

split kernel parts of the if.h into a separate header file if_var.h
which allows us to modify ifnet structure in a relatively safe way;
discussed with deraadt, ok mpi


# 1.152 09-Nov-2013 dlg

ticks is compared against mcl_grown to see if time has elapsed since
the rx ring was last allowed to grow and then assigned to it. it
is erroneous to do this because mcl_grown is a u_int and ticks is an
int.

this makes mcl_grown an int, and follows the idiom in kern_timeout.c
of going "thing - ticks < diff", which better copes with ticks
wrapping around and being used to calculate relative intervals.

ok pirofti@ guenther@


# 1.151 01-Nov-2013 pelikan

keep net/hfsc.h away from userspace, except in pfctl

tested by naddy, ok deraadt


# 1.150 21-Oct-2013 benno

nuke comment. How soon is now?
"do it" deraadt@


# 1.149 19-Oct-2013 reyk

Bring back the if_detachhook. We're going to have more users now.

ok mpi@ henning@ benno@


# 1.148 13-Oct-2013 reyk

Import vxlan(4), the virtual extensible local area network tunnel
interface. VXLAN is a UDP-based tunnelling protocol for overlaying
virtualized layer 2 networks over layer 3 networks. The implementation
is based on draft-mahalingam-dutt-dcops-vxlan-04 and has been tested
with other implementations in the wild.

put it in deraadt@


# 1.147 12-Oct-2013 henning

new bandwidth shaping subsystem, kernel side
uses hfsc behind the scenes; altq stays in parallel for a migration phase.
if.h even more messy for the transition, but eventuelly it should become
readable...
looked over & tested by many, ok phessler sthen


# 1.146 17-Sep-2013 mpi

Change vlan(4) detach procedure to not use a hook but a list of vlans
on the parent interface. This is similar to what bridge(4), trunk(4)
or carp(4) are doing and allows us to get rid of the detachhook.

ok reyk@, mikeb@


# 1.145 28-Aug-2013 mpi

Remove unused argument from *rtrequest()

ok krw@, mikeb@


Revision tags: OPENBSD_5_4_BASE
# 1.144 20-Jun-2013 mpi

Revert previous and unbreak asr, the new include should be protected.

Reported by naddy@


# 1.143 20-Jun-2013 mpi

Allocate the various hook head descriptors as part of the ifnet
structure rather than doing various M_WAITOK allocations during
the *attach() functions, we always rely on them anyway.

ok mikeb@, uebayasi@


# 1.142 02-Apr-2013 mpi

Instead of storing the link-level address of every interface in a global
array indexed by interface numbers, add a new field to the interface
descriptor pointing to it.

claudio@ and todd@ like it, ok mikeb@


# 1.141 26-Mar-2013 mpi

Remove various read-only *maxlen variables and use IFQ_MAXLEN directly.

ok beck@, mikeb@


# 1.140 20-Mar-2013 mpi

Introduce if_get() to retrieve an interface descriptor pointer given
an interface index and replace all the redondant checks and accesses
to a global array by a call to this function.

With imputs from and ok bluhm@, mikeb@


# 1.139 07-Mar-2013 mpi

Remove unused ifa_ifwithaf() function.

ok mikeb@, miod@


# 1.138 07-Mar-2013 mpi

Remove the IFAFREE() macro, the ifafree() function it was calling already
check for the reference counter.

ok mikeb@, miod@, pelikan@, kettenis@, krw@


Revision tags: OPENBSD_5_3_BASE
# 1.137 23-Nov-2012 sthen

Add SIOCGIFHARDMTU to allow retrieving the driver's maximum supported MTU
looks fine reyk@ ok mikeb@


# 1.136 11-Nov-2012 deraadt

align ifaliasreq.ifra_addr similar to the way that ifreq is fixed --
a gruesome union, to block the compiler from placing the struct
incorrectly aligned on stack frames
ok guenther


# 1.135 05-Oct-2012 camield

Point an interface directly to its bridgeport configuration, instead
of to the bridge itself. This is ok, since an interface can only be part
of one bridge, and the parent bridge is easy to find from the bridgeport.

This way we can get rid of a lot of list walks, improving performance
and shortening the code.

ok henning stsp sthen reyk


# 1.134 19-Sep-2012 henning

defina an IFCAP_CSUM_MASK, covering IFCAP_CSUM_*, and use it in if_vlan.c
to replace the list of them.
this actually makes vlan inherit the IPv6 CSUM flags from it's parent, that
had been commented out since this code was committed back in 2001.
ok benno mpf


# 1.133 10-Sep-2012 guenther

Bring into compliance with POSIX, exposing just the specified bits.

Requested by jasper@, ok kettenis@


# 1.132 21-Aug-2012 bluhm

Reverse the name and meaning of the IFXF_INET6_PRIVACY interface
flag. It is now called IFXF_INET6_NOPRIVACY. So IPv6 privacy
addresses are on by default without resetting the flag during
ifconfig down/up.
OK stsp@, sperreault@ (who wrote the same diff)


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.131 02-Dec-2011 haesbaert

Kill unused IFCAP_IPSEC and IFCAP_IPCOMP.

ok claudio@ henning@ mikeb@


# 1.130 02-Nov-2011 haesbaert

Expose if_capabilities to userland so that ifconfig can display the
device hardware features.
Tune ifconfig to show them with 'hwfeatures' argument.
While here, kill some old unused capabilities and respect 80 columns
in brconfig.h.

ok mcbride@, henning@, mpf@.


# 1.129 07-Oct-2011 henning

rename some vars and functions
unfortunately altq is one giant namespace violation. rename just those that
conflict with new stuff for now only to be found on my laptop. reduce pain,
the diff is huge already. ok ryan


Revision tags: OPENBSD_5_0_BASE
# 1.128 08-Jul-2011 henning

new priority queueing implementation, extremely low overhead, thus fast.
unconditional, always on. 8 priority levels, as every better switch, the
vlan header etc etc. ok ryan mpf sthen, pea tested as well


# 1.127 07-Jul-2011 henning

provide IF_LEN and IFQ_LEN to access ifq_len on an ifqueue, ryan ok


# 1.126 05-Jul-2011 henning

now of course I only noticed if_qflush is completely unused after
adjusting it to the new world order in my tree... remove it, ok ryan claudio


# 1.125 03-Jul-2011 henning

IFQ_CLASSIFY is also just schrapnel


# 1.124 03-Jul-2011 henning

no traces of ALTQ_DECL to be found anywhere, thus kill the #defines


# 1.123 03-Jul-2011 claudio

LINK_STATE_IS_UP() should consider LINK_STATE_UNKNOWN as an up state.
This is now possible because carp no longer uses LINK_STATE_UNKNOWN
for a state that is considered down. This will simplify a lot of code.
OK mpf@ mcbride@ henning@


# 1.122 13-Mar-2011 stsp

Add a way to enable/disable Wake On LAN with ifconfig.
ok deraadt


Revision tags: OPENBSD_4_9_BASE
# 1.121 17-Nov-2010 henning

introduce ifa_update_broadaddr to update an ifaddr's broadcast address,
trivial for the moment, more needed soon
tested by many as part of a larger diff, ok sthen claudio dlg krw


# 1.120 24-Sep-2010 claudio

Implement if_freenameindex() as a real function as required by posix.
OK deraadt@, millert@


# 1.119 23-Sep-2010 dlg

tweak the mclgeti algorithm to behave better under load.

instead of letting hardware rings grow on every interrupt, restrict
it so it can only grow once per softclock tick. we can only punish
the rings on softclock ticks, so it make sense to only grow on
softclock tick boundaries too.

the rings are now punished after >1 lost softclock tick rather than
>2. mclgeti is now more aggressive at detecting livelock.

the rings get punished by an 8th, rather than by half.

we now allow the rings to be punished again even if the system is
already considered in livelock.

without this diff a livelocked system will have its rx ring sizes
scale up and down very rapidly, while holding the rings low for too
long. this affected throughput significantly.

discussed and tested heavily at j2k10. there are still some games
with softnet we can play, but this is a good first step.

"put it in" and ok deraadt@
ok claudio@ krw@ henning@ mcbride@

if we find out that it sucks we can pull it out again later. till then
we'll run with it and see how it goes.


# 1.118 27-Aug-2010 jsg

remove the unused if_init callback in struct ifnet
ok deraadt@ henning@ claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.117 26-Jun-2010 claudio

Implement a simple keepalive mechanism in gre(4) that is compatible with
the one used by Cisco. It sends a return gre packet inside a gre packet
to the other side and expects it to return.
OK deraadt, reyk additional testing by sthen


# 1.116 28-May-2010 claudio

Rework the way we handle MPLS in the kernel. Instead of fumbling MPLS into
ether_output() and later on other L2 output functions use a trick and over-
load the ifp->if_output() function pointer on MPLS enabled interfaces to
go through mpls_output() which will then call the link level output function.
By setting IFXF_MPLS on an interface the output pointers are switched.
This now allows to cleanup the MPLS input and output pathes and fix mpe(4)
so that the MPLS code now actually works for both P and PE systems.
Tested by myself and michele
(A custom kernel with MPLS and mpe enabled is still needed).


# 1.115 17-Apr-2010 deraadt

split SIOCSIFLLADDR code out into an ifnewlladr() function
ok stsp


# 1.114 06-Apr-2010 stsp

Simple implementation of RFC4941, "Privacy Extensions for Stateless
Address Autoconfiguration in IPv6". For those among us who are paranoid
about broadcasting their MAC address to the IPv6 internet.

Man page help from jmc, testing by weerd, arc4random API hints from djm.

ok deraadt, claudio


Revision tags: OPENBSD_4_7_BASE
# 1.113 13-Jan-2010 henning

maintain a global RB tree of all local addresses in the system. this
includes AF_LINK addresses (aka mac addresses in the ethernet case). for
inet this also includes the broadcast addresses.
depends on ifinit() called earlier so we have a chance to pool_init before
autoconf assigns the AF_LINK addresses, the v6 fix, and the ifa_add/del
abstraction i just committed.
this is a change in semantics, it is now illegal to change the actual
address in an ifaddr struct because then the RB tree becomes unbalanced.
nothing using this tree yet.
ok theo ryan dlg


# 1.112 13-Jan-2010 henning

instead of fiddling with the per-interface address lists directly in
many places create a proper API (ifa_add / ifa_del) and use it.
ok theo ryan dlg


# 1.111 12-Jan-2010 deraadt

Make the structures for ifa_msghdr and friends even more like
the route messages so that people and compilers will not get
confused.
ok claudio


# 1.110 17-Sep-2009 claudio

Remove the comaptibility structures for routing socket version 3.
The RTM_VERSION bump is 2 years ago and so there is no need for this.
Diff made by tedu@ some time ago but got never commited so I do it now.


# 1.109 14-Sep-2009 claudio

Add a way to convert the ifi_link_state to a string without the use of
if_media. This makes link state tracking a lot easier as there is no need
to convert if types to if_media types, etc. Additionally this allows us
to extend the link states to include states tracked on higher protocol layers.
gre(4) keepalives packets, bfd and udld can be implemented without ugly hacks.
OK henning, michele, sthen, deraadt


# 1.108 10-Aug-2009 deraadt

At sys_reboot time, bring all the interfaces down so that their xxstop
functions are called, which will turn off DMA. Receiving packets into
your memory after a system reboot is pretty nasty. This will also mean
that the shutdown hooks can go; this solution is smaller.
ok henning miod dlg kettenis


Revision tags: OPENBSD_4_6_BASE
# 1.107 06-Jun-2009 rainer

when xflags got changed, tell the userland by routing sockets

ok henning@


# 1.106 05-Jun-2009 claudio

Initial support for routing domains. This allows to bind interfaces to
alternate routing table and separate them from other interfaces in distinct
routing tables. The same network can now be used in any doamin at the same
time without causing conflicts.
This diff is mostly mechanical and adds the necessary rdomain checks accross
net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6.
input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@


# 1.105 04-Jun-2009 henning

allow IPvShit to be turned off completely per-interface.
ifconfig em0 -inet6
deletes all v6 addresses including link-local and prevents new ones from
being added.
ifconfig em0 inet6 <addr>
re-enables v6, brings the link local back and adds optional <addr>
ok theo reyk


# 1.104 03-Jun-2009 beck

make wireless interfaces priority 4 by default. other interfaces remain
priority 0. while we are in here make sure we add wi interfaces to group "wlan"
in the same way the net80211 stuff already is.

this makes dhcp multiple default routes useful on laptops.

ok claudio@


Revision tags: OPENBSD_4_5_BASE
# 1.103 27-Jan-2009 dlg

make drivers tell the mclgeti allocator what their maximum ring size is
to prevent the hwm growing beyond that. this allows the livelock mitigation
to do something where the hwm used to grow beyond twice the rx rings size.

ok kettenis@ claudio@


# 1.102 12-Dec-2008 claudio

Introduce a if_priority that will be added to RTP_STATIC when routes are
added without an expilict priority. This allows to specify less prefered
interfaces that will only take over if the primary interface loses link.
OK deraadt@


# 1.101 11-Dec-2008 deraadt

export per-interface mbuf cluster pool use statistics out to userland
inside if_data, so that netstat(1) and systat(1) can see them
ok dlg


# 1.100 30-Nov-2008 brad

- Remove unused if_reset "bus reset routine" field in the ifnet struct.
- Add if_stop "stop routine" field in the ifnet struct.

ok mglocker@


# 1.99 26-Nov-2008 dlg

provide m_clsetlwm, an interface for an interface to raise its low
watermark for mbuf cluster allocations.

this is necessary for things like bge which cannot cope with less than a
certain number of pkts on the ring.

ok deraadt@


# 1.98 25-Nov-2008 deraadt

Factor increases are not needed, +1 appears to work as well.
ok dlg


# 1.97 24-Nov-2008 deraadt

move MCLPOOLS to if.h and force uipc_mbuf.c to get if.h, there is no
other option
ok dlg


# 1.96 24-Nov-2008 dlg

add several backend pools to allocate mbufs clusters of various sizes out
of. currently limited to MCLBYTES (2048 bytes) and 4096 bytes until pools
can allocate objects of sizes greater than PAGESIZE.

this allows drivers to ask for "jumbo" packets to fill rx rings with.

the second half of this change is per interface mbuf cluster allocator
statistics. drivers can use the new interface (MCLGETI), which will use
these stats to selectively fail allocations based on demand for mbufs. if
the driver isnt rapidly consuming rx mbufs, we dont allow it to allocate
many to put on its rx ring.

drivers require modifications to take advantage of both the new allocation
semantic and large clusters.

this was written and developed with deraadt@ over the last two days
ok deraadt@ claudio@


# 1.95 07-Nov-2008 deraadt

give this some /* CONSTCOND */ love


Revision tags: OPENBSD_4_4_BASE
# 1.94 10-Apr-2008 dlg

introduce mitigation for the calling of an interfaces start routine.

decent drivers prefer to have a lot of packets on the send queue so they
can queue a lot of them up on the tx ring and then post them all in one
big chunk. unfortunately our stack queues one packet onto the send queue
and then calls the start handler immediately.

this mitigates against that queue, send, queue, send behaviour by trying to
call the start routine only once per softnet. now its queue, queue, queue,
send.

this is the result of a lot of discussion with claudio@
tested by many.


Revision tags: OPENBSD_4_3_BASE
# 1.93 18-Nov-2007 mpf

Sync struct ifaltq to match struct ifqueue.
I wonder why 64-bit archs have not been bitten by this.
OK mcbride@, henning@


# 1.92 03-Sep-2007 claudio

Bump RTM_VERSION to 4 and start a new aera of routing in OpenBSD :)
Changes include 64bit counters instead of u_long, routing table id in the header
of most messages, reserved routing priority field, added a hdrlen field to skip
over the header so that binary compatibility becomes easier.
A minimal backward support for old binaries is included to ease upgrades but
don't expect anything more than ifconfig, route and dhclient to correctly work.
OK henning@ mglocker@


Revision tags: OPENBSD_4_2_BASE
# 1.91 25-Jun-2007 henning

crank ifq_maxlen from 50 to 256, so it is not smaller than most interfaces
rx rings any more. forwarding boxes with many fast interfaces can still use
some more, but this is a saner default.
ok deraadt markus henric


# 1.90 14-Jun-2007 reyk

Add a new "rtlabel" option to ifconfig. It allows to specify a route label
which will be used for new interface routes. For example,
ifconfig em0 10.1.1.0 255.255.255.0 rtlabel RING_1
will set the new interface address and attach the route label RING_1 to
the corresponding route.

manpage bits from jmc@
ok claudio@ henning@


# 1.89 29-May-2007 uwe

Define IF_ENQUEUE() and friends as proper C statements using do ... while
ok henning


# 1.88 26-May-2007 jason

one extern seems to be better than 20 for ifqmaxlen; ok krw


# 1.87 27-Mar-2007 jmc

grammar from bret lambert, and one more from me;


Revision tags: OPENBSD_4_1_BASE
# 1.86 09-Feb-2007 jmc

grammar fix from bret lambert;


# 1.85 28-Nov-2006 reyk

add additional link states to report the half duplex / full duplex
state, if known by the driver. this is required to check the full
duplex state without depending on the ifmedia ioctl which can't be
called in the kernel without process context.

ok henning@, brad@


# 1.84 16-Nov-2006 henning

introduce if_creategroup() to create an empty interface group.
code factored out from if_addgroup(), previously a group always had to have
members. ok mpf mcbride


# 1.83 31-Oct-2006 jason

ether_input_mbuf() isn't necessary, turn it into a macro and deal with
it's "special" case in ether_input(). Based on similiar idea in FreeBSD.
ok brad


Revision tags: OPENBSD_4_0_BASE
# 1.82 02-Jun-2006 mpf

Introduce attributes to interface groups.
As a first user, move the global carp(4) demotion counter
into the interface group. Thus we have the possibility
to define which carp interfaces are demoted together.

Put the demotion counter into the reserved field of the carp header.
With this, we can have carp act smarter if multiple errors occur.
It now always takes over other carp peers, that are advertising
with a higher demote count. As a side effect, we can also have
group failovers without the need of running in preempt mode.
The protocol change does not break compability with older
implementations.

Collaborative work with mcbride@

OK mcbride@, henning@


# 1.81 27-May-2006 brad

remove IFCAP_JUMBO_MTU interface capabilities flag and set if_hardmtu in a few
more drivers.

ok reyk@


# 1.80 26-May-2006 deraadt

rename jumbo mtu to if_hardmtu; ok brad reyk


# 1.79 19-May-2006 reyk

add a if_jumbo_mtu field to the interface structure for drivers
supporting ethernet jumbo frames. there's no standard for the size of
jumbo MTUs, so either let the driver set it's own value or use 9000
byte jumbo frames by default.

ok brad@


# 1.78 04-Mar-2006 brad

With the exception of two other small uncommited diffs this moves
the remainder of the network stack from splimp to splnet.

ok miod@


Revision tags: OPENBSD_3_9_BASE
# 1.77 09-Feb-2006 reyk

add an interface detach hook and use it with the vlan(4) driver. this
fixes a possible crash if the parent interface has been destroyed
(like vlan on trunk) before destroying the vlan interface.

ok brad@


Revision tags: OPENBSD_3_8_BASE
# 1.76 14-Jun-2005 henning

rename function and define to reflect the external -> egress name change
so it is clear what it is all about


# 1.75 14-Jun-2005 henning

use "egress" instead of "external" for the interface group containing the
interfaces the default route(s) point to, proposed deraadt some days ago,
ok djm deraadt


# 1.74 12-Jun-2005 henning

add SIOCGIFGMEMB ioctl, returns a list of all interfaces who are member of
the given group, markus ok


# 1.73 07-Jun-2005 henning

introduce a default "external" interface group, containing the interface(s)
the the default route(s) point to.
handles IPv4 and IPv6 as well as multipath routes.
follows default route changes, of course.
eases writing pf rulesets especially on laptops etc. that use different
interfaces depending on the environment (wired, wireless, ...)
ok theo ryan


# 1.72 06-Jun-2005 henning

use a define instead of hardcoding "all" in 3 places


# 1.71 05-Jun-2005 henning

const'ify the char *groupname param to if_addgroup and if_delgroup


# 1.70 24-May-2005 markus

add net.inet.ip.ifq for monitoring and changing ifqueue; similar to netbsd
ok henning


# 1.69 24-May-2005 reyk

initial import of a trunking (link aggregation and link failover)
implementation. it currently supports round robin mode with link state
checking, additional modes will be added later.

ok brad@, deraadt@


# 1.68 24-May-2005 henning

keep a list of member interfaces in ifg_group


# 1.67 22-May-2005 henning

allow pf to match on interface groups
pass on mygroup ...
markus ok


# 1.66 21-May-2005 henning

clean up and rework the interface absraction code big time, rip out multiple
useless layers of indirection and make the code way cleaner overall.
this is just the start, more to come...
worked very hard on by Ryan and me in Montreal last week, on the airplane to
vancouver and yesterday here in calgary. it hurt.
ok ryan theo


# 1.65 20-Apr-2005 mpf

Introduce if_linkstatehooks.
This converts if_link_state_change() to a generic usable
callback with dohooks().

OK henning@, camield@
Tested by camield@ and Alexey E. Suslikov


Revision tags: OPENBSD_3_7_BASE
# 1.64 07-Feb-2005 mcbride

Add new function if_link_state_change() to take care of sending messages
on the routing socket and notifying carp() of link changes.

ok brad@ mpf@


# 1.63 14-Jan-2005 henning

remove old ifgroups ioctls
the old ifgroups haven't been in use ever really, and the new
implementation is 3 months old today. theo ok (3 months ago)


# 1.62 07-Dec-2004 mcbride

Convert carp(4) to behave more like a regular interface, much in the same
style as vlan(4). carp interfaces no longer require the physical interface
to be on the same subnet as the carp interface, or even that the physical
interface has an adress at all, so CARP can now be used on /30 networks.

ok deraadt@ henning@


# 1.61 07-Dec-2004 mcbride

KNF


# 1.60 03-Dec-2004 henning

do not use one struct timeout for the if congestion stuff, but embed
a struct timeout to struct ifqueue so that each one has its own - it
is a per-queue thing. from chris pascoe


# 1.59 10-Nov-2004 grange

Safer IF_INPUT_ENQUEUE macro.

ok millert@


# 1.58 14-Oct-2004 mickey

avoid stupid commons


# 1.57 11-Oct-2004 henning

ifgroups reqrite
there is now a TAILQ with all interface groups as members, and
in struct ofnet there is only a pointer to the group structure stored
and not its name.
mostly hacked at c2k4 and somewhere over the atlantic ocean
ok markus mcbride


Revision tags: OPENBSD_3_6_BASE
# 1.56 26-Jun-2004 markus

cleanup ioctl for ifgroups; ok pb@


# 1.55 25-Jun-2004 pb

introduce "interface groups"

by "ifconfig fxp0 group foobar" "ifconfig xl0 group foobar"
these two interfaces are in one group.
Every interface has its if-family as default group.

idea/design from henning@, based on some work/disucssion from Joris Vink.

henning@, mcbride@ ok.


# 1.54 20-Jun-2004 beck

undo mbuf cluster breakage that causes free'ed packets to show up on the
input queues when using dhcp and hostap wi, or xl, or fxp....
ok art@


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.53 29-May-2004 jcs

introduce SIOCSIFDESCR and SIOCGIFDESCR to maintain interface
descriptions, configurable with ifconfig

help from various, ok deraadt@


# 1.52 18-May-2004 brad

if_ether.h
add ETHER_MAX_LEN_JUMBO, ETHER_VLAN_ENCAP_LEN, ETHER_ALIGN, and
ETHERMTU_JUMBO constants.

if.h
add a few more interface capabilities flags.

Some from NetBSD, some from FreeBSD.

ok markus@


# 1.51 26-Apr-2004 mcbride

Before enqueueing the packet, copy the contents of incoming clusters
to the mbuf and free the cluster when it contains a small packet.

ok deraadt@


# 1.50 17-Apr-2004 henning

add a congestion indicator to if_queue. It is set when the input queue
is full, along with a timer that unsets it again after 10ms.
The input queue beeing full is a reliable indicator for CPU overload, and
this flag allows other subsystems to cope with the situation.
hacked with beck
ok kjc@ markus@ beck@


Revision tags: OPENBSD_3_5_BASE
# 1.49 15-Jan-2004 markus

add a RTM_IFANNOUNCE message; from netbsd; ok itojun, henning


# 1.48 16-Dec-2003 markus

return error in ifc_destroy; ok deraadt, itojun, cedric, hshoexer


# 1.47 10-Dec-2003 itojun

use if_indexlim (instead of if_index) and ifindex2ifnet[x] != NULL
to check if interface exists, as (1) if_index will have different meaning
(2) ifindex2ifnet could become NULL when interface gets destroyed,
when we introduce dynamically-created interfaces. markus ok


# 1.46 08-Dec-2003 markus

add IOCIFGCLONERS; ifconfig -C; from netbsd; ok henning, deraadt


# 1.45 03-Dec-2003 markus

support for network interface "cloning", e.g. gif(4) via ifconfig(8)


# 1.44 19-Oct-2003 david

more typos


# 1.43 17-Oct-2003 mcbride

Common Address Redundancy Protocol

Allows multiple hosts to share an IP address, providing high availability
and load balancing.

Based on code by mickey@, with additional help from markus@
and Marco_Pfatschbacher@genua.de

ok deraadt@


Revision tags: OPENBSD_3_4_BASE
# 1.42 25-Aug-2003 fgsch

if_init support, required by ieee80211.
deraadt@ ok.


# 1.41 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_A UBC_SYNC_B
# 1.40 03-Jul-2002 miod

Change all variables definitions (int foo) in sys/sys/*.h to variable
declarations (extern int foo), and compensate in the appropriate locations.


# 1.39 30-Jun-2002 itojun

allocate sockaddr_dl for ifnet in if_alloc_sadl(), as we don't always know
the size of sockaddr_dl on if_attach() - for instance, see ether_ifattach().
from netbsd. fgs ok


# 1.38 23-Jun-2002 itojun

g/c last remains of old ipv6 prefix management


# 1.37 27-May-2002 itojun

if_attach() gets called before domaininit(). scan all interfaces for if_afdata
initialization after domaininit().


# 1.36 27-May-2002 itojun

framework to add af-dependent data structure to struct ifnet.
as discussed at bsd-api-discuss. sync w/kame


# 1.35 24-Apr-2002 dhartmei

Add hooks to struct ifnet that allow to register callbacks that will be
notified of interface address changes. ok provos@, angelos@


Revision tags: OPENBSD_3_1_BASE
# 1.34 15-Mar-2002 millert

Cosmetic changes only, primarily making comments line up nicely after the
__P removal.


# 1.33 14-Mar-2002 millert

First round of __P removal in sys


# 1.32 23-Jan-2002 fgsch

compatability -> compatibility.


Revision tags: OPENBSD_3_0_BASE UBC_BASE
# 1.31 05-Jul-2001 angelos

branches: 1.31.4;
KNF


# 1.30 05-Jul-2001 jjbg

Include files for IPComp support. angelos@ ok.


# 1.29 27-Jun-2001 kjc

ALTQ base modifications to the kernel.
- ALTQ introduces a set of new queue macros that coexist with the
traditional IF_XXX macros.
- "struct ifaltq" replaces "struct ifqueue" in "struct ifnet".
- assign cdev major 74 for i386 and 54 for alpha as ALTQ control interface.


# 1.28 23-Jun-2001 fgsch

Add ether_input_mbuf to help us remove the ether_header from
ether_input; all drivers should start migrating to this.
Discussed with jason@, deraadt@ more or les ok'ed.


# 1.27 15-Jun-2001 itojun

change the meaning of ifnet.if_lastchange to meet RFC1573 ifLastChange.
follows BSD/OS practice and ucd-snmp code (FreeBSD does it for specific
interfaces only).

was: if_lastchange get updated on every packet transmission/receipt.
now: if_lastchange get updated when IFF_UP is changed.


# 1.26 09-Jun-2001 angelos

By popular demand, protect from multiple inclusion, and fix to use the
same naming style.


# 1.25 28-May-2001 angelos

IPSECv4 -> IPSEC


# 1.24 28-May-2001 angelos

No need for separate ESP/AH interface capabilities.


# 1.23 28-May-2001 angelos

Interface capabilities (based on NetBSD, but merge ethercom and ifnet
capabilities into one, in the ifp).


Revision tags: OPENBSD_2_9_BASE
# 1.22 06-Feb-2001 mickey

allow changing number of loopbacks in ukc.
change rest of the code to use lo0ifp pointing
to the corresponding struct ifnet.
itojun@ and niklas@ ok


# 1.21 19-Jan-2001 itojun

pull post-4.4BSD change to sys/net/route.c from BSD/OS 4.2 (UCB copyrighted).

have sys/net/route.c:rtrequest1(), which takes rt_addrinfo * as the argument.
pass rt_addrinfo all the way down to rtrequest, and ifa->ifa_rtrequest.
3rd arg of ifa->ifa_rtrequest is now rt_addrinfo * instead of sockaddr *
(almost noone is using it anyways).

benefit: the follwoing command now works. previously we need two route(8)
invocations, "add" then "change".
# route add -inet6 default ::1 -ifp gif0

remove unsafe typecast in rtrequest(), from rtentry * to sockaddr *. it was
introduced by 4.3BSD-reno and never corrected.

XXX is eon_rtrequest() change correct regarding to 3rd arg?
eon_rtrequest() and rtrequest() were incorrect since 4.3BSD-reno,
so i do not have correct answer in the source code.
someone with more clue about netiso-over-ip, please help.


Revision tags: OPENBSD_2_8_BASE
# 1.20 20-Sep-2000 art

Since ifa_refcnt was bumped to an int and rt_flags is an int too, bump
ifa_flags to int.


# 1.19 28-Aug-2000 deraadt

changing the size of if_data has heavy impact on userland compat. there
was even a u_char slot available for ifi_link_state, which clearly does not
need a full 32 bits.


# 1.18 26-Aug-2000 nate

sync mii code with netbsd
adds detach functionality for phys
some code cleanup

Nobody really had time to test all of this out, but theo said commit anyway


Revision tags: OPENBSD_2_7_BASE
# 1.17 22-Mar-2000 itojun

remove if_withname(), which was imported during KAME merge by mistake.


# 1.16 21-Mar-2000 mickey

add SIOCGIFMTU/SIOCSIFMTU; remediate redundant code of tun, ppp, sppp; chris@ ok


Revision tags: SMP_BASE
# 1.15 02-Feb-2000 itojun

branches: 1.15.2;
wrap IFAFREE() by "do {} while (0)". it wasn't safe enough.


Revision tags: kame_19991208
# 1.14 08-Dec-1999 itojun

bring in KAME IPv6 code, dated 19991208.
replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support.
see sys/netinet6/{TODO,IMPLEMENTATION} for more details.

GENERIC configuration should work fine as before. GENERIC.v6 works fine
as well, but you'll need KAME userland tools to play with IPv6 (will be
bringed into soon).


Revision tags: OPENBSD_2_6_BASE
# 1.13 08-Aug-1999 niklas

Support detaching of network interfaces. Still work to do in ipf, and
other families than inet.


# 1.12 23-Jun-1999 cmetz

Added some protocol independent interfaces (supposedly IPv6 support APIs, but
ones that are useful for all protocols, not just IPv6).


Revision tags: OPENBSD_2_5_BASE
# 1.11 13-Mar-1999 deraadt

make ifa_refcnt a u_int; andrewb@demon.net


# 1.10 26-Feb-1999 jason

Ethernet bridge/IP firewall driver.


# 1.9 07-Jan-1999 deraadt

fix IFAFREE() to be safe for if/else nesting


Revision tags: OPENBSD_2_4_BASE
# 1.8 03-Sep-1998 jason

o OpenBSD gets if_media support (from NetBSD)
o rework/simplify if_xl to use it


Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE OPENBSD_2_2_BASE OPENBSD_2_3_BASE
# 1.7 02-Jul-1996 niklas

-Wall & -Wstrict-prototype fixes


# 1.6 29-Jun-1996 deraadt

provide if_attachhead(), and make if_loop use it


# 1.5 10-May-1996 deraadt

if_name/if_unit -> if_xname/if_softc


# 1.4 06-May-1996 mickey

if.h was missed from the commit.
if_ethersubr.c: missed variables added.


# 1.3 05-Mar-1996 mickey

Changes for ifconfig to compile.


# 1.2 03-Mar-1996 niklas

From NetBSD: 960217 merge


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.210 27-Feb-2023 jan

Turn off TSO if interface is added to layer 2 devices.

ok bluhm@, claudio@


Revision tags: OPENBSD_7_2_BASE
# 1.209 27-Jun-2022 jan

Introduce Large Receive Offloading of TCP segment offloading for ix(4). It is
disabled by default. Also add a tso option to ifconfig(8) to enable and
disable this feature.

ok deraadt


Revision tags: OPENBSD_6_9_BASE OPENBSD_7_0_BASE OPENBSD_7_1_BASE
# 1.208 11-Mar-2021 florian

When RFC 8981 obsoleted RFC 4941 the terminology changed from
"privacy extensions" to "temporary address extensions"

Change ifconfig(8) to output temporary after temporary addresses and
add "temporary" option which is an alias for autoconfprivacy for now.

Also make AUTOCONF6TEMP a positiv flag that is set by default.
Previously the negative flag "INET6_NOPRIVACY" was set when privacy
addresses were disabled. This makes the flags output less ugly and
will allow us to disable autoconf addresses while having temporary
addresses enabled in the future.

More work is needed in slaacd.

input benno, jmc, deraadt
previous verison OK benno
OK jmc, kn


# 1.207 20-Feb-2021 dlg

add a MONITOR flag to ifaces to say they're only used for watching packets.

an example use of this is when you have a span port on a switch and
you want to be able to see the packets coming out of it with tcpdump,
but do not want these packets to enter the network stack for
processing. this is particularly important if the span port is
pushing a copy of any packets related to the machine doing the
monitoring as it will confuse pf states and the stack.

ok benno@


# 1.206 01-Feb-2021 mvs

ifunit() was fully replaced by if_unit(9) and should go away.

ok bluhm@ dlg@


# 1.205 18-Jan-2021 mvs

Introduce new function if_unit(9). This function returns a pointer the
interface descriptor corresponding to the unique name. This descriptor
is guaranteed to be valid until if_put(9) is called on the returned
pointer. if_unit(9) should replace already existent ifunit() which
returns descriptor not safe for dereference when context was switched.
This allow us to avoid some use-after-free issues in ioctl(2) path.
Also this unifies interface descriptor usage.

ok claudio@ sashan@


# 1.204 30-Sep-2020 mvs

We have no if_attachtail() function so remove the declaration.

ok deraadt@ claudio@


Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE OPENBSD_6_8_BASE
# 1.203 25-Jul-2019 krw

AF_INET comes before AF_INET6. Shorten line to <80 chars.

pointed out by claudio@


# 1.202 25-Jul-2019 krw

Add IFXF_AUTOCONF4 to if_xflags to match IFXF_AUTOCONF6. Let
ifconfig set/unset it.

ok deraadt@ kmos@


# 1.201 19-Apr-2019 dlg

add IF_HDRPRIO_OUTER for rxprio

IF_HDRPRIO_OUTER says you want the priority from the outer encap header.

ok claudio@


Revision tags: OPENBSD_6_5_BASE
# 1.200 10-Apr-2019 dlg

add struct if_sffpage so userland can read a page of sfp/qsfp info

this will be used by ifconfig so it can show you things like who
mades what module you're using and when it was made, and on some
modules you get diag info like temperature, vcc, and rx and tx
powers.

im putting the kernel side in so we can keep fiddling with the
userland printf side of things.

this work was done based on a question by rachel roch
ok deraadt@
enthusiasm from many including mikeb@ sthen@ patrick@


# 1.199 23-Jan-2019 dlg

add a SIOCGPWE3 ioctl for interfaces to advertise pwe3 capability

im going to turn mpw into an ethernet interface, which includes
changing its if_type to IFT_ETHER. currently ldpd looks for if_type
IFT_MPLSTUNNEL to decide if an interface is a pseudowire, ie, it's
going to break. the ioctl will let ldpd ask the interface if it is
pseudowire capable as an alternative.

ok claudio@


# 1.198 12-Nov-2018 dlg

add ifreq bits for the tx header prio field ioctls

a tx header prio can set to a fixed value from 0 to 7, or magic
values to represent populating the prio field from the encapsulated
packet, or from the mbuf prio value.

ok claudio@


# 1.197 12-Nov-2018 krw

Add new routing socket message RTM_80211INFO to provide details of
802.11 interface state changes (e.g. SSID) to interested parties.

Original diff from phessler@. Many suggestions and tweaks from
claudio@, stsp@, anton@.

ok claudio@ stsp@ anton@ phessler@


# 1.196 11-Nov-2018 dlg

use the llprio on gre(4) and eoip(4) interfaces for the keepalive tos

llprios are valued 0 to 7, while the ip tos/dscp/tclass is an 8 bit
value. fortunately the high 3 bits map nicely to the llprio values,
so we shift the llprio into place when generating the keepalive
frames. the llprio is defaulted to the value that cisco uses for
their gre keepalives.


Revision tags: OPENBSD_6_4_BASE
# 1.195 12-Sep-2018 krw

Fix obvious cut&pasto in comment (ifa_msghdr -> if_announcemsghdr).

ok claudio@


# 1.194 30-May-2018 dlg

restrict the prio values from SIOCSIFLLPRIO to what the kernel handles

previously the ioctl code checked that prio was an int less than
UCHAR_MAX, but the rest of the kernel (and priq code in particular)
expects it to be between 0 and 7 inclusive.

ok krw@ tb@


# 1.193 25-Apr-2018 jca

Make this header standalone #if __BSD_VISIBLE, by including needed headers

Puts us in line with Free/NetBSD and Linux and will get us rid of
pointless patches in the ports tree. ok guenther@ deraadt@


Revision tags: OPENBSD_6_3_BASE
# 1.192 19-Feb-2018 dlg

tunneldf needs ifr_df


# 1.191 10-Feb-2018 florian

Implement RFC 7217: "A Method for Generating Semantically Opaque
Interface Identifiers with IPv6 Stateless Address Autoconfiguration."

"An IPv6 address configured using this method is stable within each
subnet, but the corresponding Interface Identifier changes when the
host moves from one network to another. This method is meant to be an
alternative to generating Interface Identifiers based on hardware
addresses."

OK naddy, sthen


# 1.190 16-Jan-2018 mpi

Recycle IFF_NOTRAILERS into IFF_STATICARP and document ownerhsip
of IFF* flags.

inputs from jmc@, ok bluhm@, visa@


# 1.189 21-Dec-2017 dlg

prototype if_attach_iqueues so drivers can configure multiple iqs.


# 1.188 09-Nov-2017 tb

The cmd argument of ifconf() has been unused since COMPAT_LINUX was
purged. Remove it and move the prototype to if.c since ifconf() is
not used outside of this file.

ok mpi


# 1.187 31-Oct-2017 sashan

- add one more softnet taskq
NOTE: code still runs with single softnet task. change definition of
SOFTNET_TASKS in net/if.c, if you want to have more than one softnet task

OK mpi@, OK phessler@


Revision tags: OPENBSD_6_1_BASE OPENBSD_6_2_BASE
# 1.186 24-Jan-2017 krw

A space here, a space there. Soon we're talking real whitespace
rectification.


# 1.185 24-Jan-2017 dlg

add support for multiple transmit ifqueues per network interface.

an ifq to transmit a packet is picked by the current traffic
conditioner (ie, priq or hfsc) by providing an index into an array
of ifqs. by default interfaces get a single ifq but can ask for
more using if_attach_queues().

the vast majority of our drivers still think there's a 1:1 mapping
between interfaces and transmit queues, so their if_start routines
take an ifnet pointer instead of a pointer to the ifqueue struct.
instead of changing all the drivers in the tree, drivers can opt
into using an if_qstart routine and setting the IFXF_MPSAFE flag.
the stack provides a compatability wrapper from the new if_qstart
handler to the previous if_start handlers if IFXF_MPSAFE isnt set.

enabling hfsc on an interface configures it to transmit everything
through the first ifq. any other ifqs are left configured as priq,
but unused, when hfsc is enabled.

getting this in now so everyone can kick the tyres.

ok mpi@ visa@ (who provided some tweaks for cnmac).


# 1.184 23-Jan-2017 mpi

Flag pseudo-interfaces as such in order to call add_net_randomness()
only once per packet.

Fix a regression introduced when if_input() started to be called by
every pseudo-driver.

ok claudio@, dlg@


# 1.183 23-Jan-2017 dlg

i botched the copyout to ifr->ifr_data in SIOCGIFDATA.

this lets pflogd run again.

rename if_data() to if_getdata() while here to make grepping for
things less noisy.

reported by jsg@
worked through with deraadt@


# 1.182 23-Jan-2017 dlg

merge the ifnet and ifqueue stats together when userland wants them.

a new if_data() function takes a pointer to ifnet and merges its
if_data and ifq statistics. it takes the ifq mutex around the reads
of the ifq stats so they get a consistent copy.

the ifnet and ifq stats are merged because some parts of the stack
still update the ifnet counters.

ok visa@ (on an earlier diff) mpi@ claudio@


# 1.181 12-Dec-2016 mpi

Remove most of the splsoftnet() recursions related to cloned interfaces.

inputs and ok bluhm@


# 1.180 27-Oct-2016 dlg

add a new pool for 2k + 2 byte (mcl2k2) clusters.

a certain vendor likes to make chips that specify the rx buffer
sizes in kilobyte increments. unfortunately it places the ethernet
header on the start of the rx buffer, which means if you give it a
mcl2k cluster, the ethernet header will not be ETHER_ALIGNed cos
mcl2k clusters are always allocated on 2k boundarys (cos they pack
into pages well). that in turn means the ip header wont be aligned
correctly.

the current workaround on these chips has been to let non-strict
alignment archs just use the normal 2k cluster, but use whatever
cluster can fit 2k + 2 on strict archs. that turns out to be the
4k cluster, meaning we waste nearly 2k of space on every packet.

properly aligning the ethernet header and ip headers gives a
performance boost, even on non-strict archs.


# 1.179 04-Sep-2016 reyk

Move code to change the rdomain of an interface from the ioctl switch case
to a new function if_setrdomain().

OK mpi@ henning@


# 1.178 03-Sep-2016 reyk

Add support for a multipoint-to-multipoint mode in vxlan(4). In this
mode, vxlan(4) must be configured to accept any virtual network
identifier with "vnetid any" and added to a bridge(4) or switch(4).
This way the driver will dynamically learn the tunnel endpoints and
their vnetids for the responses and can be used to dynamically bridge
between VXLANs. It is also being used in combination with switch(4)
and the OpenFlow tunnel classifiers.

With input from yasuoka@ goda@
OK deraadt@ dlg@


Revision tags: OPENBSD_6_0_BASE
# 1.177 10-Jun-2016 vgross

Add the "llprio" field to struct ifnet, and the corresponding keyword
to ifconfig.

"llprio" allows one to set the priority of packets that do not go through
pf(4), as the case is for arp(4) or bpf(4).

ok sthen@ mikeb@


# 1.176 02-Mar-2016 dlg

provide generic ioctls for managing an interfaces parent

in the future this will subsume the individual vlandev, carpdev,
pppoedev, foodev options for things like vlan, carp, pppoe, etc.

inspired by vnetid

ok mpi@ jmatthew@


Revision tags: OPENBSD_5_9_BASE
# 1.175 05-Dec-2015 deraadt

avoid an ugly wrap in a comment


# 1.174 03-Dec-2015 dlg

rework if_start to allow nics to provide an mpsafe start routine.

existing start routines will still be called under the kernel lock
and at IPL_NET.

mpsafe start routines will be serialised so only one instance of
each interfaces function will be running in the kernel at any point
in time. this guarantees packets will be dequeued in order, and the
start routines dont have to lock against themselves because if_start
does it for them.

the code to do that is based on the scsi runqueue code.

this also provides an if_start_barrier() function that should wait
until any currently running instances of if_start have finished.

a driver can opt in to the mpsafe if_start call by doing the following:

1. setting ifp->if_xflags = IFXF_MPSAFE
2. only calling if_start() instead of its own start routine
3. clearing IFF_RUNNING before calling if_start_barrier() on its way down
4. only using IFQ_DEQUEUE (not ifq_deq_begin/commit/rollback)

to simplify the implementation the tx mitigation code has been removed.

tested by several
ok mpi@ jmatthew@


# 1.173 20-Nov-2015 mpi

Keep if_ref() private, if_get() is what you want to use before if_put().

The thread detaching an interface will sleep until all references to this
interface have been released. So we decided to only keep references for
a short period of time.

Keeping if_ref() private will hopefully help preserve this goal as long
as it makes sense.

Calling if_get()/if_put() in the same function also allows us to make
use of static analysis tools (thanks jsg@!) to catch our errors.

ok dlg@


# 1.172 24-Oct-2015 reyk

Add pair(4), a vether-based virtual Ethernet driver to interconnect
rdomains and bridges on the local system. This can be used to route
through local rdomains, to create L2 devices (like trunks) between
them, and many other things.

Discussed with many, with input from mpi@
OK sthen@ phessler@ yasuoka@ mikeb@


# 1.171 23-Oct-2015 claudio

Introduce a new sysctl NET_RT_IFNAMES that returns only ifnames to ifindex
mappings. This will be used by if_nameindex(3), if_nametoindex(3) and
if_indextoname(3) soon to fix the issues in pledge because of inet6 link
local addressing.
OK mpi@ benno@ deraadt@
The libc version will follow soon so better start updating your kernels


# 1.170 23-Oct-2015 dlg

tweak the vnetid so it can be optional and therefore cleared/deleted.

the abstract vnetid is promoted to a uin32_t, and adds a SIOCDVNETID
ioctl so it can be cleared.

this is all because i set an assignment on implementing a virtual
network interface and the students got confused when vnetid 0 didnt
show up in ifconfig output.

the vnetid in the vxlan(4) protocol is optional, but the current
code confuses 0 with no vnetid being set. this makes it clear.

ok reyk@ who also simplified my diff


# 1.169 05-Oct-2015 uebayasi

Add ifi_oqdrops and its alias to struct if_data.

Necessary bumps in Ports will be handled by sthen@.

OK mpi@ dlg@


# 1.168 27-Sep-2015 stsp

Add if_setlladdr(), factored out from ifioctl(). Will be used by iwm(4) soon.
With suggestions from tedu@ and guenther@
ok kettenis@


# 1.167 11-Sep-2015 stsp

Make room for media types of the future. Extend the ifmedia word to 64 bits.
This changes numbers of the SIOCSIFMEDIA and SIOCGIFMEDIA ioctls and
grows struct ifmediareq.

Old ifconfig and dhclient binaries can still assign addresses, however
the 'media' subcommand stops working. Recompiling ifconfig and dhclient
with new headers before a reboot should not be necessary unless in very
special circumstances where non-default media settings must be used to
get link and console access is not available.

There may be some MD fallout but that will be cleared up later.

ok deraadt miod
with help and suggestions from several sharks attending l2k15


# 1.166 09-Sep-2015 dlg

introduce reference counts for interfaces (ie, struct ifnet *ifp).

if_get can get a reference to an ifp, but it never releases that
reference. this provides an if_put function that can be used to
decrement the refcount.

we cannot come up with a scheme for letting the network stack run on
one (or many) cpus while ioctls are pulling interfaces down on another
cpu without refcounts for the interfaces.

if_put is going in now so we can go through the stack and put the
necessary calls to it in, and then we'll backfill this implementation
to actually check the refcounts when the interface detaches.

ok mpi@ mikeb@ claudio@


# 1.165 30-Aug-2015 mpi

Use a global table for domains instead of building a list at run time.

As a side effect there's no need to run if_attachdomain() after the
list of domains has been built.

ok claudio@, reyk@


Revision tags: OPENBSD_5_8_BASE
# 1.164 07-Jun-2015 jsg

Introduce unhandled_af() for cases where code conditionally does
something based on an address family and later assumes one of the paths
was taken. This was initially just calls to panic until guenther
suggested a function to reduce the amount of strings needed.

This reduces the amount of noise with static analysers and acts
as a sanity check.

ok guenther@ bluhm@


# 1.163 18-May-2015 reyk

Move the rdomain from struct ifnet into struct if_data. This way it
will be exported to userland with the existing sysctl, getifaddrs()
and routing socket (if_msghdr.ifm_data) interfaces that expose
if_data. All programs and daemons - Apps - that call the
SIOCGIFRDOMAIN ioctl in a getifaddrs() loop or after receiving an
interface message on the routing socket can now remove the pointless
additional ioctl. In base, that could be: dhclient, isakmpd, dhcpd,
dhcrelay, ntpd, ospfd, ripd, ifconfig.

No ABI breakage because it uses a previously unused pad field in if_data.

OK mpi@ deraadt@


# 1.162 10-Apr-2015 mpi

Run detach hook and similar before cleaning up any other resource when
an interface is destroyed/removed. This way we can ensure pseudo-driver
changes done after attaching an interface are undone before detaching it.

Note: it is safe to call if_deactivate() multiple times as the interface
should not have any attached pseudo-interface after the first call.

ok deraadt@, dlg@


# 1.161 18-Mar-2015 dlg

remove the congestion handling from struct ifqueue.

its only used for the ip and ip6 network stack input queues, so it
seems unfair that every instance of ifqueue has to carry a pointer
around for this specific use case.

this moves the congestion marker to a kernel global. if we detect
that we're congested, we assume the whole system is busy and punish
all input queues.

marking a system as congested is done by setting the global to the
current value of ticks. as the system moves away from that value,
it moves away from being congested until the comparison fails.

written at s2k15
ok henning@ beck@ bluhm@ claudio@


Revision tags: OPENBSD_5_7_BASE
# 1.160 08-Feb-2015 mpi

Introduce if_input() a function to pass packets dequeued from a
recieving ring to the stack.

if_input() is at the moment a drop-in replacement for ether_input_mbuf()
but will let us stack pseudo-driver in a nice way in order to no longer
call ether_input() recursively.

ok pelikan@, reyk@, blambert@, henning@


# 1.159 06-Jan-2015 stsp

Remove the NOINET6 interface flag, a left-over from the times when IPv6
was enabled by default. Add AFATTACH/AFDETACH ioctls which enable/disable
an address family for an interface (currently used for IPv6 only).

New kernel needs new ifconfig for IPv6 configuration (address assignment
still works with old ifconfig making this easy to cross over).

Committing on behalf of henning@ who is currently lebensmittelvergiftet.
ok stsp, benno, mpi


# 1.158 05-Dec-2014 mpi

Explicitly include <net/if_var.h> instead of pulling it in <net/if.h>.

ok mikeb@, krw@, bluhm@, tedu@


Revision tags: OPENBSD_5_6_BASE
# 1.157 14-Jul-2014 dlg

now that receive ring accounting has been pulled out of the mbuf layer,
we can pull the space the mbuf layer used to do per interface accounting
out of struct if_data.

saves a hundredish bytes on every interface.

ok deraadt@ claudio@


# 1.156 11-Jul-2014 henning

introduce the IFXF_AUTOCONF6 interface flag which controls wether we
accept rtadvs on that interface. the global net.inet6.ip6.accept_rtadv
sysctl just doesn't cut it, even tho the spec wants that - but in their
little absurd world, a host just has one interface by definition anyway...
the sysctlgoes away.
lots of head scratching, brain cell elemination etc from bluhm benno stsp
florian, excitement from simon and todd, ok bluhm stsp benno florian


# 1.155 08-Jul-2014 dlg

introduce the if_rxr api. it is intended to pull the rx ring accounting
out of the mbuf layer, and break the assumption that an interface will
only have a single ring per mbuf cluster size.

mpi@ is ok with moving this forward


# 1.154 13-Jun-2014 mpi

Instead of updating all the cluster allocation water marks of all the
interfaces when the kernel is livelocked, only do it for the current
pool and defer the other updates.

This allow us to get rid of an interface list iteration in a critical
path.

Ridding the libc crank since this change introduce an ABI break.

ok claudio@


Revision tags: OPENBSD_5_5_BASE
# 1.153 21-Nov-2013 mikeb

split kernel parts of the if.h into a separate header file if_var.h
which allows us to modify ifnet structure in a relatively safe way;
discussed with deraadt, ok mpi


# 1.152 09-Nov-2013 dlg

ticks is compared against mcl_grown to see if time has elapsed since
the rx ring was last allowed to grow and then assigned to it. it
is erroneous to do this because mcl_grown is a u_int and ticks is an
int.

this makes mcl_grown an int, and follows the idiom in kern_timeout.c
of going "thing - ticks < diff", which better copes with ticks
wrapping around and being used to calculate relative intervals.

ok pirofti@ guenther@


# 1.151 01-Nov-2013 pelikan

keep net/hfsc.h away from userspace, except in pfctl

tested by naddy, ok deraadt


# 1.150 21-Oct-2013 benno

nuke comment. How soon is now?
"do it" deraadt@


# 1.149 19-Oct-2013 reyk

Bring back the if_detachhook. We're going to have more users now.

ok mpi@ henning@ benno@


# 1.148 13-Oct-2013 reyk

Import vxlan(4), the virtual extensible local area network tunnel
interface. VXLAN is a UDP-based tunnelling protocol for overlaying
virtualized layer 2 networks over layer 3 networks. The implementation
is based on draft-mahalingam-dutt-dcops-vxlan-04 and has been tested
with other implementations in the wild.

put it in deraadt@


# 1.147 12-Oct-2013 henning

new bandwidth shaping subsystem, kernel side
uses hfsc behind the scenes; altq stays in parallel for a migration phase.
if.h even more messy for the transition, but eventuelly it should become
readable...
looked over & tested by many, ok phessler sthen


# 1.146 17-Sep-2013 mpi

Change vlan(4) detach procedure to not use a hook but a list of vlans
on the parent interface. This is similar to what bridge(4), trunk(4)
or carp(4) are doing and allows us to get rid of the detachhook.

ok reyk@, mikeb@


# 1.145 28-Aug-2013 mpi

Remove unused argument from *rtrequest()

ok krw@, mikeb@


Revision tags: OPENBSD_5_4_BASE
# 1.144 20-Jun-2013 mpi

Revert previous and unbreak asr, the new include should be protected.

Reported by naddy@


# 1.143 20-Jun-2013 mpi

Allocate the various hook head descriptors as part of the ifnet
structure rather than doing various M_WAITOK allocations during
the *attach() functions, we always rely on them anyway.

ok mikeb@, uebayasi@


# 1.142 02-Apr-2013 mpi

Instead of storing the link-level address of every interface in a global
array indexed by interface numbers, add a new field to the interface
descriptor pointing to it.

claudio@ and todd@ like it, ok mikeb@


# 1.141 26-Mar-2013 mpi

Remove various read-only *maxlen variables and use IFQ_MAXLEN directly.

ok beck@, mikeb@


# 1.140 20-Mar-2013 mpi

Introduce if_get() to retrieve an interface descriptor pointer given
an interface index and replace all the redondant checks and accesses
to a global array by a call to this function.

With imputs from and ok bluhm@, mikeb@


# 1.139 07-Mar-2013 mpi

Remove unused ifa_ifwithaf() function.

ok mikeb@, miod@


# 1.138 07-Mar-2013 mpi

Remove the IFAFREE() macro, the ifafree() function it was calling already
check for the reference counter.

ok mikeb@, miod@, pelikan@, kettenis@, krw@


Revision tags: OPENBSD_5_3_BASE
# 1.137 23-Nov-2012 sthen

Add SIOCGIFHARDMTU to allow retrieving the driver's maximum supported MTU
looks fine reyk@ ok mikeb@


# 1.136 11-Nov-2012 deraadt

align ifaliasreq.ifra_addr similar to the way that ifreq is fixed --
a gruesome union, to block the compiler from placing the struct
incorrectly aligned on stack frames
ok guenther


# 1.135 05-Oct-2012 camield

Point an interface directly to its bridgeport configuration, instead
of to the bridge itself. This is ok, since an interface can only be part
of one bridge, and the parent bridge is easy to find from the bridgeport.

This way we can get rid of a lot of list walks, improving performance
and shortening the code.

ok henning stsp sthen reyk


# 1.134 19-Sep-2012 henning

defina an IFCAP_CSUM_MASK, covering IFCAP_CSUM_*, and use it in if_vlan.c
to replace the list of them.
this actually makes vlan inherit the IPv6 CSUM flags from it's parent, that
had been commented out since this code was committed back in 2001.
ok benno mpf


# 1.133 10-Sep-2012 guenther

Bring into compliance with POSIX, exposing just the specified bits.

Requested by jasper@, ok kettenis@


# 1.132 21-Aug-2012 bluhm

Reverse the name and meaning of the IFXF_INET6_PRIVACY interface
flag. It is now called IFXF_INET6_NOPRIVACY. So IPv6 privacy
addresses are on by default without resetting the flag during
ifconfig down/up.
OK stsp@, sperreault@ (who wrote the same diff)


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.131 02-Dec-2011 haesbaert

Kill unused IFCAP_IPSEC and IFCAP_IPCOMP.

ok claudio@ henning@ mikeb@


# 1.130 02-Nov-2011 haesbaert

Expose if_capabilities to userland so that ifconfig can display the
device hardware features.
Tune ifconfig to show them with 'hwfeatures' argument.
While here, kill some old unused capabilities and respect 80 columns
in brconfig.h.

ok mcbride@, henning@, mpf@.


# 1.129 07-Oct-2011 henning

rename some vars and functions
unfortunately altq is one giant namespace violation. rename just those that
conflict with new stuff for now only to be found on my laptop. reduce pain,
the diff is huge already. ok ryan


Revision tags: OPENBSD_5_0_BASE
# 1.128 08-Jul-2011 henning

new priority queueing implementation, extremely low overhead, thus fast.
unconditional, always on. 8 priority levels, as every better switch, the
vlan header etc etc. ok ryan mpf sthen, pea tested as well


# 1.127 07-Jul-2011 henning

provide IF_LEN and IFQ_LEN to access ifq_len on an ifqueue, ryan ok


# 1.126 05-Jul-2011 henning

now of course I only noticed if_qflush is completely unused after
adjusting it to the new world order in my tree... remove it, ok ryan claudio


# 1.125 03-Jul-2011 henning

IFQ_CLASSIFY is also just schrapnel


# 1.124 03-Jul-2011 henning

no traces of ALTQ_DECL to be found anywhere, thus kill the #defines


# 1.123 03-Jul-2011 claudio

LINK_STATE_IS_UP() should consider LINK_STATE_UNKNOWN as an up state.
This is now possible because carp no longer uses LINK_STATE_UNKNOWN
for a state that is considered down. This will simplify a lot of code.
OK mpf@ mcbride@ henning@


# 1.122 13-Mar-2011 stsp

Add a way to enable/disable Wake On LAN with ifconfig.
ok deraadt


Revision tags: OPENBSD_4_9_BASE
# 1.121 17-Nov-2010 henning

introduce ifa_update_broadaddr to update an ifaddr's broadcast address,
trivial for the moment, more needed soon
tested by many as part of a larger diff, ok sthen claudio dlg krw


# 1.120 24-Sep-2010 claudio

Implement if_freenameindex() as a real function as required by posix.
OK deraadt@, millert@


# 1.119 23-Sep-2010 dlg

tweak the mclgeti algorithm to behave better under load.

instead of letting hardware rings grow on every interrupt, restrict
it so it can only grow once per softclock tick. we can only punish
the rings on softclock ticks, so it make sense to only grow on
softclock tick boundaries too.

the rings are now punished after >1 lost softclock tick rather than
>2. mclgeti is now more aggressive at detecting livelock.

the rings get punished by an 8th, rather than by half.

we now allow the rings to be punished again even if the system is
already considered in livelock.

without this diff a livelocked system will have its rx ring sizes
scale up and down very rapidly, while holding the rings low for too
long. this affected throughput significantly.

discussed and tested heavily at j2k10. there are still some games
with softnet we can play, but this is a good first step.

"put it in" and ok deraadt@
ok claudio@ krw@ henning@ mcbride@

if we find out that it sucks we can pull it out again later. till then
we'll run with it and see how it goes.


# 1.118 27-Aug-2010 jsg

remove the unused if_init callback in struct ifnet
ok deraadt@ henning@ claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.117 26-Jun-2010 claudio

Implement a simple keepalive mechanism in gre(4) that is compatible with
the one used by Cisco. It sends a return gre packet inside a gre packet
to the other side and expects it to return.
OK deraadt, reyk additional testing by sthen


# 1.116 28-May-2010 claudio

Rework the way we handle MPLS in the kernel. Instead of fumbling MPLS into
ether_output() and later on other L2 output functions use a trick and over-
load the ifp->if_output() function pointer on MPLS enabled interfaces to
go through mpls_output() which will then call the link level output function.
By setting IFXF_MPLS on an interface the output pointers are switched.
This now allows to cleanup the MPLS input and output pathes and fix mpe(4)
so that the MPLS code now actually works for both P and PE systems.
Tested by myself and michele
(A custom kernel with MPLS and mpe enabled is still needed).


# 1.115 17-Apr-2010 deraadt

split SIOCSIFLLADDR code out into an ifnewlladr() function
ok stsp


# 1.114 06-Apr-2010 stsp

Simple implementation of RFC4941, "Privacy Extensions for Stateless
Address Autoconfiguration in IPv6". For those among us who are paranoid
about broadcasting their MAC address to the IPv6 internet.

Man page help from jmc, testing by weerd, arc4random API hints from djm.

ok deraadt, claudio


Revision tags: OPENBSD_4_7_BASE
# 1.113 13-Jan-2010 henning

maintain a global RB tree of all local addresses in the system. this
includes AF_LINK addresses (aka mac addresses in the ethernet case). for
inet this also includes the broadcast addresses.
depends on ifinit() called earlier so we have a chance to pool_init before
autoconf assigns the AF_LINK addresses, the v6 fix, and the ifa_add/del
abstraction i just committed.
this is a change in semantics, it is now illegal to change the actual
address in an ifaddr struct because then the RB tree becomes unbalanced.
nothing using this tree yet.
ok theo ryan dlg


# 1.112 13-Jan-2010 henning

instead of fiddling with the per-interface address lists directly in
many places create a proper API (ifa_add / ifa_del) and use it.
ok theo ryan dlg


# 1.111 12-Jan-2010 deraadt

Make the structures for ifa_msghdr and friends even more like
the route messages so that people and compilers will not get
confused.
ok claudio


# 1.110 17-Sep-2009 claudio

Remove the comaptibility structures for routing socket version 3.
The RTM_VERSION bump is 2 years ago and so there is no need for this.
Diff made by tedu@ some time ago but got never commited so I do it now.


# 1.109 14-Sep-2009 claudio

Add a way to convert the ifi_link_state to a string without the use of
if_media. This makes link state tracking a lot easier as there is no need
to convert if types to if_media types, etc. Additionally this allows us
to extend the link states to include states tracked on higher protocol layers.
gre(4) keepalives packets, bfd and udld can be implemented without ugly hacks.
OK henning, michele, sthen, deraadt


# 1.108 10-Aug-2009 deraadt

At sys_reboot time, bring all the interfaces down so that their xxstop
functions are called, which will turn off DMA. Receiving packets into
your memory after a system reboot is pretty nasty. This will also mean
that the shutdown hooks can go; this solution is smaller.
ok henning miod dlg kettenis


Revision tags: OPENBSD_4_6_BASE
# 1.107 06-Jun-2009 rainer

when xflags got changed, tell the userland by routing sockets

ok henning@


# 1.106 05-Jun-2009 claudio

Initial support for routing domains. This allows to bind interfaces to
alternate routing table and separate them from other interfaces in distinct
routing tables. The same network can now be used in any doamin at the same
time without causing conflicts.
This diff is mostly mechanical and adds the necessary rdomain checks accross
net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6.
input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@


# 1.105 04-Jun-2009 henning

allow IPvShit to be turned off completely per-interface.
ifconfig em0 -inet6
deletes all v6 addresses including link-local and prevents new ones from
being added.
ifconfig em0 inet6 <addr>
re-enables v6, brings the link local back and adds optional <addr>
ok theo reyk


# 1.104 03-Jun-2009 beck

make wireless interfaces priority 4 by default. other interfaces remain
priority 0. while we are in here make sure we add wi interfaces to group "wlan"
in the same way the net80211 stuff already is.

this makes dhcp multiple default routes useful on laptops.

ok claudio@


Revision tags: OPENBSD_4_5_BASE
# 1.103 27-Jan-2009 dlg

make drivers tell the mclgeti allocator what their maximum ring size is
to prevent the hwm growing beyond that. this allows the livelock mitigation
to do something where the hwm used to grow beyond twice the rx rings size.

ok kettenis@ claudio@


# 1.102 12-Dec-2008 claudio

Introduce a if_priority that will be added to RTP_STATIC when routes are
added without an expilict priority. This allows to specify less prefered
interfaces that will only take over if the primary interface loses link.
OK deraadt@


# 1.101 11-Dec-2008 deraadt

export per-interface mbuf cluster pool use statistics out to userland
inside if_data, so that netstat(1) and systat(1) can see them
ok dlg


# 1.100 30-Nov-2008 brad

- Remove unused if_reset "bus reset routine" field in the ifnet struct.
- Add if_stop "stop routine" field in the ifnet struct.

ok mglocker@


# 1.99 26-Nov-2008 dlg

provide m_clsetlwm, an interface for an interface to raise its low
watermark for mbuf cluster allocations.

this is necessary for things like bge which cannot cope with less than a
certain number of pkts on the ring.

ok deraadt@


# 1.98 25-Nov-2008 deraadt

Factor increases are not needed, +1 appears to work as well.
ok dlg


# 1.97 24-Nov-2008 deraadt

move MCLPOOLS to if.h and force uipc_mbuf.c to get if.h, there is no
other option
ok dlg


# 1.96 24-Nov-2008 dlg

add several backend pools to allocate mbufs clusters of various sizes out
of. currently limited to MCLBYTES (2048 bytes) and 4096 bytes until pools
can allocate objects of sizes greater than PAGESIZE.

this allows drivers to ask for "jumbo" packets to fill rx rings with.

the second half of this change is per interface mbuf cluster allocator
statistics. drivers can use the new interface (MCLGETI), which will use
these stats to selectively fail allocations based on demand for mbufs. if
the driver isnt rapidly consuming rx mbufs, we dont allow it to allocate
many to put on its rx ring.

drivers require modifications to take advantage of both the new allocation
semantic and large clusters.

this was written and developed with deraadt@ over the last two days
ok deraadt@ claudio@


# 1.95 07-Nov-2008 deraadt

give this some /* CONSTCOND */ love


Revision tags: OPENBSD_4_4_BASE
# 1.94 10-Apr-2008 dlg

introduce mitigation for the calling of an interfaces start routine.

decent drivers prefer to have a lot of packets on the send queue so they
can queue a lot of them up on the tx ring and then post them all in one
big chunk. unfortunately our stack queues one packet onto the send queue
and then calls the start handler immediately.

this mitigates against that queue, send, queue, send behaviour by trying to
call the start routine only once per softnet. now its queue, queue, queue,
send.

this is the result of a lot of discussion with claudio@
tested by many.


Revision tags: OPENBSD_4_3_BASE
# 1.93 18-Nov-2007 mpf

Sync struct ifaltq to match struct ifqueue.
I wonder why 64-bit archs have not been bitten by this.
OK mcbride@, henning@


# 1.92 03-Sep-2007 claudio

Bump RTM_VERSION to 4 and start a new aera of routing in OpenBSD :)
Changes include 64bit counters instead of u_long, routing table id in the header
of most messages, reserved routing priority field, added a hdrlen field to skip
over the header so that binary compatibility becomes easier.
A minimal backward support for old binaries is included to ease upgrades but
don't expect anything more than ifconfig, route and dhclient to correctly work.
OK henning@ mglocker@


Revision tags: OPENBSD_4_2_BASE
# 1.91 25-Jun-2007 henning

crank ifq_maxlen from 50 to 256, so it is not smaller than most interfaces
rx rings any more. forwarding boxes with many fast interfaces can still use
some more, but this is a saner default.
ok deraadt markus henric


# 1.90 14-Jun-2007 reyk

Add a new "rtlabel" option to ifconfig. It allows to specify a route label
which will be used for new interface routes. For example,
ifconfig em0 10.1.1.0 255.255.255.0 rtlabel RING_1
will set the new interface address and attach the route label RING_1 to
the corresponding route.

manpage bits from jmc@
ok claudio@ henning@


# 1.89 29-May-2007 uwe

Define IF_ENQUEUE() and friends as proper C statements using do ... while
ok henning


# 1.88 26-May-2007 jason

one extern seems to be better than 20 for ifqmaxlen; ok krw


# 1.87 27-Mar-2007 jmc

grammar from bret lambert, and one more from me;


Revision tags: OPENBSD_4_1_BASE
# 1.86 09-Feb-2007 jmc

grammar fix from bret lambert;


# 1.85 28-Nov-2006 reyk

add additional link states to report the half duplex / full duplex
state, if known by the driver. this is required to check the full
duplex state without depending on the ifmedia ioctl which can't be
called in the kernel without process context.

ok henning@, brad@


# 1.84 16-Nov-2006 henning

introduce if_creategroup() to create an empty interface group.
code factored out from if_addgroup(), previously a group always had to have
members. ok mpf mcbride


# 1.83 31-Oct-2006 jason

ether_input_mbuf() isn't necessary, turn it into a macro and deal with
it's "special" case in ether_input(). Based on similiar idea in FreeBSD.
ok brad


Revision tags: OPENBSD_4_0_BASE
# 1.82 02-Jun-2006 mpf

Introduce attributes to interface groups.
As a first user, move the global carp(4) demotion counter
into the interface group. Thus we have the possibility
to define which carp interfaces are demoted together.

Put the demotion counter into the reserved field of the carp header.
With this, we can have carp act smarter if multiple errors occur.
It now always takes over other carp peers, that are advertising
with a higher demote count. As a side effect, we can also have
group failovers without the need of running in preempt mode.
The protocol change does not break compability with older
implementations.

Collaborative work with mcbride@

OK mcbride@, henning@


# 1.81 27-May-2006 brad

remove IFCAP_JUMBO_MTU interface capabilities flag and set if_hardmtu in a few
more drivers.

ok reyk@


# 1.80 26-May-2006 deraadt

rename jumbo mtu to if_hardmtu; ok brad reyk


# 1.79 19-May-2006 reyk

add a if_jumbo_mtu field to the interface structure for drivers
supporting ethernet jumbo frames. there's no standard for the size of
jumbo MTUs, so either let the driver set it's own value or use 9000
byte jumbo frames by default.

ok brad@


# 1.78 04-Mar-2006 brad

With the exception of two other small uncommited diffs this moves
the remainder of the network stack from splimp to splnet.

ok miod@


Revision tags: OPENBSD_3_9_BASE
# 1.77 09-Feb-2006 reyk

add an interface detach hook and use it with the vlan(4) driver. this
fixes a possible crash if the parent interface has been destroyed
(like vlan on trunk) before destroying the vlan interface.

ok brad@


Revision tags: OPENBSD_3_8_BASE
# 1.76 14-Jun-2005 henning

rename function and define to reflect the external -> egress name change
so it is clear what it is all about


# 1.75 14-Jun-2005 henning

use "egress" instead of "external" for the interface group containing the
interfaces the default route(s) point to, proposed deraadt some days ago,
ok djm deraadt


# 1.74 12-Jun-2005 henning

add SIOCGIFGMEMB ioctl, returns a list of all interfaces who are member of
the given group, markus ok


# 1.73 07-Jun-2005 henning

introduce a default "external" interface group, containing the interface(s)
the the default route(s) point to.
handles IPv4 and IPv6 as well as multipath routes.
follows default route changes, of course.
eases writing pf rulesets especially on laptops etc. that use different
interfaces depending on the environment (wired, wireless, ...)
ok theo ryan


# 1.72 06-Jun-2005 henning

use a define instead of hardcoding "all" in 3 places


# 1.71 05-Jun-2005 henning

const'ify the char *groupname param to if_addgroup and if_delgroup


# 1.70 24-May-2005 markus

add net.inet.ip.ifq for monitoring and changing ifqueue; similar to netbsd
ok henning


# 1.69 24-May-2005 reyk

initial import of a trunking (link aggregation and link failover)
implementation. it currently supports round robin mode with link state
checking, additional modes will be added later.

ok brad@, deraadt@


# 1.68 24-May-2005 henning

keep a list of member interfaces in ifg_group


# 1.67 22-May-2005 henning

allow pf to match on interface groups
pass on mygroup ...
markus ok


# 1.66 21-May-2005 henning

clean up and rework the interface absraction code big time, rip out multiple
useless layers of indirection and make the code way cleaner overall.
this is just the start, more to come...
worked very hard on by Ryan and me in Montreal last week, on the airplane to
vancouver and yesterday here in calgary. it hurt.
ok ryan theo


# 1.65 20-Apr-2005 mpf

Introduce if_linkstatehooks.
This converts if_link_state_change() to a generic usable
callback with dohooks().

OK henning@, camield@
Tested by camield@ and Alexey E. Suslikov


Revision tags: OPENBSD_3_7_BASE
# 1.64 07-Feb-2005 mcbride

Add new function if_link_state_change() to take care of sending messages
on the routing socket and notifying carp() of link changes.

ok brad@ mpf@


# 1.63 14-Jan-2005 henning

remove old ifgroups ioctls
the old ifgroups haven't been in use ever really, and the new
implementation is 3 months old today. theo ok (3 months ago)


# 1.62 07-Dec-2004 mcbride

Convert carp(4) to behave more like a regular interface, much in the same
style as vlan(4). carp interfaces no longer require the physical interface
to be on the same subnet as the carp interface, or even that the physical
interface has an adress at all, so CARP can now be used on /30 networks.

ok deraadt@ henning@


# 1.61 07-Dec-2004 mcbride

KNF


# 1.60 03-Dec-2004 henning

do not use one struct timeout for the if congestion stuff, but embed
a struct timeout to struct ifqueue so that each one has its own - it
is a per-queue thing. from chris pascoe


# 1.59 10-Nov-2004 grange

Safer IF_INPUT_ENQUEUE macro.

ok millert@


# 1.58 14-Oct-2004 mickey

avoid stupid commons


# 1.57 11-Oct-2004 henning

ifgroups reqrite
there is now a TAILQ with all interface groups as members, and
in struct ofnet there is only a pointer to the group structure stored
and not its name.
mostly hacked at c2k4 and somewhere over the atlantic ocean
ok markus mcbride


Revision tags: OPENBSD_3_6_BASE
# 1.56 26-Jun-2004 markus

cleanup ioctl for ifgroups; ok pb@


# 1.55 25-Jun-2004 pb

introduce "interface groups"

by "ifconfig fxp0 group foobar" "ifconfig xl0 group foobar"
these two interfaces are in one group.
Every interface has its if-family as default group.

idea/design from henning@, based on some work/disucssion from Joris Vink.

henning@, mcbride@ ok.


# 1.54 20-Jun-2004 beck

undo mbuf cluster breakage that causes free'ed packets to show up on the
input queues when using dhcp and hostap wi, or xl, or fxp....
ok art@


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.53 29-May-2004 jcs

introduce SIOCSIFDESCR and SIOCGIFDESCR to maintain interface
descriptions, configurable with ifconfig

help from various, ok deraadt@


# 1.52 18-May-2004 brad

if_ether.h
add ETHER_MAX_LEN_JUMBO, ETHER_VLAN_ENCAP_LEN, ETHER_ALIGN, and
ETHERMTU_JUMBO constants.

if.h
add a few more interface capabilities flags.

Some from NetBSD, some from FreeBSD.

ok markus@


# 1.51 26-Apr-2004 mcbride

Before enqueueing the packet, copy the contents of incoming clusters
to the mbuf and free the cluster when it contains a small packet.

ok deraadt@


# 1.50 17-Apr-2004 henning

add a congestion indicator to if_queue. It is set when the input queue
is full, along with a timer that unsets it again after 10ms.
The input queue beeing full is a reliable indicator for CPU overload, and
this flag allows other subsystems to cope with the situation.
hacked with beck
ok kjc@ markus@ beck@


Revision tags: OPENBSD_3_5_BASE
# 1.49 15-Jan-2004 markus

add a RTM_IFANNOUNCE message; from netbsd; ok itojun, henning


# 1.48 16-Dec-2003 markus

return error in ifc_destroy; ok deraadt, itojun, cedric, hshoexer


# 1.47 10-Dec-2003 itojun

use if_indexlim (instead of if_index) and ifindex2ifnet[x] != NULL
to check if interface exists, as (1) if_index will have different meaning
(2) ifindex2ifnet could become NULL when interface gets destroyed,
when we introduce dynamically-created interfaces. markus ok


# 1.46 08-Dec-2003 markus

add IOCIFGCLONERS; ifconfig -C; from netbsd; ok henning, deraadt


# 1.45 03-Dec-2003 markus

support for network interface "cloning", e.g. gif(4) via ifconfig(8)


# 1.44 19-Oct-2003 david

more typos


# 1.43 17-Oct-2003 mcbride

Common Address Redundancy Protocol

Allows multiple hosts to share an IP address, providing high availability
and load balancing.

Based on code by mickey@, with additional help from markus@
and Marco_Pfatschbacher@genua.de

ok deraadt@


Revision tags: OPENBSD_3_4_BASE
# 1.42 25-Aug-2003 fgsch

if_init support, required by ieee80211.
deraadt@ ok.


# 1.41 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_A UBC_SYNC_B
# 1.40 03-Jul-2002 miod

Change all variables definitions (int foo) in sys/sys/*.h to variable
declarations (extern int foo), and compensate in the appropriate locations.


# 1.39 30-Jun-2002 itojun

allocate sockaddr_dl for ifnet in if_alloc_sadl(), as we don't always know
the size of sockaddr_dl on if_attach() - for instance, see ether_ifattach().
from netbsd. fgs ok


# 1.38 23-Jun-2002 itojun

g/c last remains of old ipv6 prefix management


# 1.37 27-May-2002 itojun

if_attach() gets called before domaininit(). scan all interfaces for if_afdata
initialization after domaininit().


# 1.36 27-May-2002 itojun

framework to add af-dependent data structure to struct ifnet.
as discussed at bsd-api-discuss. sync w/kame


# 1.35 24-Apr-2002 dhartmei

Add hooks to struct ifnet that allow to register callbacks that will be
notified of interface address changes. ok provos@, angelos@


Revision tags: OPENBSD_3_1_BASE
# 1.34 15-Mar-2002 millert

Cosmetic changes only, primarily making comments line up nicely after the
__P removal.


# 1.33 14-Mar-2002 millert

First round of __P removal in sys


# 1.32 23-Jan-2002 fgsch

compatability -> compatibility.


Revision tags: OPENBSD_3_0_BASE UBC_BASE
# 1.31 05-Jul-2001 angelos

branches: 1.31.4;
KNF


# 1.30 05-Jul-2001 jjbg

Include files for IPComp support. angelos@ ok.


# 1.29 27-Jun-2001 kjc

ALTQ base modifications to the kernel.
- ALTQ introduces a set of new queue macros that coexist with the
traditional IF_XXX macros.
- "struct ifaltq" replaces "struct ifqueue" in "struct ifnet".
- assign cdev major 74 for i386 and 54 for alpha as ALTQ control interface.


# 1.28 23-Jun-2001 fgsch

Add ether_input_mbuf to help us remove the ether_header from
ether_input; all drivers should start migrating to this.
Discussed with jason@, deraadt@ more or les ok'ed.


# 1.27 15-Jun-2001 itojun

change the meaning of ifnet.if_lastchange to meet RFC1573 ifLastChange.
follows BSD/OS practice and ucd-snmp code (FreeBSD does it for specific
interfaces only).

was: if_lastchange get updated on every packet transmission/receipt.
now: if_lastchange get updated when IFF_UP is changed.


# 1.26 09-Jun-2001 angelos

By popular demand, protect from multiple inclusion, and fix to use the
same naming style.


# 1.25 28-May-2001 angelos

IPSECv4 -> IPSEC


# 1.24 28-May-2001 angelos

No need for separate ESP/AH interface capabilities.


# 1.23 28-May-2001 angelos

Interface capabilities (based on NetBSD, but merge ethercom and ifnet
capabilities into one, in the ifp).


Revision tags: OPENBSD_2_9_BASE
# 1.22 06-Feb-2001 mickey

allow changing number of loopbacks in ukc.
change rest of the code to use lo0ifp pointing
to the corresponding struct ifnet.
itojun@ and niklas@ ok


# 1.21 19-Jan-2001 itojun

pull post-4.4BSD change to sys/net/route.c from BSD/OS 4.2 (UCB copyrighted).

have sys/net/route.c:rtrequest1(), which takes rt_addrinfo * as the argument.
pass rt_addrinfo all the way down to rtrequest, and ifa->ifa_rtrequest.
3rd arg of ifa->ifa_rtrequest is now rt_addrinfo * instead of sockaddr *
(almost noone is using it anyways).

benefit: the follwoing command now works. previously we need two route(8)
invocations, "add" then "change".
# route add -inet6 default ::1 -ifp gif0

remove unsafe typecast in rtrequest(), from rtentry * to sockaddr *. it was
introduced by 4.3BSD-reno and never corrected.

XXX is eon_rtrequest() change correct regarding to 3rd arg?
eon_rtrequest() and rtrequest() were incorrect since 4.3BSD-reno,
so i do not have correct answer in the source code.
someone with more clue about netiso-over-ip, please help.


Revision tags: OPENBSD_2_8_BASE
# 1.20 20-Sep-2000 art

Since ifa_refcnt was bumped to an int and rt_flags is an int too, bump
ifa_flags to int.


# 1.19 28-Aug-2000 deraadt

changing the size of if_data has heavy impact on userland compat. there
was even a u_char slot available for ifi_link_state, which clearly does not
need a full 32 bits.


# 1.18 26-Aug-2000 nate

sync mii code with netbsd
adds detach functionality for phys
some code cleanup

Nobody really had time to test all of this out, but theo said commit anyway


Revision tags: OPENBSD_2_7_BASE
# 1.17 22-Mar-2000 itojun

remove if_withname(), which was imported during KAME merge by mistake.


# 1.16 21-Mar-2000 mickey

add SIOCGIFMTU/SIOCSIFMTU; remediate redundant code of tun, ppp, sppp; chris@ ok


Revision tags: SMP_BASE
# 1.15 02-Feb-2000 itojun

branches: 1.15.2;
wrap IFAFREE() by "do {} while (0)". it wasn't safe enough.


Revision tags: kame_19991208
# 1.14 08-Dec-1999 itojun

bring in KAME IPv6 code, dated 19991208.
replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support.
see sys/netinet6/{TODO,IMPLEMENTATION} for more details.

GENERIC configuration should work fine as before. GENERIC.v6 works fine
as well, but you'll need KAME userland tools to play with IPv6 (will be
bringed into soon).


Revision tags: OPENBSD_2_6_BASE
# 1.13 08-Aug-1999 niklas

Support detaching of network interfaces. Still work to do in ipf, and
other families than inet.


# 1.12 23-Jun-1999 cmetz

Added some protocol independent interfaces (supposedly IPv6 support APIs, but
ones that are useful for all protocols, not just IPv6).


Revision tags: OPENBSD_2_5_BASE
# 1.11 13-Mar-1999 deraadt

make ifa_refcnt a u_int; andrewb@demon.net


# 1.10 26-Feb-1999 jason

Ethernet bridge/IP firewall driver.


# 1.9 07-Jan-1999 deraadt

fix IFAFREE() to be safe for if/else nesting


Revision tags: OPENBSD_2_4_BASE
# 1.8 03-Sep-1998 jason

o OpenBSD gets if_media support (from NetBSD)
o rework/simplify if_xl to use it


Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE OPENBSD_2_2_BASE OPENBSD_2_3_BASE
# 1.7 02-Jul-1996 niklas

-Wall & -Wstrict-prototype fixes


# 1.6 29-Jun-1996 deraadt

provide if_attachhead(), and make if_loop use it


# 1.5 10-May-1996 deraadt

if_name/if_unit -> if_xname/if_softc


# 1.4 06-May-1996 mickey

if.h was missed from the commit.
if_ethersubr.c: missed variables added.


# 1.3 05-Mar-1996 mickey

Changes for ifconfig to compile.


# 1.2 03-Mar-1996 niklas

From NetBSD: 960217 merge


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.209 27-Jun-2022 jan

Introduce Large Receive Offloading of TCP segment offloading for ix(4). It is
disabled by default. Also add a tso option to ifconfig(8) to enable and
disable this feature.

ok deraadt


Revision tags: OPENBSD_6_9_BASE OPENBSD_7_0_BASE OPENBSD_7_1_BASE
# 1.208 11-Mar-2021 florian

When RFC 8981 obsoleted RFC 4941 the terminology changed from
"privacy extensions" to "temporary address extensions"

Change ifconfig(8) to output temporary after temporary addresses and
add "temporary" option which is an alias for autoconfprivacy for now.

Also make AUTOCONF6TEMP a positiv flag that is set by default.
Previously the negative flag "INET6_NOPRIVACY" was set when privacy
addresses were disabled. This makes the flags output less ugly and
will allow us to disable autoconf addresses while having temporary
addresses enabled in the future.

More work is needed in slaacd.

input benno, jmc, deraadt
previous verison OK benno
OK jmc, kn


# 1.207 20-Feb-2021 dlg

add a MONITOR flag to ifaces to say they're only used for watching packets.

an example use of this is when you have a span port on a switch and
you want to be able to see the packets coming out of it with tcpdump,
but do not want these packets to enter the network stack for
processing. this is particularly important if the span port is
pushing a copy of any packets related to the machine doing the
monitoring as it will confuse pf states and the stack.

ok benno@


# 1.206 01-Feb-2021 mvs

ifunit() was fully replaced by if_unit(9) and should go away.

ok bluhm@ dlg@


# 1.205 18-Jan-2021 mvs

Introduce new function if_unit(9). This function returns a pointer the
interface descriptor corresponding to the unique name. This descriptor
is guaranteed to be valid until if_put(9) is called on the returned
pointer. if_unit(9) should replace already existent ifunit() which
returns descriptor not safe for dereference when context was switched.
This allow us to avoid some use-after-free issues in ioctl(2) path.
Also this unifies interface descriptor usage.

ok claudio@ sashan@


# 1.204 30-Sep-2020 mvs

We have no if_attachtail() function so remove the declaration.

ok deraadt@ claudio@


Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE OPENBSD_6_8_BASE
# 1.203 25-Jul-2019 krw

AF_INET comes before AF_INET6. Shorten line to <80 chars.

pointed out by claudio@


# 1.202 25-Jul-2019 krw

Add IFXF_AUTOCONF4 to if_xflags to match IFXF_AUTOCONF6. Let
ifconfig set/unset it.

ok deraadt@ kmos@


# 1.201 19-Apr-2019 dlg

add IF_HDRPRIO_OUTER for rxprio

IF_HDRPRIO_OUTER says you want the priority from the outer encap header.

ok claudio@


Revision tags: OPENBSD_6_5_BASE
# 1.200 10-Apr-2019 dlg

add struct if_sffpage so userland can read a page of sfp/qsfp info

this will be used by ifconfig so it can show you things like who
mades what module you're using and when it was made, and on some
modules you get diag info like temperature, vcc, and rx and tx
powers.

im putting the kernel side in so we can keep fiddling with the
userland printf side of things.

this work was done based on a question by rachel roch
ok deraadt@
enthusiasm from many including mikeb@ sthen@ patrick@


# 1.199 23-Jan-2019 dlg

add a SIOCGPWE3 ioctl for interfaces to advertise pwe3 capability

im going to turn mpw into an ethernet interface, which includes
changing its if_type to IFT_ETHER. currently ldpd looks for if_type
IFT_MPLSTUNNEL to decide if an interface is a pseudowire, ie, it's
going to break. the ioctl will let ldpd ask the interface if it is
pseudowire capable as an alternative.

ok claudio@


# 1.198 12-Nov-2018 dlg

add ifreq bits for the tx header prio field ioctls

a tx header prio can set to a fixed value from 0 to 7, or magic
values to represent populating the prio field from the encapsulated
packet, or from the mbuf prio value.

ok claudio@


# 1.197 12-Nov-2018 krw

Add new routing socket message RTM_80211INFO to provide details of
802.11 interface state changes (e.g. SSID) to interested parties.

Original diff from phessler@. Many suggestions and tweaks from
claudio@, stsp@, anton@.

ok claudio@ stsp@ anton@ phessler@


# 1.196 11-Nov-2018 dlg

use the llprio on gre(4) and eoip(4) interfaces for the keepalive tos

llprios are valued 0 to 7, while the ip tos/dscp/tclass is an 8 bit
value. fortunately the high 3 bits map nicely to the llprio values,
so we shift the llprio into place when generating the keepalive
frames. the llprio is defaulted to the value that cisco uses for
their gre keepalives.


Revision tags: OPENBSD_6_4_BASE
# 1.195 12-Sep-2018 krw

Fix obvious cut&pasto in comment (ifa_msghdr -> if_announcemsghdr).

ok claudio@


# 1.194 30-May-2018 dlg

restrict the prio values from SIOCSIFLLPRIO to what the kernel handles

previously the ioctl code checked that prio was an int less than
UCHAR_MAX, but the rest of the kernel (and priq code in particular)
expects it to be between 0 and 7 inclusive.

ok krw@ tb@


# 1.193 25-Apr-2018 jca

Make this header standalone #if __BSD_VISIBLE, by including needed headers

Puts us in line with Free/NetBSD and Linux and will get us rid of
pointless patches in the ports tree. ok guenther@ deraadt@


Revision tags: OPENBSD_6_3_BASE
# 1.192 19-Feb-2018 dlg

tunneldf needs ifr_df


# 1.191 10-Feb-2018 florian

Implement RFC 7217: "A Method for Generating Semantically Opaque
Interface Identifiers with IPv6 Stateless Address Autoconfiguration."

"An IPv6 address configured using this method is stable within each
subnet, but the corresponding Interface Identifier changes when the
host moves from one network to another. This method is meant to be an
alternative to generating Interface Identifiers based on hardware
addresses."

OK naddy, sthen


# 1.190 16-Jan-2018 mpi

Recycle IFF_NOTRAILERS into IFF_STATICARP and document ownerhsip
of IFF* flags.

inputs from jmc@, ok bluhm@, visa@


# 1.189 21-Dec-2017 dlg

prototype if_attach_iqueues so drivers can configure multiple iqs.


# 1.188 09-Nov-2017 tb

The cmd argument of ifconf() has been unused since COMPAT_LINUX was
purged. Remove it and move the prototype to if.c since ifconf() is
not used outside of this file.

ok mpi


# 1.187 31-Oct-2017 sashan

- add one more softnet taskq
NOTE: code still runs with single softnet task. change definition of
SOFTNET_TASKS in net/if.c, if you want to have more than one softnet task

OK mpi@, OK phessler@


Revision tags: OPENBSD_6_1_BASE OPENBSD_6_2_BASE
# 1.186 24-Jan-2017 krw

A space here, a space there. Soon we're talking real whitespace
rectification.


# 1.185 24-Jan-2017 dlg

add support for multiple transmit ifqueues per network interface.

an ifq to transmit a packet is picked by the current traffic
conditioner (ie, priq or hfsc) by providing an index into an array
of ifqs. by default interfaces get a single ifq but can ask for
more using if_attach_queues().

the vast majority of our drivers still think there's a 1:1 mapping
between interfaces and transmit queues, so their if_start routines
take an ifnet pointer instead of a pointer to the ifqueue struct.
instead of changing all the drivers in the tree, drivers can opt
into using an if_qstart routine and setting the IFXF_MPSAFE flag.
the stack provides a compatability wrapper from the new if_qstart
handler to the previous if_start handlers if IFXF_MPSAFE isnt set.

enabling hfsc on an interface configures it to transmit everything
through the first ifq. any other ifqs are left configured as priq,
but unused, when hfsc is enabled.

getting this in now so everyone can kick the tyres.

ok mpi@ visa@ (who provided some tweaks for cnmac).


# 1.184 23-Jan-2017 mpi

Flag pseudo-interfaces as such in order to call add_net_randomness()
only once per packet.

Fix a regression introduced when if_input() started to be called by
every pseudo-driver.

ok claudio@, dlg@


# 1.183 23-Jan-2017 dlg

i botched the copyout to ifr->ifr_data in SIOCGIFDATA.

this lets pflogd run again.

rename if_data() to if_getdata() while here to make grepping for
things less noisy.

reported by jsg@
worked through with deraadt@


# 1.182 23-Jan-2017 dlg

merge the ifnet and ifqueue stats together when userland wants them.

a new if_data() function takes a pointer to ifnet and merges its
if_data and ifq statistics. it takes the ifq mutex around the reads
of the ifq stats so they get a consistent copy.

the ifnet and ifq stats are merged because some parts of the stack
still update the ifnet counters.

ok visa@ (on an earlier diff) mpi@ claudio@


# 1.181 12-Dec-2016 mpi

Remove most of the splsoftnet() recursions related to cloned interfaces.

inputs and ok bluhm@


# 1.180 27-Oct-2016 dlg

add a new pool for 2k + 2 byte (mcl2k2) clusters.

a certain vendor likes to make chips that specify the rx buffer
sizes in kilobyte increments. unfortunately it places the ethernet
header on the start of the rx buffer, which means if you give it a
mcl2k cluster, the ethernet header will not be ETHER_ALIGNed cos
mcl2k clusters are always allocated on 2k boundarys (cos they pack
into pages well). that in turn means the ip header wont be aligned
correctly.

the current workaround on these chips has been to let non-strict
alignment archs just use the normal 2k cluster, but use whatever
cluster can fit 2k + 2 on strict archs. that turns out to be the
4k cluster, meaning we waste nearly 2k of space on every packet.

properly aligning the ethernet header and ip headers gives a
performance boost, even on non-strict archs.


# 1.179 04-Sep-2016 reyk

Move code to change the rdomain of an interface from the ioctl switch case
to a new function if_setrdomain().

OK mpi@ henning@


# 1.178 03-Sep-2016 reyk

Add support for a multipoint-to-multipoint mode in vxlan(4). In this
mode, vxlan(4) must be configured to accept any virtual network
identifier with "vnetid any" and added to a bridge(4) or switch(4).
This way the driver will dynamically learn the tunnel endpoints and
their vnetids for the responses and can be used to dynamically bridge
between VXLANs. It is also being used in combination with switch(4)
and the OpenFlow tunnel classifiers.

With input from yasuoka@ goda@
OK deraadt@ dlg@


Revision tags: OPENBSD_6_0_BASE
# 1.177 10-Jun-2016 vgross

Add the "llprio" field to struct ifnet, and the corresponding keyword
to ifconfig.

"llprio" allows one to set the priority of packets that do not go through
pf(4), as the case is for arp(4) or bpf(4).

ok sthen@ mikeb@


# 1.176 02-Mar-2016 dlg

provide generic ioctls for managing an interfaces parent

in the future this will subsume the individual vlandev, carpdev,
pppoedev, foodev options for things like vlan, carp, pppoe, etc.

inspired by vnetid

ok mpi@ jmatthew@


Revision tags: OPENBSD_5_9_BASE
# 1.175 05-Dec-2015 deraadt

avoid an ugly wrap in a comment


# 1.174 03-Dec-2015 dlg

rework if_start to allow nics to provide an mpsafe start routine.

existing start routines will still be called under the kernel lock
and at IPL_NET.

mpsafe start routines will be serialised so only one instance of
each interfaces function will be running in the kernel at any point
in time. this guarantees packets will be dequeued in order, and the
start routines dont have to lock against themselves because if_start
does it for them.

the code to do that is based on the scsi runqueue code.

this also provides an if_start_barrier() function that should wait
until any currently running instances of if_start have finished.

a driver can opt in to the mpsafe if_start call by doing the following:

1. setting ifp->if_xflags = IFXF_MPSAFE
2. only calling if_start() instead of its own start routine
3. clearing IFF_RUNNING before calling if_start_barrier() on its way down
4. only using IFQ_DEQUEUE (not ifq_deq_begin/commit/rollback)

to simplify the implementation the tx mitigation code has been removed.

tested by several
ok mpi@ jmatthew@


# 1.173 20-Nov-2015 mpi

Keep if_ref() private, if_get() is what you want to use before if_put().

The thread detaching an interface will sleep until all references to this
interface have been released. So we decided to only keep references for
a short period of time.

Keeping if_ref() private will hopefully help preserve this goal as long
as it makes sense.

Calling if_get()/if_put() in the same function also allows us to make
use of static analysis tools (thanks jsg@!) to catch our errors.

ok dlg@


# 1.172 24-Oct-2015 reyk

Add pair(4), a vether-based virtual Ethernet driver to interconnect
rdomains and bridges on the local system. This can be used to route
through local rdomains, to create L2 devices (like trunks) between
them, and many other things.

Discussed with many, with input from mpi@
OK sthen@ phessler@ yasuoka@ mikeb@


# 1.171 23-Oct-2015 claudio

Introduce a new sysctl NET_RT_IFNAMES that returns only ifnames to ifindex
mappings. This will be used by if_nameindex(3), if_nametoindex(3) and
if_indextoname(3) soon to fix the issues in pledge because of inet6 link
local addressing.
OK mpi@ benno@ deraadt@
The libc version will follow soon so better start updating your kernels


# 1.170 23-Oct-2015 dlg

tweak the vnetid so it can be optional and therefore cleared/deleted.

the abstract vnetid is promoted to a uin32_t, and adds a SIOCDVNETID
ioctl so it can be cleared.

this is all because i set an assignment on implementing a virtual
network interface and the students got confused when vnetid 0 didnt
show up in ifconfig output.

the vnetid in the vxlan(4) protocol is optional, but the current
code confuses 0 with no vnetid being set. this makes it clear.

ok reyk@ who also simplified my diff


# 1.169 05-Oct-2015 uebayasi

Add ifi_oqdrops and its alias to struct if_data.

Necessary bumps in Ports will be handled by sthen@.

OK mpi@ dlg@


# 1.168 27-Sep-2015 stsp

Add if_setlladdr(), factored out from ifioctl(). Will be used by iwm(4) soon.
With suggestions from tedu@ and guenther@
ok kettenis@


# 1.167 11-Sep-2015 stsp

Make room for media types of the future. Extend the ifmedia word to 64 bits.
This changes numbers of the SIOCSIFMEDIA and SIOCGIFMEDIA ioctls and
grows struct ifmediareq.

Old ifconfig and dhclient binaries can still assign addresses, however
the 'media' subcommand stops working. Recompiling ifconfig and dhclient
with new headers before a reboot should not be necessary unless in very
special circumstances where non-default media settings must be used to
get link and console access is not available.

There may be some MD fallout but that will be cleared up later.

ok deraadt miod
with help and suggestions from several sharks attending l2k15


# 1.166 09-Sep-2015 dlg

introduce reference counts for interfaces (ie, struct ifnet *ifp).

if_get can get a reference to an ifp, but it never releases that
reference. this provides an if_put function that can be used to
decrement the refcount.

we cannot come up with a scheme for letting the network stack run on
one (or many) cpus while ioctls are pulling interfaces down on another
cpu without refcounts for the interfaces.

if_put is going in now so we can go through the stack and put the
necessary calls to it in, and then we'll backfill this implementation
to actually check the refcounts when the interface detaches.

ok mpi@ mikeb@ claudio@


# 1.165 30-Aug-2015 mpi

Use a global table for domains instead of building a list at run time.

As a side effect there's no need to run if_attachdomain() after the
list of domains has been built.

ok claudio@, reyk@


Revision tags: OPENBSD_5_8_BASE
# 1.164 07-Jun-2015 jsg

Introduce unhandled_af() for cases where code conditionally does
something based on an address family and later assumes one of the paths
was taken. This was initially just calls to panic until guenther
suggested a function to reduce the amount of strings needed.

This reduces the amount of noise with static analysers and acts
as a sanity check.

ok guenther@ bluhm@


# 1.163 18-May-2015 reyk

Move the rdomain from struct ifnet into struct if_data. This way it
will be exported to userland with the existing sysctl, getifaddrs()
and routing socket (if_msghdr.ifm_data) interfaces that expose
if_data. All programs and daemons - Apps - that call the
SIOCGIFRDOMAIN ioctl in a getifaddrs() loop or after receiving an
interface message on the routing socket can now remove the pointless
additional ioctl. In base, that could be: dhclient, isakmpd, dhcpd,
dhcrelay, ntpd, ospfd, ripd, ifconfig.

No ABI breakage because it uses a previously unused pad field in if_data.

OK mpi@ deraadt@


# 1.162 10-Apr-2015 mpi

Run detach hook and similar before cleaning up any other resource when
an interface is destroyed/removed. This way we can ensure pseudo-driver
changes done after attaching an interface are undone before detaching it.

Note: it is safe to call if_deactivate() multiple times as the interface
should not have any attached pseudo-interface after the first call.

ok deraadt@, dlg@


# 1.161 18-Mar-2015 dlg

remove the congestion handling from struct ifqueue.

its only used for the ip and ip6 network stack input queues, so it
seems unfair that every instance of ifqueue has to carry a pointer
around for this specific use case.

this moves the congestion marker to a kernel global. if we detect
that we're congested, we assume the whole system is busy and punish
all input queues.

marking a system as congested is done by setting the global to the
current value of ticks. as the system moves away from that value,
it moves away from being congested until the comparison fails.

written at s2k15
ok henning@ beck@ bluhm@ claudio@


Revision tags: OPENBSD_5_7_BASE
# 1.160 08-Feb-2015 mpi

Introduce if_input() a function to pass packets dequeued from a
recieving ring to the stack.

if_input() is at the moment a drop-in replacement for ether_input_mbuf()
but will let us stack pseudo-driver in a nice way in order to no longer
call ether_input() recursively.

ok pelikan@, reyk@, blambert@, henning@


# 1.159 06-Jan-2015 stsp

Remove the NOINET6 interface flag, a left-over from the times when IPv6
was enabled by default. Add AFATTACH/AFDETACH ioctls which enable/disable
an address family for an interface (currently used for IPv6 only).

New kernel needs new ifconfig for IPv6 configuration (address assignment
still works with old ifconfig making this easy to cross over).

Committing on behalf of henning@ who is currently lebensmittelvergiftet.
ok stsp, benno, mpi


# 1.158 05-Dec-2014 mpi

Explicitly include <net/if_var.h> instead of pulling it in <net/if.h>.

ok mikeb@, krw@, bluhm@, tedu@


Revision tags: OPENBSD_5_6_BASE
# 1.157 14-Jul-2014 dlg

now that receive ring accounting has been pulled out of the mbuf layer,
we can pull the space the mbuf layer used to do per interface accounting
out of struct if_data.

saves a hundredish bytes on every interface.

ok deraadt@ claudio@


# 1.156 11-Jul-2014 henning

introduce the IFXF_AUTOCONF6 interface flag which controls wether we
accept rtadvs on that interface. the global net.inet6.ip6.accept_rtadv
sysctl just doesn't cut it, even tho the spec wants that - but in their
little absurd world, a host just has one interface by definition anyway...
the sysctlgoes away.
lots of head scratching, brain cell elemination etc from bluhm benno stsp
florian, excitement from simon and todd, ok bluhm stsp benno florian


# 1.155 08-Jul-2014 dlg

introduce the if_rxr api. it is intended to pull the rx ring accounting
out of the mbuf layer, and break the assumption that an interface will
only have a single ring per mbuf cluster size.

mpi@ is ok with moving this forward


# 1.154 13-Jun-2014 mpi

Instead of updating all the cluster allocation water marks of all the
interfaces when the kernel is livelocked, only do it for the current
pool and defer the other updates.

This allow us to get rid of an interface list iteration in a critical
path.

Ridding the libc crank since this change introduce an ABI break.

ok claudio@


Revision tags: OPENBSD_5_5_BASE
# 1.153 21-Nov-2013 mikeb

split kernel parts of the if.h into a separate header file if_var.h
which allows us to modify ifnet structure in a relatively safe way;
discussed with deraadt, ok mpi


# 1.152 09-Nov-2013 dlg

ticks is compared against mcl_grown to see if time has elapsed since
the rx ring was last allowed to grow and then assigned to it. it
is erroneous to do this because mcl_grown is a u_int and ticks is an
int.

this makes mcl_grown an int, and follows the idiom in kern_timeout.c
of going "thing - ticks < diff", which better copes with ticks
wrapping around and being used to calculate relative intervals.

ok pirofti@ guenther@


# 1.151 01-Nov-2013 pelikan

keep net/hfsc.h away from userspace, except in pfctl

tested by naddy, ok deraadt


# 1.150 21-Oct-2013 benno

nuke comment. How soon is now?
"do it" deraadt@


# 1.149 19-Oct-2013 reyk

Bring back the if_detachhook. We're going to have more users now.

ok mpi@ henning@ benno@


# 1.148 13-Oct-2013 reyk

Import vxlan(4), the virtual extensible local area network tunnel
interface. VXLAN is a UDP-based tunnelling protocol for overlaying
virtualized layer 2 networks over layer 3 networks. The implementation
is based on draft-mahalingam-dutt-dcops-vxlan-04 and has been tested
with other implementations in the wild.

put it in deraadt@


# 1.147 12-Oct-2013 henning

new bandwidth shaping subsystem, kernel side
uses hfsc behind the scenes; altq stays in parallel for a migration phase.
if.h even more messy for the transition, but eventuelly it should become
readable...
looked over & tested by many, ok phessler sthen


# 1.146 17-Sep-2013 mpi

Change vlan(4) detach procedure to not use a hook but a list of vlans
on the parent interface. This is similar to what bridge(4), trunk(4)
or carp(4) are doing and allows us to get rid of the detachhook.

ok reyk@, mikeb@


# 1.145 28-Aug-2013 mpi

Remove unused argument from *rtrequest()

ok krw@, mikeb@


Revision tags: OPENBSD_5_4_BASE
# 1.144 20-Jun-2013 mpi

Revert previous and unbreak asr, the new include should be protected.

Reported by naddy@


# 1.143 20-Jun-2013 mpi

Allocate the various hook head descriptors as part of the ifnet
structure rather than doing various M_WAITOK allocations during
the *attach() functions, we always rely on them anyway.

ok mikeb@, uebayasi@


# 1.142 02-Apr-2013 mpi

Instead of storing the link-level address of every interface in a global
array indexed by interface numbers, add a new field to the interface
descriptor pointing to it.

claudio@ and todd@ like it, ok mikeb@


# 1.141 26-Mar-2013 mpi

Remove various read-only *maxlen variables and use IFQ_MAXLEN directly.

ok beck@, mikeb@


# 1.140 20-Mar-2013 mpi

Introduce if_get() to retrieve an interface descriptor pointer given
an interface index and replace all the redondant checks and accesses
to a global array by a call to this function.

With imputs from and ok bluhm@, mikeb@


# 1.139 07-Mar-2013 mpi

Remove unused ifa_ifwithaf() function.

ok mikeb@, miod@


# 1.138 07-Mar-2013 mpi

Remove the IFAFREE() macro, the ifafree() function it was calling already
check for the reference counter.

ok mikeb@, miod@, pelikan@, kettenis@, krw@


Revision tags: OPENBSD_5_3_BASE
# 1.137 23-Nov-2012 sthen

Add SIOCGIFHARDMTU to allow retrieving the driver's maximum supported MTU
looks fine reyk@ ok mikeb@


# 1.136 11-Nov-2012 deraadt

align ifaliasreq.ifra_addr similar to the way that ifreq is fixed --
a gruesome union, to block the compiler from placing the struct
incorrectly aligned on stack frames
ok guenther


# 1.135 05-Oct-2012 camield

Point an interface directly to its bridgeport configuration, instead
of to the bridge itself. This is ok, since an interface can only be part
of one bridge, and the parent bridge is easy to find from the bridgeport.

This way we can get rid of a lot of list walks, improving performance
and shortening the code.

ok henning stsp sthen reyk


# 1.134 19-Sep-2012 henning

defina an IFCAP_CSUM_MASK, covering IFCAP_CSUM_*, and use it in if_vlan.c
to replace the list of them.
this actually makes vlan inherit the IPv6 CSUM flags from it's parent, that
had been commented out since this code was committed back in 2001.
ok benno mpf


# 1.133 10-Sep-2012 guenther

Bring into compliance with POSIX, exposing just the specified bits.

Requested by jasper@, ok kettenis@


# 1.132 21-Aug-2012 bluhm

Reverse the name and meaning of the IFXF_INET6_PRIVACY interface
flag. It is now called IFXF_INET6_NOPRIVACY. So IPv6 privacy
addresses are on by default without resetting the flag during
ifconfig down/up.
OK stsp@, sperreault@ (who wrote the same diff)


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.131 02-Dec-2011 haesbaert

Kill unused IFCAP_IPSEC and IFCAP_IPCOMP.

ok claudio@ henning@ mikeb@


# 1.130 02-Nov-2011 haesbaert

Expose if_capabilities to userland so that ifconfig can display the
device hardware features.
Tune ifconfig to show them with 'hwfeatures' argument.
While here, kill some old unused capabilities and respect 80 columns
in brconfig.h.

ok mcbride@, henning@, mpf@.


# 1.129 07-Oct-2011 henning

rename some vars and functions
unfortunately altq is one giant namespace violation. rename just those that
conflict with new stuff for now only to be found on my laptop. reduce pain,
the diff is huge already. ok ryan


Revision tags: OPENBSD_5_0_BASE
# 1.128 08-Jul-2011 henning

new priority queueing implementation, extremely low overhead, thus fast.
unconditional, always on. 8 priority levels, as every better switch, the
vlan header etc etc. ok ryan mpf sthen, pea tested as well


# 1.127 07-Jul-2011 henning

provide IF_LEN and IFQ_LEN to access ifq_len on an ifqueue, ryan ok


# 1.126 05-Jul-2011 henning

now of course I only noticed if_qflush is completely unused after
adjusting it to the new world order in my tree... remove it, ok ryan claudio


# 1.125 03-Jul-2011 henning

IFQ_CLASSIFY is also just schrapnel


# 1.124 03-Jul-2011 henning

no traces of ALTQ_DECL to be found anywhere, thus kill the #defines


# 1.123 03-Jul-2011 claudio

LINK_STATE_IS_UP() should consider LINK_STATE_UNKNOWN as an up state.
This is now possible because carp no longer uses LINK_STATE_UNKNOWN
for a state that is considered down. This will simplify a lot of code.
OK mpf@ mcbride@ henning@


# 1.122 13-Mar-2011 stsp

Add a way to enable/disable Wake On LAN with ifconfig.
ok deraadt


Revision tags: OPENBSD_4_9_BASE
# 1.121 17-Nov-2010 henning

introduce ifa_update_broadaddr to update an ifaddr's broadcast address,
trivial for the moment, more needed soon
tested by many as part of a larger diff, ok sthen claudio dlg krw


# 1.120 24-Sep-2010 claudio

Implement if_freenameindex() as a real function as required by posix.
OK deraadt@, millert@


# 1.119 23-Sep-2010 dlg

tweak the mclgeti algorithm to behave better under load.

instead of letting hardware rings grow on every interrupt, restrict
it so it can only grow once per softclock tick. we can only punish
the rings on softclock ticks, so it make sense to only grow on
softclock tick boundaries too.

the rings are now punished after >1 lost softclock tick rather than
>2. mclgeti is now more aggressive at detecting livelock.

the rings get punished by an 8th, rather than by half.

we now allow the rings to be punished again even if the system is
already considered in livelock.

without this diff a livelocked system will have its rx ring sizes
scale up and down very rapidly, while holding the rings low for too
long. this affected throughput significantly.

discussed and tested heavily at j2k10. there are still some games
with softnet we can play, but this is a good first step.

"put it in" and ok deraadt@
ok claudio@ krw@ henning@ mcbride@

if we find out that it sucks we can pull it out again later. till then
we'll run with it and see how it goes.


# 1.118 27-Aug-2010 jsg

remove the unused if_init callback in struct ifnet
ok deraadt@ henning@ claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.117 26-Jun-2010 claudio

Implement a simple keepalive mechanism in gre(4) that is compatible with
the one used by Cisco. It sends a return gre packet inside a gre packet
to the other side and expects it to return.
OK deraadt, reyk additional testing by sthen


# 1.116 28-May-2010 claudio

Rework the way we handle MPLS in the kernel. Instead of fumbling MPLS into
ether_output() and later on other L2 output functions use a trick and over-
load the ifp->if_output() function pointer on MPLS enabled interfaces to
go through mpls_output() which will then call the link level output function.
By setting IFXF_MPLS on an interface the output pointers are switched.
This now allows to cleanup the MPLS input and output pathes and fix mpe(4)
so that the MPLS code now actually works for both P and PE systems.
Tested by myself and michele
(A custom kernel with MPLS and mpe enabled is still needed).


# 1.115 17-Apr-2010 deraadt

split SIOCSIFLLADDR code out into an ifnewlladr() function
ok stsp


# 1.114 06-Apr-2010 stsp

Simple implementation of RFC4941, "Privacy Extensions for Stateless
Address Autoconfiguration in IPv6". For those among us who are paranoid
about broadcasting their MAC address to the IPv6 internet.

Man page help from jmc, testing by weerd, arc4random API hints from djm.

ok deraadt, claudio


Revision tags: OPENBSD_4_7_BASE
# 1.113 13-Jan-2010 henning

maintain a global RB tree of all local addresses in the system. this
includes AF_LINK addresses (aka mac addresses in the ethernet case). for
inet this also includes the broadcast addresses.
depends on ifinit() called earlier so we have a chance to pool_init before
autoconf assigns the AF_LINK addresses, the v6 fix, and the ifa_add/del
abstraction i just committed.
this is a change in semantics, it is now illegal to change the actual
address in an ifaddr struct because then the RB tree becomes unbalanced.
nothing using this tree yet.
ok theo ryan dlg


# 1.112 13-Jan-2010 henning

instead of fiddling with the per-interface address lists directly in
many places create a proper API (ifa_add / ifa_del) and use it.
ok theo ryan dlg


# 1.111 12-Jan-2010 deraadt

Make the structures for ifa_msghdr and friends even more like
the route messages so that people and compilers will not get
confused.
ok claudio


# 1.110 17-Sep-2009 claudio

Remove the comaptibility structures for routing socket version 3.
The RTM_VERSION bump is 2 years ago and so there is no need for this.
Diff made by tedu@ some time ago but got never commited so I do it now.


# 1.109 14-Sep-2009 claudio

Add a way to convert the ifi_link_state to a string without the use of
if_media. This makes link state tracking a lot easier as there is no need
to convert if types to if_media types, etc. Additionally this allows us
to extend the link states to include states tracked on higher protocol layers.
gre(4) keepalives packets, bfd and udld can be implemented without ugly hacks.
OK henning, michele, sthen, deraadt


# 1.108 10-Aug-2009 deraadt

At sys_reboot time, bring all the interfaces down so that their xxstop
functions are called, which will turn off DMA. Receiving packets into
your memory after a system reboot is pretty nasty. This will also mean
that the shutdown hooks can go; this solution is smaller.
ok henning miod dlg kettenis


Revision tags: OPENBSD_4_6_BASE
# 1.107 06-Jun-2009 rainer

when xflags got changed, tell the userland by routing sockets

ok henning@


# 1.106 05-Jun-2009 claudio

Initial support for routing domains. This allows to bind interfaces to
alternate routing table and separate them from other interfaces in distinct
routing tables. The same network can now be used in any doamin at the same
time without causing conflicts.
This diff is mostly mechanical and adds the necessary rdomain checks accross
net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6.
input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@


# 1.105 04-Jun-2009 henning

allow IPvShit to be turned off completely per-interface.
ifconfig em0 -inet6
deletes all v6 addresses including link-local and prevents new ones from
being added.
ifconfig em0 inet6 <addr>
re-enables v6, brings the link local back and adds optional <addr>
ok theo reyk


# 1.104 03-Jun-2009 beck

make wireless interfaces priority 4 by default. other interfaces remain
priority 0. while we are in here make sure we add wi interfaces to group "wlan"
in the same way the net80211 stuff already is.

this makes dhcp multiple default routes useful on laptops.

ok claudio@


Revision tags: OPENBSD_4_5_BASE
# 1.103 27-Jan-2009 dlg

make drivers tell the mclgeti allocator what their maximum ring size is
to prevent the hwm growing beyond that. this allows the livelock mitigation
to do something where the hwm used to grow beyond twice the rx rings size.

ok kettenis@ claudio@


# 1.102 12-Dec-2008 claudio

Introduce a if_priority that will be added to RTP_STATIC when routes are
added without an expilict priority. This allows to specify less prefered
interfaces that will only take over if the primary interface loses link.
OK deraadt@


# 1.101 11-Dec-2008 deraadt

export per-interface mbuf cluster pool use statistics out to userland
inside if_data, so that netstat(1) and systat(1) can see them
ok dlg


# 1.100 30-Nov-2008 brad

- Remove unused if_reset "bus reset routine" field in the ifnet struct.
- Add if_stop "stop routine" field in the ifnet struct.

ok mglocker@


# 1.99 26-Nov-2008 dlg

provide m_clsetlwm, an interface for an interface to raise its low
watermark for mbuf cluster allocations.

this is necessary for things like bge which cannot cope with less than a
certain number of pkts on the ring.

ok deraadt@


# 1.98 25-Nov-2008 deraadt

Factor increases are not needed, +1 appears to work as well.
ok dlg


# 1.97 24-Nov-2008 deraadt

move MCLPOOLS to if.h and force uipc_mbuf.c to get if.h, there is no
other option
ok dlg


# 1.96 24-Nov-2008 dlg

add several backend pools to allocate mbufs clusters of various sizes out
of. currently limited to MCLBYTES (2048 bytes) and 4096 bytes until pools
can allocate objects of sizes greater than PAGESIZE.

this allows drivers to ask for "jumbo" packets to fill rx rings with.

the second half of this change is per interface mbuf cluster allocator
statistics. drivers can use the new interface (MCLGETI), which will use
these stats to selectively fail allocations based on demand for mbufs. if
the driver isnt rapidly consuming rx mbufs, we dont allow it to allocate
many to put on its rx ring.

drivers require modifications to take advantage of both the new allocation
semantic and large clusters.

this was written and developed with deraadt@ over the last two days
ok deraadt@ claudio@


# 1.95 07-Nov-2008 deraadt

give this some /* CONSTCOND */ love


Revision tags: OPENBSD_4_4_BASE
# 1.94 10-Apr-2008 dlg

introduce mitigation for the calling of an interfaces start routine.

decent drivers prefer to have a lot of packets on the send queue so they
can queue a lot of them up on the tx ring and then post them all in one
big chunk. unfortunately our stack queues one packet onto the send queue
and then calls the start handler immediately.

this mitigates against that queue, send, queue, send behaviour by trying to
call the start routine only once per softnet. now its queue, queue, queue,
send.

this is the result of a lot of discussion with claudio@
tested by many.


Revision tags: OPENBSD_4_3_BASE
# 1.93 18-Nov-2007 mpf

Sync struct ifaltq to match struct ifqueue.
I wonder why 64-bit archs have not been bitten by this.
OK mcbride@, henning@


# 1.92 03-Sep-2007 claudio

Bump RTM_VERSION to 4 and start a new aera of routing in OpenBSD :)
Changes include 64bit counters instead of u_long, routing table id in the header
of most messages, reserved routing priority field, added a hdrlen field to skip
over the header so that binary compatibility becomes easier.
A minimal backward support for old binaries is included to ease upgrades but
don't expect anything more than ifconfig, route and dhclient to correctly work.
OK henning@ mglocker@


Revision tags: OPENBSD_4_2_BASE
# 1.91 25-Jun-2007 henning

crank ifq_maxlen from 50 to 256, so it is not smaller than most interfaces
rx rings any more. forwarding boxes with many fast interfaces can still use
some more, but this is a saner default.
ok deraadt markus henric


# 1.90 14-Jun-2007 reyk

Add a new "rtlabel" option to ifconfig. It allows to specify a route label
which will be used for new interface routes. For example,
ifconfig em0 10.1.1.0 255.255.255.0 rtlabel RING_1
will set the new interface address and attach the route label RING_1 to
the corresponding route.

manpage bits from jmc@
ok claudio@ henning@


# 1.89 29-May-2007 uwe

Define IF_ENQUEUE() and friends as proper C statements using do ... while
ok henning


# 1.88 26-May-2007 jason

one extern seems to be better than 20 for ifqmaxlen; ok krw


# 1.87 27-Mar-2007 jmc

grammar from bret lambert, and one more from me;


Revision tags: OPENBSD_4_1_BASE
# 1.86 09-Feb-2007 jmc

grammar fix from bret lambert;


# 1.85 28-Nov-2006 reyk

add additional link states to report the half duplex / full duplex
state, if known by the driver. this is required to check the full
duplex state without depending on the ifmedia ioctl which can't be
called in the kernel without process context.

ok henning@, brad@


# 1.84 16-Nov-2006 henning

introduce if_creategroup() to create an empty interface group.
code factored out from if_addgroup(), previously a group always had to have
members. ok mpf mcbride


# 1.83 31-Oct-2006 jason

ether_input_mbuf() isn't necessary, turn it into a macro and deal with
it's "special" case in ether_input(). Based on similiar idea in FreeBSD.
ok brad


Revision tags: OPENBSD_4_0_BASE
# 1.82 02-Jun-2006 mpf

Introduce attributes to interface groups.
As a first user, move the global carp(4) demotion counter
into the interface group. Thus we have the possibility
to define which carp interfaces are demoted together.

Put the demotion counter into the reserved field of the carp header.
With this, we can have carp act smarter if multiple errors occur.
It now always takes over other carp peers, that are advertising
with a higher demote count. As a side effect, we can also have
group failovers without the need of running in preempt mode.
The protocol change does not break compability with older
implementations.

Collaborative work with mcbride@

OK mcbride@, henning@


# 1.81 27-May-2006 brad

remove IFCAP_JUMBO_MTU interface capabilities flag and set if_hardmtu in a few
more drivers.

ok reyk@


# 1.80 26-May-2006 deraadt

rename jumbo mtu to if_hardmtu; ok brad reyk


# 1.79 19-May-2006 reyk

add a if_jumbo_mtu field to the interface structure for drivers
supporting ethernet jumbo frames. there's no standard for the size of
jumbo MTUs, so either let the driver set it's own value or use 9000
byte jumbo frames by default.

ok brad@


# 1.78 04-Mar-2006 brad

With the exception of two other small uncommited diffs this moves
the remainder of the network stack from splimp to splnet.

ok miod@


Revision tags: OPENBSD_3_9_BASE
# 1.77 09-Feb-2006 reyk

add an interface detach hook and use it with the vlan(4) driver. this
fixes a possible crash if the parent interface has been destroyed
(like vlan on trunk) before destroying the vlan interface.

ok brad@


Revision tags: OPENBSD_3_8_BASE
# 1.76 14-Jun-2005 henning

rename function and define to reflect the external -> egress name change
so it is clear what it is all about


# 1.75 14-Jun-2005 henning

use "egress" instead of "external" for the interface group containing the
interfaces the default route(s) point to, proposed deraadt some days ago,
ok djm deraadt


# 1.74 12-Jun-2005 henning

add SIOCGIFGMEMB ioctl, returns a list of all interfaces who are member of
the given group, markus ok


# 1.73 07-Jun-2005 henning

introduce a default "external" interface group, containing the interface(s)
the the default route(s) point to.
handles IPv4 and IPv6 as well as multipath routes.
follows default route changes, of course.
eases writing pf rulesets especially on laptops etc. that use different
interfaces depending on the environment (wired, wireless, ...)
ok theo ryan


# 1.72 06-Jun-2005 henning

use a define instead of hardcoding "all" in 3 places


# 1.71 05-Jun-2005 henning

const'ify the char *groupname param to if_addgroup and if_delgroup


# 1.70 24-May-2005 markus

add net.inet.ip.ifq for monitoring and changing ifqueue; similar to netbsd
ok henning


# 1.69 24-May-2005 reyk

initial import of a trunking (link aggregation and link failover)
implementation. it currently supports round robin mode with link state
checking, additional modes will be added later.

ok brad@, deraadt@


# 1.68 24-May-2005 henning

keep a list of member interfaces in ifg_group


# 1.67 22-May-2005 henning

allow pf to match on interface groups
pass on mygroup ...
markus ok


# 1.66 21-May-2005 henning

clean up and rework the interface absraction code big time, rip out multiple
useless layers of indirection and make the code way cleaner overall.
this is just the start, more to come...
worked very hard on by Ryan and me in Montreal last week, on the airplane to
vancouver and yesterday here in calgary. it hurt.
ok ryan theo


# 1.65 20-Apr-2005 mpf

Introduce if_linkstatehooks.
This converts if_link_state_change() to a generic usable
callback with dohooks().

OK henning@, camield@
Tested by camield@ and Alexey E. Suslikov


Revision tags: OPENBSD_3_7_BASE
# 1.64 07-Feb-2005 mcbride

Add new function if_link_state_change() to take care of sending messages
on the routing socket and notifying carp() of link changes.

ok brad@ mpf@


# 1.63 14-Jan-2005 henning

remove old ifgroups ioctls
the old ifgroups haven't been in use ever really, and the new
implementation is 3 months old today. theo ok (3 months ago)


# 1.62 07-Dec-2004 mcbride

Convert carp(4) to behave more like a regular interface, much in the same
style as vlan(4). carp interfaces no longer require the physical interface
to be on the same subnet as the carp interface, or even that the physical
interface has an adress at all, so CARP can now be used on /30 networks.

ok deraadt@ henning@


# 1.61 07-Dec-2004 mcbride

KNF


# 1.60 03-Dec-2004 henning

do not use one struct timeout for the if congestion stuff, but embed
a struct timeout to struct ifqueue so that each one has its own - it
is a per-queue thing. from chris pascoe


# 1.59 10-Nov-2004 grange

Safer IF_INPUT_ENQUEUE macro.

ok millert@


# 1.58 14-Oct-2004 mickey

avoid stupid commons


# 1.57 11-Oct-2004 henning

ifgroups reqrite
there is now a TAILQ with all interface groups as members, and
in struct ofnet there is only a pointer to the group structure stored
and not its name.
mostly hacked at c2k4 and somewhere over the atlantic ocean
ok markus mcbride


Revision tags: OPENBSD_3_6_BASE
# 1.56 26-Jun-2004 markus

cleanup ioctl for ifgroups; ok pb@


# 1.55 25-Jun-2004 pb

introduce "interface groups"

by "ifconfig fxp0 group foobar" "ifconfig xl0 group foobar"
these two interfaces are in one group.
Every interface has its if-family as default group.

idea/design from henning@, based on some work/disucssion from Joris Vink.

henning@, mcbride@ ok.


# 1.54 20-Jun-2004 beck

undo mbuf cluster breakage that causes free'ed packets to show up on the
input queues when using dhcp and hostap wi, or xl, or fxp....
ok art@


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.53 29-May-2004 jcs

introduce SIOCSIFDESCR and SIOCGIFDESCR to maintain interface
descriptions, configurable with ifconfig

help from various, ok deraadt@


# 1.52 18-May-2004 brad

if_ether.h
add ETHER_MAX_LEN_JUMBO, ETHER_VLAN_ENCAP_LEN, ETHER_ALIGN, and
ETHERMTU_JUMBO constants.

if.h
add a few more interface capabilities flags.

Some from NetBSD, some from FreeBSD.

ok markus@


# 1.51 26-Apr-2004 mcbride

Before enqueueing the packet, copy the contents of incoming clusters
to the mbuf and free the cluster when it contains a small packet.

ok deraadt@


# 1.50 17-Apr-2004 henning

add a congestion indicator to if_queue. It is set when the input queue
is full, along with a timer that unsets it again after 10ms.
The input queue beeing full is a reliable indicator for CPU overload, and
this flag allows other subsystems to cope with the situation.
hacked with beck
ok kjc@ markus@ beck@


Revision tags: OPENBSD_3_5_BASE
# 1.49 15-Jan-2004 markus

add a RTM_IFANNOUNCE message; from netbsd; ok itojun, henning


# 1.48 16-Dec-2003 markus

return error in ifc_destroy; ok deraadt, itojun, cedric, hshoexer


# 1.47 10-Dec-2003 itojun

use if_indexlim (instead of if_index) and ifindex2ifnet[x] != NULL
to check if interface exists, as (1) if_index will have different meaning
(2) ifindex2ifnet could become NULL when interface gets destroyed,
when we introduce dynamically-created interfaces. markus ok


# 1.46 08-Dec-2003 markus

add IOCIFGCLONERS; ifconfig -C; from netbsd; ok henning, deraadt


# 1.45 03-Dec-2003 markus

support for network interface "cloning", e.g. gif(4) via ifconfig(8)


# 1.44 19-Oct-2003 david

more typos


# 1.43 17-Oct-2003 mcbride

Common Address Redundancy Protocol

Allows multiple hosts to share an IP address, providing high availability
and load balancing.

Based on code by mickey@, with additional help from markus@
and Marco_Pfatschbacher@genua.de

ok deraadt@


Revision tags: OPENBSD_3_4_BASE
# 1.42 25-Aug-2003 fgsch

if_init support, required by ieee80211.
deraadt@ ok.


# 1.41 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_A UBC_SYNC_B
# 1.40 03-Jul-2002 miod

Change all variables definitions (int foo) in sys/sys/*.h to variable
declarations (extern int foo), and compensate in the appropriate locations.


# 1.39 30-Jun-2002 itojun

allocate sockaddr_dl for ifnet in if_alloc_sadl(), as we don't always know
the size of sockaddr_dl on if_attach() - for instance, see ether_ifattach().
from netbsd. fgs ok


# 1.38 23-Jun-2002 itojun

g/c last remains of old ipv6 prefix management


# 1.37 27-May-2002 itojun

if_attach() gets called before domaininit(). scan all interfaces for if_afdata
initialization after domaininit().


# 1.36 27-May-2002 itojun

framework to add af-dependent data structure to struct ifnet.
as discussed at bsd-api-discuss. sync w/kame


# 1.35 24-Apr-2002 dhartmei

Add hooks to struct ifnet that allow to register callbacks that will be
notified of interface address changes. ok provos@, angelos@


Revision tags: OPENBSD_3_1_BASE
# 1.34 15-Mar-2002 millert

Cosmetic changes only, primarily making comments line up nicely after the
__P removal.


# 1.33 14-Mar-2002 millert

First round of __P removal in sys


# 1.32 23-Jan-2002 fgsch

compatability -> compatibility.


Revision tags: OPENBSD_3_0_BASE UBC_BASE
# 1.31 05-Jul-2001 angelos

branches: 1.31.4;
KNF


# 1.30 05-Jul-2001 jjbg

Include files for IPComp support. angelos@ ok.


# 1.29 27-Jun-2001 kjc

ALTQ base modifications to the kernel.
- ALTQ introduces a set of new queue macros that coexist with the
traditional IF_XXX macros.
- "struct ifaltq" replaces "struct ifqueue" in "struct ifnet".
- assign cdev major 74 for i386 and 54 for alpha as ALTQ control interface.


# 1.28 23-Jun-2001 fgsch

Add ether_input_mbuf to help us remove the ether_header from
ether_input; all drivers should start migrating to this.
Discussed with jason@, deraadt@ more or les ok'ed.


# 1.27 15-Jun-2001 itojun

change the meaning of ifnet.if_lastchange to meet RFC1573 ifLastChange.
follows BSD/OS practice and ucd-snmp code (FreeBSD does it for specific
interfaces only).

was: if_lastchange get updated on every packet transmission/receipt.
now: if_lastchange get updated when IFF_UP is changed.


# 1.26 09-Jun-2001 angelos

By popular demand, protect from multiple inclusion, and fix to use the
same naming style.


# 1.25 28-May-2001 angelos

IPSECv4 -> IPSEC


# 1.24 28-May-2001 angelos

No need for separate ESP/AH interface capabilities.


# 1.23 28-May-2001 angelos

Interface capabilities (based on NetBSD, but merge ethercom and ifnet
capabilities into one, in the ifp).


Revision tags: OPENBSD_2_9_BASE
# 1.22 06-Feb-2001 mickey

allow changing number of loopbacks in ukc.
change rest of the code to use lo0ifp pointing
to the corresponding struct ifnet.
itojun@ and niklas@ ok


# 1.21 19-Jan-2001 itojun

pull post-4.4BSD change to sys/net/route.c from BSD/OS 4.2 (UCB copyrighted).

have sys/net/route.c:rtrequest1(), which takes rt_addrinfo * as the argument.
pass rt_addrinfo all the way down to rtrequest, and ifa->ifa_rtrequest.
3rd arg of ifa->ifa_rtrequest is now rt_addrinfo * instead of sockaddr *
(almost noone is using it anyways).

benefit: the follwoing command now works. previously we need two route(8)
invocations, "add" then "change".
# route add -inet6 default ::1 -ifp gif0

remove unsafe typecast in rtrequest(), from rtentry * to sockaddr *. it was
introduced by 4.3BSD-reno and never corrected.

XXX is eon_rtrequest() change correct regarding to 3rd arg?
eon_rtrequest() and rtrequest() were incorrect since 4.3BSD-reno,
so i do not have correct answer in the source code.
someone with more clue about netiso-over-ip, please help.


Revision tags: OPENBSD_2_8_BASE
# 1.20 20-Sep-2000 art

Since ifa_refcnt was bumped to an int and rt_flags is an int too, bump
ifa_flags to int.


# 1.19 28-Aug-2000 deraadt

changing the size of if_data has heavy impact on userland compat. there
was even a u_char slot available for ifi_link_state, which clearly does not
need a full 32 bits.


# 1.18 26-Aug-2000 nate

sync mii code with netbsd
adds detach functionality for phys
some code cleanup

Nobody really had time to test all of this out, but theo said commit anyway


Revision tags: OPENBSD_2_7_BASE
# 1.17 22-Mar-2000 itojun

remove if_withname(), which was imported during KAME merge by mistake.


# 1.16 21-Mar-2000 mickey

add SIOCGIFMTU/SIOCSIFMTU; remediate redundant code of tun, ppp, sppp; chris@ ok


Revision tags: SMP_BASE
# 1.15 02-Feb-2000 itojun

branches: 1.15.2;
wrap IFAFREE() by "do {} while (0)". it wasn't safe enough.


Revision tags: kame_19991208
# 1.14 08-Dec-1999 itojun

bring in KAME IPv6 code, dated 19991208.
replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support.
see sys/netinet6/{TODO,IMPLEMENTATION} for more details.

GENERIC configuration should work fine as before. GENERIC.v6 works fine
as well, but you'll need KAME userland tools to play with IPv6 (will be
bringed into soon).


Revision tags: OPENBSD_2_6_BASE
# 1.13 08-Aug-1999 niklas

Support detaching of network interfaces. Still work to do in ipf, and
other families than inet.


# 1.12 23-Jun-1999 cmetz

Added some protocol independent interfaces (supposedly IPv6 support APIs, but
ones that are useful for all protocols, not just IPv6).


Revision tags: OPENBSD_2_5_BASE
# 1.11 13-Mar-1999 deraadt

make ifa_refcnt a u_int; andrewb@demon.net


# 1.10 26-Feb-1999 jason

Ethernet bridge/IP firewall driver.


# 1.9 07-Jan-1999 deraadt

fix IFAFREE() to be safe for if/else nesting


Revision tags: OPENBSD_2_4_BASE
# 1.8 03-Sep-1998 jason

o OpenBSD gets if_media support (from NetBSD)
o rework/simplify if_xl to use it


Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE OPENBSD_2_2_BASE OPENBSD_2_3_BASE
# 1.7 02-Jul-1996 niklas

-Wall & -Wstrict-prototype fixes


# 1.6 29-Jun-1996 deraadt

provide if_attachhead(), and make if_loop use it


# 1.5 10-May-1996 deraadt

if_name/if_unit -> if_xname/if_softc


# 1.4 06-May-1996 mickey

if.h was missed from the commit.
if_ethersubr.c: missed variables added.


# 1.3 05-Mar-1996 mickey

Changes for ifconfig to compile.


# 1.2 03-Mar-1996 niklas

From NetBSD: 960217 merge


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.208 11-Mar-2021 florian

When RFC 8981 obsoleted RFC 4941 the terminology changed from
"privacy extensions" to "temporary address extensions"

Change ifconfig(8) to output temporary after temporary addresses and
add "temporary" option which is an alias for autoconfprivacy for now.

Also make AUTOCONF6TEMP a positiv flag that is set by default.
Previously the negative flag "INET6_NOPRIVACY" was set when privacy
addresses were disabled. This makes the flags output less ugly and
will allow us to disable autoconf addresses while having temporary
addresses enabled in the future.

More work is needed in slaacd.

input benno, jmc, deraadt
previous verison OK benno
OK jmc, kn


# 1.207 20-Feb-2021 dlg

add a MONITOR flag to ifaces to say they're only used for watching packets.

an example use of this is when you have a span port on a switch and
you want to be able to see the packets coming out of it with tcpdump,
but do not want these packets to enter the network stack for
processing. this is particularly important if the span port is
pushing a copy of any packets related to the machine doing the
monitoring as it will confuse pf states and the stack.

ok benno@


# 1.206 01-Feb-2021 mvs

ifunit() was fully replaced by if_unit(9) and should go away.

ok bluhm@ dlg@


# 1.205 18-Jan-2021 mvs

Introduce new function if_unit(9). This function returns a pointer the
interface descriptor corresponding to the unique name. This descriptor
is guaranteed to be valid until if_put(9) is called on the returned
pointer. if_unit(9) should replace already existent ifunit() which
returns descriptor not safe for dereference when context was switched.
This allow us to avoid some use-after-free issues in ioctl(2) path.
Also this unifies interface descriptor usage.

ok claudio@ sashan@


# 1.204 30-Sep-2020 mvs

We have no if_attachtail() function so remove the declaration.

ok deraadt@ claudio@


Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE OPENBSD_6_8_BASE
# 1.203 25-Jul-2019 krw

AF_INET comes before AF_INET6. Shorten line to <80 chars.

pointed out by claudio@


# 1.202 25-Jul-2019 krw

Add IFXF_AUTOCONF4 to if_xflags to match IFXF_AUTOCONF6. Let
ifconfig set/unset it.

ok deraadt@ kmos@


# 1.201 19-Apr-2019 dlg

add IF_HDRPRIO_OUTER for rxprio

IF_HDRPRIO_OUTER says you want the priority from the outer encap header.

ok claudio@


Revision tags: OPENBSD_6_5_BASE
# 1.200 10-Apr-2019 dlg

add struct if_sffpage so userland can read a page of sfp/qsfp info

this will be used by ifconfig so it can show you things like who
mades what module you're using and when it was made, and on some
modules you get diag info like temperature, vcc, and rx and tx
powers.

im putting the kernel side in so we can keep fiddling with the
userland printf side of things.

this work was done based on a question by rachel roch
ok deraadt@
enthusiasm from many including mikeb@ sthen@ patrick@


# 1.199 23-Jan-2019 dlg

add a SIOCGPWE3 ioctl for interfaces to advertise pwe3 capability

im going to turn mpw into an ethernet interface, which includes
changing its if_type to IFT_ETHER. currently ldpd looks for if_type
IFT_MPLSTUNNEL to decide if an interface is a pseudowire, ie, it's
going to break. the ioctl will let ldpd ask the interface if it is
pseudowire capable as an alternative.

ok claudio@


# 1.198 12-Nov-2018 dlg

add ifreq bits for the tx header prio field ioctls

a tx header prio can set to a fixed value from 0 to 7, or magic
values to represent populating the prio field from the encapsulated
packet, or from the mbuf prio value.

ok claudio@


# 1.197 12-Nov-2018 krw

Add new routing socket message RTM_80211INFO to provide details of
802.11 interface state changes (e.g. SSID) to interested parties.

Original diff from phessler@. Many suggestions and tweaks from
claudio@, stsp@, anton@.

ok claudio@ stsp@ anton@ phessler@


# 1.196 11-Nov-2018 dlg

use the llprio on gre(4) and eoip(4) interfaces for the keepalive tos

llprios are valued 0 to 7, while the ip tos/dscp/tclass is an 8 bit
value. fortunately the high 3 bits map nicely to the llprio values,
so we shift the llprio into place when generating the keepalive
frames. the llprio is defaulted to the value that cisco uses for
their gre keepalives.


Revision tags: OPENBSD_6_4_BASE
# 1.195 12-Sep-2018 krw

Fix obvious cut&pasto in comment (ifa_msghdr -> if_announcemsghdr).

ok claudio@


# 1.194 30-May-2018 dlg

restrict the prio values from SIOCSIFLLPRIO to what the kernel handles

previously the ioctl code checked that prio was an int less than
UCHAR_MAX, but the rest of the kernel (and priq code in particular)
expects it to be between 0 and 7 inclusive.

ok krw@ tb@


# 1.193 25-Apr-2018 jca

Make this header standalone #if __BSD_VISIBLE, by including needed headers

Puts us in line with Free/NetBSD and Linux and will get us rid of
pointless patches in the ports tree. ok guenther@ deraadt@


Revision tags: OPENBSD_6_3_BASE
# 1.192 19-Feb-2018 dlg

tunneldf needs ifr_df


# 1.191 10-Feb-2018 florian

Implement RFC 7217: "A Method for Generating Semantically Opaque
Interface Identifiers with IPv6 Stateless Address Autoconfiguration."

"An IPv6 address configured using this method is stable within each
subnet, but the corresponding Interface Identifier changes when the
host moves from one network to another. This method is meant to be an
alternative to generating Interface Identifiers based on hardware
addresses."

OK naddy, sthen


# 1.190 16-Jan-2018 mpi

Recycle IFF_NOTRAILERS into IFF_STATICARP and document ownerhsip
of IFF* flags.

inputs from jmc@, ok bluhm@, visa@


# 1.189 21-Dec-2017 dlg

prototype if_attach_iqueues so drivers can configure multiple iqs.


# 1.188 09-Nov-2017 tb

The cmd argument of ifconf() has been unused since COMPAT_LINUX was
purged. Remove it and move the prototype to if.c since ifconf() is
not used outside of this file.

ok mpi


# 1.187 31-Oct-2017 sashan

- add one more softnet taskq
NOTE: code still runs with single softnet task. change definition of
SOFTNET_TASKS in net/if.c, if you want to have more than one softnet task

OK mpi@, OK phessler@


Revision tags: OPENBSD_6_1_BASE OPENBSD_6_2_BASE
# 1.186 24-Jan-2017 krw

A space here, a space there. Soon we're talking real whitespace
rectification.


# 1.185 24-Jan-2017 dlg

add support for multiple transmit ifqueues per network interface.

an ifq to transmit a packet is picked by the current traffic
conditioner (ie, priq or hfsc) by providing an index into an array
of ifqs. by default interfaces get a single ifq but can ask for
more using if_attach_queues().

the vast majority of our drivers still think there's a 1:1 mapping
between interfaces and transmit queues, so their if_start routines
take an ifnet pointer instead of a pointer to the ifqueue struct.
instead of changing all the drivers in the tree, drivers can opt
into using an if_qstart routine and setting the IFXF_MPSAFE flag.
the stack provides a compatability wrapper from the new if_qstart
handler to the previous if_start handlers if IFXF_MPSAFE isnt set.

enabling hfsc on an interface configures it to transmit everything
through the first ifq. any other ifqs are left configured as priq,
but unused, when hfsc is enabled.

getting this in now so everyone can kick the tyres.

ok mpi@ visa@ (who provided some tweaks for cnmac).


# 1.184 23-Jan-2017 mpi

Flag pseudo-interfaces as such in order to call add_net_randomness()
only once per packet.

Fix a regression introduced when if_input() started to be called by
every pseudo-driver.

ok claudio@, dlg@


# 1.183 23-Jan-2017 dlg

i botched the copyout to ifr->ifr_data in SIOCGIFDATA.

this lets pflogd run again.

rename if_data() to if_getdata() while here to make grepping for
things less noisy.

reported by jsg@
worked through with deraadt@


# 1.182 23-Jan-2017 dlg

merge the ifnet and ifqueue stats together when userland wants them.

a new if_data() function takes a pointer to ifnet and merges its
if_data and ifq statistics. it takes the ifq mutex around the reads
of the ifq stats so they get a consistent copy.

the ifnet and ifq stats are merged because some parts of the stack
still update the ifnet counters.

ok visa@ (on an earlier diff) mpi@ claudio@


# 1.181 12-Dec-2016 mpi

Remove most of the splsoftnet() recursions related to cloned interfaces.

inputs and ok bluhm@


# 1.180 27-Oct-2016 dlg

add a new pool for 2k + 2 byte (mcl2k2) clusters.

a certain vendor likes to make chips that specify the rx buffer
sizes in kilobyte increments. unfortunately it places the ethernet
header on the start of the rx buffer, which means if you give it a
mcl2k cluster, the ethernet header will not be ETHER_ALIGNed cos
mcl2k clusters are always allocated on 2k boundarys (cos they pack
into pages well). that in turn means the ip header wont be aligned
correctly.

the current workaround on these chips has been to let non-strict
alignment archs just use the normal 2k cluster, but use whatever
cluster can fit 2k + 2 on strict archs. that turns out to be the
4k cluster, meaning we waste nearly 2k of space on every packet.

properly aligning the ethernet header and ip headers gives a
performance boost, even on non-strict archs.


# 1.179 04-Sep-2016 reyk

Move code to change the rdomain of an interface from the ioctl switch case
to a new function if_setrdomain().

OK mpi@ henning@


# 1.178 03-Sep-2016 reyk

Add support for a multipoint-to-multipoint mode in vxlan(4). In this
mode, vxlan(4) must be configured to accept any virtual network
identifier with "vnetid any" and added to a bridge(4) or switch(4).
This way the driver will dynamically learn the tunnel endpoints and
their vnetids for the responses and can be used to dynamically bridge
between VXLANs. It is also being used in combination with switch(4)
and the OpenFlow tunnel classifiers.

With input from yasuoka@ goda@
OK deraadt@ dlg@


Revision tags: OPENBSD_6_0_BASE
# 1.177 10-Jun-2016 vgross

Add the "llprio" field to struct ifnet, and the corresponding keyword
to ifconfig.

"llprio" allows one to set the priority of packets that do not go through
pf(4), as the case is for arp(4) or bpf(4).

ok sthen@ mikeb@


# 1.176 02-Mar-2016 dlg

provide generic ioctls for managing an interfaces parent

in the future this will subsume the individual vlandev, carpdev,
pppoedev, foodev options for things like vlan, carp, pppoe, etc.

inspired by vnetid

ok mpi@ jmatthew@


Revision tags: OPENBSD_5_9_BASE
# 1.175 05-Dec-2015 deraadt

avoid an ugly wrap in a comment


# 1.174 03-Dec-2015 dlg

rework if_start to allow nics to provide an mpsafe start routine.

existing start routines will still be called under the kernel lock
and at IPL_NET.

mpsafe start routines will be serialised so only one instance of
each interfaces function will be running in the kernel at any point
in time. this guarantees packets will be dequeued in order, and the
start routines dont have to lock against themselves because if_start
does it for them.

the code to do that is based on the scsi runqueue code.

this also provides an if_start_barrier() function that should wait
until any currently running instances of if_start have finished.

a driver can opt in to the mpsafe if_start call by doing the following:

1. setting ifp->if_xflags = IFXF_MPSAFE
2. only calling if_start() instead of its own start routine
3. clearing IFF_RUNNING before calling if_start_barrier() on its way down
4. only using IFQ_DEQUEUE (not ifq_deq_begin/commit/rollback)

to simplify the implementation the tx mitigation code has been removed.

tested by several
ok mpi@ jmatthew@


# 1.173 20-Nov-2015 mpi

Keep if_ref() private, if_get() is what you want to use before if_put().

The thread detaching an interface will sleep until all references to this
interface have been released. So we decided to only keep references for
a short period of time.

Keeping if_ref() private will hopefully help preserve this goal as long
as it makes sense.

Calling if_get()/if_put() in the same function also allows us to make
use of static analysis tools (thanks jsg@!) to catch our errors.

ok dlg@


# 1.172 24-Oct-2015 reyk

Add pair(4), a vether-based virtual Ethernet driver to interconnect
rdomains and bridges on the local system. This can be used to route
through local rdomains, to create L2 devices (like trunks) between
them, and many other things.

Discussed with many, with input from mpi@
OK sthen@ phessler@ yasuoka@ mikeb@


# 1.171 23-Oct-2015 claudio

Introduce a new sysctl NET_RT_IFNAMES that returns only ifnames to ifindex
mappings. This will be used by if_nameindex(3), if_nametoindex(3) and
if_indextoname(3) soon to fix the issues in pledge because of inet6 link
local addressing.
OK mpi@ benno@ deraadt@
The libc version will follow soon so better start updating your kernels


# 1.170 23-Oct-2015 dlg

tweak the vnetid so it can be optional and therefore cleared/deleted.

the abstract vnetid is promoted to a uin32_t, and adds a SIOCDVNETID
ioctl so it can be cleared.

this is all because i set an assignment on implementing a virtual
network interface and the students got confused when vnetid 0 didnt
show up in ifconfig output.

the vnetid in the vxlan(4) protocol is optional, but the current
code confuses 0 with no vnetid being set. this makes it clear.

ok reyk@ who also simplified my diff


# 1.169 05-Oct-2015 uebayasi

Add ifi_oqdrops and its alias to struct if_data.

Necessary bumps in Ports will be handled by sthen@.

OK mpi@ dlg@


# 1.168 27-Sep-2015 stsp

Add if_setlladdr(), factored out from ifioctl(). Will be used by iwm(4) soon.
With suggestions from tedu@ and guenther@
ok kettenis@


# 1.167 11-Sep-2015 stsp

Make room for media types of the future. Extend the ifmedia word to 64 bits.
This changes numbers of the SIOCSIFMEDIA and SIOCGIFMEDIA ioctls and
grows struct ifmediareq.

Old ifconfig and dhclient binaries can still assign addresses, however
the 'media' subcommand stops working. Recompiling ifconfig and dhclient
with new headers before a reboot should not be necessary unless in very
special circumstances where non-default media settings must be used to
get link and console access is not available.

There may be some MD fallout but that will be cleared up later.

ok deraadt miod
with help and suggestions from several sharks attending l2k15


# 1.166 09-Sep-2015 dlg

introduce reference counts for interfaces (ie, struct ifnet *ifp).

if_get can get a reference to an ifp, but it never releases that
reference. this provides an if_put function that can be used to
decrement the refcount.

we cannot come up with a scheme for letting the network stack run on
one (or many) cpus while ioctls are pulling interfaces down on another
cpu without refcounts for the interfaces.

if_put is going in now so we can go through the stack and put the
necessary calls to it in, and then we'll backfill this implementation
to actually check the refcounts when the interface detaches.

ok mpi@ mikeb@ claudio@


# 1.165 30-Aug-2015 mpi

Use a global table for domains instead of building a list at run time.

As a side effect there's no need to run if_attachdomain() after the
list of domains has been built.

ok claudio@, reyk@


Revision tags: OPENBSD_5_8_BASE
# 1.164 07-Jun-2015 jsg

Introduce unhandled_af() for cases where code conditionally does
something based on an address family and later assumes one of the paths
was taken. This was initially just calls to panic until guenther
suggested a function to reduce the amount of strings needed.

This reduces the amount of noise with static analysers and acts
as a sanity check.

ok guenther@ bluhm@


# 1.163 18-May-2015 reyk

Move the rdomain from struct ifnet into struct if_data. This way it
will be exported to userland with the existing sysctl, getifaddrs()
and routing socket (if_msghdr.ifm_data) interfaces that expose
if_data. All programs and daemons - Apps - that call the
SIOCGIFRDOMAIN ioctl in a getifaddrs() loop or after receiving an
interface message on the routing socket can now remove the pointless
additional ioctl. In base, that could be: dhclient, isakmpd, dhcpd,
dhcrelay, ntpd, ospfd, ripd, ifconfig.

No ABI breakage because it uses a previously unused pad field in if_data.

OK mpi@ deraadt@


# 1.162 10-Apr-2015 mpi

Run detach hook and similar before cleaning up any other resource when
an interface is destroyed/removed. This way we can ensure pseudo-driver
changes done after attaching an interface are undone before detaching it.

Note: it is safe to call if_deactivate() multiple times as the interface
should not have any attached pseudo-interface after the first call.

ok deraadt@, dlg@


# 1.161 18-Mar-2015 dlg

remove the congestion handling from struct ifqueue.

its only used for the ip and ip6 network stack input queues, so it
seems unfair that every instance of ifqueue has to carry a pointer
around for this specific use case.

this moves the congestion marker to a kernel global. if we detect
that we're congested, we assume the whole system is busy and punish
all input queues.

marking a system as congested is done by setting the global to the
current value of ticks. as the system moves away from that value,
it moves away from being congested until the comparison fails.

written at s2k15
ok henning@ beck@ bluhm@ claudio@


Revision tags: OPENBSD_5_7_BASE
# 1.160 08-Feb-2015 mpi

Introduce if_input() a function to pass packets dequeued from a
recieving ring to the stack.

if_input() is at the moment a drop-in replacement for ether_input_mbuf()
but will let us stack pseudo-driver in a nice way in order to no longer
call ether_input() recursively.

ok pelikan@, reyk@, blambert@, henning@


# 1.159 06-Jan-2015 stsp

Remove the NOINET6 interface flag, a left-over from the times when IPv6
was enabled by default. Add AFATTACH/AFDETACH ioctls which enable/disable
an address family for an interface (currently used for IPv6 only).

New kernel needs new ifconfig for IPv6 configuration (address assignment
still works with old ifconfig making this easy to cross over).

Committing on behalf of henning@ who is currently lebensmittelvergiftet.
ok stsp, benno, mpi


# 1.158 05-Dec-2014 mpi

Explicitly include <net/if_var.h> instead of pulling it in <net/if.h>.

ok mikeb@, krw@, bluhm@, tedu@


Revision tags: OPENBSD_5_6_BASE
# 1.157 14-Jul-2014 dlg

now that receive ring accounting has been pulled out of the mbuf layer,
we can pull the space the mbuf layer used to do per interface accounting
out of struct if_data.

saves a hundredish bytes on every interface.

ok deraadt@ claudio@


# 1.156 11-Jul-2014 henning

introduce the IFXF_AUTOCONF6 interface flag which controls wether we
accept rtadvs on that interface. the global net.inet6.ip6.accept_rtadv
sysctl just doesn't cut it, even tho the spec wants that - but in their
little absurd world, a host just has one interface by definition anyway...
the sysctlgoes away.
lots of head scratching, brain cell elemination etc from bluhm benno stsp
florian, excitement from simon and todd, ok bluhm stsp benno florian


# 1.155 08-Jul-2014 dlg

introduce the if_rxr api. it is intended to pull the rx ring accounting
out of the mbuf layer, and break the assumption that an interface will
only have a single ring per mbuf cluster size.

mpi@ is ok with moving this forward


# 1.154 13-Jun-2014 mpi

Instead of updating all the cluster allocation water marks of all the
interfaces when the kernel is livelocked, only do it for the current
pool and defer the other updates.

This allow us to get rid of an interface list iteration in a critical
path.

Ridding the libc crank since this change introduce an ABI break.

ok claudio@


Revision tags: OPENBSD_5_5_BASE
# 1.153 21-Nov-2013 mikeb

split kernel parts of the if.h into a separate header file if_var.h
which allows us to modify ifnet structure in a relatively safe way;
discussed with deraadt, ok mpi


# 1.152 09-Nov-2013 dlg

ticks is compared against mcl_grown to see if time has elapsed since
the rx ring was last allowed to grow and then assigned to it. it
is erroneous to do this because mcl_grown is a u_int and ticks is an
int.

this makes mcl_grown an int, and follows the idiom in kern_timeout.c
of going "thing - ticks < diff", which better copes with ticks
wrapping around and being used to calculate relative intervals.

ok pirofti@ guenther@


# 1.151 01-Nov-2013 pelikan

keep net/hfsc.h away from userspace, except in pfctl

tested by naddy, ok deraadt


# 1.150 21-Oct-2013 benno

nuke comment. How soon is now?
"do it" deraadt@


# 1.149 19-Oct-2013 reyk

Bring back the if_detachhook. We're going to have more users now.

ok mpi@ henning@ benno@


# 1.148 13-Oct-2013 reyk

Import vxlan(4), the virtual extensible local area network tunnel
interface. VXLAN is a UDP-based tunnelling protocol for overlaying
virtualized layer 2 networks over layer 3 networks. The implementation
is based on draft-mahalingam-dutt-dcops-vxlan-04 and has been tested
with other implementations in the wild.

put it in deraadt@


# 1.147 12-Oct-2013 henning

new bandwidth shaping subsystem, kernel side
uses hfsc behind the scenes; altq stays in parallel for a migration phase.
if.h even more messy for the transition, but eventuelly it should become
readable...
looked over & tested by many, ok phessler sthen


# 1.146 17-Sep-2013 mpi

Change vlan(4) detach procedure to not use a hook but a list of vlans
on the parent interface. This is similar to what bridge(4), trunk(4)
or carp(4) are doing and allows us to get rid of the detachhook.

ok reyk@, mikeb@


# 1.145 28-Aug-2013 mpi

Remove unused argument from *rtrequest()

ok krw@, mikeb@


Revision tags: OPENBSD_5_4_BASE
# 1.144 20-Jun-2013 mpi

Revert previous and unbreak asr, the new include should be protected.

Reported by naddy@


# 1.143 20-Jun-2013 mpi

Allocate the various hook head descriptors as part of the ifnet
structure rather than doing various M_WAITOK allocations during
the *attach() functions, we always rely on them anyway.

ok mikeb@, uebayasi@


# 1.142 02-Apr-2013 mpi

Instead of storing the link-level address of every interface in a global
array indexed by interface numbers, add a new field to the interface
descriptor pointing to it.

claudio@ and todd@ like it, ok mikeb@


# 1.141 26-Mar-2013 mpi

Remove various read-only *maxlen variables and use IFQ_MAXLEN directly.

ok beck@, mikeb@


# 1.140 20-Mar-2013 mpi

Introduce if_get() to retrieve an interface descriptor pointer given
an interface index and replace all the redondant checks and accesses
to a global array by a call to this function.

With imputs from and ok bluhm@, mikeb@


# 1.139 07-Mar-2013 mpi

Remove unused ifa_ifwithaf() function.

ok mikeb@, miod@


# 1.138 07-Mar-2013 mpi

Remove the IFAFREE() macro, the ifafree() function it was calling already
check for the reference counter.

ok mikeb@, miod@, pelikan@, kettenis@, krw@


Revision tags: OPENBSD_5_3_BASE
# 1.137 23-Nov-2012 sthen

Add SIOCGIFHARDMTU to allow retrieving the driver's maximum supported MTU
looks fine reyk@ ok mikeb@


# 1.136 11-Nov-2012 deraadt

align ifaliasreq.ifra_addr similar to the way that ifreq is fixed --
a gruesome union, to block the compiler from placing the struct
incorrectly aligned on stack frames
ok guenther


# 1.135 05-Oct-2012 camield

Point an interface directly to its bridgeport configuration, instead
of to the bridge itself. This is ok, since an interface can only be part
of one bridge, and the parent bridge is easy to find from the bridgeport.

This way we can get rid of a lot of list walks, improving performance
and shortening the code.

ok henning stsp sthen reyk


# 1.134 19-Sep-2012 henning

defina an IFCAP_CSUM_MASK, covering IFCAP_CSUM_*, and use it in if_vlan.c
to replace the list of them.
this actually makes vlan inherit the IPv6 CSUM flags from it's parent, that
had been commented out since this code was committed back in 2001.
ok benno mpf


# 1.133 10-Sep-2012 guenther

Bring into compliance with POSIX, exposing just the specified bits.

Requested by jasper@, ok kettenis@


# 1.132 21-Aug-2012 bluhm

Reverse the name and meaning of the IFXF_INET6_PRIVACY interface
flag. It is now called IFXF_INET6_NOPRIVACY. So IPv6 privacy
addresses are on by default without resetting the flag during
ifconfig down/up.
OK stsp@, sperreault@ (who wrote the same diff)


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.131 02-Dec-2011 haesbaert

Kill unused IFCAP_IPSEC and IFCAP_IPCOMP.

ok claudio@ henning@ mikeb@


# 1.130 02-Nov-2011 haesbaert

Expose if_capabilities to userland so that ifconfig can display the
device hardware features.
Tune ifconfig to show them with 'hwfeatures' argument.
While here, kill some old unused capabilities and respect 80 columns
in brconfig.h.

ok mcbride@, henning@, mpf@.


# 1.129 07-Oct-2011 henning

rename some vars and functions
unfortunately altq is one giant namespace violation. rename just those that
conflict with new stuff for now only to be found on my laptop. reduce pain,
the diff is huge already. ok ryan


Revision tags: OPENBSD_5_0_BASE
# 1.128 08-Jul-2011 henning

new priority queueing implementation, extremely low overhead, thus fast.
unconditional, always on. 8 priority levels, as every better switch, the
vlan header etc etc. ok ryan mpf sthen, pea tested as well


# 1.127 07-Jul-2011 henning

provide IF_LEN and IFQ_LEN to access ifq_len on an ifqueue, ryan ok


# 1.126 05-Jul-2011 henning

now of course I only noticed if_qflush is completely unused after
adjusting it to the new world order in my tree... remove it, ok ryan claudio


# 1.125 03-Jul-2011 henning

IFQ_CLASSIFY is also just schrapnel


# 1.124 03-Jul-2011 henning

no traces of ALTQ_DECL to be found anywhere, thus kill the #defines


# 1.123 03-Jul-2011 claudio

LINK_STATE_IS_UP() should consider LINK_STATE_UNKNOWN as an up state.
This is now possible because carp no longer uses LINK_STATE_UNKNOWN
for a state that is considered down. This will simplify a lot of code.
OK mpf@ mcbride@ henning@


# 1.122 13-Mar-2011 stsp

Add a way to enable/disable Wake On LAN with ifconfig.
ok deraadt


Revision tags: OPENBSD_4_9_BASE
# 1.121 17-Nov-2010 henning

introduce ifa_update_broadaddr to update an ifaddr's broadcast address,
trivial for the moment, more needed soon
tested by many as part of a larger diff, ok sthen claudio dlg krw


# 1.120 24-Sep-2010 claudio

Implement if_freenameindex() as a real function as required by posix.
OK deraadt@, millert@


# 1.119 23-Sep-2010 dlg

tweak the mclgeti algorithm to behave better under load.

instead of letting hardware rings grow on every interrupt, restrict
it so it can only grow once per softclock tick. we can only punish
the rings on softclock ticks, so it make sense to only grow on
softclock tick boundaries too.

the rings are now punished after >1 lost softclock tick rather than
>2. mclgeti is now more aggressive at detecting livelock.

the rings get punished by an 8th, rather than by half.

we now allow the rings to be punished again even if the system is
already considered in livelock.

without this diff a livelocked system will have its rx ring sizes
scale up and down very rapidly, while holding the rings low for too
long. this affected throughput significantly.

discussed and tested heavily at j2k10. there are still some games
with softnet we can play, but this is a good first step.

"put it in" and ok deraadt@
ok claudio@ krw@ henning@ mcbride@

if we find out that it sucks we can pull it out again later. till then
we'll run with it and see how it goes.


# 1.118 27-Aug-2010 jsg

remove the unused if_init callback in struct ifnet
ok deraadt@ henning@ claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.117 26-Jun-2010 claudio

Implement a simple keepalive mechanism in gre(4) that is compatible with
the one used by Cisco. It sends a return gre packet inside a gre packet
to the other side and expects it to return.
OK deraadt, reyk additional testing by sthen


# 1.116 28-May-2010 claudio

Rework the way we handle MPLS in the kernel. Instead of fumbling MPLS into
ether_output() and later on other L2 output functions use a trick and over-
load the ifp->if_output() function pointer on MPLS enabled interfaces to
go through mpls_output() which will then call the link level output function.
By setting IFXF_MPLS on an interface the output pointers are switched.
This now allows to cleanup the MPLS input and output pathes and fix mpe(4)
so that the MPLS code now actually works for both P and PE systems.
Tested by myself and michele
(A custom kernel with MPLS and mpe enabled is still needed).


# 1.115 17-Apr-2010 deraadt

split SIOCSIFLLADDR code out into an ifnewlladr() function
ok stsp


# 1.114 06-Apr-2010 stsp

Simple implementation of RFC4941, "Privacy Extensions for Stateless
Address Autoconfiguration in IPv6". For those among us who are paranoid
about broadcasting their MAC address to the IPv6 internet.

Man page help from jmc, testing by weerd, arc4random API hints from djm.

ok deraadt, claudio


Revision tags: OPENBSD_4_7_BASE
# 1.113 13-Jan-2010 henning

maintain a global RB tree of all local addresses in the system. this
includes AF_LINK addresses (aka mac addresses in the ethernet case). for
inet this also includes the broadcast addresses.
depends on ifinit() called earlier so we have a chance to pool_init before
autoconf assigns the AF_LINK addresses, the v6 fix, and the ifa_add/del
abstraction i just committed.
this is a change in semantics, it is now illegal to change the actual
address in an ifaddr struct because then the RB tree becomes unbalanced.
nothing using this tree yet.
ok theo ryan dlg


# 1.112 13-Jan-2010 henning

instead of fiddling with the per-interface address lists directly in
many places create a proper API (ifa_add / ifa_del) and use it.
ok theo ryan dlg


# 1.111 12-Jan-2010 deraadt

Make the structures for ifa_msghdr and friends even more like
the route messages so that people and compilers will not get
confused.
ok claudio


# 1.110 17-Sep-2009 claudio

Remove the comaptibility structures for routing socket version 3.
The RTM_VERSION bump is 2 years ago and so there is no need for this.
Diff made by tedu@ some time ago but got never commited so I do it now.


# 1.109 14-Sep-2009 claudio

Add a way to convert the ifi_link_state to a string without the use of
if_media. This makes link state tracking a lot easier as there is no need
to convert if types to if_media types, etc. Additionally this allows us
to extend the link states to include states tracked on higher protocol layers.
gre(4) keepalives packets, bfd and udld can be implemented without ugly hacks.
OK henning, michele, sthen, deraadt


# 1.108 10-Aug-2009 deraadt

At sys_reboot time, bring all the interfaces down so that their xxstop
functions are called, which will turn off DMA. Receiving packets into
your memory after a system reboot is pretty nasty. This will also mean
that the shutdown hooks can go; this solution is smaller.
ok henning miod dlg kettenis


Revision tags: OPENBSD_4_6_BASE
# 1.107 06-Jun-2009 rainer

when xflags got changed, tell the userland by routing sockets

ok henning@


# 1.106 05-Jun-2009 claudio

Initial support for routing domains. This allows to bind interfaces to
alternate routing table and separate them from other interfaces in distinct
routing tables. The same network can now be used in any doamin at the same
time without causing conflicts.
This diff is mostly mechanical and adds the necessary rdomain checks accross
net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6.
input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@


# 1.105 04-Jun-2009 henning

allow IPvShit to be turned off completely per-interface.
ifconfig em0 -inet6
deletes all v6 addresses including link-local and prevents new ones from
being added.
ifconfig em0 inet6 <addr>
re-enables v6, brings the link local back and adds optional <addr>
ok theo reyk


# 1.104 03-Jun-2009 beck

make wireless interfaces priority 4 by default. other interfaces remain
priority 0. while we are in here make sure we add wi interfaces to group "wlan"
in the same way the net80211 stuff already is.

this makes dhcp multiple default routes useful on laptops.

ok claudio@


Revision tags: OPENBSD_4_5_BASE
# 1.103 27-Jan-2009 dlg

make drivers tell the mclgeti allocator what their maximum ring size is
to prevent the hwm growing beyond that. this allows the livelock mitigation
to do something where the hwm used to grow beyond twice the rx rings size.

ok kettenis@ claudio@


# 1.102 12-Dec-2008 claudio

Introduce a if_priority that will be added to RTP_STATIC when routes are
added without an expilict priority. This allows to specify less prefered
interfaces that will only take over if the primary interface loses link.
OK deraadt@


# 1.101 11-Dec-2008 deraadt

export per-interface mbuf cluster pool use statistics out to userland
inside if_data, so that netstat(1) and systat(1) can see them
ok dlg


# 1.100 30-Nov-2008 brad

- Remove unused if_reset "bus reset routine" field in the ifnet struct.
- Add if_stop "stop routine" field in the ifnet struct.

ok mglocker@


# 1.99 26-Nov-2008 dlg

provide m_clsetlwm, an interface for an interface to raise its low
watermark for mbuf cluster allocations.

this is necessary for things like bge which cannot cope with less than a
certain number of pkts on the ring.

ok deraadt@


# 1.98 25-Nov-2008 deraadt

Factor increases are not needed, +1 appears to work as well.
ok dlg


# 1.97 24-Nov-2008 deraadt

move MCLPOOLS to if.h and force uipc_mbuf.c to get if.h, there is no
other option
ok dlg


# 1.96 24-Nov-2008 dlg

add several backend pools to allocate mbufs clusters of various sizes out
of. currently limited to MCLBYTES (2048 bytes) and 4096 bytes until pools
can allocate objects of sizes greater than PAGESIZE.

this allows drivers to ask for "jumbo" packets to fill rx rings with.

the second half of this change is per interface mbuf cluster allocator
statistics. drivers can use the new interface (MCLGETI), which will use
these stats to selectively fail allocations based on demand for mbufs. if
the driver isnt rapidly consuming rx mbufs, we dont allow it to allocate
many to put on its rx ring.

drivers require modifications to take advantage of both the new allocation
semantic and large clusters.

this was written and developed with deraadt@ over the last two days
ok deraadt@ claudio@


# 1.95 07-Nov-2008 deraadt

give this some /* CONSTCOND */ love


Revision tags: OPENBSD_4_4_BASE
# 1.94 10-Apr-2008 dlg

introduce mitigation for the calling of an interfaces start routine.

decent drivers prefer to have a lot of packets on the send queue so they
can queue a lot of them up on the tx ring and then post them all in one
big chunk. unfortunately our stack queues one packet onto the send queue
and then calls the start handler immediately.

this mitigates against that queue, send, queue, send behaviour by trying to
call the start routine only once per softnet. now its queue, queue, queue,
send.

this is the result of a lot of discussion with claudio@
tested by many.


Revision tags: OPENBSD_4_3_BASE
# 1.93 18-Nov-2007 mpf

Sync struct ifaltq to match struct ifqueue.
I wonder why 64-bit archs have not been bitten by this.
OK mcbride@, henning@


# 1.92 03-Sep-2007 claudio

Bump RTM_VERSION to 4 and start a new aera of routing in OpenBSD :)
Changes include 64bit counters instead of u_long, routing table id in the header
of most messages, reserved routing priority field, added a hdrlen field to skip
over the header so that binary compatibility becomes easier.
A minimal backward support for old binaries is included to ease upgrades but
don't expect anything more than ifconfig, route and dhclient to correctly work.
OK henning@ mglocker@


Revision tags: OPENBSD_4_2_BASE
# 1.91 25-Jun-2007 henning

crank ifq_maxlen from 50 to 256, so it is not smaller than most interfaces
rx rings any more. forwarding boxes with many fast interfaces can still use
some more, but this is a saner default.
ok deraadt markus henric


# 1.90 14-Jun-2007 reyk

Add a new "rtlabel" option to ifconfig. It allows to specify a route label
which will be used for new interface routes. For example,
ifconfig em0 10.1.1.0 255.255.255.0 rtlabel RING_1
will set the new interface address and attach the route label RING_1 to
the corresponding route.

manpage bits from jmc@
ok claudio@ henning@


# 1.89 29-May-2007 uwe

Define IF_ENQUEUE() and friends as proper C statements using do ... while
ok henning


# 1.88 26-May-2007 jason

one extern seems to be better than 20 for ifqmaxlen; ok krw


# 1.87 27-Mar-2007 jmc

grammar from bret lambert, and one more from me;


Revision tags: OPENBSD_4_1_BASE
# 1.86 09-Feb-2007 jmc

grammar fix from bret lambert;


# 1.85 28-Nov-2006 reyk

add additional link states to report the half duplex / full duplex
state, if known by the driver. this is required to check the full
duplex state without depending on the ifmedia ioctl which can't be
called in the kernel without process context.

ok henning@, brad@


# 1.84 16-Nov-2006 henning

introduce if_creategroup() to create an empty interface group.
code factored out from if_addgroup(), previously a group always had to have
members. ok mpf mcbride


# 1.83 31-Oct-2006 jason

ether_input_mbuf() isn't necessary, turn it into a macro and deal with
it's "special" case in ether_input(). Based on similiar idea in FreeBSD.
ok brad


Revision tags: OPENBSD_4_0_BASE
# 1.82 02-Jun-2006 mpf

Introduce attributes to interface groups.
As a first user, move the global carp(4) demotion counter
into the interface group. Thus we have the possibility
to define which carp interfaces are demoted together.

Put the demotion counter into the reserved field of the carp header.
With this, we can have carp act smarter if multiple errors occur.
It now always takes over other carp peers, that are advertising
with a higher demote count. As a side effect, we can also have
group failovers without the need of running in preempt mode.
The protocol change does not break compability with older
implementations.

Collaborative work with mcbride@

OK mcbride@, henning@


# 1.81 27-May-2006 brad

remove IFCAP_JUMBO_MTU interface capabilities flag and set if_hardmtu in a few
more drivers.

ok reyk@


# 1.80 26-May-2006 deraadt

rename jumbo mtu to if_hardmtu; ok brad reyk


# 1.79 19-May-2006 reyk

add a if_jumbo_mtu field to the interface structure for drivers
supporting ethernet jumbo frames. there's no standard for the size of
jumbo MTUs, so either let the driver set it's own value or use 9000
byte jumbo frames by default.

ok brad@


# 1.78 04-Mar-2006 brad

With the exception of two other small uncommited diffs this moves
the remainder of the network stack from splimp to splnet.

ok miod@


Revision tags: OPENBSD_3_9_BASE
# 1.77 09-Feb-2006 reyk

add an interface detach hook and use it with the vlan(4) driver. this
fixes a possible crash if the parent interface has been destroyed
(like vlan on trunk) before destroying the vlan interface.

ok brad@


Revision tags: OPENBSD_3_8_BASE
# 1.76 14-Jun-2005 henning

rename function and define to reflect the external -> egress name change
so it is clear what it is all about


# 1.75 14-Jun-2005 henning

use "egress" instead of "external" for the interface group containing the
interfaces the default route(s) point to, proposed deraadt some days ago,
ok djm deraadt


# 1.74 12-Jun-2005 henning

add SIOCGIFGMEMB ioctl, returns a list of all interfaces who are member of
the given group, markus ok


# 1.73 07-Jun-2005 henning

introduce a default "external" interface group, containing the interface(s)
the the default route(s) point to.
handles IPv4 and IPv6 as well as multipath routes.
follows default route changes, of course.
eases writing pf rulesets especially on laptops etc. that use different
interfaces depending on the environment (wired, wireless, ...)
ok theo ryan


# 1.72 06-Jun-2005 henning

use a define instead of hardcoding "all" in 3 places


# 1.71 05-Jun-2005 henning

const'ify the char *groupname param to if_addgroup and if_delgroup


# 1.70 24-May-2005 markus

add net.inet.ip.ifq for monitoring and changing ifqueue; similar to netbsd
ok henning


# 1.69 24-May-2005 reyk

initial import of a trunking (link aggregation and link failover)
implementation. it currently supports round robin mode with link state
checking, additional modes will be added later.

ok brad@, deraadt@


# 1.68 24-May-2005 henning

keep a list of member interfaces in ifg_group


# 1.67 22-May-2005 henning

allow pf to match on interface groups
pass on mygroup ...
markus ok


# 1.66 21-May-2005 henning

clean up and rework the interface absraction code big time, rip out multiple
useless layers of indirection and make the code way cleaner overall.
this is just the start, more to come...
worked very hard on by Ryan and me in Montreal last week, on the airplane to
vancouver and yesterday here in calgary. it hurt.
ok ryan theo


# 1.65 20-Apr-2005 mpf

Introduce if_linkstatehooks.
This converts if_link_state_change() to a generic usable
callback with dohooks().

OK henning@, camield@
Tested by camield@ and Alexey E. Suslikov


Revision tags: OPENBSD_3_7_BASE
# 1.64 07-Feb-2005 mcbride

Add new function if_link_state_change() to take care of sending messages
on the routing socket and notifying carp() of link changes.

ok brad@ mpf@


# 1.63 14-Jan-2005 henning

remove old ifgroups ioctls
the old ifgroups haven't been in use ever really, and the new
implementation is 3 months old today. theo ok (3 months ago)


# 1.62 07-Dec-2004 mcbride

Convert carp(4) to behave more like a regular interface, much in the same
style as vlan(4). carp interfaces no longer require the physical interface
to be on the same subnet as the carp interface, or even that the physical
interface has an adress at all, so CARP can now be used on /30 networks.

ok deraadt@ henning@


# 1.61 07-Dec-2004 mcbride

KNF


# 1.60 03-Dec-2004 henning

do not use one struct timeout for the if congestion stuff, but embed
a struct timeout to struct ifqueue so that each one has its own - it
is a per-queue thing. from chris pascoe


# 1.59 10-Nov-2004 grange

Safer IF_INPUT_ENQUEUE macro.

ok millert@


# 1.58 14-Oct-2004 mickey

avoid stupid commons


# 1.57 11-Oct-2004 henning

ifgroups reqrite
there is now a TAILQ with all interface groups as members, and
in struct ofnet there is only a pointer to the group structure stored
and not its name.
mostly hacked at c2k4 and somewhere over the atlantic ocean
ok markus mcbride


Revision tags: OPENBSD_3_6_BASE
# 1.56 26-Jun-2004 markus

cleanup ioctl for ifgroups; ok pb@


# 1.55 25-Jun-2004 pb

introduce "interface groups"

by "ifconfig fxp0 group foobar" "ifconfig xl0 group foobar"
these two interfaces are in one group.
Every interface has its if-family as default group.

idea/design from henning@, based on some work/disucssion from Joris Vink.

henning@, mcbride@ ok.


# 1.54 20-Jun-2004 beck

undo mbuf cluster breakage that causes free'ed packets to show up on the
input queues when using dhcp and hostap wi, or xl, or fxp....
ok art@


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.53 29-May-2004 jcs

introduce SIOCSIFDESCR and SIOCGIFDESCR to maintain interface
descriptions, configurable with ifconfig

help from various, ok deraadt@


# 1.52 18-May-2004 brad

if_ether.h
add ETHER_MAX_LEN_JUMBO, ETHER_VLAN_ENCAP_LEN, ETHER_ALIGN, and
ETHERMTU_JUMBO constants.

if.h
add a few more interface capabilities flags.

Some from NetBSD, some from FreeBSD.

ok markus@


# 1.51 26-Apr-2004 mcbride

Before enqueueing the packet, copy the contents of incoming clusters
to the mbuf and free the cluster when it contains a small packet.

ok deraadt@


# 1.50 17-Apr-2004 henning

add a congestion indicator to if_queue. It is set when the input queue
is full, along with a timer that unsets it again after 10ms.
The input queue beeing full is a reliable indicator for CPU overload, and
this flag allows other subsystems to cope with the situation.
hacked with beck
ok kjc@ markus@ beck@


Revision tags: OPENBSD_3_5_BASE
# 1.49 15-Jan-2004 markus

add a RTM_IFANNOUNCE message; from netbsd; ok itojun, henning


# 1.48 16-Dec-2003 markus

return error in ifc_destroy; ok deraadt, itojun, cedric, hshoexer


# 1.47 10-Dec-2003 itojun

use if_indexlim (instead of if_index) and ifindex2ifnet[x] != NULL
to check if interface exists, as (1) if_index will have different meaning
(2) ifindex2ifnet could become NULL when interface gets destroyed,
when we introduce dynamically-created interfaces. markus ok


# 1.46 08-Dec-2003 markus

add IOCIFGCLONERS; ifconfig -C; from netbsd; ok henning, deraadt


# 1.45 03-Dec-2003 markus

support for network interface "cloning", e.g. gif(4) via ifconfig(8)


# 1.44 19-Oct-2003 david

more typos


# 1.43 17-Oct-2003 mcbride

Common Address Redundancy Protocol

Allows multiple hosts to share an IP address, providing high availability
and load balancing.

Based on code by mickey@, with additional help from markus@
and Marco_Pfatschbacher@genua.de

ok deraadt@


Revision tags: OPENBSD_3_4_BASE
# 1.42 25-Aug-2003 fgsch

if_init support, required by ieee80211.
deraadt@ ok.


# 1.41 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_A UBC_SYNC_B
# 1.40 03-Jul-2002 miod

Change all variables definitions (int foo) in sys/sys/*.h to variable
declarations (extern int foo), and compensate in the appropriate locations.


# 1.39 30-Jun-2002 itojun

allocate sockaddr_dl for ifnet in if_alloc_sadl(), as we don't always know
the size of sockaddr_dl on if_attach() - for instance, see ether_ifattach().
from netbsd. fgs ok


# 1.38 23-Jun-2002 itojun

g/c last remains of old ipv6 prefix management


# 1.37 27-May-2002 itojun

if_attach() gets called before domaininit(). scan all interfaces for if_afdata
initialization after domaininit().


# 1.36 27-May-2002 itojun

framework to add af-dependent data structure to struct ifnet.
as discussed at bsd-api-discuss. sync w/kame


# 1.35 24-Apr-2002 dhartmei

Add hooks to struct ifnet that allow to register callbacks that will be
notified of interface address changes. ok provos@, angelos@


Revision tags: OPENBSD_3_1_BASE
# 1.34 15-Mar-2002 millert

Cosmetic changes only, primarily making comments line up nicely after the
__P removal.


# 1.33 14-Mar-2002 millert

First round of __P removal in sys


# 1.32 23-Jan-2002 fgsch

compatability -> compatibility.


Revision tags: OPENBSD_3_0_BASE UBC_BASE
# 1.31 05-Jul-2001 angelos

branches: 1.31.4;
KNF


# 1.30 05-Jul-2001 jjbg

Include files for IPComp support. angelos@ ok.


# 1.29 27-Jun-2001 kjc

ALTQ base modifications to the kernel.
- ALTQ introduces a set of new queue macros that coexist with the
traditional IF_XXX macros.
- "struct ifaltq" replaces "struct ifqueue" in "struct ifnet".
- assign cdev major 74 for i386 and 54 for alpha as ALTQ control interface.


# 1.28 23-Jun-2001 fgsch

Add ether_input_mbuf to help us remove the ether_header from
ether_input; all drivers should start migrating to this.
Discussed with jason@, deraadt@ more or les ok'ed.


# 1.27 15-Jun-2001 itojun

change the meaning of ifnet.if_lastchange to meet RFC1573 ifLastChange.
follows BSD/OS practice and ucd-snmp code (FreeBSD does it for specific
interfaces only).

was: if_lastchange get updated on every packet transmission/receipt.
now: if_lastchange get updated when IFF_UP is changed.


# 1.26 09-Jun-2001 angelos

By popular demand, protect from multiple inclusion, and fix to use the
same naming style.


# 1.25 28-May-2001 angelos

IPSECv4 -> IPSEC


# 1.24 28-May-2001 angelos

No need for separate ESP/AH interface capabilities.


# 1.23 28-May-2001 angelos

Interface capabilities (based on NetBSD, but merge ethercom and ifnet
capabilities into one, in the ifp).


Revision tags: OPENBSD_2_9_BASE
# 1.22 06-Feb-2001 mickey

allow changing number of loopbacks in ukc.
change rest of the code to use lo0ifp pointing
to the corresponding struct ifnet.
itojun@ and niklas@ ok


# 1.21 19-Jan-2001 itojun

pull post-4.4BSD change to sys/net/route.c from BSD/OS 4.2 (UCB copyrighted).

have sys/net/route.c:rtrequest1(), which takes rt_addrinfo * as the argument.
pass rt_addrinfo all the way down to rtrequest, and ifa->ifa_rtrequest.
3rd arg of ifa->ifa_rtrequest is now rt_addrinfo * instead of sockaddr *
(almost noone is using it anyways).

benefit: the follwoing command now works. previously we need two route(8)
invocations, "add" then "change".
# route add -inet6 default ::1 -ifp gif0

remove unsafe typecast in rtrequest(), from rtentry * to sockaddr *. it was
introduced by 4.3BSD-reno and never corrected.

XXX is eon_rtrequest() change correct regarding to 3rd arg?
eon_rtrequest() and rtrequest() were incorrect since 4.3BSD-reno,
so i do not have correct answer in the source code.
someone with more clue about netiso-over-ip, please help.


Revision tags: OPENBSD_2_8_BASE
# 1.20 20-Sep-2000 art

Since ifa_refcnt was bumped to an int and rt_flags is an int too, bump
ifa_flags to int.


# 1.19 28-Aug-2000 deraadt

changing the size of if_data has heavy impact on userland compat. there
was even a u_char slot available for ifi_link_state, which clearly does not
need a full 32 bits.


# 1.18 26-Aug-2000 nate

sync mii code with netbsd
adds detach functionality for phys
some code cleanup

Nobody really had time to test all of this out, but theo said commit anyway


Revision tags: OPENBSD_2_7_BASE
# 1.17 22-Mar-2000 itojun

remove if_withname(), which was imported during KAME merge by mistake.


# 1.16 21-Mar-2000 mickey

add SIOCGIFMTU/SIOCSIFMTU; remediate redundant code of tun, ppp, sppp; chris@ ok


Revision tags: SMP_BASE
# 1.15 02-Feb-2000 itojun

branches: 1.15.2;
wrap IFAFREE() by "do {} while (0)". it wasn't safe enough.


Revision tags: kame_19991208
# 1.14 08-Dec-1999 itojun

bring in KAME IPv6 code, dated 19991208.
replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support.
see sys/netinet6/{TODO,IMPLEMENTATION} for more details.

GENERIC configuration should work fine as before. GENERIC.v6 works fine
as well, but you'll need KAME userland tools to play with IPv6 (will be
bringed into soon).


Revision tags: OPENBSD_2_6_BASE
# 1.13 08-Aug-1999 niklas

Support detaching of network interfaces. Still work to do in ipf, and
other families than inet.


# 1.12 23-Jun-1999 cmetz

Added some protocol independent interfaces (supposedly IPv6 support APIs, but
ones that are useful for all protocols, not just IPv6).


Revision tags: OPENBSD_2_5_BASE
# 1.11 13-Mar-1999 deraadt

make ifa_refcnt a u_int; andrewb@demon.net


# 1.10 26-Feb-1999 jason

Ethernet bridge/IP firewall driver.


# 1.9 07-Jan-1999 deraadt

fix IFAFREE() to be safe for if/else nesting


Revision tags: OPENBSD_2_4_BASE
# 1.8 03-Sep-1998 jason

o OpenBSD gets if_media support (from NetBSD)
o rework/simplify if_xl to use it


Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE OPENBSD_2_2_BASE OPENBSD_2_3_BASE
# 1.7 02-Jul-1996 niklas

-Wall & -Wstrict-prototype fixes


# 1.6 29-Jun-1996 deraadt

provide if_attachhead(), and make if_loop use it


# 1.5 10-May-1996 deraadt

if_name/if_unit -> if_xname/if_softc


# 1.4 06-May-1996 mickey

if.h was missed from the commit.
if_ethersubr.c: missed variables added.


# 1.3 05-Mar-1996 mickey

Changes for ifconfig to compile.


# 1.2 03-Mar-1996 niklas

From NetBSD: 960217 merge


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.207 20-Feb-2021 dlg

add a MONITOR flag to ifaces to say they're only used for watching packets.

an example use of this is when you have a span port on a switch and
you want to be able to see the packets coming out of it with tcpdump,
but do not want these packets to enter the network stack for
processing. this is particularly important if the span port is
pushing a copy of any packets related to the machine doing the
monitoring as it will confuse pf states and the stack.

ok benno@


# 1.206 01-Feb-2021 mvs

ifunit() was fully replaced by if_unit(9) and should go away.

ok bluhm@ dlg@


# 1.205 18-Jan-2021 mvs

Introduce new function if_unit(9). This function returns a pointer the
interface descriptor corresponding to the unique name. This descriptor
is guaranteed to be valid until if_put(9) is called on the returned
pointer. if_unit(9) should replace already existent ifunit() which
returns descriptor not safe for dereference when context was switched.
This allow us to avoid some use-after-free issues in ioctl(2) path.
Also this unifies interface descriptor usage.

ok claudio@ sashan@


# 1.204 30-Sep-2020 mvs

We have no if_attachtail() function so remove the declaration.

ok deraadt@ claudio@


Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE OPENBSD_6_8_BASE
# 1.203 25-Jul-2019 krw

AF_INET comes before AF_INET6. Shorten line to <80 chars.

pointed out by claudio@


# 1.202 25-Jul-2019 krw

Add IFXF_AUTOCONF4 to if_xflags to match IFXF_AUTOCONF6. Let
ifconfig set/unset it.

ok deraadt@ kmos@


# 1.201 19-Apr-2019 dlg

add IF_HDRPRIO_OUTER for rxprio

IF_HDRPRIO_OUTER says you want the priority from the outer encap header.

ok claudio@


Revision tags: OPENBSD_6_5_BASE
# 1.200 10-Apr-2019 dlg

add struct if_sffpage so userland can read a page of sfp/qsfp info

this will be used by ifconfig so it can show you things like who
mades what module you're using and when it was made, and on some
modules you get diag info like temperature, vcc, and rx and tx
powers.

im putting the kernel side in so we can keep fiddling with the
userland printf side of things.

this work was done based on a question by rachel roch
ok deraadt@
enthusiasm from many including mikeb@ sthen@ patrick@


# 1.199 23-Jan-2019 dlg

add a SIOCGPWE3 ioctl for interfaces to advertise pwe3 capability

im going to turn mpw into an ethernet interface, which includes
changing its if_type to IFT_ETHER. currently ldpd looks for if_type
IFT_MPLSTUNNEL to decide if an interface is a pseudowire, ie, it's
going to break. the ioctl will let ldpd ask the interface if it is
pseudowire capable as an alternative.

ok claudio@


# 1.198 12-Nov-2018 dlg

add ifreq bits for the tx header prio field ioctls

a tx header prio can set to a fixed value from 0 to 7, or magic
values to represent populating the prio field from the encapsulated
packet, or from the mbuf prio value.

ok claudio@


# 1.197 12-Nov-2018 krw

Add new routing socket message RTM_80211INFO to provide details of
802.11 interface state changes (e.g. SSID) to interested parties.

Original diff from phessler@. Many suggestions and tweaks from
claudio@, stsp@, anton@.

ok claudio@ stsp@ anton@ phessler@


# 1.196 11-Nov-2018 dlg

use the llprio on gre(4) and eoip(4) interfaces for the keepalive tos

llprios are valued 0 to 7, while the ip tos/dscp/tclass is an 8 bit
value. fortunately the high 3 bits map nicely to the llprio values,
so we shift the llprio into place when generating the keepalive
frames. the llprio is defaulted to the value that cisco uses for
their gre keepalives.


Revision tags: OPENBSD_6_4_BASE
# 1.195 12-Sep-2018 krw

Fix obvious cut&pasto in comment (ifa_msghdr -> if_announcemsghdr).

ok claudio@


# 1.194 30-May-2018 dlg

restrict the prio values from SIOCSIFLLPRIO to what the kernel handles

previously the ioctl code checked that prio was an int less than
UCHAR_MAX, but the rest of the kernel (and priq code in particular)
expects it to be between 0 and 7 inclusive.

ok krw@ tb@


# 1.193 25-Apr-2018 jca

Make this header standalone #if __BSD_VISIBLE, by including needed headers

Puts us in line with Free/NetBSD and Linux and will get us rid of
pointless patches in the ports tree. ok guenther@ deraadt@


Revision tags: OPENBSD_6_3_BASE
# 1.192 19-Feb-2018 dlg

tunneldf needs ifr_df


# 1.191 10-Feb-2018 florian

Implement RFC 7217: "A Method for Generating Semantically Opaque
Interface Identifiers with IPv6 Stateless Address Autoconfiguration."

"An IPv6 address configured using this method is stable within each
subnet, but the corresponding Interface Identifier changes when the
host moves from one network to another. This method is meant to be an
alternative to generating Interface Identifiers based on hardware
addresses."

OK naddy, sthen


# 1.190 16-Jan-2018 mpi

Recycle IFF_NOTRAILERS into IFF_STATICARP and document ownerhsip
of IFF* flags.

inputs from jmc@, ok bluhm@, visa@


# 1.189 21-Dec-2017 dlg

prototype if_attach_iqueues so drivers can configure multiple iqs.


# 1.188 09-Nov-2017 tb

The cmd argument of ifconf() has been unused since COMPAT_LINUX was
purged. Remove it and move the prototype to if.c since ifconf() is
not used outside of this file.

ok mpi


# 1.187 31-Oct-2017 sashan

- add one more softnet taskq
NOTE: code still runs with single softnet task. change definition of
SOFTNET_TASKS in net/if.c, if you want to have more than one softnet task

OK mpi@, OK phessler@


Revision tags: OPENBSD_6_1_BASE OPENBSD_6_2_BASE
# 1.186 24-Jan-2017 krw

A space here, a space there. Soon we're talking real whitespace
rectification.


# 1.185 24-Jan-2017 dlg

add support for multiple transmit ifqueues per network interface.

an ifq to transmit a packet is picked by the current traffic
conditioner (ie, priq or hfsc) by providing an index into an array
of ifqs. by default interfaces get a single ifq but can ask for
more using if_attach_queues().

the vast majority of our drivers still think there's a 1:1 mapping
between interfaces and transmit queues, so their if_start routines
take an ifnet pointer instead of a pointer to the ifqueue struct.
instead of changing all the drivers in the tree, drivers can opt
into using an if_qstart routine and setting the IFXF_MPSAFE flag.
the stack provides a compatability wrapper from the new if_qstart
handler to the previous if_start handlers if IFXF_MPSAFE isnt set.

enabling hfsc on an interface configures it to transmit everything
through the first ifq. any other ifqs are left configured as priq,
but unused, when hfsc is enabled.

getting this in now so everyone can kick the tyres.

ok mpi@ visa@ (who provided some tweaks for cnmac).


# 1.184 23-Jan-2017 mpi

Flag pseudo-interfaces as such in order to call add_net_randomness()
only once per packet.

Fix a regression introduced when if_input() started to be called by
every pseudo-driver.

ok claudio@, dlg@


# 1.183 23-Jan-2017 dlg

i botched the copyout to ifr->ifr_data in SIOCGIFDATA.

this lets pflogd run again.

rename if_data() to if_getdata() while here to make grepping for
things less noisy.

reported by jsg@
worked through with deraadt@


# 1.182 23-Jan-2017 dlg

merge the ifnet and ifqueue stats together when userland wants them.

a new if_data() function takes a pointer to ifnet and merges its
if_data and ifq statistics. it takes the ifq mutex around the reads
of the ifq stats so they get a consistent copy.

the ifnet and ifq stats are merged because some parts of the stack
still update the ifnet counters.

ok visa@ (on an earlier diff) mpi@ claudio@


# 1.181 12-Dec-2016 mpi

Remove most of the splsoftnet() recursions related to cloned interfaces.

inputs and ok bluhm@


# 1.180 27-Oct-2016 dlg

add a new pool for 2k + 2 byte (mcl2k2) clusters.

a certain vendor likes to make chips that specify the rx buffer
sizes in kilobyte increments. unfortunately it places the ethernet
header on the start of the rx buffer, which means if you give it a
mcl2k cluster, the ethernet header will not be ETHER_ALIGNed cos
mcl2k clusters are always allocated on 2k boundarys (cos they pack
into pages well). that in turn means the ip header wont be aligned
correctly.

the current workaround on these chips has been to let non-strict
alignment archs just use the normal 2k cluster, but use whatever
cluster can fit 2k + 2 on strict archs. that turns out to be the
4k cluster, meaning we waste nearly 2k of space on every packet.

properly aligning the ethernet header and ip headers gives a
performance boost, even on non-strict archs.


# 1.179 04-Sep-2016 reyk

Move code to change the rdomain of an interface from the ioctl switch case
to a new function if_setrdomain().

OK mpi@ henning@


# 1.178 03-Sep-2016 reyk

Add support for a multipoint-to-multipoint mode in vxlan(4). In this
mode, vxlan(4) must be configured to accept any virtual network
identifier with "vnetid any" and added to a bridge(4) or switch(4).
This way the driver will dynamically learn the tunnel endpoints and
their vnetids for the responses and can be used to dynamically bridge
between VXLANs. It is also being used in combination with switch(4)
and the OpenFlow tunnel classifiers.

With input from yasuoka@ goda@
OK deraadt@ dlg@


Revision tags: OPENBSD_6_0_BASE
# 1.177 10-Jun-2016 vgross

Add the "llprio" field to struct ifnet, and the corresponding keyword
to ifconfig.

"llprio" allows one to set the priority of packets that do not go through
pf(4), as the case is for arp(4) or bpf(4).

ok sthen@ mikeb@


# 1.176 02-Mar-2016 dlg

provide generic ioctls for managing an interfaces parent

in the future this will subsume the individual vlandev, carpdev,
pppoedev, foodev options for things like vlan, carp, pppoe, etc.

inspired by vnetid

ok mpi@ jmatthew@


Revision tags: OPENBSD_5_9_BASE
# 1.175 05-Dec-2015 deraadt

avoid an ugly wrap in a comment


# 1.174 03-Dec-2015 dlg

rework if_start to allow nics to provide an mpsafe start routine.

existing start routines will still be called under the kernel lock
and at IPL_NET.

mpsafe start routines will be serialised so only one instance of
each interfaces function will be running in the kernel at any point
in time. this guarantees packets will be dequeued in order, and the
start routines dont have to lock against themselves because if_start
does it for them.

the code to do that is based on the scsi runqueue code.

this also provides an if_start_barrier() function that should wait
until any currently running instances of if_start have finished.

a driver can opt in to the mpsafe if_start call by doing the following:

1. setting ifp->if_xflags = IFXF_MPSAFE
2. only calling if_start() instead of its own start routine
3. clearing IFF_RUNNING before calling if_start_barrier() on its way down
4. only using IFQ_DEQUEUE (not ifq_deq_begin/commit/rollback)

to simplify the implementation the tx mitigation code has been removed.

tested by several
ok mpi@ jmatthew@


# 1.173 20-Nov-2015 mpi

Keep if_ref() private, if_get() is what you want to use before if_put().

The thread detaching an interface will sleep until all references to this
interface have been released. So we decided to only keep references for
a short period of time.

Keeping if_ref() private will hopefully help preserve this goal as long
as it makes sense.

Calling if_get()/if_put() in the same function also allows us to make
use of static analysis tools (thanks jsg@!) to catch our errors.

ok dlg@


# 1.172 24-Oct-2015 reyk

Add pair(4), a vether-based virtual Ethernet driver to interconnect
rdomains and bridges on the local system. This can be used to route
through local rdomains, to create L2 devices (like trunks) between
them, and many other things.

Discussed with many, with input from mpi@
OK sthen@ phessler@ yasuoka@ mikeb@


# 1.171 23-Oct-2015 claudio

Introduce a new sysctl NET_RT_IFNAMES that returns only ifnames to ifindex
mappings. This will be used by if_nameindex(3), if_nametoindex(3) and
if_indextoname(3) soon to fix the issues in pledge because of inet6 link
local addressing.
OK mpi@ benno@ deraadt@
The libc version will follow soon so better start updating your kernels


# 1.170 23-Oct-2015 dlg

tweak the vnetid so it can be optional and therefore cleared/deleted.

the abstract vnetid is promoted to a uin32_t, and adds a SIOCDVNETID
ioctl so it can be cleared.

this is all because i set an assignment on implementing a virtual
network interface and the students got confused when vnetid 0 didnt
show up in ifconfig output.

the vnetid in the vxlan(4) protocol is optional, but the current
code confuses 0 with no vnetid being set. this makes it clear.

ok reyk@ who also simplified my diff


# 1.169 05-Oct-2015 uebayasi

Add ifi_oqdrops and its alias to struct if_data.

Necessary bumps in Ports will be handled by sthen@.

OK mpi@ dlg@


# 1.168 27-Sep-2015 stsp

Add if_setlladdr(), factored out from ifioctl(). Will be used by iwm(4) soon.
With suggestions from tedu@ and guenther@
ok kettenis@


# 1.167 11-Sep-2015 stsp

Make room for media types of the future. Extend the ifmedia word to 64 bits.
This changes numbers of the SIOCSIFMEDIA and SIOCGIFMEDIA ioctls and
grows struct ifmediareq.

Old ifconfig and dhclient binaries can still assign addresses, however
the 'media' subcommand stops working. Recompiling ifconfig and dhclient
with new headers before a reboot should not be necessary unless in very
special circumstances where non-default media settings must be used to
get link and console access is not available.

There may be some MD fallout but that will be cleared up later.

ok deraadt miod
with help and suggestions from several sharks attending l2k15


# 1.166 09-Sep-2015 dlg

introduce reference counts for interfaces (ie, struct ifnet *ifp).

if_get can get a reference to an ifp, but it never releases that
reference. this provides an if_put function that can be used to
decrement the refcount.

we cannot come up with a scheme for letting the network stack run on
one (or many) cpus while ioctls are pulling interfaces down on another
cpu without refcounts for the interfaces.

if_put is going in now so we can go through the stack and put the
necessary calls to it in, and then we'll backfill this implementation
to actually check the refcounts when the interface detaches.

ok mpi@ mikeb@ claudio@


# 1.165 30-Aug-2015 mpi

Use a global table for domains instead of building a list at run time.

As a side effect there's no need to run if_attachdomain() after the
list of domains has been built.

ok claudio@, reyk@


Revision tags: OPENBSD_5_8_BASE
# 1.164 07-Jun-2015 jsg

Introduce unhandled_af() for cases where code conditionally does
something based on an address family and later assumes one of the paths
was taken. This was initially just calls to panic until guenther
suggested a function to reduce the amount of strings needed.

This reduces the amount of noise with static analysers and acts
as a sanity check.

ok guenther@ bluhm@


# 1.163 18-May-2015 reyk

Move the rdomain from struct ifnet into struct if_data. This way it
will be exported to userland with the existing sysctl, getifaddrs()
and routing socket (if_msghdr.ifm_data) interfaces that expose
if_data. All programs and daemons - Apps - that call the
SIOCGIFRDOMAIN ioctl in a getifaddrs() loop or after receiving an
interface message on the routing socket can now remove the pointless
additional ioctl. In base, that could be: dhclient, isakmpd, dhcpd,
dhcrelay, ntpd, ospfd, ripd, ifconfig.

No ABI breakage because it uses a previously unused pad field in if_data.

OK mpi@ deraadt@


# 1.162 10-Apr-2015 mpi

Run detach hook and similar before cleaning up any other resource when
an interface is destroyed/removed. This way we can ensure pseudo-driver
changes done after attaching an interface are undone before detaching it.

Note: it is safe to call if_deactivate() multiple times as the interface
should not have any attached pseudo-interface after the first call.

ok deraadt@, dlg@


# 1.161 18-Mar-2015 dlg

remove the congestion handling from struct ifqueue.

its only used for the ip and ip6 network stack input queues, so it
seems unfair that every instance of ifqueue has to carry a pointer
around for this specific use case.

this moves the congestion marker to a kernel global. if we detect
that we're congested, we assume the whole system is busy and punish
all input queues.

marking a system as congested is done by setting the global to the
current value of ticks. as the system moves away from that value,
it moves away from being congested until the comparison fails.

written at s2k15
ok henning@ beck@ bluhm@ claudio@


Revision tags: OPENBSD_5_7_BASE
# 1.160 08-Feb-2015 mpi

Introduce if_input() a function to pass packets dequeued from a
recieving ring to the stack.

if_input() is at the moment a drop-in replacement for ether_input_mbuf()
but will let us stack pseudo-driver in a nice way in order to no longer
call ether_input() recursively.

ok pelikan@, reyk@, blambert@, henning@


# 1.159 06-Jan-2015 stsp

Remove the NOINET6 interface flag, a left-over from the times when IPv6
was enabled by default. Add AFATTACH/AFDETACH ioctls which enable/disable
an address family for an interface (currently used for IPv6 only).

New kernel needs new ifconfig for IPv6 configuration (address assignment
still works with old ifconfig making this easy to cross over).

Committing on behalf of henning@ who is currently lebensmittelvergiftet.
ok stsp, benno, mpi


# 1.158 05-Dec-2014 mpi

Explicitly include <net/if_var.h> instead of pulling it in <net/if.h>.

ok mikeb@, krw@, bluhm@, tedu@


Revision tags: OPENBSD_5_6_BASE
# 1.157 14-Jul-2014 dlg

now that receive ring accounting has been pulled out of the mbuf layer,
we can pull the space the mbuf layer used to do per interface accounting
out of struct if_data.

saves a hundredish bytes on every interface.

ok deraadt@ claudio@


# 1.156 11-Jul-2014 henning

introduce the IFXF_AUTOCONF6 interface flag which controls wether we
accept rtadvs on that interface. the global net.inet6.ip6.accept_rtadv
sysctl just doesn't cut it, even tho the spec wants that - but in their
little absurd world, a host just has one interface by definition anyway...
the sysctlgoes away.
lots of head scratching, brain cell elemination etc from bluhm benno stsp
florian, excitement from simon and todd, ok bluhm stsp benno florian


# 1.155 08-Jul-2014 dlg

introduce the if_rxr api. it is intended to pull the rx ring accounting
out of the mbuf layer, and break the assumption that an interface will
only have a single ring per mbuf cluster size.

mpi@ is ok with moving this forward


# 1.154 13-Jun-2014 mpi

Instead of updating all the cluster allocation water marks of all the
interfaces when the kernel is livelocked, only do it for the current
pool and defer the other updates.

This allow us to get rid of an interface list iteration in a critical
path.

Ridding the libc crank since this change introduce an ABI break.

ok claudio@


Revision tags: OPENBSD_5_5_BASE
# 1.153 21-Nov-2013 mikeb

split kernel parts of the if.h into a separate header file if_var.h
which allows us to modify ifnet structure in a relatively safe way;
discussed with deraadt, ok mpi


# 1.152 09-Nov-2013 dlg

ticks is compared against mcl_grown to see if time has elapsed since
the rx ring was last allowed to grow and then assigned to it. it
is erroneous to do this because mcl_grown is a u_int and ticks is an
int.

this makes mcl_grown an int, and follows the idiom in kern_timeout.c
of going "thing - ticks < diff", which better copes with ticks
wrapping around and being used to calculate relative intervals.

ok pirofti@ guenther@


# 1.151 01-Nov-2013 pelikan

keep net/hfsc.h away from userspace, except in pfctl

tested by naddy, ok deraadt


# 1.150 21-Oct-2013 benno

nuke comment. How soon is now?
"do it" deraadt@


# 1.149 19-Oct-2013 reyk

Bring back the if_detachhook. We're going to have more users now.

ok mpi@ henning@ benno@


# 1.148 13-Oct-2013 reyk

Import vxlan(4), the virtual extensible local area network tunnel
interface. VXLAN is a UDP-based tunnelling protocol for overlaying
virtualized layer 2 networks over layer 3 networks. The implementation
is based on draft-mahalingam-dutt-dcops-vxlan-04 and has been tested
with other implementations in the wild.

put it in deraadt@


# 1.147 12-Oct-2013 henning

new bandwidth shaping subsystem, kernel side
uses hfsc behind the scenes; altq stays in parallel for a migration phase.
if.h even more messy for the transition, but eventuelly it should become
readable...
looked over & tested by many, ok phessler sthen


# 1.146 17-Sep-2013 mpi

Change vlan(4) detach procedure to not use a hook but a list of vlans
on the parent interface. This is similar to what bridge(4), trunk(4)
or carp(4) are doing and allows us to get rid of the detachhook.

ok reyk@, mikeb@


# 1.145 28-Aug-2013 mpi

Remove unused argument from *rtrequest()

ok krw@, mikeb@


Revision tags: OPENBSD_5_4_BASE
# 1.144 20-Jun-2013 mpi

Revert previous and unbreak asr, the new include should be protected.

Reported by naddy@


# 1.143 20-Jun-2013 mpi

Allocate the various hook head descriptors as part of the ifnet
structure rather than doing various M_WAITOK allocations during
the *attach() functions, we always rely on them anyway.

ok mikeb@, uebayasi@


# 1.142 02-Apr-2013 mpi

Instead of storing the link-level address of every interface in a global
array indexed by interface numbers, add a new field to the interface
descriptor pointing to it.

claudio@ and todd@ like it, ok mikeb@


# 1.141 26-Mar-2013 mpi

Remove various read-only *maxlen variables and use IFQ_MAXLEN directly.

ok beck@, mikeb@


# 1.140 20-Mar-2013 mpi

Introduce if_get() to retrieve an interface descriptor pointer given
an interface index and replace all the redondant checks and accesses
to a global array by a call to this function.

With imputs from and ok bluhm@, mikeb@


# 1.139 07-Mar-2013 mpi

Remove unused ifa_ifwithaf() function.

ok mikeb@, miod@


# 1.138 07-Mar-2013 mpi

Remove the IFAFREE() macro, the ifafree() function it was calling already
check for the reference counter.

ok mikeb@, miod@, pelikan@, kettenis@, krw@


Revision tags: OPENBSD_5_3_BASE
# 1.137 23-Nov-2012 sthen

Add SIOCGIFHARDMTU to allow retrieving the driver's maximum supported MTU
looks fine reyk@ ok mikeb@


# 1.136 11-Nov-2012 deraadt

align ifaliasreq.ifra_addr similar to the way that ifreq is fixed --
a gruesome union, to block the compiler from placing the struct
incorrectly aligned on stack frames
ok guenther


# 1.135 05-Oct-2012 camield

Point an interface directly to its bridgeport configuration, instead
of to the bridge itself. This is ok, since an interface can only be part
of one bridge, and the parent bridge is easy to find from the bridgeport.

This way we can get rid of a lot of list walks, improving performance
and shortening the code.

ok henning stsp sthen reyk


# 1.134 19-Sep-2012 henning

defina an IFCAP_CSUM_MASK, covering IFCAP_CSUM_*, and use it in if_vlan.c
to replace the list of them.
this actually makes vlan inherit the IPv6 CSUM flags from it's parent, that
had been commented out since this code was committed back in 2001.
ok benno mpf


# 1.133 10-Sep-2012 guenther

Bring into compliance with POSIX, exposing just the specified bits.

Requested by jasper@, ok kettenis@


# 1.132 21-Aug-2012 bluhm

Reverse the name and meaning of the IFXF_INET6_PRIVACY interface
flag. It is now called IFXF_INET6_NOPRIVACY. So IPv6 privacy
addresses are on by default without resetting the flag during
ifconfig down/up.
OK stsp@, sperreault@ (who wrote the same diff)


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.131 02-Dec-2011 haesbaert

Kill unused IFCAP_IPSEC and IFCAP_IPCOMP.

ok claudio@ henning@ mikeb@


# 1.130 02-Nov-2011 haesbaert

Expose if_capabilities to userland so that ifconfig can display the
device hardware features.
Tune ifconfig to show them with 'hwfeatures' argument.
While here, kill some old unused capabilities and respect 80 columns
in brconfig.h.

ok mcbride@, henning@, mpf@.


# 1.129 07-Oct-2011 henning

rename some vars and functions
unfortunately altq is one giant namespace violation. rename just those that
conflict with new stuff for now only to be found on my laptop. reduce pain,
the diff is huge already. ok ryan


Revision tags: OPENBSD_5_0_BASE
# 1.128 08-Jul-2011 henning

new priority queueing implementation, extremely low overhead, thus fast.
unconditional, always on. 8 priority levels, as every better switch, the
vlan header etc etc. ok ryan mpf sthen, pea tested as well


# 1.127 07-Jul-2011 henning

provide IF_LEN and IFQ_LEN to access ifq_len on an ifqueue, ryan ok


# 1.126 05-Jul-2011 henning

now of course I only noticed if_qflush is completely unused after
adjusting it to the new world order in my tree... remove it, ok ryan claudio


# 1.125 03-Jul-2011 henning

IFQ_CLASSIFY is also just schrapnel


# 1.124 03-Jul-2011 henning

no traces of ALTQ_DECL to be found anywhere, thus kill the #defines


# 1.123 03-Jul-2011 claudio

LINK_STATE_IS_UP() should consider LINK_STATE_UNKNOWN as an up state.
This is now possible because carp no longer uses LINK_STATE_UNKNOWN
for a state that is considered down. This will simplify a lot of code.
OK mpf@ mcbride@ henning@


# 1.122 13-Mar-2011 stsp

Add a way to enable/disable Wake On LAN with ifconfig.
ok deraadt


Revision tags: OPENBSD_4_9_BASE
# 1.121 17-Nov-2010 henning

introduce ifa_update_broadaddr to update an ifaddr's broadcast address,
trivial for the moment, more needed soon
tested by many as part of a larger diff, ok sthen claudio dlg krw


# 1.120 24-Sep-2010 claudio

Implement if_freenameindex() as a real function as required by posix.
OK deraadt@, millert@


# 1.119 23-Sep-2010 dlg

tweak the mclgeti algorithm to behave better under load.

instead of letting hardware rings grow on every interrupt, restrict
it so it can only grow once per softclock tick. we can only punish
the rings on softclock ticks, so it make sense to only grow on
softclock tick boundaries too.

the rings are now punished after >1 lost softclock tick rather than
>2. mclgeti is now more aggressive at detecting livelock.

the rings get punished by an 8th, rather than by half.

we now allow the rings to be punished again even if the system is
already considered in livelock.

without this diff a livelocked system will have its rx ring sizes
scale up and down very rapidly, while holding the rings low for too
long. this affected throughput significantly.

discussed and tested heavily at j2k10. there are still some games
with softnet we can play, but this is a good first step.

"put it in" and ok deraadt@
ok claudio@ krw@ henning@ mcbride@

if we find out that it sucks we can pull it out again later. till then
we'll run with it and see how it goes.


# 1.118 27-Aug-2010 jsg

remove the unused if_init callback in struct ifnet
ok deraadt@ henning@ claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.117 26-Jun-2010 claudio

Implement a simple keepalive mechanism in gre(4) that is compatible with
the one used by Cisco. It sends a return gre packet inside a gre packet
to the other side and expects it to return.
OK deraadt, reyk additional testing by sthen


# 1.116 28-May-2010 claudio

Rework the way we handle MPLS in the kernel. Instead of fumbling MPLS into
ether_output() and later on other L2 output functions use a trick and over-
load the ifp->if_output() function pointer on MPLS enabled interfaces to
go through mpls_output() which will then call the link level output function.
By setting IFXF_MPLS on an interface the output pointers are switched.
This now allows to cleanup the MPLS input and output pathes and fix mpe(4)
so that the MPLS code now actually works for both P and PE systems.
Tested by myself and michele
(A custom kernel with MPLS and mpe enabled is still needed).


# 1.115 17-Apr-2010 deraadt

split SIOCSIFLLADDR code out into an ifnewlladr() function
ok stsp


# 1.114 06-Apr-2010 stsp

Simple implementation of RFC4941, "Privacy Extensions for Stateless
Address Autoconfiguration in IPv6". For those among us who are paranoid
about broadcasting their MAC address to the IPv6 internet.

Man page help from jmc, testing by weerd, arc4random API hints from djm.

ok deraadt, claudio


Revision tags: OPENBSD_4_7_BASE
# 1.113 13-Jan-2010 henning

maintain a global RB tree of all local addresses in the system. this
includes AF_LINK addresses (aka mac addresses in the ethernet case). for
inet this also includes the broadcast addresses.
depends on ifinit() called earlier so we have a chance to pool_init before
autoconf assigns the AF_LINK addresses, the v6 fix, and the ifa_add/del
abstraction i just committed.
this is a change in semantics, it is now illegal to change the actual
address in an ifaddr struct because then the RB tree becomes unbalanced.
nothing using this tree yet.
ok theo ryan dlg


# 1.112 13-Jan-2010 henning

instead of fiddling with the per-interface address lists directly in
many places create a proper API (ifa_add / ifa_del) and use it.
ok theo ryan dlg


# 1.111 12-Jan-2010 deraadt

Make the structures for ifa_msghdr and friends even more like
the route messages so that people and compilers will not get
confused.
ok claudio


# 1.110 17-Sep-2009 claudio

Remove the comaptibility structures for routing socket version 3.
The RTM_VERSION bump is 2 years ago and so there is no need for this.
Diff made by tedu@ some time ago but got never commited so I do it now.


# 1.109 14-Sep-2009 claudio

Add a way to convert the ifi_link_state to a string without the use of
if_media. This makes link state tracking a lot easier as there is no need
to convert if types to if_media types, etc. Additionally this allows us
to extend the link states to include states tracked on higher protocol layers.
gre(4) keepalives packets, bfd and udld can be implemented without ugly hacks.
OK henning, michele, sthen, deraadt


# 1.108 10-Aug-2009 deraadt

At sys_reboot time, bring all the interfaces down so that their xxstop
functions are called, which will turn off DMA. Receiving packets into
your memory after a system reboot is pretty nasty. This will also mean
that the shutdown hooks can go; this solution is smaller.
ok henning miod dlg kettenis


Revision tags: OPENBSD_4_6_BASE
# 1.107 06-Jun-2009 rainer

when xflags got changed, tell the userland by routing sockets

ok henning@


# 1.106 05-Jun-2009 claudio

Initial support for routing domains. This allows to bind interfaces to
alternate routing table and separate them from other interfaces in distinct
routing tables. The same network can now be used in any doamin at the same
time without causing conflicts.
This diff is mostly mechanical and adds the necessary rdomain checks accross
net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6.
input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@


# 1.105 04-Jun-2009 henning

allow IPvShit to be turned off completely per-interface.
ifconfig em0 -inet6
deletes all v6 addresses including link-local and prevents new ones from
being added.
ifconfig em0 inet6 <addr>
re-enables v6, brings the link local back and adds optional <addr>
ok theo reyk


# 1.104 03-Jun-2009 beck

make wireless interfaces priority 4 by default. other interfaces remain
priority 0. while we are in here make sure we add wi interfaces to group "wlan"
in the same way the net80211 stuff already is.

this makes dhcp multiple default routes useful on laptops.

ok claudio@


Revision tags: OPENBSD_4_5_BASE
# 1.103 27-Jan-2009 dlg

make drivers tell the mclgeti allocator what their maximum ring size is
to prevent the hwm growing beyond that. this allows the livelock mitigation
to do something where the hwm used to grow beyond twice the rx rings size.

ok kettenis@ claudio@


# 1.102 12-Dec-2008 claudio

Introduce a if_priority that will be added to RTP_STATIC when routes are
added without an expilict priority. This allows to specify less prefered
interfaces that will only take over if the primary interface loses link.
OK deraadt@


# 1.101 11-Dec-2008 deraadt

export per-interface mbuf cluster pool use statistics out to userland
inside if_data, so that netstat(1) and systat(1) can see them
ok dlg


# 1.100 30-Nov-2008 brad

- Remove unused if_reset "bus reset routine" field in the ifnet struct.
- Add if_stop "stop routine" field in the ifnet struct.

ok mglocker@


# 1.99 26-Nov-2008 dlg

provide m_clsetlwm, an interface for an interface to raise its low
watermark for mbuf cluster allocations.

this is necessary for things like bge which cannot cope with less than a
certain number of pkts on the ring.

ok deraadt@


# 1.98 25-Nov-2008 deraadt

Factor increases are not needed, +1 appears to work as well.
ok dlg


# 1.97 24-Nov-2008 deraadt

move MCLPOOLS to if.h and force uipc_mbuf.c to get if.h, there is no
other option
ok dlg


# 1.96 24-Nov-2008 dlg

add several backend pools to allocate mbufs clusters of various sizes out
of. currently limited to MCLBYTES (2048 bytes) and 4096 bytes until pools
can allocate objects of sizes greater than PAGESIZE.

this allows drivers to ask for "jumbo" packets to fill rx rings with.

the second half of this change is per interface mbuf cluster allocator
statistics. drivers can use the new interface (MCLGETI), which will use
these stats to selectively fail allocations based on demand for mbufs. if
the driver isnt rapidly consuming rx mbufs, we dont allow it to allocate
many to put on its rx ring.

drivers require modifications to take advantage of both the new allocation
semantic and large clusters.

this was written and developed with deraadt@ over the last two days
ok deraadt@ claudio@


# 1.95 07-Nov-2008 deraadt

give this some /* CONSTCOND */ love


Revision tags: OPENBSD_4_4_BASE
# 1.94 10-Apr-2008 dlg

introduce mitigation for the calling of an interfaces start routine.

decent drivers prefer to have a lot of packets on the send queue so they
can queue a lot of them up on the tx ring and then post them all in one
big chunk. unfortunately our stack queues one packet onto the send queue
and then calls the start handler immediately.

this mitigates against that queue, send, queue, send behaviour by trying to
call the start routine only once per softnet. now its queue, queue, queue,
send.

this is the result of a lot of discussion with claudio@
tested by many.


Revision tags: OPENBSD_4_3_BASE
# 1.93 18-Nov-2007 mpf

Sync struct ifaltq to match struct ifqueue.
I wonder why 64-bit archs have not been bitten by this.
OK mcbride@, henning@


# 1.92 03-Sep-2007 claudio

Bump RTM_VERSION to 4 and start a new aera of routing in OpenBSD :)
Changes include 64bit counters instead of u_long, routing table id in the header
of most messages, reserved routing priority field, added a hdrlen field to skip
over the header so that binary compatibility becomes easier.
A minimal backward support for old binaries is included to ease upgrades but
don't expect anything more than ifconfig, route and dhclient to correctly work.
OK henning@ mglocker@


Revision tags: OPENBSD_4_2_BASE
# 1.91 25-Jun-2007 henning

crank ifq_maxlen from 50 to 256, so it is not smaller than most interfaces
rx rings any more. forwarding boxes with many fast interfaces can still use
some more, but this is a saner default.
ok deraadt markus henric


# 1.90 14-Jun-2007 reyk

Add a new "rtlabel" option to ifconfig. It allows to specify a route label
which will be used for new interface routes. For example,
ifconfig em0 10.1.1.0 255.255.255.0 rtlabel RING_1
will set the new interface address and attach the route label RING_1 to
the corresponding route.

manpage bits from jmc@
ok claudio@ henning@


# 1.89 29-May-2007 uwe

Define IF_ENQUEUE() and friends as proper C statements using do ... while
ok henning


# 1.88 26-May-2007 jason

one extern seems to be better than 20 for ifqmaxlen; ok krw


# 1.87 27-Mar-2007 jmc

grammar from bret lambert, and one more from me;


Revision tags: OPENBSD_4_1_BASE
# 1.86 09-Feb-2007 jmc

grammar fix from bret lambert;


# 1.85 28-Nov-2006 reyk

add additional link states to report the half duplex / full duplex
state, if known by the driver. this is required to check the full
duplex state without depending on the ifmedia ioctl which can't be
called in the kernel without process context.

ok henning@, brad@


# 1.84 16-Nov-2006 henning

introduce if_creategroup() to create an empty interface group.
code factored out from if_addgroup(), previously a group always had to have
members. ok mpf mcbride


# 1.83 31-Oct-2006 jason

ether_input_mbuf() isn't necessary, turn it into a macro and deal with
it's "special" case in ether_input(). Based on similiar idea in FreeBSD.
ok brad


Revision tags: OPENBSD_4_0_BASE
# 1.82 02-Jun-2006 mpf

Introduce attributes to interface groups.
As a first user, move the global carp(4) demotion counter
into the interface group. Thus we have the possibility
to define which carp interfaces are demoted together.

Put the demotion counter into the reserved field of the carp header.
With this, we can have carp act smarter if multiple errors occur.
It now always takes over other carp peers, that are advertising
with a higher demote count. As a side effect, we can also have
group failovers without the need of running in preempt mode.
The protocol change does not break compability with older
implementations.

Collaborative work with mcbride@

OK mcbride@, henning@


# 1.81 27-May-2006 brad

remove IFCAP_JUMBO_MTU interface capabilities flag and set if_hardmtu in a few
more drivers.

ok reyk@


# 1.80 26-May-2006 deraadt

rename jumbo mtu to if_hardmtu; ok brad reyk


# 1.79 19-May-2006 reyk

add a if_jumbo_mtu field to the interface structure for drivers
supporting ethernet jumbo frames. there's no standard for the size of
jumbo MTUs, so either let the driver set it's own value or use 9000
byte jumbo frames by default.

ok brad@


# 1.78 04-Mar-2006 brad

With the exception of two other small uncommited diffs this moves
the remainder of the network stack from splimp to splnet.

ok miod@


Revision tags: OPENBSD_3_9_BASE
# 1.77 09-Feb-2006 reyk

add an interface detach hook and use it with the vlan(4) driver. this
fixes a possible crash if the parent interface has been destroyed
(like vlan on trunk) before destroying the vlan interface.

ok brad@


Revision tags: OPENBSD_3_8_BASE
# 1.76 14-Jun-2005 henning

rename function and define to reflect the external -> egress name change
so it is clear what it is all about


# 1.75 14-Jun-2005 henning

use "egress" instead of "external" for the interface group containing the
interfaces the default route(s) point to, proposed deraadt some days ago,
ok djm deraadt


# 1.74 12-Jun-2005 henning

add SIOCGIFGMEMB ioctl, returns a list of all interfaces who are member of
the given group, markus ok


# 1.73 07-Jun-2005 henning

introduce a default "external" interface group, containing the interface(s)
the the default route(s) point to.
handles IPv4 and IPv6 as well as multipath routes.
follows default route changes, of course.
eases writing pf rulesets especially on laptops etc. that use different
interfaces depending on the environment (wired, wireless, ...)
ok theo ryan


# 1.72 06-Jun-2005 henning

use a define instead of hardcoding "all" in 3 places


# 1.71 05-Jun-2005 henning

const'ify the char *groupname param to if_addgroup and if_delgroup


# 1.70 24-May-2005 markus

add net.inet.ip.ifq for monitoring and changing ifqueue; similar to netbsd
ok henning


# 1.69 24-May-2005 reyk

initial import of a trunking (link aggregation and link failover)
implementation. it currently supports round robin mode with link state
checking, additional modes will be added later.

ok brad@, deraadt@


# 1.68 24-May-2005 henning

keep a list of member interfaces in ifg_group


# 1.67 22-May-2005 henning

allow pf to match on interface groups
pass on mygroup ...
markus ok


# 1.66 21-May-2005 henning

clean up and rework the interface absraction code big time, rip out multiple
useless layers of indirection and make the code way cleaner overall.
this is just the start, more to come...
worked very hard on by Ryan and me in Montreal last week, on the airplane to
vancouver and yesterday here in calgary. it hurt.
ok ryan theo


# 1.65 20-Apr-2005 mpf

Introduce if_linkstatehooks.
This converts if_link_state_change() to a generic usable
callback with dohooks().

OK henning@, camield@
Tested by camield@ and Alexey E. Suslikov


Revision tags: OPENBSD_3_7_BASE
# 1.64 07-Feb-2005 mcbride

Add new function if_link_state_change() to take care of sending messages
on the routing socket and notifying carp() of link changes.

ok brad@ mpf@


# 1.63 14-Jan-2005 henning

remove old ifgroups ioctls
the old ifgroups haven't been in use ever really, and the new
implementation is 3 months old today. theo ok (3 months ago)


# 1.62 07-Dec-2004 mcbride

Convert carp(4) to behave more like a regular interface, much in the same
style as vlan(4). carp interfaces no longer require the physical interface
to be on the same subnet as the carp interface, or even that the physical
interface has an adress at all, so CARP can now be used on /30 networks.

ok deraadt@ henning@


# 1.61 07-Dec-2004 mcbride

KNF


# 1.60 03-Dec-2004 henning

do not use one struct timeout for the if congestion stuff, but embed
a struct timeout to struct ifqueue so that each one has its own - it
is a per-queue thing. from chris pascoe


# 1.59 10-Nov-2004 grange

Safer IF_INPUT_ENQUEUE macro.

ok millert@


# 1.58 14-Oct-2004 mickey

avoid stupid commons


# 1.57 11-Oct-2004 henning

ifgroups reqrite
there is now a TAILQ with all interface groups as members, and
in struct ofnet there is only a pointer to the group structure stored
and not its name.
mostly hacked at c2k4 and somewhere over the atlantic ocean
ok markus mcbride


Revision tags: OPENBSD_3_6_BASE
# 1.56 26-Jun-2004 markus

cleanup ioctl for ifgroups; ok pb@


# 1.55 25-Jun-2004 pb

introduce "interface groups"

by "ifconfig fxp0 group foobar" "ifconfig xl0 group foobar"
these two interfaces are in one group.
Every interface has its if-family as default group.

idea/design from henning@, based on some work/disucssion from Joris Vink.

henning@, mcbride@ ok.


# 1.54 20-Jun-2004 beck

undo mbuf cluster breakage that causes free'ed packets to show up on the
input queues when using dhcp and hostap wi, or xl, or fxp....
ok art@


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.53 29-May-2004 jcs

introduce SIOCSIFDESCR and SIOCGIFDESCR to maintain interface
descriptions, configurable with ifconfig

help from various, ok deraadt@


# 1.52 18-May-2004 brad

if_ether.h
add ETHER_MAX_LEN_JUMBO, ETHER_VLAN_ENCAP_LEN, ETHER_ALIGN, and
ETHERMTU_JUMBO constants.

if.h
add a few more interface capabilities flags.

Some from NetBSD, some from FreeBSD.

ok markus@


# 1.51 26-Apr-2004 mcbride

Before enqueueing the packet, copy the contents of incoming clusters
to the mbuf and free the cluster when it contains a small packet.

ok deraadt@


# 1.50 17-Apr-2004 henning

add a congestion indicator to if_queue. It is set when the input queue
is full, along with a timer that unsets it again after 10ms.
The input queue beeing full is a reliable indicator for CPU overload, and
this flag allows other subsystems to cope with the situation.
hacked with beck
ok kjc@ markus@ beck@


Revision tags: OPENBSD_3_5_BASE
# 1.49 15-Jan-2004 markus

add a RTM_IFANNOUNCE message; from netbsd; ok itojun, henning


# 1.48 16-Dec-2003 markus

return error in ifc_destroy; ok deraadt, itojun, cedric, hshoexer


# 1.47 10-Dec-2003 itojun

use if_indexlim (instead of if_index) and ifindex2ifnet[x] != NULL
to check if interface exists, as (1) if_index will have different meaning
(2) ifindex2ifnet could become NULL when interface gets destroyed,
when we introduce dynamically-created interfaces. markus ok


# 1.46 08-Dec-2003 markus

add IOCIFGCLONERS; ifconfig -C; from netbsd; ok henning, deraadt


# 1.45 03-Dec-2003 markus

support for network interface "cloning", e.g. gif(4) via ifconfig(8)


# 1.44 19-Oct-2003 david

more typos


# 1.43 17-Oct-2003 mcbride

Common Address Redundancy Protocol

Allows multiple hosts to share an IP address, providing high availability
and load balancing.

Based on code by mickey@, with additional help from markus@
and Marco_Pfatschbacher@genua.de

ok deraadt@


Revision tags: OPENBSD_3_4_BASE
# 1.42 25-Aug-2003 fgsch

if_init support, required by ieee80211.
deraadt@ ok.


# 1.41 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_A UBC_SYNC_B
# 1.40 03-Jul-2002 miod

Change all variables definitions (int foo) in sys/sys/*.h to variable
declarations (extern int foo), and compensate in the appropriate locations.


# 1.39 30-Jun-2002 itojun

allocate sockaddr_dl for ifnet in if_alloc_sadl(), as we don't always know
the size of sockaddr_dl on if_attach() - for instance, see ether_ifattach().
from netbsd. fgs ok


# 1.38 23-Jun-2002 itojun

g/c last remains of old ipv6 prefix management


# 1.37 27-May-2002 itojun

if_attach() gets called before domaininit(). scan all interfaces for if_afdata
initialization after domaininit().


# 1.36 27-May-2002 itojun

framework to add af-dependent data structure to struct ifnet.
as discussed at bsd-api-discuss. sync w/kame


# 1.35 24-Apr-2002 dhartmei

Add hooks to struct ifnet that allow to register callbacks that will be
notified of interface address changes. ok provos@, angelos@


Revision tags: OPENBSD_3_1_BASE
# 1.34 15-Mar-2002 millert

Cosmetic changes only, primarily making comments line up nicely after the
__P removal.


# 1.33 14-Mar-2002 millert

First round of __P removal in sys


# 1.32 23-Jan-2002 fgsch

compatability -> compatibility.


Revision tags: OPENBSD_3_0_BASE UBC_BASE
# 1.31 05-Jul-2001 angelos

branches: 1.31.4;
KNF


# 1.30 05-Jul-2001 jjbg

Include files for IPComp support. angelos@ ok.


# 1.29 27-Jun-2001 kjc

ALTQ base modifications to the kernel.
- ALTQ introduces a set of new queue macros that coexist with the
traditional IF_XXX macros.
- "struct ifaltq" replaces "struct ifqueue" in "struct ifnet".
- assign cdev major 74 for i386 and 54 for alpha as ALTQ control interface.


# 1.28 23-Jun-2001 fgsch

Add ether_input_mbuf to help us remove the ether_header from
ether_input; all drivers should start migrating to this.
Discussed with jason@, deraadt@ more or les ok'ed.


# 1.27 15-Jun-2001 itojun

change the meaning of ifnet.if_lastchange to meet RFC1573 ifLastChange.
follows BSD/OS practice and ucd-snmp code (FreeBSD does it for specific
interfaces only).

was: if_lastchange get updated on every packet transmission/receipt.
now: if_lastchange get updated when IFF_UP is changed.


# 1.26 09-Jun-2001 angelos

By popular demand, protect from multiple inclusion, and fix to use the
same naming style.


# 1.25 28-May-2001 angelos

IPSECv4 -> IPSEC


# 1.24 28-May-2001 angelos

No need for separate ESP/AH interface capabilities.


# 1.23 28-May-2001 angelos

Interface capabilities (based on NetBSD, but merge ethercom and ifnet
capabilities into one, in the ifp).


Revision tags: OPENBSD_2_9_BASE
# 1.22 06-Feb-2001 mickey

allow changing number of loopbacks in ukc.
change rest of the code to use lo0ifp pointing
to the corresponding struct ifnet.
itojun@ and niklas@ ok


# 1.21 19-Jan-2001 itojun

pull post-4.4BSD change to sys/net/route.c from BSD/OS 4.2 (UCB copyrighted).

have sys/net/route.c:rtrequest1(), which takes rt_addrinfo * as the argument.
pass rt_addrinfo all the way down to rtrequest, and ifa->ifa_rtrequest.
3rd arg of ifa->ifa_rtrequest is now rt_addrinfo * instead of sockaddr *
(almost noone is using it anyways).

benefit: the follwoing command now works. previously we need two route(8)
invocations, "add" then "change".
# route add -inet6 default ::1 -ifp gif0

remove unsafe typecast in rtrequest(), from rtentry * to sockaddr *. it was
introduced by 4.3BSD-reno and never corrected.

XXX is eon_rtrequest() change correct regarding to 3rd arg?
eon_rtrequest() and rtrequest() were incorrect since 4.3BSD-reno,
so i do not have correct answer in the source code.
someone with more clue about netiso-over-ip, please help.


Revision tags: OPENBSD_2_8_BASE
# 1.20 20-Sep-2000 art

Since ifa_refcnt was bumped to an int and rt_flags is an int too, bump
ifa_flags to int.


# 1.19 28-Aug-2000 deraadt

changing the size of if_data has heavy impact on userland compat. there
was even a u_char slot available for ifi_link_state, which clearly does not
need a full 32 bits.


# 1.18 26-Aug-2000 nate

sync mii code with netbsd
adds detach functionality for phys
some code cleanup

Nobody really had time to test all of this out, but theo said commit anyway


Revision tags: OPENBSD_2_7_BASE
# 1.17 22-Mar-2000 itojun

remove if_withname(), which was imported during KAME merge by mistake.


# 1.16 21-Mar-2000 mickey

add SIOCGIFMTU/SIOCSIFMTU; remediate redundant code of tun, ppp, sppp; chris@ ok


Revision tags: SMP_BASE
# 1.15 02-Feb-2000 itojun

branches: 1.15.2;
wrap IFAFREE() by "do {} while (0)". it wasn't safe enough.


Revision tags: kame_19991208
# 1.14 08-Dec-1999 itojun

bring in KAME IPv6 code, dated 19991208.
replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support.
see sys/netinet6/{TODO,IMPLEMENTATION} for more details.

GENERIC configuration should work fine as before. GENERIC.v6 works fine
as well, but you'll need KAME userland tools to play with IPv6 (will be
bringed into soon).


Revision tags: OPENBSD_2_6_BASE
# 1.13 08-Aug-1999 niklas

Support detaching of network interfaces. Still work to do in ipf, and
other families than inet.


# 1.12 23-Jun-1999 cmetz

Added some protocol independent interfaces (supposedly IPv6 support APIs, but
ones that are useful for all protocols, not just IPv6).


Revision tags: OPENBSD_2_5_BASE
# 1.11 13-Mar-1999 deraadt

make ifa_refcnt a u_int; andrewb@demon.net


# 1.10 26-Feb-1999 jason

Ethernet bridge/IP firewall driver.


# 1.9 07-Jan-1999 deraadt

fix IFAFREE() to be safe for if/else nesting


Revision tags: OPENBSD_2_4_BASE
# 1.8 03-Sep-1998 jason

o OpenBSD gets if_media support (from NetBSD)
o rework/simplify if_xl to use it


Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE OPENBSD_2_2_BASE OPENBSD_2_3_BASE
# 1.7 02-Jul-1996 niklas

-Wall & -Wstrict-prototype fixes


# 1.6 29-Jun-1996 deraadt

provide if_attachhead(), and make if_loop use it


# 1.5 10-May-1996 deraadt

if_name/if_unit -> if_xname/if_softc


# 1.4 06-May-1996 mickey

if.h was missed from the commit.
if_ethersubr.c: missed variables added.


# 1.3 05-Mar-1996 mickey

Changes for ifconfig to compile.


# 1.2 03-Mar-1996 niklas

From NetBSD: 960217 merge


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.206 01-Feb-2021 mvs

ifunit() was fully replaced by if_unit(9) and should go away.

ok bluhm@ dlg@


# 1.205 18-Jan-2021 mvs

Introduce new function if_unit(9). This function returns a pointer the
interface descriptor corresponding to the unique name. This descriptor
is guaranteed to be valid until if_put(9) is called on the returned
pointer. if_unit(9) should replace already existent ifunit() which
returns descriptor not safe for dereference when context was switched.
This allow us to avoid some use-after-free issues in ioctl(2) path.
Also this unifies interface descriptor usage.

ok claudio@ sashan@


# 1.204 30-Sep-2020 mvs

We have no if_attachtail() function so remove the declaration.

ok deraadt@ claudio@


Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE OPENBSD_6_8_BASE
# 1.203 25-Jul-2019 krw

AF_INET comes before AF_INET6. Shorten line to <80 chars.

pointed out by claudio@


# 1.202 25-Jul-2019 krw

Add IFXF_AUTOCONF4 to if_xflags to match IFXF_AUTOCONF6. Let
ifconfig set/unset it.

ok deraadt@ kmos@


# 1.201 19-Apr-2019 dlg

add IF_HDRPRIO_OUTER for rxprio

IF_HDRPRIO_OUTER says you want the priority from the outer encap header.

ok claudio@


Revision tags: OPENBSD_6_5_BASE
# 1.200 10-Apr-2019 dlg

add struct if_sffpage so userland can read a page of sfp/qsfp info

this will be used by ifconfig so it can show you things like who
mades what module you're using and when it was made, and on some
modules you get diag info like temperature, vcc, and rx and tx
powers.

im putting the kernel side in so we can keep fiddling with the
userland printf side of things.

this work was done based on a question by rachel roch
ok deraadt@
enthusiasm from many including mikeb@ sthen@ patrick@


# 1.199 23-Jan-2019 dlg

add a SIOCGPWE3 ioctl for interfaces to advertise pwe3 capability

im going to turn mpw into an ethernet interface, which includes
changing its if_type to IFT_ETHER. currently ldpd looks for if_type
IFT_MPLSTUNNEL to decide if an interface is a pseudowire, ie, it's
going to break. the ioctl will let ldpd ask the interface if it is
pseudowire capable as an alternative.

ok claudio@


# 1.198 12-Nov-2018 dlg

add ifreq bits for the tx header prio field ioctls

a tx header prio can set to a fixed value from 0 to 7, or magic
values to represent populating the prio field from the encapsulated
packet, or from the mbuf prio value.

ok claudio@


# 1.197 12-Nov-2018 krw

Add new routing socket message RTM_80211INFO to provide details of
802.11 interface state changes (e.g. SSID) to interested parties.

Original diff from phessler@. Many suggestions and tweaks from
claudio@, stsp@, anton@.

ok claudio@ stsp@ anton@ phessler@


# 1.196 11-Nov-2018 dlg

use the llprio on gre(4) and eoip(4) interfaces for the keepalive tos

llprios are valued 0 to 7, while the ip tos/dscp/tclass is an 8 bit
value. fortunately the high 3 bits map nicely to the llprio values,
so we shift the llprio into place when generating the keepalive
frames. the llprio is defaulted to the value that cisco uses for
their gre keepalives.


Revision tags: OPENBSD_6_4_BASE
# 1.195 12-Sep-2018 krw

Fix obvious cut&pasto in comment (ifa_msghdr -> if_announcemsghdr).

ok claudio@


# 1.194 30-May-2018 dlg

restrict the prio values from SIOCSIFLLPRIO to what the kernel handles

previously the ioctl code checked that prio was an int less than
UCHAR_MAX, but the rest of the kernel (and priq code in particular)
expects it to be between 0 and 7 inclusive.

ok krw@ tb@


# 1.193 25-Apr-2018 jca

Make this header standalone #if __BSD_VISIBLE, by including needed headers

Puts us in line with Free/NetBSD and Linux and will get us rid of
pointless patches in the ports tree. ok guenther@ deraadt@


Revision tags: OPENBSD_6_3_BASE
# 1.192 19-Feb-2018 dlg

tunneldf needs ifr_df


# 1.191 10-Feb-2018 florian

Implement RFC 7217: "A Method for Generating Semantically Opaque
Interface Identifiers with IPv6 Stateless Address Autoconfiguration."

"An IPv6 address configured using this method is stable within each
subnet, but the corresponding Interface Identifier changes when the
host moves from one network to another. This method is meant to be an
alternative to generating Interface Identifiers based on hardware
addresses."

OK naddy, sthen


# 1.190 16-Jan-2018 mpi

Recycle IFF_NOTRAILERS into IFF_STATICARP and document ownerhsip
of IFF* flags.

inputs from jmc@, ok bluhm@, visa@


# 1.189 21-Dec-2017 dlg

prototype if_attach_iqueues so drivers can configure multiple iqs.


# 1.188 09-Nov-2017 tb

The cmd argument of ifconf() has been unused since COMPAT_LINUX was
purged. Remove it and move the prototype to if.c since ifconf() is
not used outside of this file.

ok mpi


# 1.187 31-Oct-2017 sashan

- add one more softnet taskq
NOTE: code still runs with single softnet task. change definition of
SOFTNET_TASKS in net/if.c, if you want to have more than one softnet task

OK mpi@, OK phessler@


Revision tags: OPENBSD_6_1_BASE OPENBSD_6_2_BASE
# 1.186 24-Jan-2017 krw

A space here, a space there. Soon we're talking real whitespace
rectification.


# 1.185 24-Jan-2017 dlg

add support for multiple transmit ifqueues per network interface.

an ifq to transmit a packet is picked by the current traffic
conditioner (ie, priq or hfsc) by providing an index into an array
of ifqs. by default interfaces get a single ifq but can ask for
more using if_attach_queues().

the vast majority of our drivers still think there's a 1:1 mapping
between interfaces and transmit queues, so their if_start routines
take an ifnet pointer instead of a pointer to the ifqueue struct.
instead of changing all the drivers in the tree, drivers can opt
into using an if_qstart routine and setting the IFXF_MPSAFE flag.
the stack provides a compatability wrapper from the new if_qstart
handler to the previous if_start handlers if IFXF_MPSAFE isnt set.

enabling hfsc on an interface configures it to transmit everything
through the first ifq. any other ifqs are left configured as priq,
but unused, when hfsc is enabled.

getting this in now so everyone can kick the tyres.

ok mpi@ visa@ (who provided some tweaks for cnmac).


# 1.184 23-Jan-2017 mpi

Flag pseudo-interfaces as such in order to call add_net_randomness()
only once per packet.

Fix a regression introduced when if_input() started to be called by
every pseudo-driver.

ok claudio@, dlg@


# 1.183 23-Jan-2017 dlg

i botched the copyout to ifr->ifr_data in SIOCGIFDATA.

this lets pflogd run again.

rename if_data() to if_getdata() while here to make grepping for
things less noisy.

reported by jsg@
worked through with deraadt@


# 1.182 23-Jan-2017 dlg

merge the ifnet and ifqueue stats together when userland wants them.

a new if_data() function takes a pointer to ifnet and merges its
if_data and ifq statistics. it takes the ifq mutex around the reads
of the ifq stats so they get a consistent copy.

the ifnet and ifq stats are merged because some parts of the stack
still update the ifnet counters.

ok visa@ (on an earlier diff) mpi@ claudio@


# 1.181 12-Dec-2016 mpi

Remove most of the splsoftnet() recursions related to cloned interfaces.

inputs and ok bluhm@


# 1.180 27-Oct-2016 dlg

add a new pool for 2k + 2 byte (mcl2k2) clusters.

a certain vendor likes to make chips that specify the rx buffer
sizes in kilobyte increments. unfortunately it places the ethernet
header on the start of the rx buffer, which means if you give it a
mcl2k cluster, the ethernet header will not be ETHER_ALIGNed cos
mcl2k clusters are always allocated on 2k boundarys (cos they pack
into pages well). that in turn means the ip header wont be aligned
correctly.

the current workaround on these chips has been to let non-strict
alignment archs just use the normal 2k cluster, but use whatever
cluster can fit 2k + 2 on strict archs. that turns out to be the
4k cluster, meaning we waste nearly 2k of space on every packet.

properly aligning the ethernet header and ip headers gives a
performance boost, even on non-strict archs.


# 1.179 04-Sep-2016 reyk

Move code to change the rdomain of an interface from the ioctl switch case
to a new function if_setrdomain().

OK mpi@ henning@


# 1.178 03-Sep-2016 reyk

Add support for a multipoint-to-multipoint mode in vxlan(4). In this
mode, vxlan(4) must be configured to accept any virtual network
identifier with "vnetid any" and added to a bridge(4) or switch(4).
This way the driver will dynamically learn the tunnel endpoints and
their vnetids for the responses and can be used to dynamically bridge
between VXLANs. It is also being used in combination with switch(4)
and the OpenFlow tunnel classifiers.

With input from yasuoka@ goda@
OK deraadt@ dlg@


Revision tags: OPENBSD_6_0_BASE
# 1.177 10-Jun-2016 vgross

Add the "llprio" field to struct ifnet, and the corresponding keyword
to ifconfig.

"llprio" allows one to set the priority of packets that do not go through
pf(4), as the case is for arp(4) or bpf(4).

ok sthen@ mikeb@


# 1.176 02-Mar-2016 dlg

provide generic ioctls for managing an interfaces parent

in the future this will subsume the individual vlandev, carpdev,
pppoedev, foodev options for things like vlan, carp, pppoe, etc.

inspired by vnetid

ok mpi@ jmatthew@


Revision tags: OPENBSD_5_9_BASE
# 1.175 05-Dec-2015 deraadt

avoid an ugly wrap in a comment


# 1.174 03-Dec-2015 dlg

rework if_start to allow nics to provide an mpsafe start routine.

existing start routines will still be called under the kernel lock
and at IPL_NET.

mpsafe start routines will be serialised so only one instance of
each interfaces function will be running in the kernel at any point
in time. this guarantees packets will be dequeued in order, and the
start routines dont have to lock against themselves because if_start
does it for them.

the code to do that is based on the scsi runqueue code.

this also provides an if_start_barrier() function that should wait
until any currently running instances of if_start have finished.

a driver can opt in to the mpsafe if_start call by doing the following:

1. setting ifp->if_xflags = IFXF_MPSAFE
2. only calling if_start() instead of its own start routine
3. clearing IFF_RUNNING before calling if_start_barrier() on its way down
4. only using IFQ_DEQUEUE (not ifq_deq_begin/commit/rollback)

to simplify the implementation the tx mitigation code has been removed.

tested by several
ok mpi@ jmatthew@


# 1.173 20-Nov-2015 mpi

Keep if_ref() private, if_get() is what you want to use before if_put().

The thread detaching an interface will sleep until all references to this
interface have been released. So we decided to only keep references for
a short period of time.

Keeping if_ref() private will hopefully help preserve this goal as long
as it makes sense.

Calling if_get()/if_put() in the same function also allows us to make
use of static analysis tools (thanks jsg@!) to catch our errors.

ok dlg@


# 1.172 24-Oct-2015 reyk

Add pair(4), a vether-based virtual Ethernet driver to interconnect
rdomains and bridges on the local system. This can be used to route
through local rdomains, to create L2 devices (like trunks) between
them, and many other things.

Discussed with many, with input from mpi@
OK sthen@ phessler@ yasuoka@ mikeb@


# 1.171 23-Oct-2015 claudio

Introduce a new sysctl NET_RT_IFNAMES that returns only ifnames to ifindex
mappings. This will be used by if_nameindex(3), if_nametoindex(3) and
if_indextoname(3) soon to fix the issues in pledge because of inet6 link
local addressing.
OK mpi@ benno@ deraadt@
The libc version will follow soon so better start updating your kernels


# 1.170 23-Oct-2015 dlg

tweak the vnetid so it can be optional and therefore cleared/deleted.

the abstract vnetid is promoted to a uin32_t, and adds a SIOCDVNETID
ioctl so it can be cleared.

this is all because i set an assignment on implementing a virtual
network interface and the students got confused when vnetid 0 didnt
show up in ifconfig output.

the vnetid in the vxlan(4) protocol is optional, but the current
code confuses 0 with no vnetid being set. this makes it clear.

ok reyk@ who also simplified my diff


# 1.169 05-Oct-2015 uebayasi

Add ifi_oqdrops and its alias to struct if_data.

Necessary bumps in Ports will be handled by sthen@.

OK mpi@ dlg@


# 1.168 27-Sep-2015 stsp

Add if_setlladdr(), factored out from ifioctl(). Will be used by iwm(4) soon.
With suggestions from tedu@ and guenther@
ok kettenis@


# 1.167 11-Sep-2015 stsp

Make room for media types of the future. Extend the ifmedia word to 64 bits.
This changes numbers of the SIOCSIFMEDIA and SIOCGIFMEDIA ioctls and
grows struct ifmediareq.

Old ifconfig and dhclient binaries can still assign addresses, however
the 'media' subcommand stops working. Recompiling ifconfig and dhclient
with new headers before a reboot should not be necessary unless in very
special circumstances where non-default media settings must be used to
get link and console access is not available.

There may be some MD fallout but that will be cleared up later.

ok deraadt miod
with help and suggestions from several sharks attending l2k15


# 1.166 09-Sep-2015 dlg

introduce reference counts for interfaces (ie, struct ifnet *ifp).

if_get can get a reference to an ifp, but it never releases that
reference. this provides an if_put function that can be used to
decrement the refcount.

we cannot come up with a scheme for letting the network stack run on
one (or many) cpus while ioctls are pulling interfaces down on another
cpu without refcounts for the interfaces.

if_put is going in now so we can go through the stack and put the
necessary calls to it in, and then we'll backfill this implementation
to actually check the refcounts when the interface detaches.

ok mpi@ mikeb@ claudio@


# 1.165 30-Aug-2015 mpi

Use a global table for domains instead of building a list at run time.

As a side effect there's no need to run if_attachdomain() after the
list of domains has been built.

ok claudio@, reyk@


Revision tags: OPENBSD_5_8_BASE
# 1.164 07-Jun-2015 jsg

Introduce unhandled_af() for cases where code conditionally does
something based on an address family and later assumes one of the paths
was taken. This was initially just calls to panic until guenther
suggested a function to reduce the amount of strings needed.

This reduces the amount of noise with static analysers and acts
as a sanity check.

ok guenther@ bluhm@


# 1.163 18-May-2015 reyk

Move the rdomain from struct ifnet into struct if_data. This way it
will be exported to userland with the existing sysctl, getifaddrs()
and routing socket (if_msghdr.ifm_data) interfaces that expose
if_data. All programs and daemons - Apps - that call the
SIOCGIFRDOMAIN ioctl in a getifaddrs() loop or after receiving an
interface message on the routing socket can now remove the pointless
additional ioctl. In base, that could be: dhclient, isakmpd, dhcpd,
dhcrelay, ntpd, ospfd, ripd, ifconfig.

No ABI breakage because it uses a previously unused pad field in if_data.

OK mpi@ deraadt@


# 1.162 10-Apr-2015 mpi

Run detach hook and similar before cleaning up any other resource when
an interface is destroyed/removed. This way we can ensure pseudo-driver
changes done after attaching an interface are undone before detaching it.

Note: it is safe to call if_deactivate() multiple times as the interface
should not have any attached pseudo-interface after the first call.

ok deraadt@, dlg@


# 1.161 18-Mar-2015 dlg

remove the congestion handling from struct ifqueue.

its only used for the ip and ip6 network stack input queues, so it
seems unfair that every instance of ifqueue has to carry a pointer
around for this specific use case.

this moves the congestion marker to a kernel global. if we detect
that we're congested, we assume the whole system is busy and punish
all input queues.

marking a system as congested is done by setting the global to the
current value of ticks. as the system moves away from that value,
it moves away from being congested until the comparison fails.

written at s2k15
ok henning@ beck@ bluhm@ claudio@


Revision tags: OPENBSD_5_7_BASE
# 1.160 08-Feb-2015 mpi

Introduce if_input() a function to pass packets dequeued from a
recieving ring to the stack.

if_input() is at the moment a drop-in replacement for ether_input_mbuf()
but will let us stack pseudo-driver in a nice way in order to no longer
call ether_input() recursively.

ok pelikan@, reyk@, blambert@, henning@


# 1.159 06-Jan-2015 stsp

Remove the NOINET6 interface flag, a left-over from the times when IPv6
was enabled by default. Add AFATTACH/AFDETACH ioctls which enable/disable
an address family for an interface (currently used for IPv6 only).

New kernel needs new ifconfig for IPv6 configuration (address assignment
still works with old ifconfig making this easy to cross over).

Committing on behalf of henning@ who is currently lebensmittelvergiftet.
ok stsp, benno, mpi


# 1.158 05-Dec-2014 mpi

Explicitly include <net/if_var.h> instead of pulling it in <net/if.h>.

ok mikeb@, krw@, bluhm@, tedu@


Revision tags: OPENBSD_5_6_BASE
# 1.157 14-Jul-2014 dlg

now that receive ring accounting has been pulled out of the mbuf layer,
we can pull the space the mbuf layer used to do per interface accounting
out of struct if_data.

saves a hundredish bytes on every interface.

ok deraadt@ claudio@


# 1.156 11-Jul-2014 henning

introduce the IFXF_AUTOCONF6 interface flag which controls wether we
accept rtadvs on that interface. the global net.inet6.ip6.accept_rtadv
sysctl just doesn't cut it, even tho the spec wants that - but in their
little absurd world, a host just has one interface by definition anyway...
the sysctlgoes away.
lots of head scratching, brain cell elemination etc from bluhm benno stsp
florian, excitement from simon and todd, ok bluhm stsp benno florian


# 1.155 08-Jul-2014 dlg

introduce the if_rxr api. it is intended to pull the rx ring accounting
out of the mbuf layer, and break the assumption that an interface will
only have a single ring per mbuf cluster size.

mpi@ is ok with moving this forward


# 1.154 13-Jun-2014 mpi

Instead of updating all the cluster allocation water marks of all the
interfaces when the kernel is livelocked, only do it for the current
pool and defer the other updates.

This allow us to get rid of an interface list iteration in a critical
path.

Ridding the libc crank since this change introduce an ABI break.

ok claudio@


Revision tags: OPENBSD_5_5_BASE
# 1.153 21-Nov-2013 mikeb

split kernel parts of the if.h into a separate header file if_var.h
which allows us to modify ifnet structure in a relatively safe way;
discussed with deraadt, ok mpi


# 1.152 09-Nov-2013 dlg

ticks is compared against mcl_grown to see if time has elapsed since
the rx ring was last allowed to grow and then assigned to it. it
is erroneous to do this because mcl_grown is a u_int and ticks is an
int.

this makes mcl_grown an int, and follows the idiom in kern_timeout.c
of going "thing - ticks < diff", which better copes with ticks
wrapping around and being used to calculate relative intervals.

ok pirofti@ guenther@


# 1.151 01-Nov-2013 pelikan

keep net/hfsc.h away from userspace, except in pfctl

tested by naddy, ok deraadt


# 1.150 21-Oct-2013 benno

nuke comment. How soon is now?
"do it" deraadt@


# 1.149 19-Oct-2013 reyk

Bring back the if_detachhook. We're going to have more users now.

ok mpi@ henning@ benno@


# 1.148 13-Oct-2013 reyk

Import vxlan(4), the virtual extensible local area network tunnel
interface. VXLAN is a UDP-based tunnelling protocol for overlaying
virtualized layer 2 networks over layer 3 networks. The implementation
is based on draft-mahalingam-dutt-dcops-vxlan-04 and has been tested
with other implementations in the wild.

put it in deraadt@


# 1.147 12-Oct-2013 henning

new bandwidth shaping subsystem, kernel side
uses hfsc behind the scenes; altq stays in parallel for a migration phase.
if.h even more messy for the transition, but eventuelly it should become
readable...
looked over & tested by many, ok phessler sthen


# 1.146 17-Sep-2013 mpi

Change vlan(4) detach procedure to not use a hook but a list of vlans
on the parent interface. This is similar to what bridge(4), trunk(4)
or carp(4) are doing and allows us to get rid of the detachhook.

ok reyk@, mikeb@


# 1.145 28-Aug-2013 mpi

Remove unused argument from *rtrequest()

ok krw@, mikeb@


Revision tags: OPENBSD_5_4_BASE
# 1.144 20-Jun-2013 mpi

Revert previous and unbreak asr, the new include should be protected.

Reported by naddy@


# 1.143 20-Jun-2013 mpi

Allocate the various hook head descriptors as part of the ifnet
structure rather than doing various M_WAITOK allocations during
the *attach() functions, we always rely on them anyway.

ok mikeb@, uebayasi@


# 1.142 02-Apr-2013 mpi

Instead of storing the link-level address of every interface in a global
array indexed by interface numbers, add a new field to the interface
descriptor pointing to it.

claudio@ and todd@ like it, ok mikeb@


# 1.141 26-Mar-2013 mpi

Remove various read-only *maxlen variables and use IFQ_MAXLEN directly.

ok beck@, mikeb@


# 1.140 20-Mar-2013 mpi

Introduce if_get() to retrieve an interface descriptor pointer given
an interface index and replace all the redondant checks and accesses
to a global array by a call to this function.

With imputs from and ok bluhm@, mikeb@


# 1.139 07-Mar-2013 mpi

Remove unused ifa_ifwithaf() function.

ok mikeb@, miod@


# 1.138 07-Mar-2013 mpi

Remove the IFAFREE() macro, the ifafree() function it was calling already
check for the reference counter.

ok mikeb@, miod@, pelikan@, kettenis@, krw@


Revision tags: OPENBSD_5_3_BASE
# 1.137 23-Nov-2012 sthen

Add SIOCGIFHARDMTU to allow retrieving the driver's maximum supported MTU
looks fine reyk@ ok mikeb@


# 1.136 11-Nov-2012 deraadt

align ifaliasreq.ifra_addr similar to the way that ifreq is fixed --
a gruesome union, to block the compiler from placing the struct
incorrectly aligned on stack frames
ok guenther


# 1.135 05-Oct-2012 camield

Point an interface directly to its bridgeport configuration, instead
of to the bridge itself. This is ok, since an interface can only be part
of one bridge, and the parent bridge is easy to find from the bridgeport.

This way we can get rid of a lot of list walks, improving performance
and shortening the code.

ok henning stsp sthen reyk


# 1.134 19-Sep-2012 henning

defina an IFCAP_CSUM_MASK, covering IFCAP_CSUM_*, and use it in if_vlan.c
to replace the list of them.
this actually makes vlan inherit the IPv6 CSUM flags from it's parent, that
had been commented out since this code was committed back in 2001.
ok benno mpf


# 1.133 10-Sep-2012 guenther

Bring into compliance with POSIX, exposing just the specified bits.

Requested by jasper@, ok kettenis@


# 1.132 21-Aug-2012 bluhm

Reverse the name and meaning of the IFXF_INET6_PRIVACY interface
flag. It is now called IFXF_INET6_NOPRIVACY. So IPv6 privacy
addresses are on by default without resetting the flag during
ifconfig down/up.
OK stsp@, sperreault@ (who wrote the same diff)


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.131 02-Dec-2011 haesbaert

Kill unused IFCAP_IPSEC and IFCAP_IPCOMP.

ok claudio@ henning@ mikeb@


# 1.130 02-Nov-2011 haesbaert

Expose if_capabilities to userland so that ifconfig can display the
device hardware features.
Tune ifconfig to show them with 'hwfeatures' argument.
While here, kill some old unused capabilities and respect 80 columns
in brconfig.h.

ok mcbride@, henning@, mpf@.


# 1.129 07-Oct-2011 henning

rename some vars and functions
unfortunately altq is one giant namespace violation. rename just those that
conflict with new stuff for now only to be found on my laptop. reduce pain,
the diff is huge already. ok ryan


Revision tags: OPENBSD_5_0_BASE
# 1.128 08-Jul-2011 henning

new priority queueing implementation, extremely low overhead, thus fast.
unconditional, always on. 8 priority levels, as every better switch, the
vlan header etc etc. ok ryan mpf sthen, pea tested as well


# 1.127 07-Jul-2011 henning

provide IF_LEN and IFQ_LEN to access ifq_len on an ifqueue, ryan ok


# 1.126 05-Jul-2011 henning

now of course I only noticed if_qflush is completely unused after
adjusting it to the new world order in my tree... remove it, ok ryan claudio


# 1.125 03-Jul-2011 henning

IFQ_CLASSIFY is also just schrapnel


# 1.124 03-Jul-2011 henning

no traces of ALTQ_DECL to be found anywhere, thus kill the #defines


# 1.123 03-Jul-2011 claudio

LINK_STATE_IS_UP() should consider LINK_STATE_UNKNOWN as an up state.
This is now possible because carp no longer uses LINK_STATE_UNKNOWN
for a state that is considered down. This will simplify a lot of code.
OK mpf@ mcbride@ henning@


# 1.122 13-Mar-2011 stsp

Add a way to enable/disable Wake On LAN with ifconfig.
ok deraadt


Revision tags: OPENBSD_4_9_BASE
# 1.121 17-Nov-2010 henning

introduce ifa_update_broadaddr to update an ifaddr's broadcast address,
trivial for the moment, more needed soon
tested by many as part of a larger diff, ok sthen claudio dlg krw


# 1.120 24-Sep-2010 claudio

Implement if_freenameindex() as a real function as required by posix.
OK deraadt@, millert@


# 1.119 23-Sep-2010 dlg

tweak the mclgeti algorithm to behave better under load.

instead of letting hardware rings grow on every interrupt, restrict
it so it can only grow once per softclock tick. we can only punish
the rings on softclock ticks, so it make sense to only grow on
softclock tick boundaries too.

the rings are now punished after >1 lost softclock tick rather than
>2. mclgeti is now more aggressive at detecting livelock.

the rings get punished by an 8th, rather than by half.

we now allow the rings to be punished again even if the system is
already considered in livelock.

without this diff a livelocked system will have its rx ring sizes
scale up and down very rapidly, while holding the rings low for too
long. this affected throughput significantly.

discussed and tested heavily at j2k10. there are still some games
with softnet we can play, but this is a good first step.

"put it in" and ok deraadt@
ok claudio@ krw@ henning@ mcbride@

if we find out that it sucks we can pull it out again later. till then
we'll run with it and see how it goes.


# 1.118 27-Aug-2010 jsg

remove the unused if_init callback in struct ifnet
ok deraadt@ henning@ claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.117 26-Jun-2010 claudio

Implement a simple keepalive mechanism in gre(4) that is compatible with
the one used by Cisco. It sends a return gre packet inside a gre packet
to the other side and expects it to return.
OK deraadt, reyk additional testing by sthen


# 1.116 28-May-2010 claudio

Rework the way we handle MPLS in the kernel. Instead of fumbling MPLS into
ether_output() and later on other L2 output functions use a trick and over-
load the ifp->if_output() function pointer on MPLS enabled interfaces to
go through mpls_output() which will then call the link level output function.
By setting IFXF_MPLS on an interface the output pointers are switched.
This now allows to cleanup the MPLS input and output pathes and fix mpe(4)
so that the MPLS code now actually works for both P and PE systems.
Tested by myself and michele
(A custom kernel with MPLS and mpe enabled is still needed).


# 1.115 17-Apr-2010 deraadt

split SIOCSIFLLADDR code out into an ifnewlladr() function
ok stsp


# 1.114 06-Apr-2010 stsp

Simple implementation of RFC4941, "Privacy Extensions for Stateless
Address Autoconfiguration in IPv6". For those among us who are paranoid
about broadcasting their MAC address to the IPv6 internet.

Man page help from jmc, testing by weerd, arc4random API hints from djm.

ok deraadt, claudio


Revision tags: OPENBSD_4_7_BASE
# 1.113 13-Jan-2010 henning

maintain a global RB tree of all local addresses in the system. this
includes AF_LINK addresses (aka mac addresses in the ethernet case). for
inet this also includes the broadcast addresses.
depends on ifinit() called earlier so we have a chance to pool_init before
autoconf assigns the AF_LINK addresses, the v6 fix, and the ifa_add/del
abstraction i just committed.
this is a change in semantics, it is now illegal to change the actual
address in an ifaddr struct because then the RB tree becomes unbalanced.
nothing using this tree yet.
ok theo ryan dlg


# 1.112 13-Jan-2010 henning

instead of fiddling with the per-interface address lists directly in
many places create a proper API (ifa_add / ifa_del) and use it.
ok theo ryan dlg


# 1.111 12-Jan-2010 deraadt

Make the structures for ifa_msghdr and friends even more like
the route messages so that people and compilers will not get
confused.
ok claudio


# 1.110 17-Sep-2009 claudio

Remove the comaptibility structures for routing socket version 3.
The RTM_VERSION bump is 2 years ago and so there is no need for this.
Diff made by tedu@ some time ago but got never commited so I do it now.


# 1.109 14-Sep-2009 claudio

Add a way to convert the ifi_link_state to a string without the use of
if_media. This makes link state tracking a lot easier as there is no need
to convert if types to if_media types, etc. Additionally this allows us
to extend the link states to include states tracked on higher protocol layers.
gre(4) keepalives packets, bfd and udld can be implemented without ugly hacks.
OK henning, michele, sthen, deraadt


# 1.108 10-Aug-2009 deraadt

At sys_reboot time, bring all the interfaces down so that their xxstop
functions are called, which will turn off DMA. Receiving packets into
your memory after a system reboot is pretty nasty. This will also mean
that the shutdown hooks can go; this solution is smaller.
ok henning miod dlg kettenis


Revision tags: OPENBSD_4_6_BASE
# 1.107 06-Jun-2009 rainer

when xflags got changed, tell the userland by routing sockets

ok henning@


# 1.106 05-Jun-2009 claudio

Initial support for routing domains. This allows to bind interfaces to
alternate routing table and separate them from other interfaces in distinct
routing tables. The same network can now be used in any doamin at the same
time without causing conflicts.
This diff is mostly mechanical and adds the necessary rdomain checks accross
net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6.
input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@


# 1.105 04-Jun-2009 henning

allow IPvShit to be turned off completely per-interface.
ifconfig em0 -inet6
deletes all v6 addresses including link-local and prevents new ones from
being added.
ifconfig em0 inet6 <addr>
re-enables v6, brings the link local back and adds optional <addr>
ok theo reyk


# 1.104 03-Jun-2009 beck

make wireless interfaces priority 4 by default. other interfaces remain
priority 0. while we are in here make sure we add wi interfaces to group "wlan"
in the same way the net80211 stuff already is.

this makes dhcp multiple default routes useful on laptops.

ok claudio@


Revision tags: OPENBSD_4_5_BASE
# 1.103 27-Jan-2009 dlg

make drivers tell the mclgeti allocator what their maximum ring size is
to prevent the hwm growing beyond that. this allows the livelock mitigation
to do something where the hwm used to grow beyond twice the rx rings size.

ok kettenis@ claudio@


# 1.102 12-Dec-2008 claudio

Introduce a if_priority that will be added to RTP_STATIC when routes are
added without an expilict priority. This allows to specify less prefered
interfaces that will only take over if the primary interface loses link.
OK deraadt@


# 1.101 11-Dec-2008 deraadt

export per-interface mbuf cluster pool use statistics out to userland
inside if_data, so that netstat(1) and systat(1) can see them
ok dlg


# 1.100 30-Nov-2008 brad

- Remove unused if_reset "bus reset routine" field in the ifnet struct.
- Add if_stop "stop routine" field in the ifnet struct.

ok mglocker@


# 1.99 26-Nov-2008 dlg

provide m_clsetlwm, an interface for an interface to raise its low
watermark for mbuf cluster allocations.

this is necessary for things like bge which cannot cope with less than a
certain number of pkts on the ring.

ok deraadt@


# 1.98 25-Nov-2008 deraadt

Factor increases are not needed, +1 appears to work as well.
ok dlg


# 1.97 24-Nov-2008 deraadt

move MCLPOOLS to if.h and force uipc_mbuf.c to get if.h, there is no
other option
ok dlg


# 1.96 24-Nov-2008 dlg

add several backend pools to allocate mbufs clusters of various sizes out
of. currently limited to MCLBYTES (2048 bytes) and 4096 bytes until pools
can allocate objects of sizes greater than PAGESIZE.

this allows drivers to ask for "jumbo" packets to fill rx rings with.

the second half of this change is per interface mbuf cluster allocator
statistics. drivers can use the new interface (MCLGETI), which will use
these stats to selectively fail allocations based on demand for mbufs. if
the driver isnt rapidly consuming rx mbufs, we dont allow it to allocate
many to put on its rx ring.

drivers require modifications to take advantage of both the new allocation
semantic and large clusters.

this was written and developed with deraadt@ over the last two days
ok deraadt@ claudio@


# 1.95 07-Nov-2008 deraadt

give this some /* CONSTCOND */ love


Revision tags: OPENBSD_4_4_BASE
# 1.94 10-Apr-2008 dlg

introduce mitigation for the calling of an interfaces start routine.

decent drivers prefer to have a lot of packets on the send queue so they
can queue a lot of them up on the tx ring and then post them all in one
big chunk. unfortunately our stack queues one packet onto the send queue
and then calls the start handler immediately.

this mitigates against that queue, send, queue, send behaviour by trying to
call the start routine only once per softnet. now its queue, queue, queue,
send.

this is the result of a lot of discussion with claudio@
tested by many.


Revision tags: OPENBSD_4_3_BASE
# 1.93 18-Nov-2007 mpf

Sync struct ifaltq to match struct ifqueue.
I wonder why 64-bit archs have not been bitten by this.
OK mcbride@, henning@


# 1.92 03-Sep-2007 claudio

Bump RTM_VERSION to 4 and start a new aera of routing in OpenBSD :)
Changes include 64bit counters instead of u_long, routing table id in the header
of most messages, reserved routing priority field, added a hdrlen field to skip
over the header so that binary compatibility becomes easier.
A minimal backward support for old binaries is included to ease upgrades but
don't expect anything more than ifconfig, route and dhclient to correctly work.
OK henning@ mglocker@


Revision tags: OPENBSD_4_2_BASE
# 1.91 25-Jun-2007 henning

crank ifq_maxlen from 50 to 256, so it is not smaller than most interfaces
rx rings any more. forwarding boxes with many fast interfaces can still use
some more, but this is a saner default.
ok deraadt markus henric


# 1.90 14-Jun-2007 reyk

Add a new "rtlabel" option to ifconfig. It allows to specify a route label
which will be used for new interface routes. For example,
ifconfig em0 10.1.1.0 255.255.255.0 rtlabel RING_1
will set the new interface address and attach the route label RING_1 to
the corresponding route.

manpage bits from jmc@
ok claudio@ henning@


# 1.89 29-May-2007 uwe

Define IF_ENQUEUE() and friends as proper C statements using do ... while
ok henning


# 1.88 26-May-2007 jason

one extern seems to be better than 20 for ifqmaxlen; ok krw


# 1.87 27-Mar-2007 jmc

grammar from bret lambert, and one more from me;


Revision tags: OPENBSD_4_1_BASE
# 1.86 09-Feb-2007 jmc

grammar fix from bret lambert;


# 1.85 28-Nov-2006 reyk

add additional link states to report the half duplex / full duplex
state, if known by the driver. this is required to check the full
duplex state without depending on the ifmedia ioctl which can't be
called in the kernel without process context.

ok henning@, brad@


# 1.84 16-Nov-2006 henning

introduce if_creategroup() to create an empty interface group.
code factored out from if_addgroup(), previously a group always had to have
members. ok mpf mcbride


# 1.83 31-Oct-2006 jason

ether_input_mbuf() isn't necessary, turn it into a macro and deal with
it's "special" case in ether_input(). Based on similiar idea in FreeBSD.
ok brad


Revision tags: OPENBSD_4_0_BASE
# 1.82 02-Jun-2006 mpf

Introduce attributes to interface groups.
As a first user, move the global carp(4) demotion counter
into the interface group. Thus we have the possibility
to define which carp interfaces are demoted together.

Put the demotion counter into the reserved field of the carp header.
With this, we can have carp act smarter if multiple errors occur.
It now always takes over other carp peers, that are advertising
with a higher demote count. As a side effect, we can also have
group failovers without the need of running in preempt mode.
The protocol change does not break compability with older
implementations.

Collaborative work with mcbride@

OK mcbride@, henning@


# 1.81 27-May-2006 brad

remove IFCAP_JUMBO_MTU interface capabilities flag and set if_hardmtu in a few
more drivers.

ok reyk@


# 1.80 26-May-2006 deraadt

rename jumbo mtu to if_hardmtu; ok brad reyk


# 1.79 19-May-2006 reyk

add a if_jumbo_mtu field to the interface structure for drivers
supporting ethernet jumbo frames. there's no standard for the size of
jumbo MTUs, so either let the driver set it's own value or use 9000
byte jumbo frames by default.

ok brad@


# 1.78 04-Mar-2006 brad

With the exception of two other small uncommited diffs this moves
the remainder of the network stack from splimp to splnet.

ok miod@


Revision tags: OPENBSD_3_9_BASE
# 1.77 09-Feb-2006 reyk

add an interface detach hook and use it with the vlan(4) driver. this
fixes a possible crash if the parent interface has been destroyed
(like vlan on trunk) before destroying the vlan interface.

ok brad@


Revision tags: OPENBSD_3_8_BASE
# 1.76 14-Jun-2005 henning

rename function and define to reflect the external -> egress name change
so it is clear what it is all about


# 1.75 14-Jun-2005 henning

use "egress" instead of "external" for the interface group containing the
interfaces the default route(s) point to, proposed deraadt some days ago,
ok djm deraadt


# 1.74 12-Jun-2005 henning

add SIOCGIFGMEMB ioctl, returns a list of all interfaces who are member of
the given group, markus ok


# 1.73 07-Jun-2005 henning

introduce a default "external" interface group, containing the interface(s)
the the default route(s) point to.
handles IPv4 and IPv6 as well as multipath routes.
follows default route changes, of course.
eases writing pf rulesets especially on laptops etc. that use different
interfaces depending on the environment (wired, wireless, ...)
ok theo ryan


# 1.72 06-Jun-2005 henning

use a define instead of hardcoding "all" in 3 places


# 1.71 05-Jun-2005 henning

const'ify the char *groupname param to if_addgroup and if_delgroup


# 1.70 24-May-2005 markus

add net.inet.ip.ifq for monitoring and changing ifqueue; similar to netbsd
ok henning


# 1.69 24-May-2005 reyk

initial import of a trunking (link aggregation and link failover)
implementation. it currently supports round robin mode with link state
checking, additional modes will be added later.

ok brad@, deraadt@


# 1.68 24-May-2005 henning

keep a list of member interfaces in ifg_group


# 1.67 22-May-2005 henning

allow pf to match on interface groups
pass on mygroup ...
markus ok


# 1.66 21-May-2005 henning

clean up and rework the interface absraction code big time, rip out multiple
useless layers of indirection and make the code way cleaner overall.
this is just the start, more to come...
worked very hard on by Ryan and me in Montreal last week, on the airplane to
vancouver and yesterday here in calgary. it hurt.
ok ryan theo


# 1.65 20-Apr-2005 mpf

Introduce if_linkstatehooks.
This converts if_link_state_change() to a generic usable
callback with dohooks().

OK henning@, camield@
Tested by camield@ and Alexey E. Suslikov


Revision tags: OPENBSD_3_7_BASE
# 1.64 07-Feb-2005 mcbride

Add new function if_link_state_change() to take care of sending messages
on the routing socket and notifying carp() of link changes.

ok brad@ mpf@


# 1.63 14-Jan-2005 henning

remove old ifgroups ioctls
the old ifgroups haven't been in use ever really, and the new
implementation is 3 months old today. theo ok (3 months ago)


# 1.62 07-Dec-2004 mcbride

Convert carp(4) to behave more like a regular interface, much in the same
style as vlan(4). carp interfaces no longer require the physical interface
to be on the same subnet as the carp interface, or even that the physical
interface has an adress at all, so CARP can now be used on /30 networks.

ok deraadt@ henning@


# 1.61 07-Dec-2004 mcbride

KNF


# 1.60 03-Dec-2004 henning

do not use one struct timeout for the if congestion stuff, but embed
a struct timeout to struct ifqueue so that each one has its own - it
is a per-queue thing. from chris pascoe


# 1.59 10-Nov-2004 grange

Safer IF_INPUT_ENQUEUE macro.

ok millert@


# 1.58 14-Oct-2004 mickey

avoid stupid commons


# 1.57 11-Oct-2004 henning

ifgroups reqrite
there is now a TAILQ with all interface groups as members, and
in struct ofnet there is only a pointer to the group structure stored
and not its name.
mostly hacked at c2k4 and somewhere over the atlantic ocean
ok markus mcbride


Revision tags: OPENBSD_3_6_BASE
# 1.56 26-Jun-2004 markus

cleanup ioctl for ifgroups; ok pb@


# 1.55 25-Jun-2004 pb

introduce "interface groups"

by "ifconfig fxp0 group foobar" "ifconfig xl0 group foobar"
these two interfaces are in one group.
Every interface has its if-family as default group.

idea/design from henning@, based on some work/disucssion from Joris Vink.

henning@, mcbride@ ok.


# 1.54 20-Jun-2004 beck

undo mbuf cluster breakage that causes free'ed packets to show up on the
input queues when using dhcp and hostap wi, or xl, or fxp....
ok art@


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.53 29-May-2004 jcs

introduce SIOCSIFDESCR and SIOCGIFDESCR to maintain interface
descriptions, configurable with ifconfig

help from various, ok deraadt@


# 1.52 18-May-2004 brad

if_ether.h
add ETHER_MAX_LEN_JUMBO, ETHER_VLAN_ENCAP_LEN, ETHER_ALIGN, and
ETHERMTU_JUMBO constants.

if.h
add a few more interface capabilities flags.

Some from NetBSD, some from FreeBSD.

ok markus@


# 1.51 26-Apr-2004 mcbride

Before enqueueing the packet, copy the contents of incoming clusters
to the mbuf and free the cluster when it contains a small packet.

ok deraadt@


# 1.50 17-Apr-2004 henning

add a congestion indicator to if_queue. It is set when the input queue
is full, along with a timer that unsets it again after 10ms.
The input queue beeing full is a reliable indicator for CPU overload, and
this flag allows other subsystems to cope with the situation.
hacked with beck
ok kjc@ markus@ beck@


Revision tags: OPENBSD_3_5_BASE
# 1.49 15-Jan-2004 markus

add a RTM_IFANNOUNCE message; from netbsd; ok itojun, henning


# 1.48 16-Dec-2003 markus

return error in ifc_destroy; ok deraadt, itojun, cedric, hshoexer


# 1.47 10-Dec-2003 itojun

use if_indexlim (instead of if_index) and ifindex2ifnet[x] != NULL
to check if interface exists, as (1) if_index will have different meaning
(2) ifindex2ifnet could become NULL when interface gets destroyed,
when we introduce dynamically-created interfaces. markus ok


# 1.46 08-Dec-2003 markus

add IOCIFGCLONERS; ifconfig -C; from netbsd; ok henning, deraadt


# 1.45 03-Dec-2003 markus

support for network interface "cloning", e.g. gif(4) via ifconfig(8)


# 1.44 19-Oct-2003 david

more typos


# 1.43 17-Oct-2003 mcbride

Common Address Redundancy Protocol

Allows multiple hosts to share an IP address, providing high availability
and load balancing.

Based on code by mickey@, with additional help from markus@
and Marco_Pfatschbacher@genua.de

ok deraadt@


Revision tags: OPENBSD_3_4_BASE
# 1.42 25-Aug-2003 fgsch

if_init support, required by ieee80211.
deraadt@ ok.


# 1.41 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_A UBC_SYNC_B
# 1.40 03-Jul-2002 miod

Change all variables definitions (int foo) in sys/sys/*.h to variable
declarations (extern int foo), and compensate in the appropriate locations.


# 1.39 30-Jun-2002 itojun

allocate sockaddr_dl for ifnet in if_alloc_sadl(), as we don't always know
the size of sockaddr_dl on if_attach() - for instance, see ether_ifattach().
from netbsd. fgs ok


# 1.38 23-Jun-2002 itojun

g/c last remains of old ipv6 prefix management


# 1.37 27-May-2002 itojun

if_attach() gets called before domaininit(). scan all interfaces for if_afdata
initialization after domaininit().


# 1.36 27-May-2002 itojun

framework to add af-dependent data structure to struct ifnet.
as discussed at bsd-api-discuss. sync w/kame


# 1.35 24-Apr-2002 dhartmei

Add hooks to struct ifnet that allow to register callbacks that will be
notified of interface address changes. ok provos@, angelos@


Revision tags: OPENBSD_3_1_BASE
# 1.34 15-Mar-2002 millert

Cosmetic changes only, primarily making comments line up nicely after the
__P removal.


# 1.33 14-Mar-2002 millert

First round of __P removal in sys


# 1.32 23-Jan-2002 fgsch

compatability -> compatibility.


Revision tags: OPENBSD_3_0_BASE UBC_BASE
# 1.31 05-Jul-2001 angelos

branches: 1.31.4;
KNF


# 1.30 05-Jul-2001 jjbg

Include files for IPComp support. angelos@ ok.


# 1.29 27-Jun-2001 kjc

ALTQ base modifications to the kernel.
- ALTQ introduces a set of new queue macros that coexist with the
traditional IF_XXX macros.
- "struct ifaltq" replaces "struct ifqueue" in "struct ifnet".
- assign cdev major 74 for i386 and 54 for alpha as ALTQ control interface.


# 1.28 23-Jun-2001 fgsch

Add ether_input_mbuf to help us remove the ether_header from
ether_input; all drivers should start migrating to this.
Discussed with jason@, deraadt@ more or les ok'ed.


# 1.27 15-Jun-2001 itojun

change the meaning of ifnet.if_lastchange to meet RFC1573 ifLastChange.
follows BSD/OS practice and ucd-snmp code (FreeBSD does it for specific
interfaces only).

was: if_lastchange get updated on every packet transmission/receipt.
now: if_lastchange get updated when IFF_UP is changed.


# 1.26 09-Jun-2001 angelos

By popular demand, protect from multiple inclusion, and fix to use the
same naming style.


# 1.25 28-May-2001 angelos

IPSECv4 -> IPSEC


# 1.24 28-May-2001 angelos

No need for separate ESP/AH interface capabilities.


# 1.23 28-May-2001 angelos

Interface capabilities (based on NetBSD, but merge ethercom and ifnet
capabilities into one, in the ifp).


Revision tags: OPENBSD_2_9_BASE
# 1.22 06-Feb-2001 mickey

allow changing number of loopbacks in ukc.
change rest of the code to use lo0ifp pointing
to the corresponding struct ifnet.
itojun@ and niklas@ ok


# 1.21 19-Jan-2001 itojun

pull post-4.4BSD change to sys/net/route.c from BSD/OS 4.2 (UCB copyrighted).

have sys/net/route.c:rtrequest1(), which takes rt_addrinfo * as the argument.
pass rt_addrinfo all the way down to rtrequest, and ifa->ifa_rtrequest.
3rd arg of ifa->ifa_rtrequest is now rt_addrinfo * instead of sockaddr *
(almost noone is using it anyways).

benefit: the follwoing command now works. previously we need two route(8)
invocations, "add" then "change".
# route add -inet6 default ::1 -ifp gif0

remove unsafe typecast in rtrequest(), from rtentry * to sockaddr *. it was
introduced by 4.3BSD-reno and never corrected.

XXX is eon_rtrequest() change correct regarding to 3rd arg?
eon_rtrequest() and rtrequest() were incorrect since 4.3BSD-reno,
so i do not have correct answer in the source code.
someone with more clue about netiso-over-ip, please help.


Revision tags: OPENBSD_2_8_BASE
# 1.20 20-Sep-2000 art

Since ifa_refcnt was bumped to an int and rt_flags is an int too, bump
ifa_flags to int.


# 1.19 28-Aug-2000 deraadt

changing the size of if_data has heavy impact on userland compat. there
was even a u_char slot available for ifi_link_state, which clearly does not
need a full 32 bits.


# 1.18 26-Aug-2000 nate

sync mii code with netbsd
adds detach functionality for phys
some code cleanup

Nobody really had time to test all of this out, but theo said commit anyway


Revision tags: OPENBSD_2_7_BASE
# 1.17 22-Mar-2000 itojun

remove if_withname(), which was imported during KAME merge by mistake.


# 1.16 21-Mar-2000 mickey

add SIOCGIFMTU/SIOCSIFMTU; remediate redundant code of tun, ppp, sppp; chris@ ok


Revision tags: SMP_BASE
# 1.15 02-Feb-2000 itojun

branches: 1.15.2;
wrap IFAFREE() by "do {} while (0)". it wasn't safe enough.


Revision tags: kame_19991208
# 1.14 08-Dec-1999 itojun

bring in KAME IPv6 code, dated 19991208.
replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support.
see sys/netinet6/{TODO,IMPLEMENTATION} for more details.

GENERIC configuration should work fine as before. GENERIC.v6 works fine
as well, but you'll need KAME userland tools to play with IPv6 (will be
bringed into soon).


Revision tags: OPENBSD_2_6_BASE
# 1.13 08-Aug-1999 niklas

Support detaching of network interfaces. Still work to do in ipf, and
other families than inet.


# 1.12 23-Jun-1999 cmetz

Added some protocol independent interfaces (supposedly IPv6 support APIs, but
ones that are useful for all protocols, not just IPv6).


Revision tags: OPENBSD_2_5_BASE
# 1.11 13-Mar-1999 deraadt

make ifa_refcnt a u_int; andrewb@demon.net


# 1.10 26-Feb-1999 jason

Ethernet bridge/IP firewall driver.


# 1.9 07-Jan-1999 deraadt

fix IFAFREE() to be safe for if/else nesting


Revision tags: OPENBSD_2_4_BASE
# 1.8 03-Sep-1998 jason

o OpenBSD gets if_media support (from NetBSD)
o rework/simplify if_xl to use it


Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE OPENBSD_2_2_BASE OPENBSD_2_3_BASE
# 1.7 02-Jul-1996 niklas

-Wall & -Wstrict-prototype fixes


# 1.6 29-Jun-1996 deraadt

provide if_attachhead(), and make if_loop use it


# 1.5 10-May-1996 deraadt

if_name/if_unit -> if_xname/if_softc


# 1.4 06-May-1996 mickey

if.h was missed from the commit.
if_ethersubr.c: missed variables added.


# 1.3 05-Mar-1996 mickey

Changes for ifconfig to compile.


# 1.2 03-Mar-1996 niklas

From NetBSD: 960217 merge


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.205 18-Jan-2021 mvs

Introduce new function if_unit(9). This function returns a pointer the
interface descriptor corresponding to the unique name. This descriptor
is guaranteed to be valid until if_put(9) is called on the returned
pointer. if_unit(9) should replace already existent ifunit() which
returns descriptor not safe for dereference when context was switched.
This allow us to avoid some use-after-free issues in ioctl(2) path.
Also this unifies interface descriptor usage.

ok claudio@ sashan@


# 1.204 30-Sep-2020 mvs

We have no if_attachtail() function so remove the declaration.

ok deraadt@ claudio@


Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE OPENBSD_6_8_BASE
# 1.203 25-Jul-2019 krw

AF_INET comes before AF_INET6. Shorten line to <80 chars.

pointed out by claudio@


# 1.202 25-Jul-2019 krw

Add IFXF_AUTOCONF4 to if_xflags to match IFXF_AUTOCONF6. Let
ifconfig set/unset it.

ok deraadt@ kmos@


# 1.201 19-Apr-2019 dlg

add IF_HDRPRIO_OUTER for rxprio

IF_HDRPRIO_OUTER says you want the priority from the outer encap header.

ok claudio@


Revision tags: OPENBSD_6_5_BASE
# 1.200 10-Apr-2019 dlg

add struct if_sffpage so userland can read a page of sfp/qsfp info

this will be used by ifconfig so it can show you things like who
mades what module you're using and when it was made, and on some
modules you get diag info like temperature, vcc, and rx and tx
powers.

im putting the kernel side in so we can keep fiddling with the
userland printf side of things.

this work was done based on a question by rachel roch
ok deraadt@
enthusiasm from many including mikeb@ sthen@ patrick@


# 1.199 23-Jan-2019 dlg

add a SIOCGPWE3 ioctl for interfaces to advertise pwe3 capability

im going to turn mpw into an ethernet interface, which includes
changing its if_type to IFT_ETHER. currently ldpd looks for if_type
IFT_MPLSTUNNEL to decide if an interface is a pseudowire, ie, it's
going to break. the ioctl will let ldpd ask the interface if it is
pseudowire capable as an alternative.

ok claudio@


# 1.198 12-Nov-2018 dlg

add ifreq bits for the tx header prio field ioctls

a tx header prio can set to a fixed value from 0 to 7, or magic
values to represent populating the prio field from the encapsulated
packet, or from the mbuf prio value.

ok claudio@


# 1.197 12-Nov-2018 krw

Add new routing socket message RTM_80211INFO to provide details of
802.11 interface state changes (e.g. SSID) to interested parties.

Original diff from phessler@. Many suggestions and tweaks from
claudio@, stsp@, anton@.

ok claudio@ stsp@ anton@ phessler@


# 1.196 11-Nov-2018 dlg

use the llprio on gre(4) and eoip(4) interfaces for the keepalive tos

llprios are valued 0 to 7, while the ip tos/dscp/tclass is an 8 bit
value. fortunately the high 3 bits map nicely to the llprio values,
so we shift the llprio into place when generating the keepalive
frames. the llprio is defaulted to the value that cisco uses for
their gre keepalives.


Revision tags: OPENBSD_6_4_BASE
# 1.195 12-Sep-2018 krw

Fix obvious cut&pasto in comment (ifa_msghdr -> if_announcemsghdr).

ok claudio@


# 1.194 30-May-2018 dlg

restrict the prio values from SIOCSIFLLPRIO to what the kernel handles

previously the ioctl code checked that prio was an int less than
UCHAR_MAX, but the rest of the kernel (and priq code in particular)
expects it to be between 0 and 7 inclusive.

ok krw@ tb@


# 1.193 25-Apr-2018 jca

Make this header standalone #if __BSD_VISIBLE, by including needed headers

Puts us in line with Free/NetBSD and Linux and will get us rid of
pointless patches in the ports tree. ok guenther@ deraadt@


Revision tags: OPENBSD_6_3_BASE
# 1.192 19-Feb-2018 dlg

tunneldf needs ifr_df


# 1.191 10-Feb-2018 florian

Implement RFC 7217: "A Method for Generating Semantically Opaque
Interface Identifiers with IPv6 Stateless Address Autoconfiguration."

"An IPv6 address configured using this method is stable within each
subnet, but the corresponding Interface Identifier changes when the
host moves from one network to another. This method is meant to be an
alternative to generating Interface Identifiers based on hardware
addresses."

OK naddy, sthen


# 1.190 16-Jan-2018 mpi

Recycle IFF_NOTRAILERS into IFF_STATICARP and document ownerhsip
of IFF* flags.

inputs from jmc@, ok bluhm@, visa@


# 1.189 21-Dec-2017 dlg

prototype if_attach_iqueues so drivers can configure multiple iqs.


# 1.188 09-Nov-2017 tb

The cmd argument of ifconf() has been unused since COMPAT_LINUX was
purged. Remove it and move the prototype to if.c since ifconf() is
not used outside of this file.

ok mpi


# 1.187 31-Oct-2017 sashan

- add one more softnet taskq
NOTE: code still runs with single softnet task. change definition of
SOFTNET_TASKS in net/if.c, if you want to have more than one softnet task

OK mpi@, OK phessler@


Revision tags: OPENBSD_6_1_BASE OPENBSD_6_2_BASE
# 1.186 24-Jan-2017 krw

A space here, a space there. Soon we're talking real whitespace
rectification.


# 1.185 24-Jan-2017 dlg

add support for multiple transmit ifqueues per network interface.

an ifq to transmit a packet is picked by the current traffic
conditioner (ie, priq or hfsc) by providing an index into an array
of ifqs. by default interfaces get a single ifq but can ask for
more using if_attach_queues().

the vast majority of our drivers still think there's a 1:1 mapping
between interfaces and transmit queues, so their if_start routines
take an ifnet pointer instead of a pointer to the ifqueue struct.
instead of changing all the drivers in the tree, drivers can opt
into using an if_qstart routine and setting the IFXF_MPSAFE flag.
the stack provides a compatability wrapper from the new if_qstart
handler to the previous if_start handlers if IFXF_MPSAFE isnt set.

enabling hfsc on an interface configures it to transmit everything
through the first ifq. any other ifqs are left configured as priq,
but unused, when hfsc is enabled.

getting this in now so everyone can kick the tyres.

ok mpi@ visa@ (who provided some tweaks for cnmac).


# 1.184 23-Jan-2017 mpi

Flag pseudo-interfaces as such in order to call add_net_randomness()
only once per packet.

Fix a regression introduced when if_input() started to be called by
every pseudo-driver.

ok claudio@, dlg@


# 1.183 23-Jan-2017 dlg

i botched the copyout to ifr->ifr_data in SIOCGIFDATA.

this lets pflogd run again.

rename if_data() to if_getdata() while here to make grepping for
things less noisy.

reported by jsg@
worked through with deraadt@


# 1.182 23-Jan-2017 dlg

merge the ifnet and ifqueue stats together when userland wants them.

a new if_data() function takes a pointer to ifnet and merges its
if_data and ifq statistics. it takes the ifq mutex around the reads
of the ifq stats so they get a consistent copy.

the ifnet and ifq stats are merged because some parts of the stack
still update the ifnet counters.

ok visa@ (on an earlier diff) mpi@ claudio@


# 1.181 12-Dec-2016 mpi

Remove most of the splsoftnet() recursions related to cloned interfaces.

inputs and ok bluhm@


# 1.180 27-Oct-2016 dlg

add a new pool for 2k + 2 byte (mcl2k2) clusters.

a certain vendor likes to make chips that specify the rx buffer
sizes in kilobyte increments. unfortunately it places the ethernet
header on the start of the rx buffer, which means if you give it a
mcl2k cluster, the ethernet header will not be ETHER_ALIGNed cos
mcl2k clusters are always allocated on 2k boundarys (cos they pack
into pages well). that in turn means the ip header wont be aligned
correctly.

the current workaround on these chips has been to let non-strict
alignment archs just use the normal 2k cluster, but use whatever
cluster can fit 2k + 2 on strict archs. that turns out to be the
4k cluster, meaning we waste nearly 2k of space on every packet.

properly aligning the ethernet header and ip headers gives a
performance boost, even on non-strict archs.


# 1.179 04-Sep-2016 reyk

Move code to change the rdomain of an interface from the ioctl switch case
to a new function if_setrdomain().

OK mpi@ henning@


# 1.178 03-Sep-2016 reyk

Add support for a multipoint-to-multipoint mode in vxlan(4). In this
mode, vxlan(4) must be configured to accept any virtual network
identifier with "vnetid any" and added to a bridge(4) or switch(4).
This way the driver will dynamically learn the tunnel endpoints and
their vnetids for the responses and can be used to dynamically bridge
between VXLANs. It is also being used in combination with switch(4)
and the OpenFlow tunnel classifiers.

With input from yasuoka@ goda@
OK deraadt@ dlg@


Revision tags: OPENBSD_6_0_BASE
# 1.177 10-Jun-2016 vgross

Add the "llprio" field to struct ifnet, and the corresponding keyword
to ifconfig.

"llprio" allows one to set the priority of packets that do not go through
pf(4), as the case is for arp(4) or bpf(4).

ok sthen@ mikeb@


# 1.176 02-Mar-2016 dlg

provide generic ioctls for managing an interfaces parent

in the future this will subsume the individual vlandev, carpdev,
pppoedev, foodev options for things like vlan, carp, pppoe, etc.

inspired by vnetid

ok mpi@ jmatthew@


Revision tags: OPENBSD_5_9_BASE
# 1.175 05-Dec-2015 deraadt

avoid an ugly wrap in a comment


# 1.174 03-Dec-2015 dlg

rework if_start to allow nics to provide an mpsafe start routine.

existing start routines will still be called under the kernel lock
and at IPL_NET.

mpsafe start routines will be serialised so only one instance of
each interfaces function will be running in the kernel at any point
in time. this guarantees packets will be dequeued in order, and the
start routines dont have to lock against themselves because if_start
does it for them.

the code to do that is based on the scsi runqueue code.

this also provides an if_start_barrier() function that should wait
until any currently running instances of if_start have finished.

a driver can opt in to the mpsafe if_start call by doing the following:

1. setting ifp->if_xflags = IFXF_MPSAFE
2. only calling if_start() instead of its own start routine
3. clearing IFF_RUNNING before calling if_start_barrier() on its way down
4. only using IFQ_DEQUEUE (not ifq_deq_begin/commit/rollback)

to simplify the implementation the tx mitigation code has been removed.

tested by several
ok mpi@ jmatthew@


# 1.173 20-Nov-2015 mpi

Keep if_ref() private, if_get() is what you want to use before if_put().

The thread detaching an interface will sleep until all references to this
interface have been released. So we decided to only keep references for
a short period of time.

Keeping if_ref() private will hopefully help preserve this goal as long
as it makes sense.

Calling if_get()/if_put() in the same function also allows us to make
use of static analysis tools (thanks jsg@!) to catch our errors.

ok dlg@


# 1.172 24-Oct-2015 reyk

Add pair(4), a vether-based virtual Ethernet driver to interconnect
rdomains and bridges on the local system. This can be used to route
through local rdomains, to create L2 devices (like trunks) between
them, and many other things.

Discussed with many, with input from mpi@
OK sthen@ phessler@ yasuoka@ mikeb@


# 1.171 23-Oct-2015 claudio

Introduce a new sysctl NET_RT_IFNAMES that returns only ifnames to ifindex
mappings. This will be used by if_nameindex(3), if_nametoindex(3) and
if_indextoname(3) soon to fix the issues in pledge because of inet6 link
local addressing.
OK mpi@ benno@ deraadt@
The libc version will follow soon so better start updating your kernels


# 1.170 23-Oct-2015 dlg

tweak the vnetid so it can be optional and therefore cleared/deleted.

the abstract vnetid is promoted to a uin32_t, and adds a SIOCDVNETID
ioctl so it can be cleared.

this is all because i set an assignment on implementing a virtual
network interface and the students got confused when vnetid 0 didnt
show up in ifconfig output.

the vnetid in the vxlan(4) protocol is optional, but the current
code confuses 0 with no vnetid being set. this makes it clear.

ok reyk@ who also simplified my diff


# 1.169 05-Oct-2015 uebayasi

Add ifi_oqdrops and its alias to struct if_data.

Necessary bumps in Ports will be handled by sthen@.

OK mpi@ dlg@


# 1.168 27-Sep-2015 stsp

Add if_setlladdr(), factored out from ifioctl(). Will be used by iwm(4) soon.
With suggestions from tedu@ and guenther@
ok kettenis@


# 1.167 11-Sep-2015 stsp

Make room for media types of the future. Extend the ifmedia word to 64 bits.
This changes numbers of the SIOCSIFMEDIA and SIOCGIFMEDIA ioctls and
grows struct ifmediareq.

Old ifconfig and dhclient binaries can still assign addresses, however
the 'media' subcommand stops working. Recompiling ifconfig and dhclient
with new headers before a reboot should not be necessary unless in very
special circumstances where non-default media settings must be used to
get link and console access is not available.

There may be some MD fallout but that will be cleared up later.

ok deraadt miod
with help and suggestions from several sharks attending l2k15


# 1.166 09-Sep-2015 dlg

introduce reference counts for interfaces (ie, struct ifnet *ifp).

if_get can get a reference to an ifp, but it never releases that
reference. this provides an if_put function that can be used to
decrement the refcount.

we cannot come up with a scheme for letting the network stack run on
one (or many) cpus while ioctls are pulling interfaces down on another
cpu without refcounts for the interfaces.

if_put is going in now so we can go through the stack and put the
necessary calls to it in, and then we'll backfill this implementation
to actually check the refcounts when the interface detaches.

ok mpi@ mikeb@ claudio@


# 1.165 30-Aug-2015 mpi

Use a global table for domains instead of building a list at run time.

As a side effect there's no need to run if_attachdomain() after the
list of domains has been built.

ok claudio@, reyk@


Revision tags: OPENBSD_5_8_BASE
# 1.164 07-Jun-2015 jsg

Introduce unhandled_af() for cases where code conditionally does
something based on an address family and later assumes one of the paths
was taken. This was initially just calls to panic until guenther
suggested a function to reduce the amount of strings needed.

This reduces the amount of noise with static analysers and acts
as a sanity check.

ok guenther@ bluhm@


# 1.163 18-May-2015 reyk

Move the rdomain from struct ifnet into struct if_data. This way it
will be exported to userland with the existing sysctl, getifaddrs()
and routing socket (if_msghdr.ifm_data) interfaces that expose
if_data. All programs and daemons - Apps - that call the
SIOCGIFRDOMAIN ioctl in a getifaddrs() loop or after receiving an
interface message on the routing socket can now remove the pointless
additional ioctl. In base, that could be: dhclient, isakmpd, dhcpd,
dhcrelay, ntpd, ospfd, ripd, ifconfig.

No ABI breakage because it uses a previously unused pad field in if_data.

OK mpi@ deraadt@


# 1.162 10-Apr-2015 mpi

Run detach hook and similar before cleaning up any other resource when
an interface is destroyed/removed. This way we can ensure pseudo-driver
changes done after attaching an interface are undone before detaching it.

Note: it is safe to call if_deactivate() multiple times as the interface
should not have any attached pseudo-interface after the first call.

ok deraadt@, dlg@


# 1.161 18-Mar-2015 dlg

remove the congestion handling from struct ifqueue.

its only used for the ip and ip6 network stack input queues, so it
seems unfair that every instance of ifqueue has to carry a pointer
around for this specific use case.

this moves the congestion marker to a kernel global. if we detect
that we're congested, we assume the whole system is busy and punish
all input queues.

marking a system as congested is done by setting the global to the
current value of ticks. as the system moves away from that value,
it moves away from being congested until the comparison fails.

written at s2k15
ok henning@ beck@ bluhm@ claudio@


Revision tags: OPENBSD_5_7_BASE
# 1.160 08-Feb-2015 mpi

Introduce if_input() a function to pass packets dequeued from a
recieving ring to the stack.

if_input() is at the moment a drop-in replacement for ether_input_mbuf()
but will let us stack pseudo-driver in a nice way in order to no longer
call ether_input() recursively.

ok pelikan@, reyk@, blambert@, henning@


# 1.159 06-Jan-2015 stsp

Remove the NOINET6 interface flag, a left-over from the times when IPv6
was enabled by default. Add AFATTACH/AFDETACH ioctls which enable/disable
an address family for an interface (currently used for IPv6 only).

New kernel needs new ifconfig for IPv6 configuration (address assignment
still works with old ifconfig making this easy to cross over).

Committing on behalf of henning@ who is currently lebensmittelvergiftet.
ok stsp, benno, mpi


# 1.158 05-Dec-2014 mpi

Explicitly include <net/if_var.h> instead of pulling it in <net/if.h>.

ok mikeb@, krw@, bluhm@, tedu@


Revision tags: OPENBSD_5_6_BASE
# 1.157 14-Jul-2014 dlg

now that receive ring accounting has been pulled out of the mbuf layer,
we can pull the space the mbuf layer used to do per interface accounting
out of struct if_data.

saves a hundredish bytes on every interface.

ok deraadt@ claudio@


# 1.156 11-Jul-2014 henning

introduce the IFXF_AUTOCONF6 interface flag which controls wether we
accept rtadvs on that interface. the global net.inet6.ip6.accept_rtadv
sysctl just doesn't cut it, even tho the spec wants that - but in their
little absurd world, a host just has one interface by definition anyway...
the sysctlgoes away.
lots of head scratching, brain cell elemination etc from bluhm benno stsp
florian, excitement from simon and todd, ok bluhm stsp benno florian


# 1.155 08-Jul-2014 dlg

introduce the if_rxr api. it is intended to pull the rx ring accounting
out of the mbuf layer, and break the assumption that an interface will
only have a single ring per mbuf cluster size.

mpi@ is ok with moving this forward


# 1.154 13-Jun-2014 mpi

Instead of updating all the cluster allocation water marks of all the
interfaces when the kernel is livelocked, only do it for the current
pool and defer the other updates.

This allow us to get rid of an interface list iteration in a critical
path.

Ridding the libc crank since this change introduce an ABI break.

ok claudio@


Revision tags: OPENBSD_5_5_BASE
# 1.153 21-Nov-2013 mikeb

split kernel parts of the if.h into a separate header file if_var.h
which allows us to modify ifnet structure in a relatively safe way;
discussed with deraadt, ok mpi


# 1.152 09-Nov-2013 dlg

ticks is compared against mcl_grown to see if time has elapsed since
the rx ring was last allowed to grow and then assigned to it. it
is erroneous to do this because mcl_grown is a u_int and ticks is an
int.

this makes mcl_grown an int, and follows the idiom in kern_timeout.c
of going "thing - ticks < diff", which better copes with ticks
wrapping around and being used to calculate relative intervals.

ok pirofti@ guenther@


# 1.151 01-Nov-2013 pelikan

keep net/hfsc.h away from userspace, except in pfctl

tested by naddy, ok deraadt


# 1.150 21-Oct-2013 benno

nuke comment. How soon is now?
"do it" deraadt@


# 1.149 19-Oct-2013 reyk

Bring back the if_detachhook. We're going to have more users now.

ok mpi@ henning@ benno@


# 1.148 13-Oct-2013 reyk

Import vxlan(4), the virtual extensible local area network tunnel
interface. VXLAN is a UDP-based tunnelling protocol for overlaying
virtualized layer 2 networks over layer 3 networks. The implementation
is based on draft-mahalingam-dutt-dcops-vxlan-04 and has been tested
with other implementations in the wild.

put it in deraadt@


# 1.147 12-Oct-2013 henning

new bandwidth shaping subsystem, kernel side
uses hfsc behind the scenes; altq stays in parallel for a migration phase.
if.h even more messy for the transition, but eventuelly it should become
readable...
looked over & tested by many, ok phessler sthen


# 1.146 17-Sep-2013 mpi

Change vlan(4) detach procedure to not use a hook but a list of vlans
on the parent interface. This is similar to what bridge(4), trunk(4)
or carp(4) are doing and allows us to get rid of the detachhook.

ok reyk@, mikeb@


# 1.145 28-Aug-2013 mpi

Remove unused argument from *rtrequest()

ok krw@, mikeb@


Revision tags: OPENBSD_5_4_BASE
# 1.144 20-Jun-2013 mpi

Revert previous and unbreak asr, the new include should be protected.

Reported by naddy@


# 1.143 20-Jun-2013 mpi

Allocate the various hook head descriptors as part of the ifnet
structure rather than doing various M_WAITOK allocations during
the *attach() functions, we always rely on them anyway.

ok mikeb@, uebayasi@


# 1.142 02-Apr-2013 mpi

Instead of storing the link-level address of every interface in a global
array indexed by interface numbers, add a new field to the interface
descriptor pointing to it.

claudio@ and todd@ like it, ok mikeb@


# 1.141 26-Mar-2013 mpi

Remove various read-only *maxlen variables and use IFQ_MAXLEN directly.

ok beck@, mikeb@


# 1.140 20-Mar-2013 mpi

Introduce if_get() to retrieve an interface descriptor pointer given
an interface index and replace all the redondant checks and accesses
to a global array by a call to this function.

With imputs from and ok bluhm@, mikeb@


# 1.139 07-Mar-2013 mpi

Remove unused ifa_ifwithaf() function.

ok mikeb@, miod@


# 1.138 07-Mar-2013 mpi

Remove the IFAFREE() macro, the ifafree() function it was calling already
check for the reference counter.

ok mikeb@, miod@, pelikan@, kettenis@, krw@


Revision tags: OPENBSD_5_3_BASE
# 1.137 23-Nov-2012 sthen

Add SIOCGIFHARDMTU to allow retrieving the driver's maximum supported MTU
looks fine reyk@ ok mikeb@


# 1.136 11-Nov-2012 deraadt

align ifaliasreq.ifra_addr similar to the way that ifreq is fixed --
a gruesome union, to block the compiler from placing the struct
incorrectly aligned on stack frames
ok guenther


# 1.135 05-Oct-2012 camield

Point an interface directly to its bridgeport configuration, instead
of to the bridge itself. This is ok, since an interface can only be part
of one bridge, and the parent bridge is easy to find from the bridgeport.

This way we can get rid of a lot of list walks, improving performance
and shortening the code.

ok henning stsp sthen reyk


# 1.134 19-Sep-2012 henning

defina an IFCAP_CSUM_MASK, covering IFCAP_CSUM_*, and use it in if_vlan.c
to replace the list of them.
this actually makes vlan inherit the IPv6 CSUM flags from it's parent, that
had been commented out since this code was committed back in 2001.
ok benno mpf


# 1.133 10-Sep-2012 guenther

Bring into compliance with POSIX, exposing just the specified bits.

Requested by jasper@, ok kettenis@


# 1.132 21-Aug-2012 bluhm

Reverse the name and meaning of the IFXF_INET6_PRIVACY interface
flag. It is now called IFXF_INET6_NOPRIVACY. So IPv6 privacy
addresses are on by default without resetting the flag during
ifconfig down/up.
OK stsp@, sperreault@ (who wrote the same diff)


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.131 02-Dec-2011 haesbaert

Kill unused IFCAP_IPSEC and IFCAP_IPCOMP.

ok claudio@ henning@ mikeb@


# 1.130 02-Nov-2011 haesbaert

Expose if_capabilities to userland so that ifconfig can display the
device hardware features.
Tune ifconfig to show them with 'hwfeatures' argument.
While here, kill some old unused capabilities and respect 80 columns
in brconfig.h.

ok mcbride@, henning@, mpf@.


# 1.129 07-Oct-2011 henning

rename some vars and functions
unfortunately altq is one giant namespace violation. rename just those that
conflict with new stuff for now only to be found on my laptop. reduce pain,
the diff is huge already. ok ryan


Revision tags: OPENBSD_5_0_BASE
# 1.128 08-Jul-2011 henning

new priority queueing implementation, extremely low overhead, thus fast.
unconditional, always on. 8 priority levels, as every better switch, the
vlan header etc etc. ok ryan mpf sthen, pea tested as well


# 1.127 07-Jul-2011 henning

provide IF_LEN and IFQ_LEN to access ifq_len on an ifqueue, ryan ok


# 1.126 05-Jul-2011 henning

now of course I only noticed if_qflush is completely unused after
adjusting it to the new world order in my tree... remove it, ok ryan claudio


# 1.125 03-Jul-2011 henning

IFQ_CLASSIFY is also just schrapnel


# 1.124 03-Jul-2011 henning

no traces of ALTQ_DECL to be found anywhere, thus kill the #defines


# 1.123 03-Jul-2011 claudio

LINK_STATE_IS_UP() should consider LINK_STATE_UNKNOWN as an up state.
This is now possible because carp no longer uses LINK_STATE_UNKNOWN
for a state that is considered down. This will simplify a lot of code.
OK mpf@ mcbride@ henning@


# 1.122 13-Mar-2011 stsp

Add a way to enable/disable Wake On LAN with ifconfig.
ok deraadt


Revision tags: OPENBSD_4_9_BASE
# 1.121 17-Nov-2010 henning

introduce ifa_update_broadaddr to update an ifaddr's broadcast address,
trivial for the moment, more needed soon
tested by many as part of a larger diff, ok sthen claudio dlg krw


# 1.120 24-Sep-2010 claudio

Implement if_freenameindex() as a real function as required by posix.
OK deraadt@, millert@


# 1.119 23-Sep-2010 dlg

tweak the mclgeti algorithm to behave better under load.

instead of letting hardware rings grow on every interrupt, restrict
it so it can only grow once per softclock tick. we can only punish
the rings on softclock ticks, so it make sense to only grow on
softclock tick boundaries too.

the rings are now punished after >1 lost softclock tick rather than
>2. mclgeti is now more aggressive at detecting livelock.

the rings get punished by an 8th, rather than by half.

we now allow the rings to be punished again even if the system is
already considered in livelock.

without this diff a livelocked system will have its rx ring sizes
scale up and down very rapidly, while holding the rings low for too
long. this affected throughput significantly.

discussed and tested heavily at j2k10. there are still some games
with softnet we can play, but this is a good first step.

"put it in" and ok deraadt@
ok claudio@ krw@ henning@ mcbride@

if we find out that it sucks we can pull it out again later. till then
we'll run with it and see how it goes.


# 1.118 27-Aug-2010 jsg

remove the unused if_init callback in struct ifnet
ok deraadt@ henning@ claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.117 26-Jun-2010 claudio

Implement a simple keepalive mechanism in gre(4) that is compatible with
the one used by Cisco. It sends a return gre packet inside a gre packet
to the other side and expects it to return.
OK deraadt, reyk additional testing by sthen


# 1.116 28-May-2010 claudio

Rework the way we handle MPLS in the kernel. Instead of fumbling MPLS into
ether_output() and later on other L2 output functions use a trick and over-
load the ifp->if_output() function pointer on MPLS enabled interfaces to
go through mpls_output() which will then call the link level output function.
By setting IFXF_MPLS on an interface the output pointers are switched.
This now allows to cleanup the MPLS input and output pathes and fix mpe(4)
so that the MPLS code now actually works for both P and PE systems.
Tested by myself and michele
(A custom kernel with MPLS and mpe enabled is still needed).


# 1.115 17-Apr-2010 deraadt

split SIOCSIFLLADDR code out into an ifnewlladr() function
ok stsp


# 1.114 06-Apr-2010 stsp

Simple implementation of RFC4941, "Privacy Extensions for Stateless
Address Autoconfiguration in IPv6". For those among us who are paranoid
about broadcasting their MAC address to the IPv6 internet.

Man page help from jmc, testing by weerd, arc4random API hints from djm.

ok deraadt, claudio


Revision tags: OPENBSD_4_7_BASE
# 1.113 13-Jan-2010 henning

maintain a global RB tree of all local addresses in the system. this
includes AF_LINK addresses (aka mac addresses in the ethernet case). for
inet this also includes the broadcast addresses.
depends on ifinit() called earlier so we have a chance to pool_init before
autoconf assigns the AF_LINK addresses, the v6 fix, and the ifa_add/del
abstraction i just committed.
this is a change in semantics, it is now illegal to change the actual
address in an ifaddr struct because then the RB tree becomes unbalanced.
nothing using this tree yet.
ok theo ryan dlg


# 1.112 13-Jan-2010 henning

instead of fiddling with the per-interface address lists directly in
many places create a proper API (ifa_add / ifa_del) and use it.
ok theo ryan dlg


# 1.111 12-Jan-2010 deraadt

Make the structures for ifa_msghdr and friends even more like
the route messages so that people and compilers will not get
confused.
ok claudio


# 1.110 17-Sep-2009 claudio

Remove the comaptibility structures for routing socket version 3.
The RTM_VERSION bump is 2 years ago and so there is no need for this.
Diff made by tedu@ some time ago but got never commited so I do it now.


# 1.109 14-Sep-2009 claudio

Add a way to convert the ifi_link_state to a string without the use of
if_media. This makes link state tracking a lot easier as there is no need
to convert if types to if_media types, etc. Additionally this allows us
to extend the link states to include states tracked on higher protocol layers.
gre(4) keepalives packets, bfd and udld can be implemented without ugly hacks.
OK henning, michele, sthen, deraadt


# 1.108 10-Aug-2009 deraadt

At sys_reboot time, bring all the interfaces down so that their xxstop
functions are called, which will turn off DMA. Receiving packets into
your memory after a system reboot is pretty nasty. This will also mean
that the shutdown hooks can go; this solution is smaller.
ok henning miod dlg kettenis


Revision tags: OPENBSD_4_6_BASE
# 1.107 06-Jun-2009 rainer

when xflags got changed, tell the userland by routing sockets

ok henning@


# 1.106 05-Jun-2009 claudio

Initial support for routing domains. This allows to bind interfaces to
alternate routing table and separate them from other interfaces in distinct
routing tables. The same network can now be used in any doamin at the same
time without causing conflicts.
This diff is mostly mechanical and adds the necessary rdomain checks accross
net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6.
input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@


# 1.105 04-Jun-2009 henning

allow IPvShit to be turned off completely per-interface.
ifconfig em0 -inet6
deletes all v6 addresses including link-local and prevents new ones from
being added.
ifconfig em0 inet6 <addr>
re-enables v6, brings the link local back and adds optional <addr>
ok theo reyk


# 1.104 03-Jun-2009 beck

make wireless interfaces priority 4 by default. other interfaces remain
priority 0. while we are in here make sure we add wi interfaces to group "wlan"
in the same way the net80211 stuff already is.

this makes dhcp multiple default routes useful on laptops.

ok claudio@


Revision tags: OPENBSD_4_5_BASE
# 1.103 27-Jan-2009 dlg

make drivers tell the mclgeti allocator what their maximum ring size is
to prevent the hwm growing beyond that. this allows the livelock mitigation
to do something where the hwm used to grow beyond twice the rx rings size.

ok kettenis@ claudio@


# 1.102 12-Dec-2008 claudio

Introduce a if_priority that will be added to RTP_STATIC when routes are
added without an expilict priority. This allows to specify less prefered
interfaces that will only take over if the primary interface loses link.
OK deraadt@


# 1.101 11-Dec-2008 deraadt

export per-interface mbuf cluster pool use statistics out to userland
inside if_data, so that netstat(1) and systat(1) can see them
ok dlg


# 1.100 30-Nov-2008 brad

- Remove unused if_reset "bus reset routine" field in the ifnet struct.
- Add if_stop "stop routine" field in the ifnet struct.

ok mglocker@


# 1.99 26-Nov-2008 dlg

provide m_clsetlwm, an interface for an interface to raise its low
watermark for mbuf cluster allocations.

this is necessary for things like bge which cannot cope with less than a
certain number of pkts on the ring.

ok deraadt@


# 1.98 25-Nov-2008 deraadt

Factor increases are not needed, +1 appears to work as well.
ok dlg


# 1.97 24-Nov-2008 deraadt

move MCLPOOLS to if.h and force uipc_mbuf.c to get if.h, there is no
other option
ok dlg


# 1.96 24-Nov-2008 dlg

add several backend pools to allocate mbufs clusters of various sizes out
of. currently limited to MCLBYTES (2048 bytes) and 4096 bytes until pools
can allocate objects of sizes greater than PAGESIZE.

this allows drivers to ask for "jumbo" packets to fill rx rings with.

the second half of this change is per interface mbuf cluster allocator
statistics. drivers can use the new interface (MCLGETI), which will use
these stats to selectively fail allocations based on demand for mbufs. if
the driver isnt rapidly consuming rx mbufs, we dont allow it to allocate
many to put on its rx ring.

drivers require modifications to take advantage of both the new allocation
semantic and large clusters.

this was written and developed with deraadt@ over the last two days
ok deraadt@ claudio@


# 1.95 07-Nov-2008 deraadt

give this some /* CONSTCOND */ love


Revision tags: OPENBSD_4_4_BASE
# 1.94 10-Apr-2008 dlg

introduce mitigation for the calling of an interfaces start routine.

decent drivers prefer to have a lot of packets on the send queue so they
can queue a lot of them up on the tx ring and then post them all in one
big chunk. unfortunately our stack queues one packet onto the send queue
and then calls the start handler immediately.

this mitigates against that queue, send, queue, send behaviour by trying to
call the start routine only once per softnet. now its queue, queue, queue,
send.

this is the result of a lot of discussion with claudio@
tested by many.


Revision tags: OPENBSD_4_3_BASE
# 1.93 18-Nov-2007 mpf

Sync struct ifaltq to match struct ifqueue.
I wonder why 64-bit archs have not been bitten by this.
OK mcbride@, henning@


# 1.92 03-Sep-2007 claudio

Bump RTM_VERSION to 4 and start a new aera of routing in OpenBSD :)
Changes include 64bit counters instead of u_long, routing table id in the header
of most messages, reserved routing priority field, added a hdrlen field to skip
over the header so that binary compatibility becomes easier.
A minimal backward support for old binaries is included to ease upgrades but
don't expect anything more than ifconfig, route and dhclient to correctly work.
OK henning@ mglocker@


Revision tags: OPENBSD_4_2_BASE
# 1.91 25-Jun-2007 henning

crank ifq_maxlen from 50 to 256, so it is not smaller than most interfaces
rx rings any more. forwarding boxes with many fast interfaces can still use
some more, but this is a saner default.
ok deraadt markus henric


# 1.90 14-Jun-2007 reyk

Add a new "rtlabel" option to ifconfig. It allows to specify a route label
which will be used for new interface routes. For example,
ifconfig em0 10.1.1.0 255.255.255.0 rtlabel RING_1
will set the new interface address and attach the route label RING_1 to
the corresponding route.

manpage bits from jmc@
ok claudio@ henning@


# 1.89 29-May-2007 uwe

Define IF_ENQUEUE() and friends as proper C statements using do ... while
ok henning


# 1.88 26-May-2007 jason

one extern seems to be better than 20 for ifqmaxlen; ok krw


# 1.87 27-Mar-2007 jmc

grammar from bret lambert, and one more from me;


Revision tags: OPENBSD_4_1_BASE
# 1.86 09-Feb-2007 jmc

grammar fix from bret lambert;


# 1.85 28-Nov-2006 reyk

add additional link states to report the half duplex / full duplex
state, if known by the driver. this is required to check the full
duplex state without depending on the ifmedia ioctl which can't be
called in the kernel without process context.

ok henning@, brad@


# 1.84 16-Nov-2006 henning

introduce if_creategroup() to create an empty interface group.
code factored out from if_addgroup(), previously a group always had to have
members. ok mpf mcbride


# 1.83 31-Oct-2006 jason

ether_input_mbuf() isn't necessary, turn it into a macro and deal with
it's "special" case in ether_input(). Based on similiar idea in FreeBSD.
ok brad


Revision tags: OPENBSD_4_0_BASE
# 1.82 02-Jun-2006 mpf

Introduce attributes to interface groups.
As a first user, move the global carp(4) demotion counter
into the interface group. Thus we have the possibility
to define which carp interfaces are demoted together.

Put the demotion counter into the reserved field of the carp header.
With this, we can have carp act smarter if multiple errors occur.
It now always takes over other carp peers, that are advertising
with a higher demote count. As a side effect, we can also have
group failovers without the need of running in preempt mode.
The protocol change does not break compability with older
implementations.

Collaborative work with mcbride@

OK mcbride@, henning@


# 1.81 27-May-2006 brad

remove IFCAP_JUMBO_MTU interface capabilities flag and set if_hardmtu in a few
more drivers.

ok reyk@


# 1.80 26-May-2006 deraadt

rename jumbo mtu to if_hardmtu; ok brad reyk


# 1.79 19-May-2006 reyk

add a if_jumbo_mtu field to the interface structure for drivers
supporting ethernet jumbo frames. there's no standard for the size of
jumbo MTUs, so either let the driver set it's own value or use 9000
byte jumbo frames by default.

ok brad@


# 1.78 04-Mar-2006 brad

With the exception of two other small uncommited diffs this moves
the remainder of the network stack from splimp to splnet.

ok miod@


Revision tags: OPENBSD_3_9_BASE
# 1.77 09-Feb-2006 reyk

add an interface detach hook and use it with the vlan(4) driver. this
fixes a possible crash if the parent interface has been destroyed
(like vlan on trunk) before destroying the vlan interface.

ok brad@


Revision tags: OPENBSD_3_8_BASE
# 1.76 14-Jun-2005 henning

rename function and define to reflect the external -> egress name change
so it is clear what it is all about


# 1.75 14-Jun-2005 henning

use "egress" instead of "external" for the interface group containing the
interfaces the default route(s) point to, proposed deraadt some days ago,
ok djm deraadt


# 1.74 12-Jun-2005 henning

add SIOCGIFGMEMB ioctl, returns a list of all interfaces who are member of
the given group, markus ok


# 1.73 07-Jun-2005 henning

introduce a default "external" interface group, containing the interface(s)
the the default route(s) point to.
handles IPv4 and IPv6 as well as multipath routes.
follows default route changes, of course.
eases writing pf rulesets especially on laptops etc. that use different
interfaces depending on the environment (wired, wireless, ...)
ok theo ryan


# 1.72 06-Jun-2005 henning

use a define instead of hardcoding "all" in 3 places


# 1.71 05-Jun-2005 henning

const'ify the char *groupname param to if_addgroup and if_delgroup


# 1.70 24-May-2005 markus

add net.inet.ip.ifq for monitoring and changing ifqueue; similar to netbsd
ok henning


# 1.69 24-May-2005 reyk

initial import of a trunking (link aggregation and link failover)
implementation. it currently supports round robin mode with link state
checking, additional modes will be added later.

ok brad@, deraadt@


# 1.68 24-May-2005 henning

keep a list of member interfaces in ifg_group


# 1.67 22-May-2005 henning

allow pf to match on interface groups
pass on mygroup ...
markus ok


# 1.66 21-May-2005 henning

clean up and rework the interface absraction code big time, rip out multiple
useless layers of indirection and make the code way cleaner overall.
this is just the start, more to come...
worked very hard on by Ryan and me in Montreal last week, on the airplane to
vancouver and yesterday here in calgary. it hurt.
ok ryan theo


# 1.65 20-Apr-2005 mpf

Introduce if_linkstatehooks.
This converts if_link_state_change() to a generic usable
callback with dohooks().

OK henning@, camield@
Tested by camield@ and Alexey E. Suslikov


Revision tags: OPENBSD_3_7_BASE
# 1.64 07-Feb-2005 mcbride

Add new function if_link_state_change() to take care of sending messages
on the routing socket and notifying carp() of link changes.

ok brad@ mpf@


# 1.63 14-Jan-2005 henning

remove old ifgroups ioctls
the old ifgroups haven't been in use ever really, and the new
implementation is 3 months old today. theo ok (3 months ago)


# 1.62 07-Dec-2004 mcbride

Convert carp(4) to behave more like a regular interface, much in the same
style as vlan(4). carp interfaces no longer require the physical interface
to be on the same subnet as the carp interface, or even that the physical
interface has an adress at all, so CARP can now be used on /30 networks.

ok deraadt@ henning@


# 1.61 07-Dec-2004 mcbride

KNF


# 1.60 03-Dec-2004 henning

do not use one struct timeout for the if congestion stuff, but embed
a struct timeout to struct ifqueue so that each one has its own - it
is a per-queue thing. from chris pascoe


# 1.59 10-Nov-2004 grange

Safer IF_INPUT_ENQUEUE macro.

ok millert@


# 1.58 14-Oct-2004 mickey

avoid stupid commons


# 1.57 11-Oct-2004 henning

ifgroups reqrite
there is now a TAILQ with all interface groups as members, and
in struct ofnet there is only a pointer to the group structure stored
and not its name.
mostly hacked at c2k4 and somewhere over the atlantic ocean
ok markus mcbride


Revision tags: OPENBSD_3_6_BASE
# 1.56 26-Jun-2004 markus

cleanup ioctl for ifgroups; ok pb@


# 1.55 25-Jun-2004 pb

introduce "interface groups"

by "ifconfig fxp0 group foobar" "ifconfig xl0 group foobar"
these two interfaces are in one group.
Every interface has its if-family as default group.

idea/design from henning@, based on some work/disucssion from Joris Vink.

henning@, mcbride@ ok.


# 1.54 20-Jun-2004 beck

undo mbuf cluster breakage that causes free'ed packets to show up on the
input queues when using dhcp and hostap wi, or xl, or fxp....
ok art@


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.53 29-May-2004 jcs

introduce SIOCSIFDESCR and SIOCGIFDESCR to maintain interface
descriptions, configurable with ifconfig

help from various, ok deraadt@


# 1.52 18-May-2004 brad

if_ether.h
add ETHER_MAX_LEN_JUMBO, ETHER_VLAN_ENCAP_LEN, ETHER_ALIGN, and
ETHERMTU_JUMBO constants.

if.h
add a few more interface capabilities flags.

Some from NetBSD, some from FreeBSD.

ok markus@


# 1.51 26-Apr-2004 mcbride

Before enqueueing the packet, copy the contents of incoming clusters
to the mbuf and free the cluster when it contains a small packet.

ok deraadt@


# 1.50 17-Apr-2004 henning

add a congestion indicator to if_queue. It is set when the input queue
is full, along with a timer that unsets it again after 10ms.
The input queue beeing full is a reliable indicator for CPU overload, and
this flag allows other subsystems to cope with the situation.
hacked with beck
ok kjc@ markus@ beck@


Revision tags: OPENBSD_3_5_BASE
# 1.49 15-Jan-2004 markus

add a RTM_IFANNOUNCE message; from netbsd; ok itojun, henning


# 1.48 16-Dec-2003 markus

return error in ifc_destroy; ok deraadt, itojun, cedric, hshoexer


# 1.47 10-Dec-2003 itojun

use if_indexlim (instead of if_index) and ifindex2ifnet[x] != NULL
to check if interface exists, as (1) if_index will have different meaning
(2) ifindex2ifnet could become NULL when interface gets destroyed,
when we introduce dynamically-created interfaces. markus ok


# 1.46 08-Dec-2003 markus

add IOCIFGCLONERS; ifconfig -C; from netbsd; ok henning, deraadt


# 1.45 03-Dec-2003 markus

support for network interface "cloning", e.g. gif(4) via ifconfig(8)


# 1.44 19-Oct-2003 david

more typos


# 1.43 17-Oct-2003 mcbride

Common Address Redundancy Protocol

Allows multiple hosts to share an IP address, providing high availability
and load balancing.

Based on code by mickey@, with additional help from markus@
and Marco_Pfatschbacher@genua.de

ok deraadt@


Revision tags: OPENBSD_3_4_BASE
# 1.42 25-Aug-2003 fgsch

if_init support, required by ieee80211.
deraadt@ ok.


# 1.41 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_A UBC_SYNC_B
# 1.40 03-Jul-2002 miod

Change all variables definitions (int foo) in sys/sys/*.h to variable
declarations (extern int foo), and compensate in the appropriate locations.


# 1.39 30-Jun-2002 itojun

allocate sockaddr_dl for ifnet in if_alloc_sadl(), as we don't always know
the size of sockaddr_dl on if_attach() - for instance, see ether_ifattach().
from netbsd. fgs ok


# 1.38 23-Jun-2002 itojun

g/c last remains of old ipv6 prefix management


# 1.37 27-May-2002 itojun

if_attach() gets called before domaininit(). scan all interfaces for if_afdata
initialization after domaininit().


# 1.36 27-May-2002 itojun

framework to add af-dependent data structure to struct ifnet.
as discussed at bsd-api-discuss. sync w/kame


# 1.35 24-Apr-2002 dhartmei

Add hooks to struct ifnet that allow to register callbacks that will be
notified of interface address changes. ok provos@, angelos@


Revision tags: OPENBSD_3_1_BASE
# 1.34 15-Mar-2002 millert

Cosmetic changes only, primarily making comments line up nicely after the
__P removal.


# 1.33 14-Mar-2002 millert

First round of __P removal in sys


# 1.32 23-Jan-2002 fgsch

compatability -> compatibility.


Revision tags: OPENBSD_3_0_BASE UBC_BASE
# 1.31 05-Jul-2001 angelos

branches: 1.31.4;
KNF


# 1.30 05-Jul-2001 jjbg

Include files for IPComp support. angelos@ ok.


# 1.29 27-Jun-2001 kjc

ALTQ base modifications to the kernel.
- ALTQ introduces a set of new queue macros that coexist with the
traditional IF_XXX macros.
- "struct ifaltq" replaces "struct ifqueue" in "struct ifnet".
- assign cdev major 74 for i386 and 54 for alpha as ALTQ control interface.


# 1.28 23-Jun-2001 fgsch

Add ether_input_mbuf to help us remove the ether_header from
ether_input; all drivers should start migrating to this.
Discussed with jason@, deraadt@ more or les ok'ed.


# 1.27 15-Jun-2001 itojun

change the meaning of ifnet.if_lastchange to meet RFC1573 ifLastChange.
follows BSD/OS practice and ucd-snmp code (FreeBSD does it for specific
interfaces only).

was: if_lastchange get updated on every packet transmission/receipt.
now: if_lastchange get updated when IFF_UP is changed.


# 1.26 09-Jun-2001 angelos

By popular demand, protect from multiple inclusion, and fix to use the
same naming style.


# 1.25 28-May-2001 angelos

IPSECv4 -> IPSEC


# 1.24 28-May-2001 angelos

No need for separate ESP/AH interface capabilities.


# 1.23 28-May-2001 angelos

Interface capabilities (based on NetBSD, but merge ethercom and ifnet
capabilities into one, in the ifp).


Revision tags: OPENBSD_2_9_BASE
# 1.22 06-Feb-2001 mickey

allow changing number of loopbacks in ukc.
change rest of the code to use lo0ifp pointing
to the corresponding struct ifnet.
itojun@ and niklas@ ok


# 1.21 19-Jan-2001 itojun

pull post-4.4BSD change to sys/net/route.c from BSD/OS 4.2 (UCB copyrighted).

have sys/net/route.c:rtrequest1(), which takes rt_addrinfo * as the argument.
pass rt_addrinfo all the way down to rtrequest, and ifa->ifa_rtrequest.
3rd arg of ifa->ifa_rtrequest is now rt_addrinfo * instead of sockaddr *
(almost noone is using it anyways).

benefit: the follwoing command now works. previously we need two route(8)
invocations, "add" then "change".
# route add -inet6 default ::1 -ifp gif0

remove unsafe typecast in rtrequest(), from rtentry * to sockaddr *. it was
introduced by 4.3BSD-reno and never corrected.

XXX is eon_rtrequest() change correct regarding to 3rd arg?
eon_rtrequest() and rtrequest() were incorrect since 4.3BSD-reno,
so i do not have correct answer in the source code.
someone with more clue about netiso-over-ip, please help.


Revision tags: OPENBSD_2_8_BASE
# 1.20 20-Sep-2000 art

Since ifa_refcnt was bumped to an int and rt_flags is an int too, bump
ifa_flags to int.


# 1.19 28-Aug-2000 deraadt

changing the size of if_data has heavy impact on userland compat. there
was even a u_char slot available for ifi_link_state, which clearly does not
need a full 32 bits.


# 1.18 26-Aug-2000 nate

sync mii code with netbsd
adds detach functionality for phys
some code cleanup

Nobody really had time to test all of this out, but theo said commit anyway


Revision tags: OPENBSD_2_7_BASE
# 1.17 22-Mar-2000 itojun

remove if_withname(), which was imported during KAME merge by mistake.


# 1.16 21-Mar-2000 mickey

add SIOCGIFMTU/SIOCSIFMTU; remediate redundant code of tun, ppp, sppp; chris@ ok


Revision tags: SMP_BASE
# 1.15 02-Feb-2000 itojun

branches: 1.15.2;
wrap IFAFREE() by "do {} while (0)". it wasn't safe enough.


Revision tags: kame_19991208
# 1.14 08-Dec-1999 itojun

bring in KAME IPv6 code, dated 19991208.
replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support.
see sys/netinet6/{TODO,IMPLEMENTATION} for more details.

GENERIC configuration should work fine as before. GENERIC.v6 works fine
as well, but you'll need KAME userland tools to play with IPv6 (will be
bringed into soon).


Revision tags: OPENBSD_2_6_BASE
# 1.13 08-Aug-1999 niklas

Support detaching of network interfaces. Still work to do in ipf, and
other families than inet.


# 1.12 23-Jun-1999 cmetz

Added some protocol independent interfaces (supposedly IPv6 support APIs, but
ones that are useful for all protocols, not just IPv6).


Revision tags: OPENBSD_2_5_BASE
# 1.11 13-Mar-1999 deraadt

make ifa_refcnt a u_int; andrewb@demon.net


# 1.10 26-Feb-1999 jason

Ethernet bridge/IP firewall driver.


# 1.9 07-Jan-1999 deraadt

fix IFAFREE() to be safe for if/else nesting


Revision tags: OPENBSD_2_4_BASE
# 1.8 03-Sep-1998 jason

o OpenBSD gets if_media support (from NetBSD)
o rework/simplify if_xl to use it


Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE OPENBSD_2_2_BASE OPENBSD_2_3_BASE
# 1.7 02-Jul-1996 niklas

-Wall & -Wstrict-prototype fixes


# 1.6 29-Jun-1996 deraadt

provide if_attachhead(), and make if_loop use it


# 1.5 10-May-1996 deraadt

if_name/if_unit -> if_xname/if_softc


# 1.4 06-May-1996 mickey

if.h was missed from the commit.
if_ethersubr.c: missed variables added.


# 1.3 05-Mar-1996 mickey

Changes for ifconfig to compile.


# 1.2 03-Mar-1996 niklas

From NetBSD: 960217 merge


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.204 30-Sep-2020 mvs

We have no if_attachtail() function so remove the declaration.

ok deraadt@ claudio@


Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE OPENBSD_6_8_BASE
# 1.203 25-Jul-2019 krw

AF_INET comes before AF_INET6. Shorten line to <80 chars.

pointed out by claudio@


# 1.202 25-Jul-2019 krw

Add IFXF_AUTOCONF4 to if_xflags to match IFXF_AUTOCONF6. Let
ifconfig set/unset it.

ok deraadt@ kmos@


# 1.201 19-Apr-2019 dlg

add IF_HDRPRIO_OUTER for rxprio

IF_HDRPRIO_OUTER says you want the priority from the outer encap header.

ok claudio@


Revision tags: OPENBSD_6_5_BASE
# 1.200 10-Apr-2019 dlg

add struct if_sffpage so userland can read a page of sfp/qsfp info

this will be used by ifconfig so it can show you things like who
mades what module you're using and when it was made, and on some
modules you get diag info like temperature, vcc, and rx and tx
powers.

im putting the kernel side in so we can keep fiddling with the
userland printf side of things.

this work was done based on a question by rachel roch
ok deraadt@
enthusiasm from many including mikeb@ sthen@ patrick@


# 1.199 23-Jan-2019 dlg

add a SIOCGPWE3 ioctl for interfaces to advertise pwe3 capability

im going to turn mpw into an ethernet interface, which includes
changing its if_type to IFT_ETHER. currently ldpd looks for if_type
IFT_MPLSTUNNEL to decide if an interface is a pseudowire, ie, it's
going to break. the ioctl will let ldpd ask the interface if it is
pseudowire capable as an alternative.

ok claudio@


# 1.198 12-Nov-2018 dlg

add ifreq bits for the tx header prio field ioctls

a tx header prio can set to a fixed value from 0 to 7, or magic
values to represent populating the prio field from the encapsulated
packet, or from the mbuf prio value.

ok claudio@


# 1.197 12-Nov-2018 krw

Add new routing socket message RTM_80211INFO to provide details of
802.11 interface state changes (e.g. SSID) to interested parties.

Original diff from phessler@. Many suggestions and tweaks from
claudio@, stsp@, anton@.

ok claudio@ stsp@ anton@ phessler@


# 1.196 11-Nov-2018 dlg

use the llprio on gre(4) and eoip(4) interfaces for the keepalive tos

llprios are valued 0 to 7, while the ip tos/dscp/tclass is an 8 bit
value. fortunately the high 3 bits map nicely to the llprio values,
so we shift the llprio into place when generating the keepalive
frames. the llprio is defaulted to the value that cisco uses for
their gre keepalives.


Revision tags: OPENBSD_6_4_BASE
# 1.195 12-Sep-2018 krw

Fix obvious cut&pasto in comment (ifa_msghdr -> if_announcemsghdr).

ok claudio@


# 1.194 30-May-2018 dlg

restrict the prio values from SIOCSIFLLPRIO to what the kernel handles

previously the ioctl code checked that prio was an int less than
UCHAR_MAX, but the rest of the kernel (and priq code in particular)
expects it to be between 0 and 7 inclusive.

ok krw@ tb@


# 1.193 25-Apr-2018 jca

Make this header standalone #if __BSD_VISIBLE, by including needed headers

Puts us in line with Free/NetBSD and Linux and will get us rid of
pointless patches in the ports tree. ok guenther@ deraadt@


Revision tags: OPENBSD_6_3_BASE
# 1.192 19-Feb-2018 dlg

tunneldf needs ifr_df


# 1.191 10-Feb-2018 florian

Implement RFC 7217: "A Method for Generating Semantically Opaque
Interface Identifiers with IPv6 Stateless Address Autoconfiguration."

"An IPv6 address configured using this method is stable within each
subnet, but the corresponding Interface Identifier changes when the
host moves from one network to another. This method is meant to be an
alternative to generating Interface Identifiers based on hardware
addresses."

OK naddy, sthen


# 1.190 16-Jan-2018 mpi

Recycle IFF_NOTRAILERS into IFF_STATICARP and document ownerhsip
of IFF* flags.

inputs from jmc@, ok bluhm@, visa@


# 1.189 21-Dec-2017 dlg

prototype if_attach_iqueues so drivers can configure multiple iqs.


# 1.188 09-Nov-2017 tb

The cmd argument of ifconf() has been unused since COMPAT_LINUX was
purged. Remove it and move the prototype to if.c since ifconf() is
not used outside of this file.

ok mpi


# 1.187 31-Oct-2017 sashan

- add one more softnet taskq
NOTE: code still runs with single softnet task. change definition of
SOFTNET_TASKS in net/if.c, if you want to have more than one softnet task

OK mpi@, OK phessler@


Revision tags: OPENBSD_6_1_BASE OPENBSD_6_2_BASE
# 1.186 24-Jan-2017 krw

A space here, a space there. Soon we're talking real whitespace
rectification.


# 1.185 24-Jan-2017 dlg

add support for multiple transmit ifqueues per network interface.

an ifq to transmit a packet is picked by the current traffic
conditioner (ie, priq or hfsc) by providing an index into an array
of ifqs. by default interfaces get a single ifq but can ask for
more using if_attach_queues().

the vast majority of our drivers still think there's a 1:1 mapping
between interfaces and transmit queues, so their if_start routines
take an ifnet pointer instead of a pointer to the ifqueue struct.
instead of changing all the drivers in the tree, drivers can opt
into using an if_qstart routine and setting the IFXF_MPSAFE flag.
the stack provides a compatability wrapper from the new if_qstart
handler to the previous if_start handlers if IFXF_MPSAFE isnt set.

enabling hfsc on an interface configures it to transmit everything
through the first ifq. any other ifqs are left configured as priq,
but unused, when hfsc is enabled.

getting this in now so everyone can kick the tyres.

ok mpi@ visa@ (who provided some tweaks for cnmac).


# 1.184 23-Jan-2017 mpi

Flag pseudo-interfaces as such in order to call add_net_randomness()
only once per packet.

Fix a regression introduced when if_input() started to be called by
every pseudo-driver.

ok claudio@, dlg@


# 1.183 23-Jan-2017 dlg

i botched the copyout to ifr->ifr_data in SIOCGIFDATA.

this lets pflogd run again.

rename if_data() to if_getdata() while here to make grepping for
things less noisy.

reported by jsg@
worked through with deraadt@


# 1.182 23-Jan-2017 dlg

merge the ifnet and ifqueue stats together when userland wants them.

a new if_data() function takes a pointer to ifnet and merges its
if_data and ifq statistics. it takes the ifq mutex around the reads
of the ifq stats so they get a consistent copy.

the ifnet and ifq stats are merged because some parts of the stack
still update the ifnet counters.

ok visa@ (on an earlier diff) mpi@ claudio@


# 1.181 12-Dec-2016 mpi

Remove most of the splsoftnet() recursions related to cloned interfaces.

inputs and ok bluhm@


# 1.180 27-Oct-2016 dlg

add a new pool for 2k + 2 byte (mcl2k2) clusters.

a certain vendor likes to make chips that specify the rx buffer
sizes in kilobyte increments. unfortunately it places the ethernet
header on the start of the rx buffer, which means if you give it a
mcl2k cluster, the ethernet header will not be ETHER_ALIGNed cos
mcl2k clusters are always allocated on 2k boundarys (cos they pack
into pages well). that in turn means the ip header wont be aligned
correctly.

the current workaround on these chips has been to let non-strict
alignment archs just use the normal 2k cluster, but use whatever
cluster can fit 2k + 2 on strict archs. that turns out to be the
4k cluster, meaning we waste nearly 2k of space on every packet.

properly aligning the ethernet header and ip headers gives a
performance boost, even on non-strict archs.


# 1.179 04-Sep-2016 reyk

Move code to change the rdomain of an interface from the ioctl switch case
to a new function if_setrdomain().

OK mpi@ henning@


# 1.178 03-Sep-2016 reyk

Add support for a multipoint-to-multipoint mode in vxlan(4). In this
mode, vxlan(4) must be configured to accept any virtual network
identifier with "vnetid any" and added to a bridge(4) or switch(4).
This way the driver will dynamically learn the tunnel endpoints and
their vnetids for the responses and can be used to dynamically bridge
between VXLANs. It is also being used in combination with switch(4)
and the OpenFlow tunnel classifiers.

With input from yasuoka@ goda@
OK deraadt@ dlg@


Revision tags: OPENBSD_6_0_BASE
# 1.177 10-Jun-2016 vgross

Add the "llprio" field to struct ifnet, and the corresponding keyword
to ifconfig.

"llprio" allows one to set the priority of packets that do not go through
pf(4), as the case is for arp(4) or bpf(4).

ok sthen@ mikeb@


# 1.176 02-Mar-2016 dlg

provide generic ioctls for managing an interfaces parent

in the future this will subsume the individual vlandev, carpdev,
pppoedev, foodev options for things like vlan, carp, pppoe, etc.

inspired by vnetid

ok mpi@ jmatthew@


Revision tags: OPENBSD_5_9_BASE
# 1.175 05-Dec-2015 deraadt

avoid an ugly wrap in a comment


# 1.174 03-Dec-2015 dlg

rework if_start to allow nics to provide an mpsafe start routine.

existing start routines will still be called under the kernel lock
and at IPL_NET.

mpsafe start routines will be serialised so only one instance of
each interfaces function will be running in the kernel at any point
in time. this guarantees packets will be dequeued in order, and the
start routines dont have to lock against themselves because if_start
does it for them.

the code to do that is based on the scsi runqueue code.

this also provides an if_start_barrier() function that should wait
until any currently running instances of if_start have finished.

a driver can opt in to the mpsafe if_start call by doing the following:

1. setting ifp->if_xflags = IFXF_MPSAFE
2. only calling if_start() instead of its own start routine
3. clearing IFF_RUNNING before calling if_start_barrier() on its way down
4. only using IFQ_DEQUEUE (not ifq_deq_begin/commit/rollback)

to simplify the implementation the tx mitigation code has been removed.

tested by several
ok mpi@ jmatthew@


# 1.173 20-Nov-2015 mpi

Keep if_ref() private, if_get() is what you want to use before if_put().

The thread detaching an interface will sleep until all references to this
interface have been released. So we decided to only keep references for
a short period of time.

Keeping if_ref() private will hopefully help preserve this goal as long
as it makes sense.

Calling if_get()/if_put() in the same function also allows us to make
use of static analysis tools (thanks jsg@!) to catch our errors.

ok dlg@


# 1.172 24-Oct-2015 reyk

Add pair(4), a vether-based virtual Ethernet driver to interconnect
rdomains and bridges on the local system. This can be used to route
through local rdomains, to create L2 devices (like trunks) between
them, and many other things.

Discussed with many, with input from mpi@
OK sthen@ phessler@ yasuoka@ mikeb@


# 1.171 23-Oct-2015 claudio

Introduce a new sysctl NET_RT_IFNAMES that returns only ifnames to ifindex
mappings. This will be used by if_nameindex(3), if_nametoindex(3) and
if_indextoname(3) soon to fix the issues in pledge because of inet6 link
local addressing.
OK mpi@ benno@ deraadt@
The libc version will follow soon so better start updating your kernels


# 1.170 23-Oct-2015 dlg

tweak the vnetid so it can be optional and therefore cleared/deleted.

the abstract vnetid is promoted to a uin32_t, and adds a SIOCDVNETID
ioctl so it can be cleared.

this is all because i set an assignment on implementing a virtual
network interface and the students got confused when vnetid 0 didnt
show up in ifconfig output.

the vnetid in the vxlan(4) protocol is optional, but the current
code confuses 0 with no vnetid being set. this makes it clear.

ok reyk@ who also simplified my diff


# 1.169 05-Oct-2015 uebayasi

Add ifi_oqdrops and its alias to struct if_data.

Necessary bumps in Ports will be handled by sthen@.

OK mpi@ dlg@


# 1.168 27-Sep-2015 stsp

Add if_setlladdr(), factored out from ifioctl(). Will be used by iwm(4) soon.
With suggestions from tedu@ and guenther@
ok kettenis@


# 1.167 11-Sep-2015 stsp

Make room for media types of the future. Extend the ifmedia word to 64 bits.
This changes numbers of the SIOCSIFMEDIA and SIOCGIFMEDIA ioctls and
grows struct ifmediareq.

Old ifconfig and dhclient binaries can still assign addresses, however
the 'media' subcommand stops working. Recompiling ifconfig and dhclient
with new headers before a reboot should not be necessary unless in very
special circumstances where non-default media settings must be used to
get link and console access is not available.

There may be some MD fallout but that will be cleared up later.

ok deraadt miod
with help and suggestions from several sharks attending l2k15


# 1.166 09-Sep-2015 dlg

introduce reference counts for interfaces (ie, struct ifnet *ifp).

if_get can get a reference to an ifp, but it never releases that
reference. this provides an if_put function that can be used to
decrement the refcount.

we cannot come up with a scheme for letting the network stack run on
one (or many) cpus while ioctls are pulling interfaces down on another
cpu without refcounts for the interfaces.

if_put is going in now so we can go through the stack and put the
necessary calls to it in, and then we'll backfill this implementation
to actually check the refcounts when the interface detaches.

ok mpi@ mikeb@ claudio@


# 1.165 30-Aug-2015 mpi

Use a global table for domains instead of building a list at run time.

As a side effect there's no need to run if_attachdomain() after the
list of domains has been built.

ok claudio@, reyk@


Revision tags: OPENBSD_5_8_BASE
# 1.164 07-Jun-2015 jsg

Introduce unhandled_af() for cases where code conditionally does
something based on an address family and later assumes one of the paths
was taken. This was initially just calls to panic until guenther
suggested a function to reduce the amount of strings needed.

This reduces the amount of noise with static analysers and acts
as a sanity check.

ok guenther@ bluhm@


# 1.163 18-May-2015 reyk

Move the rdomain from struct ifnet into struct if_data. This way it
will be exported to userland with the existing sysctl, getifaddrs()
and routing socket (if_msghdr.ifm_data) interfaces that expose
if_data. All programs and daemons - Apps - that call the
SIOCGIFRDOMAIN ioctl in a getifaddrs() loop or after receiving an
interface message on the routing socket can now remove the pointless
additional ioctl. In base, that could be: dhclient, isakmpd, dhcpd,
dhcrelay, ntpd, ospfd, ripd, ifconfig.

No ABI breakage because it uses a previously unused pad field in if_data.

OK mpi@ deraadt@


# 1.162 10-Apr-2015 mpi

Run detach hook and similar before cleaning up any other resource when
an interface is destroyed/removed. This way we can ensure pseudo-driver
changes done after attaching an interface are undone before detaching it.

Note: it is safe to call if_deactivate() multiple times as the interface
should not have any attached pseudo-interface after the first call.

ok deraadt@, dlg@


# 1.161 18-Mar-2015 dlg

remove the congestion handling from struct ifqueue.

its only used for the ip and ip6 network stack input queues, so it
seems unfair that every instance of ifqueue has to carry a pointer
around for this specific use case.

this moves the congestion marker to a kernel global. if we detect
that we're congested, we assume the whole system is busy and punish
all input queues.

marking a system as congested is done by setting the global to the
current value of ticks. as the system moves away from that value,
it moves away from being congested until the comparison fails.

written at s2k15
ok henning@ beck@ bluhm@ claudio@


Revision tags: OPENBSD_5_7_BASE
# 1.160 08-Feb-2015 mpi

Introduce if_input() a function to pass packets dequeued from a
recieving ring to the stack.

if_input() is at the moment a drop-in replacement for ether_input_mbuf()
but will let us stack pseudo-driver in a nice way in order to no longer
call ether_input() recursively.

ok pelikan@, reyk@, blambert@, henning@


# 1.159 06-Jan-2015 stsp

Remove the NOINET6 interface flag, a left-over from the times when IPv6
was enabled by default. Add AFATTACH/AFDETACH ioctls which enable/disable
an address family for an interface (currently used for IPv6 only).

New kernel needs new ifconfig for IPv6 configuration (address assignment
still works with old ifconfig making this easy to cross over).

Committing on behalf of henning@ who is currently lebensmittelvergiftet.
ok stsp, benno, mpi


# 1.158 05-Dec-2014 mpi

Explicitly include <net/if_var.h> instead of pulling it in <net/if.h>.

ok mikeb@, krw@, bluhm@, tedu@


Revision tags: OPENBSD_5_6_BASE
# 1.157 14-Jul-2014 dlg

now that receive ring accounting has been pulled out of the mbuf layer,
we can pull the space the mbuf layer used to do per interface accounting
out of struct if_data.

saves a hundredish bytes on every interface.

ok deraadt@ claudio@


# 1.156 11-Jul-2014 henning

introduce the IFXF_AUTOCONF6 interface flag which controls wether we
accept rtadvs on that interface. the global net.inet6.ip6.accept_rtadv
sysctl just doesn't cut it, even tho the spec wants that - but in their
little absurd world, a host just has one interface by definition anyway...
the sysctlgoes away.
lots of head scratching, brain cell elemination etc from bluhm benno stsp
florian, excitement from simon and todd, ok bluhm stsp benno florian


# 1.155 08-Jul-2014 dlg

introduce the if_rxr api. it is intended to pull the rx ring accounting
out of the mbuf layer, and break the assumption that an interface will
only have a single ring per mbuf cluster size.

mpi@ is ok with moving this forward


# 1.154 13-Jun-2014 mpi

Instead of updating all the cluster allocation water marks of all the
interfaces when the kernel is livelocked, only do it for the current
pool and defer the other updates.

This allow us to get rid of an interface list iteration in a critical
path.

Ridding the libc crank since this change introduce an ABI break.

ok claudio@


Revision tags: OPENBSD_5_5_BASE
# 1.153 21-Nov-2013 mikeb

split kernel parts of the if.h into a separate header file if_var.h
which allows us to modify ifnet structure in a relatively safe way;
discussed with deraadt, ok mpi


# 1.152 09-Nov-2013 dlg

ticks is compared against mcl_grown to see if time has elapsed since
the rx ring was last allowed to grow and then assigned to it. it
is erroneous to do this because mcl_grown is a u_int and ticks is an
int.

this makes mcl_grown an int, and follows the idiom in kern_timeout.c
of going "thing - ticks < diff", which better copes with ticks
wrapping around and being used to calculate relative intervals.

ok pirofti@ guenther@


# 1.151 01-Nov-2013 pelikan

keep net/hfsc.h away from userspace, except in pfctl

tested by naddy, ok deraadt


# 1.150 21-Oct-2013 benno

nuke comment. How soon is now?
"do it" deraadt@


# 1.149 19-Oct-2013 reyk

Bring back the if_detachhook. We're going to have more users now.

ok mpi@ henning@ benno@


# 1.148 13-Oct-2013 reyk

Import vxlan(4), the virtual extensible local area network tunnel
interface. VXLAN is a UDP-based tunnelling protocol for overlaying
virtualized layer 2 networks over layer 3 networks. The implementation
is based on draft-mahalingam-dutt-dcops-vxlan-04 and has been tested
with other implementations in the wild.

put it in deraadt@


# 1.147 12-Oct-2013 henning

new bandwidth shaping subsystem, kernel side
uses hfsc behind the scenes; altq stays in parallel for a migration phase.
if.h even more messy for the transition, but eventuelly it should become
readable...
looked over & tested by many, ok phessler sthen


# 1.146 17-Sep-2013 mpi

Change vlan(4) detach procedure to not use a hook but a list of vlans
on the parent interface. This is similar to what bridge(4), trunk(4)
or carp(4) are doing and allows us to get rid of the detachhook.

ok reyk@, mikeb@


# 1.145 28-Aug-2013 mpi

Remove unused argument from *rtrequest()

ok krw@, mikeb@


Revision tags: OPENBSD_5_4_BASE
# 1.144 20-Jun-2013 mpi

Revert previous and unbreak asr, the new include should be protected.

Reported by naddy@


# 1.143 20-Jun-2013 mpi

Allocate the various hook head descriptors as part of the ifnet
structure rather than doing various M_WAITOK allocations during
the *attach() functions, we always rely on them anyway.

ok mikeb@, uebayasi@


# 1.142 02-Apr-2013 mpi

Instead of storing the link-level address of every interface in a global
array indexed by interface numbers, add a new field to the interface
descriptor pointing to it.

claudio@ and todd@ like it, ok mikeb@


# 1.141 26-Mar-2013 mpi

Remove various read-only *maxlen variables and use IFQ_MAXLEN directly.

ok beck@, mikeb@


# 1.140 20-Mar-2013 mpi

Introduce if_get() to retrieve an interface descriptor pointer given
an interface index and replace all the redondant checks and accesses
to a global array by a call to this function.

With imputs from and ok bluhm@, mikeb@


# 1.139 07-Mar-2013 mpi

Remove unused ifa_ifwithaf() function.

ok mikeb@, miod@


# 1.138 07-Mar-2013 mpi

Remove the IFAFREE() macro, the ifafree() function it was calling already
check for the reference counter.

ok mikeb@, miod@, pelikan@, kettenis@, krw@


Revision tags: OPENBSD_5_3_BASE
# 1.137 23-Nov-2012 sthen

Add SIOCGIFHARDMTU to allow retrieving the driver's maximum supported MTU
looks fine reyk@ ok mikeb@


# 1.136 11-Nov-2012 deraadt

align ifaliasreq.ifra_addr similar to the way that ifreq is fixed --
a gruesome union, to block the compiler from placing the struct
incorrectly aligned on stack frames
ok guenther


# 1.135 05-Oct-2012 camield

Point an interface directly to its bridgeport configuration, instead
of to the bridge itself. This is ok, since an interface can only be part
of one bridge, and the parent bridge is easy to find from the bridgeport.

This way we can get rid of a lot of list walks, improving performance
and shortening the code.

ok henning stsp sthen reyk


# 1.134 19-Sep-2012 henning

defina an IFCAP_CSUM_MASK, covering IFCAP_CSUM_*, and use it in if_vlan.c
to replace the list of them.
this actually makes vlan inherit the IPv6 CSUM flags from it's parent, that
had been commented out since this code was committed back in 2001.
ok benno mpf


# 1.133 10-Sep-2012 guenther

Bring into compliance with POSIX, exposing just the specified bits.

Requested by jasper@, ok kettenis@


# 1.132 21-Aug-2012 bluhm

Reverse the name and meaning of the IFXF_INET6_PRIVACY interface
flag. It is now called IFXF_INET6_NOPRIVACY. So IPv6 privacy
addresses are on by default without resetting the flag during
ifconfig down/up.
OK stsp@, sperreault@ (who wrote the same diff)


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.131 02-Dec-2011 haesbaert

Kill unused IFCAP_IPSEC and IFCAP_IPCOMP.

ok claudio@ henning@ mikeb@


# 1.130 02-Nov-2011 haesbaert

Expose if_capabilities to userland so that ifconfig can display the
device hardware features.
Tune ifconfig to show them with 'hwfeatures' argument.
While here, kill some old unused capabilities and respect 80 columns
in brconfig.h.

ok mcbride@, henning@, mpf@.


# 1.129 07-Oct-2011 henning

rename some vars and functions
unfortunately altq is one giant namespace violation. rename just those that
conflict with new stuff for now only to be found on my laptop. reduce pain,
the diff is huge already. ok ryan


Revision tags: OPENBSD_5_0_BASE
# 1.128 08-Jul-2011 henning

new priority queueing implementation, extremely low overhead, thus fast.
unconditional, always on. 8 priority levels, as every better switch, the
vlan header etc etc. ok ryan mpf sthen, pea tested as well


# 1.127 07-Jul-2011 henning

provide IF_LEN and IFQ_LEN to access ifq_len on an ifqueue, ryan ok


# 1.126 05-Jul-2011 henning

now of course I only noticed if_qflush is completely unused after
adjusting it to the new world order in my tree... remove it, ok ryan claudio


# 1.125 03-Jul-2011 henning

IFQ_CLASSIFY is also just schrapnel


# 1.124 03-Jul-2011 henning

no traces of ALTQ_DECL to be found anywhere, thus kill the #defines


# 1.123 03-Jul-2011 claudio

LINK_STATE_IS_UP() should consider LINK_STATE_UNKNOWN as an up state.
This is now possible because carp no longer uses LINK_STATE_UNKNOWN
for a state that is considered down. This will simplify a lot of code.
OK mpf@ mcbride@ henning@


# 1.122 13-Mar-2011 stsp

Add a way to enable/disable Wake On LAN with ifconfig.
ok deraadt


Revision tags: OPENBSD_4_9_BASE
# 1.121 17-Nov-2010 henning

introduce ifa_update_broadaddr to update an ifaddr's broadcast address,
trivial for the moment, more needed soon
tested by many as part of a larger diff, ok sthen claudio dlg krw


# 1.120 24-Sep-2010 claudio

Implement if_freenameindex() as a real function as required by posix.
OK deraadt@, millert@


# 1.119 23-Sep-2010 dlg

tweak the mclgeti algorithm to behave better under load.

instead of letting hardware rings grow on every interrupt, restrict
it so it can only grow once per softclock tick. we can only punish
the rings on softclock ticks, so it make sense to only grow on
softclock tick boundaries too.

the rings are now punished after >1 lost softclock tick rather than
>2. mclgeti is now more aggressive at detecting livelock.

the rings get punished by an 8th, rather than by half.

we now allow the rings to be punished again even if the system is
already considered in livelock.

without this diff a livelocked system will have its rx ring sizes
scale up and down very rapidly, while holding the rings low for too
long. this affected throughput significantly.

discussed and tested heavily at j2k10. there are still some games
with softnet we can play, but this is a good first step.

"put it in" and ok deraadt@
ok claudio@ krw@ henning@ mcbride@

if we find out that it sucks we can pull it out again later. till then
we'll run with it and see how it goes.


# 1.118 27-Aug-2010 jsg

remove the unused if_init callback in struct ifnet
ok deraadt@ henning@ claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.117 26-Jun-2010 claudio

Implement a simple keepalive mechanism in gre(4) that is compatible with
the one used by Cisco. It sends a return gre packet inside a gre packet
to the other side and expects it to return.
OK deraadt, reyk additional testing by sthen


# 1.116 28-May-2010 claudio

Rework the way we handle MPLS in the kernel. Instead of fumbling MPLS into
ether_output() and later on other L2 output functions use a trick and over-
load the ifp->if_output() function pointer on MPLS enabled interfaces to
go through mpls_output() which will then call the link level output function.
By setting IFXF_MPLS on an interface the output pointers are switched.
This now allows to cleanup the MPLS input and output pathes and fix mpe(4)
so that the MPLS code now actually works for both P and PE systems.
Tested by myself and michele
(A custom kernel with MPLS and mpe enabled is still needed).


# 1.115 17-Apr-2010 deraadt

split SIOCSIFLLADDR code out into an ifnewlladr() function
ok stsp


# 1.114 06-Apr-2010 stsp

Simple implementation of RFC4941, "Privacy Extensions for Stateless
Address Autoconfiguration in IPv6". For those among us who are paranoid
about broadcasting their MAC address to the IPv6 internet.

Man page help from jmc, testing by weerd, arc4random API hints from djm.

ok deraadt, claudio


Revision tags: OPENBSD_4_7_BASE
# 1.113 13-Jan-2010 henning

maintain a global RB tree of all local addresses in the system. this
includes AF_LINK addresses (aka mac addresses in the ethernet case). for
inet this also includes the broadcast addresses.
depends on ifinit() called earlier so we have a chance to pool_init before
autoconf assigns the AF_LINK addresses, the v6 fix, and the ifa_add/del
abstraction i just committed.
this is a change in semantics, it is now illegal to change the actual
address in an ifaddr struct because then the RB tree becomes unbalanced.
nothing using this tree yet.
ok theo ryan dlg


# 1.112 13-Jan-2010 henning

instead of fiddling with the per-interface address lists directly in
many places create a proper API (ifa_add / ifa_del) and use it.
ok theo ryan dlg


# 1.111 12-Jan-2010 deraadt

Make the structures for ifa_msghdr and friends even more like
the route messages so that people and compilers will not get
confused.
ok claudio


# 1.110 17-Sep-2009 claudio

Remove the comaptibility structures for routing socket version 3.
The RTM_VERSION bump is 2 years ago and so there is no need for this.
Diff made by tedu@ some time ago but got never commited so I do it now.


# 1.109 14-Sep-2009 claudio

Add a way to convert the ifi_link_state to a string without the use of
if_media. This makes link state tracking a lot easier as there is no need
to convert if types to if_media types, etc. Additionally this allows us
to extend the link states to include states tracked on higher protocol layers.
gre(4) keepalives packets, bfd and udld can be implemented without ugly hacks.
OK henning, michele, sthen, deraadt


# 1.108 10-Aug-2009 deraadt

At sys_reboot time, bring all the interfaces down so that their xxstop
functions are called, which will turn off DMA. Receiving packets into
your memory after a system reboot is pretty nasty. This will also mean
that the shutdown hooks can go; this solution is smaller.
ok henning miod dlg kettenis


Revision tags: OPENBSD_4_6_BASE
# 1.107 06-Jun-2009 rainer

when xflags got changed, tell the userland by routing sockets

ok henning@


# 1.106 05-Jun-2009 claudio

Initial support for routing domains. This allows to bind interfaces to
alternate routing table and separate them from other interfaces in distinct
routing tables. The same network can now be used in any doamin at the same
time without causing conflicts.
This diff is mostly mechanical and adds the necessary rdomain checks accross
net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6.
input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@


# 1.105 04-Jun-2009 henning

allow IPvShit to be turned off completely per-interface.
ifconfig em0 -inet6
deletes all v6 addresses including link-local and prevents new ones from
being added.
ifconfig em0 inet6 <addr>
re-enables v6, brings the link local back and adds optional <addr>
ok theo reyk


# 1.104 03-Jun-2009 beck

make wireless interfaces priority 4 by default. other interfaces remain
priority 0. while we are in here make sure we add wi interfaces to group "wlan"
in the same way the net80211 stuff already is.

this makes dhcp multiple default routes useful on laptops.

ok claudio@


Revision tags: OPENBSD_4_5_BASE
# 1.103 27-Jan-2009 dlg

make drivers tell the mclgeti allocator what their maximum ring size is
to prevent the hwm growing beyond that. this allows the livelock mitigation
to do something where the hwm used to grow beyond twice the rx rings size.

ok kettenis@ claudio@


# 1.102 12-Dec-2008 claudio

Introduce a if_priority that will be added to RTP_STATIC when routes are
added without an expilict priority. This allows to specify less prefered
interfaces that will only take over if the primary interface loses link.
OK deraadt@


# 1.101 11-Dec-2008 deraadt

export per-interface mbuf cluster pool use statistics out to userland
inside if_data, so that netstat(1) and systat(1) can see them
ok dlg


# 1.100 30-Nov-2008 brad

- Remove unused if_reset "bus reset routine" field in the ifnet struct.
- Add if_stop "stop routine" field in the ifnet struct.

ok mglocker@


# 1.99 26-Nov-2008 dlg

provide m_clsetlwm, an interface for an interface to raise its low
watermark for mbuf cluster allocations.

this is necessary for things like bge which cannot cope with less than a
certain number of pkts on the ring.

ok deraadt@


# 1.98 25-Nov-2008 deraadt

Factor increases are not needed, +1 appears to work as well.
ok dlg


# 1.97 24-Nov-2008 deraadt

move MCLPOOLS to if.h and force uipc_mbuf.c to get if.h, there is no
other option
ok dlg


# 1.96 24-Nov-2008 dlg

add several backend pools to allocate mbufs clusters of various sizes out
of. currently limited to MCLBYTES (2048 bytes) and 4096 bytes until pools
can allocate objects of sizes greater than PAGESIZE.

this allows drivers to ask for "jumbo" packets to fill rx rings with.

the second half of this change is per interface mbuf cluster allocator
statistics. drivers can use the new interface (MCLGETI), which will use
these stats to selectively fail allocations based on demand for mbufs. if
the driver isnt rapidly consuming rx mbufs, we dont allow it to allocate
many to put on its rx ring.

drivers require modifications to take advantage of both the new allocation
semantic and large clusters.

this was written and developed with deraadt@ over the last two days
ok deraadt@ claudio@


# 1.95 07-Nov-2008 deraadt

give this some /* CONSTCOND */ love


Revision tags: OPENBSD_4_4_BASE
# 1.94 10-Apr-2008 dlg

introduce mitigation for the calling of an interfaces start routine.

decent drivers prefer to have a lot of packets on the send queue so they
can queue a lot of them up on the tx ring and then post them all in one
big chunk. unfortunately our stack queues one packet onto the send queue
and then calls the start handler immediately.

this mitigates against that queue, send, queue, send behaviour by trying to
call the start routine only once per softnet. now its queue, queue, queue,
send.

this is the result of a lot of discussion with claudio@
tested by many.


Revision tags: OPENBSD_4_3_BASE
# 1.93 18-Nov-2007 mpf

Sync struct ifaltq to match struct ifqueue.
I wonder why 64-bit archs have not been bitten by this.
OK mcbride@, henning@


# 1.92 03-Sep-2007 claudio

Bump RTM_VERSION to 4 and start a new aera of routing in OpenBSD :)
Changes include 64bit counters instead of u_long, routing table id in the header
of most messages, reserved routing priority field, added a hdrlen field to skip
over the header so that binary compatibility becomes easier.
A minimal backward support for old binaries is included to ease upgrades but
don't expect anything more than ifconfig, route and dhclient to correctly work.
OK henning@ mglocker@


Revision tags: OPENBSD_4_2_BASE
# 1.91 25-Jun-2007 henning

crank ifq_maxlen from 50 to 256, so it is not smaller than most interfaces
rx rings any more. forwarding boxes with many fast interfaces can still use
some more, but this is a saner default.
ok deraadt markus henric


# 1.90 14-Jun-2007 reyk

Add a new "rtlabel" option to ifconfig. It allows to specify a route label
which will be used for new interface routes. For example,
ifconfig em0 10.1.1.0 255.255.255.0 rtlabel RING_1
will set the new interface address and attach the route label RING_1 to
the corresponding route.

manpage bits from jmc@
ok claudio@ henning@


# 1.89 29-May-2007 uwe

Define IF_ENQUEUE() and friends as proper C statements using do ... while
ok henning


# 1.88 26-May-2007 jason

one extern seems to be better than 20 for ifqmaxlen; ok krw


# 1.87 27-Mar-2007 jmc

grammar from bret lambert, and one more from me;


Revision tags: OPENBSD_4_1_BASE
# 1.86 09-Feb-2007 jmc

grammar fix from bret lambert;


# 1.85 28-Nov-2006 reyk

add additional link states to report the half duplex / full duplex
state, if known by the driver. this is required to check the full
duplex state without depending on the ifmedia ioctl which can't be
called in the kernel without process context.

ok henning@, brad@


# 1.84 16-Nov-2006 henning

introduce if_creategroup() to create an empty interface group.
code factored out from if_addgroup(), previously a group always had to have
members. ok mpf mcbride


# 1.83 31-Oct-2006 jason

ether_input_mbuf() isn't necessary, turn it into a macro and deal with
it's "special" case in ether_input(). Based on similiar idea in FreeBSD.
ok brad


Revision tags: OPENBSD_4_0_BASE
# 1.82 02-Jun-2006 mpf

Introduce attributes to interface groups.
As a first user, move the global carp(4) demotion counter
into the interface group. Thus we have the possibility
to define which carp interfaces are demoted together.

Put the demotion counter into the reserved field of the carp header.
With this, we can have carp act smarter if multiple errors occur.
It now always takes over other carp peers, that are advertising
with a higher demote count. As a side effect, we can also have
group failovers without the need of running in preempt mode.
The protocol change does not break compability with older
implementations.

Collaborative work with mcbride@

OK mcbride@, henning@


# 1.81 27-May-2006 brad

remove IFCAP_JUMBO_MTU interface capabilities flag and set if_hardmtu in a few
more drivers.

ok reyk@


# 1.80 26-May-2006 deraadt

rename jumbo mtu to if_hardmtu; ok brad reyk


# 1.79 19-May-2006 reyk

add a if_jumbo_mtu field to the interface structure for drivers
supporting ethernet jumbo frames. there's no standard for the size of
jumbo MTUs, so either let the driver set it's own value or use 9000
byte jumbo frames by default.

ok brad@


# 1.78 04-Mar-2006 brad

With the exception of two other small uncommited diffs this moves
the remainder of the network stack from splimp to splnet.

ok miod@


Revision tags: OPENBSD_3_9_BASE
# 1.77 09-Feb-2006 reyk

add an interface detach hook and use it with the vlan(4) driver. this
fixes a possible crash if the parent interface has been destroyed
(like vlan on trunk) before destroying the vlan interface.

ok brad@


Revision tags: OPENBSD_3_8_BASE
# 1.76 14-Jun-2005 henning

rename function and define to reflect the external -> egress name change
so it is clear what it is all about


# 1.75 14-Jun-2005 henning

use "egress" instead of "external" for the interface group containing the
interfaces the default route(s) point to, proposed deraadt some days ago,
ok djm deraadt


# 1.74 12-Jun-2005 henning

add SIOCGIFGMEMB ioctl, returns a list of all interfaces who are member of
the given group, markus ok


# 1.73 07-Jun-2005 henning

introduce a default "external" interface group, containing the interface(s)
the the default route(s) point to.
handles IPv4 and IPv6 as well as multipath routes.
follows default route changes, of course.
eases writing pf rulesets especially on laptops etc. that use different
interfaces depending on the environment (wired, wireless, ...)
ok theo ryan


# 1.72 06-Jun-2005 henning

use a define instead of hardcoding "all" in 3 places


# 1.71 05-Jun-2005 henning

const'ify the char *groupname param to if_addgroup and if_delgroup


# 1.70 24-May-2005 markus

add net.inet.ip.ifq for monitoring and changing ifqueue; similar to netbsd
ok henning


# 1.69 24-May-2005 reyk

initial import of a trunking (link aggregation and link failover)
implementation. it currently supports round robin mode with link state
checking, additional modes will be added later.

ok brad@, deraadt@


# 1.68 24-May-2005 henning

keep a list of member interfaces in ifg_group


# 1.67 22-May-2005 henning

allow pf to match on interface groups
pass on mygroup ...
markus ok


# 1.66 21-May-2005 henning

clean up and rework the interface absraction code big time, rip out multiple
useless layers of indirection and make the code way cleaner overall.
this is just the start, more to come...
worked very hard on by Ryan and me in Montreal last week, on the airplane to
vancouver and yesterday here in calgary. it hurt.
ok ryan theo


# 1.65 20-Apr-2005 mpf

Introduce if_linkstatehooks.
This converts if_link_state_change() to a generic usable
callback with dohooks().

OK henning@, camield@
Tested by camield@ and Alexey E. Suslikov


Revision tags: OPENBSD_3_7_BASE
# 1.64 07-Feb-2005 mcbride

Add new function if_link_state_change() to take care of sending messages
on the routing socket and notifying carp() of link changes.

ok brad@ mpf@


# 1.63 14-Jan-2005 henning

remove old ifgroups ioctls
the old ifgroups haven't been in use ever really, and the new
implementation is 3 months old today. theo ok (3 months ago)


# 1.62 07-Dec-2004 mcbride

Convert carp(4) to behave more like a regular interface, much in the same
style as vlan(4). carp interfaces no longer require the physical interface
to be on the same subnet as the carp interface, or even that the physical
interface has an adress at all, so CARP can now be used on /30 networks.

ok deraadt@ henning@


# 1.61 07-Dec-2004 mcbride

KNF


# 1.60 03-Dec-2004 henning

do not use one struct timeout for the if congestion stuff, but embed
a struct timeout to struct ifqueue so that each one has its own - it
is a per-queue thing. from chris pascoe


# 1.59 10-Nov-2004 grange

Safer IF_INPUT_ENQUEUE macro.

ok millert@


# 1.58 14-Oct-2004 mickey

avoid stupid commons


# 1.57 11-Oct-2004 henning

ifgroups reqrite
there is now a TAILQ with all interface groups as members, and
in struct ofnet there is only a pointer to the group structure stored
and not its name.
mostly hacked at c2k4 and somewhere over the atlantic ocean
ok markus mcbride


Revision tags: OPENBSD_3_6_BASE
# 1.56 26-Jun-2004 markus

cleanup ioctl for ifgroups; ok pb@


# 1.55 25-Jun-2004 pb

introduce "interface groups"

by "ifconfig fxp0 group foobar" "ifconfig xl0 group foobar"
these two interfaces are in one group.
Every interface has its if-family as default group.

idea/design from henning@, based on some work/disucssion from Joris Vink.

henning@, mcbride@ ok.


# 1.54 20-Jun-2004 beck

undo mbuf cluster breakage that causes free'ed packets to show up on the
input queues when using dhcp and hostap wi, or xl, or fxp....
ok art@


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.53 29-May-2004 jcs

introduce SIOCSIFDESCR and SIOCGIFDESCR to maintain interface
descriptions, configurable with ifconfig

help from various, ok deraadt@


# 1.52 18-May-2004 brad

if_ether.h
add ETHER_MAX_LEN_JUMBO, ETHER_VLAN_ENCAP_LEN, ETHER_ALIGN, and
ETHERMTU_JUMBO constants.

if.h
add a few more interface capabilities flags.

Some from NetBSD, some from FreeBSD.

ok markus@


# 1.51 26-Apr-2004 mcbride

Before enqueueing the packet, copy the contents of incoming clusters
to the mbuf and free the cluster when it contains a small packet.

ok deraadt@


# 1.50 17-Apr-2004 henning

add a congestion indicator to if_queue. It is set when the input queue
is full, along with a timer that unsets it again after 10ms.
The input queue beeing full is a reliable indicator for CPU overload, and
this flag allows other subsystems to cope with the situation.
hacked with beck
ok kjc@ markus@ beck@


Revision tags: OPENBSD_3_5_BASE
# 1.49 15-Jan-2004 markus

add a RTM_IFANNOUNCE message; from netbsd; ok itojun, henning


# 1.48 16-Dec-2003 markus

return error in ifc_destroy; ok deraadt, itojun, cedric, hshoexer


# 1.47 10-Dec-2003 itojun

use if_indexlim (instead of if_index) and ifindex2ifnet[x] != NULL
to check if interface exists, as (1) if_index will have different meaning
(2) ifindex2ifnet could become NULL when interface gets destroyed,
when we introduce dynamically-created interfaces. markus ok


# 1.46 08-Dec-2003 markus

add IOCIFGCLONERS; ifconfig -C; from netbsd; ok henning, deraadt


# 1.45 03-Dec-2003 markus

support for network interface "cloning", e.g. gif(4) via ifconfig(8)


# 1.44 19-Oct-2003 david

more typos


# 1.43 17-Oct-2003 mcbride

Common Address Redundancy Protocol

Allows multiple hosts to share an IP address, providing high availability
and load balancing.

Based on code by mickey@, with additional help from markus@
and Marco_Pfatschbacher@genua.de

ok deraadt@


Revision tags: OPENBSD_3_4_BASE
# 1.42 25-Aug-2003 fgsch

if_init support, required by ieee80211.
deraadt@ ok.


# 1.41 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_A UBC_SYNC_B
# 1.40 03-Jul-2002 miod

Change all variables definitions (int foo) in sys/sys/*.h to variable
declarations (extern int foo), and compensate in the appropriate locations.


# 1.39 30-Jun-2002 itojun

allocate sockaddr_dl for ifnet in if_alloc_sadl(), as we don't always know
the size of sockaddr_dl on if_attach() - for instance, see ether_ifattach().
from netbsd. fgs ok


# 1.38 23-Jun-2002 itojun

g/c last remains of old ipv6 prefix management


# 1.37 27-May-2002 itojun

if_attach() gets called before domaininit(). scan all interfaces for if_afdata
initialization after domaininit().


# 1.36 27-May-2002 itojun

framework to add af-dependent data structure to struct ifnet.
as discussed at bsd-api-discuss. sync w/kame


# 1.35 24-Apr-2002 dhartmei

Add hooks to struct ifnet that allow to register callbacks that will be
notified of interface address changes. ok provos@, angelos@


Revision tags: OPENBSD_3_1_BASE
# 1.34 15-Mar-2002 millert

Cosmetic changes only, primarily making comments line up nicely after the
__P removal.


# 1.33 14-Mar-2002 millert

First round of __P removal in sys


# 1.32 23-Jan-2002 fgsch

compatability -> compatibility.


Revision tags: OPENBSD_3_0_BASE UBC_BASE
# 1.31 05-Jul-2001 angelos

branches: 1.31.4;
KNF


# 1.30 05-Jul-2001 jjbg

Include files for IPComp support. angelos@ ok.


# 1.29 27-Jun-2001 kjc

ALTQ base modifications to the kernel.
- ALTQ introduces a set of new queue macros that coexist with the
traditional IF_XXX macros.
- "struct ifaltq" replaces "struct ifqueue" in "struct ifnet".
- assign cdev major 74 for i386 and 54 for alpha as ALTQ control interface.


# 1.28 23-Jun-2001 fgsch

Add ether_input_mbuf to help us remove the ether_header from
ether_input; all drivers should start migrating to this.
Discussed with jason@, deraadt@ more or les ok'ed.


# 1.27 15-Jun-2001 itojun

change the meaning of ifnet.if_lastchange to meet RFC1573 ifLastChange.
follows BSD/OS practice and ucd-snmp code (FreeBSD does it for specific
interfaces only).

was: if_lastchange get updated on every packet transmission/receipt.
now: if_lastchange get updated when IFF_UP is changed.


# 1.26 09-Jun-2001 angelos

By popular demand, protect from multiple inclusion, and fix to use the
same naming style.


# 1.25 28-May-2001 angelos

IPSECv4 -> IPSEC


# 1.24 28-May-2001 angelos

No need for separate ESP/AH interface capabilities.


# 1.23 28-May-2001 angelos

Interface capabilities (based on NetBSD, but merge ethercom and ifnet
capabilities into one, in the ifp).


Revision tags: OPENBSD_2_9_BASE
# 1.22 06-Feb-2001 mickey

allow changing number of loopbacks in ukc.
change rest of the code to use lo0ifp pointing
to the corresponding struct ifnet.
itojun@ and niklas@ ok


# 1.21 19-Jan-2001 itojun

pull post-4.4BSD change to sys/net/route.c from BSD/OS 4.2 (UCB copyrighted).

have sys/net/route.c:rtrequest1(), which takes rt_addrinfo * as the argument.
pass rt_addrinfo all the way down to rtrequest, and ifa->ifa_rtrequest.
3rd arg of ifa->ifa_rtrequest is now rt_addrinfo * instead of sockaddr *
(almost noone is using it anyways).

benefit: the follwoing command now works. previously we need two route(8)
invocations, "add" then "change".
# route add -inet6 default ::1 -ifp gif0

remove unsafe typecast in rtrequest(), from rtentry * to sockaddr *. it was
introduced by 4.3BSD-reno and never corrected.

XXX is eon_rtrequest() change correct regarding to 3rd arg?
eon_rtrequest() and rtrequest() were incorrect since 4.3BSD-reno,
so i do not have correct answer in the source code.
someone with more clue about netiso-over-ip, please help.


Revision tags: OPENBSD_2_8_BASE
# 1.20 20-Sep-2000 art

Since ifa_refcnt was bumped to an int and rt_flags is an int too, bump
ifa_flags to int.


# 1.19 28-Aug-2000 deraadt

changing the size of if_data has heavy impact on userland compat. there
was even a u_char slot available for ifi_link_state, which clearly does not
need a full 32 bits.


# 1.18 26-Aug-2000 nate

sync mii code with netbsd
adds detach functionality for phys
some code cleanup

Nobody really had time to test all of this out, but theo said commit anyway


Revision tags: OPENBSD_2_7_BASE
# 1.17 22-Mar-2000 itojun

remove if_withname(), which was imported during KAME merge by mistake.


# 1.16 21-Mar-2000 mickey

add SIOCGIFMTU/SIOCSIFMTU; remediate redundant code of tun, ppp, sppp; chris@ ok


Revision tags: SMP_BASE
# 1.15 02-Feb-2000 itojun

branches: 1.15.2;
wrap IFAFREE() by "do {} while (0)". it wasn't safe enough.


Revision tags: kame_19991208
# 1.14 08-Dec-1999 itojun

bring in KAME IPv6 code, dated 19991208.
replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support.
see sys/netinet6/{TODO,IMPLEMENTATION} for more details.

GENERIC configuration should work fine as before. GENERIC.v6 works fine
as well, but you'll need KAME userland tools to play with IPv6 (will be
bringed into soon).


Revision tags: OPENBSD_2_6_BASE
# 1.13 08-Aug-1999 niklas

Support detaching of network interfaces. Still work to do in ipf, and
other families than inet.


# 1.12 23-Jun-1999 cmetz

Added some protocol independent interfaces (supposedly IPv6 support APIs, but
ones that are useful for all protocols, not just IPv6).


Revision tags: OPENBSD_2_5_BASE
# 1.11 13-Mar-1999 deraadt

make ifa_refcnt a u_int; andrewb@demon.net


# 1.10 26-Feb-1999 jason

Ethernet bridge/IP firewall driver.


# 1.9 07-Jan-1999 deraadt

fix IFAFREE() to be safe for if/else nesting


Revision tags: OPENBSD_2_4_BASE
# 1.8 03-Sep-1998 jason

o OpenBSD gets if_media support (from NetBSD)
o rework/simplify if_xl to use it


Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE OPENBSD_2_2_BASE OPENBSD_2_3_BASE
# 1.7 02-Jul-1996 niklas

-Wall & -Wstrict-prototype fixes


# 1.6 29-Jun-1996 deraadt

provide if_attachhead(), and make if_loop use it


# 1.5 10-May-1996 deraadt

if_name/if_unit -> if_xname/if_softc


# 1.4 06-May-1996 mickey

if.h was missed from the commit.
if_ethersubr.c: missed variables added.


# 1.3 05-Mar-1996 mickey

Changes for ifconfig to compile.


# 1.2 03-Mar-1996 niklas

From NetBSD: 960217 merge


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.203 25-Jul-2019 krw

AF_INET comes before AF_INET6. Shorten line to <80 chars.

pointed out by claudio@


# 1.202 25-Jul-2019 krw

Add IFXF_AUTOCONF4 to if_xflags to match IFXF_AUTOCONF6. Let
ifconfig set/unset it.

ok deraadt@ kmos@


# 1.201 19-Apr-2019 dlg

add IF_HDRPRIO_OUTER for rxprio

IF_HDRPRIO_OUTER says you want the priority from the outer encap header.

ok claudio@


Revision tags: OPENBSD_6_5_BASE
# 1.200 10-Apr-2019 dlg

add struct if_sffpage so userland can read a page of sfp/qsfp info

this will be used by ifconfig so it can show you things like who
mades what module you're using and when it was made, and on some
modules you get diag info like temperature, vcc, and rx and tx
powers.

im putting the kernel side in so we can keep fiddling with the
userland printf side of things.

this work was done based on a question by rachel roch
ok deraadt@
enthusiasm from many including mikeb@ sthen@ patrick@


# 1.199 23-Jan-2019 dlg

add a SIOCGPWE3 ioctl for interfaces to advertise pwe3 capability

im going to turn mpw into an ethernet interface, which includes
changing its if_type to IFT_ETHER. currently ldpd looks for if_type
IFT_MPLSTUNNEL to decide if an interface is a pseudowire, ie, it's
going to break. the ioctl will let ldpd ask the interface if it is
pseudowire capable as an alternative.

ok claudio@


# 1.198 12-Nov-2018 dlg

add ifreq bits for the tx header prio field ioctls

a tx header prio can set to a fixed value from 0 to 7, or magic
values to represent populating the prio field from the encapsulated
packet, or from the mbuf prio value.

ok claudio@


# 1.197 12-Nov-2018 krw

Add new routing socket message RTM_80211INFO to provide details of
802.11 interface state changes (e.g. SSID) to interested parties.

Original diff from phessler@. Many suggestions and tweaks from
claudio@, stsp@, anton@.

ok claudio@ stsp@ anton@ phessler@


# 1.196 11-Nov-2018 dlg

use the llprio on gre(4) and eoip(4) interfaces for the keepalive tos

llprios are valued 0 to 7, while the ip tos/dscp/tclass is an 8 bit
value. fortunately the high 3 bits map nicely to the llprio values,
so we shift the llprio into place when generating the keepalive
frames. the llprio is defaulted to the value that cisco uses for
their gre keepalives.


Revision tags: OPENBSD_6_4_BASE
# 1.195 12-Sep-2018 krw

Fix obvious cut&pasto in comment (ifa_msghdr -> if_announcemsghdr).

ok claudio@


# 1.194 30-May-2018 dlg

restrict the prio values from SIOCSIFLLPRIO to what the kernel handles

previously the ioctl code checked that prio was an int less than
UCHAR_MAX, but the rest of the kernel (and priq code in particular)
expects it to be between 0 and 7 inclusive.

ok krw@ tb@


# 1.193 25-Apr-2018 jca

Make this header standalone #if __BSD_VISIBLE, by including needed headers

Puts us in line with Free/NetBSD and Linux and will get us rid of
pointless patches in the ports tree. ok guenther@ deraadt@


Revision tags: OPENBSD_6_3_BASE
# 1.192 19-Feb-2018 dlg

tunneldf needs ifr_df


# 1.191 10-Feb-2018 florian

Implement RFC 7217: "A Method for Generating Semantically Opaque
Interface Identifiers with IPv6 Stateless Address Autoconfiguration."

"An IPv6 address configured using this method is stable within each
subnet, but the corresponding Interface Identifier changes when the
host moves from one network to another. This method is meant to be an
alternative to generating Interface Identifiers based on hardware
addresses."

OK naddy, sthen


# 1.190 16-Jan-2018 mpi

Recycle IFF_NOTRAILERS into IFF_STATICARP and document ownerhsip
of IFF* flags.

inputs from jmc@, ok bluhm@, visa@


# 1.189 21-Dec-2017 dlg

prototype if_attach_iqueues so drivers can configure multiple iqs.


# 1.188 09-Nov-2017 tb

The cmd argument of ifconf() has been unused since COMPAT_LINUX was
purged. Remove it and move the prototype to if.c since ifconf() is
not used outside of this file.

ok mpi


# 1.187 31-Oct-2017 sashan

- add one more softnet taskq
NOTE: code still runs with single softnet task. change definition of
SOFTNET_TASKS in net/if.c, if you want to have more than one softnet task

OK mpi@, OK phessler@


Revision tags: OPENBSD_6_1_BASE OPENBSD_6_2_BASE
# 1.186 24-Jan-2017 krw

A space here, a space there. Soon we're talking real whitespace
rectification.


# 1.185 24-Jan-2017 dlg

add support for multiple transmit ifqueues per network interface.

an ifq to transmit a packet is picked by the current traffic
conditioner (ie, priq or hfsc) by providing an index into an array
of ifqs. by default interfaces get a single ifq but can ask for
more using if_attach_queues().

the vast majority of our drivers still think there's a 1:1 mapping
between interfaces and transmit queues, so their if_start routines
take an ifnet pointer instead of a pointer to the ifqueue struct.
instead of changing all the drivers in the tree, drivers can opt
into using an if_qstart routine and setting the IFXF_MPSAFE flag.
the stack provides a compatability wrapper from the new if_qstart
handler to the previous if_start handlers if IFXF_MPSAFE isnt set.

enabling hfsc on an interface configures it to transmit everything
through the first ifq. any other ifqs are left configured as priq,
but unused, when hfsc is enabled.

getting this in now so everyone can kick the tyres.

ok mpi@ visa@ (who provided some tweaks for cnmac).


# 1.184 23-Jan-2017 mpi

Flag pseudo-interfaces as such in order to call add_net_randomness()
only once per packet.

Fix a regression introduced when if_input() started to be called by
every pseudo-driver.

ok claudio@, dlg@


# 1.183 23-Jan-2017 dlg

i botched the copyout to ifr->ifr_data in SIOCGIFDATA.

this lets pflogd run again.

rename if_data() to if_getdata() while here to make grepping for
things less noisy.

reported by jsg@
worked through with deraadt@


# 1.182 23-Jan-2017 dlg

merge the ifnet and ifqueue stats together when userland wants them.

a new if_data() function takes a pointer to ifnet and merges its
if_data and ifq statistics. it takes the ifq mutex around the reads
of the ifq stats so they get a consistent copy.

the ifnet and ifq stats are merged because some parts of the stack
still update the ifnet counters.

ok visa@ (on an earlier diff) mpi@ claudio@


# 1.181 12-Dec-2016 mpi

Remove most of the splsoftnet() recursions related to cloned interfaces.

inputs and ok bluhm@


# 1.180 27-Oct-2016 dlg

add a new pool for 2k + 2 byte (mcl2k2) clusters.

a certain vendor likes to make chips that specify the rx buffer
sizes in kilobyte increments. unfortunately it places the ethernet
header on the start of the rx buffer, which means if you give it a
mcl2k cluster, the ethernet header will not be ETHER_ALIGNed cos
mcl2k clusters are always allocated on 2k boundarys (cos they pack
into pages well). that in turn means the ip header wont be aligned
correctly.

the current workaround on these chips has been to let non-strict
alignment archs just use the normal 2k cluster, but use whatever
cluster can fit 2k + 2 on strict archs. that turns out to be the
4k cluster, meaning we waste nearly 2k of space on every packet.

properly aligning the ethernet header and ip headers gives a
performance boost, even on non-strict archs.


# 1.179 04-Sep-2016 reyk

Move code to change the rdomain of an interface from the ioctl switch case
to a new function if_setrdomain().

OK mpi@ henning@


# 1.178 03-Sep-2016 reyk

Add support for a multipoint-to-multipoint mode in vxlan(4). In this
mode, vxlan(4) must be configured to accept any virtual network
identifier with "vnetid any" and added to a bridge(4) or switch(4).
This way the driver will dynamically learn the tunnel endpoints and
their vnetids for the responses and can be used to dynamically bridge
between VXLANs. It is also being used in combination with switch(4)
and the OpenFlow tunnel classifiers.

With input from yasuoka@ goda@
OK deraadt@ dlg@


Revision tags: OPENBSD_6_0_BASE
# 1.177 10-Jun-2016 vgross

Add the "llprio" field to struct ifnet, and the corresponding keyword
to ifconfig.

"llprio" allows one to set the priority of packets that do not go through
pf(4), as the case is for arp(4) or bpf(4).

ok sthen@ mikeb@


# 1.176 02-Mar-2016 dlg

provide generic ioctls for managing an interfaces parent

in the future this will subsume the individual vlandev, carpdev,
pppoedev, foodev options for things like vlan, carp, pppoe, etc.

inspired by vnetid

ok mpi@ jmatthew@


Revision tags: OPENBSD_5_9_BASE
# 1.175 05-Dec-2015 deraadt

avoid an ugly wrap in a comment


# 1.174 03-Dec-2015 dlg

rework if_start to allow nics to provide an mpsafe start routine.

existing start routines will still be called under the kernel lock
and at IPL_NET.

mpsafe start routines will be serialised so only one instance of
each interfaces function will be running in the kernel at any point
in time. this guarantees packets will be dequeued in order, and the
start routines dont have to lock against themselves because if_start
does it for them.

the code to do that is based on the scsi runqueue code.

this also provides an if_start_barrier() function that should wait
until any currently running instances of if_start have finished.

a driver can opt in to the mpsafe if_start call by doing the following:

1. setting ifp->if_xflags = IFXF_MPSAFE
2. only calling if_start() instead of its own start routine
3. clearing IFF_RUNNING before calling if_start_barrier() on its way down
4. only using IFQ_DEQUEUE (not ifq_deq_begin/commit/rollback)

to simplify the implementation the tx mitigation code has been removed.

tested by several
ok mpi@ jmatthew@


# 1.173 20-Nov-2015 mpi

Keep if_ref() private, if_get() is what you want to use before if_put().

The thread detaching an interface will sleep until all references to this
interface have been released. So we decided to only keep references for
a short period of time.

Keeping if_ref() private will hopefully help preserve this goal as long
as it makes sense.

Calling if_get()/if_put() in the same function also allows us to make
use of static analysis tools (thanks jsg@!) to catch our errors.

ok dlg@


# 1.172 24-Oct-2015 reyk

Add pair(4), a vether-based virtual Ethernet driver to interconnect
rdomains and bridges on the local system. This can be used to route
through local rdomains, to create L2 devices (like trunks) between
them, and many other things.

Discussed with many, with input from mpi@
OK sthen@ phessler@ yasuoka@ mikeb@


# 1.171 23-Oct-2015 claudio

Introduce a new sysctl NET_RT_IFNAMES that returns only ifnames to ifindex
mappings. This will be used by if_nameindex(3), if_nametoindex(3) and
if_indextoname(3) soon to fix the issues in pledge because of inet6 link
local addressing.
OK mpi@ benno@ deraadt@
The libc version will follow soon so better start updating your kernels


# 1.170 23-Oct-2015 dlg

tweak the vnetid so it can be optional and therefore cleared/deleted.

the abstract vnetid is promoted to a uin32_t, and adds a SIOCDVNETID
ioctl so it can be cleared.

this is all because i set an assignment on implementing a virtual
network interface and the students got confused when vnetid 0 didnt
show up in ifconfig output.

the vnetid in the vxlan(4) protocol is optional, but the current
code confuses 0 with no vnetid being set. this makes it clear.

ok reyk@ who also simplified my diff


# 1.169 05-Oct-2015 uebayasi

Add ifi_oqdrops and its alias to struct if_data.

Necessary bumps in Ports will be handled by sthen@.

OK mpi@ dlg@


# 1.168 27-Sep-2015 stsp

Add if_setlladdr(), factored out from ifioctl(). Will be used by iwm(4) soon.
With suggestions from tedu@ and guenther@
ok kettenis@


# 1.167 11-Sep-2015 stsp

Make room for media types of the future. Extend the ifmedia word to 64 bits.
This changes numbers of the SIOCSIFMEDIA and SIOCGIFMEDIA ioctls and
grows struct ifmediareq.

Old ifconfig and dhclient binaries can still assign addresses, however
the 'media' subcommand stops working. Recompiling ifconfig and dhclient
with new headers before a reboot should not be necessary unless in very
special circumstances where non-default media settings must be used to
get link and console access is not available.

There may be some MD fallout but that will be cleared up later.

ok deraadt miod
with help and suggestions from several sharks attending l2k15


# 1.166 09-Sep-2015 dlg

introduce reference counts for interfaces (ie, struct ifnet *ifp).

if_get can get a reference to an ifp, but it never releases that
reference. this provides an if_put function that can be used to
decrement the refcount.

we cannot come up with a scheme for letting the network stack run on
one (or many) cpus while ioctls are pulling interfaces down on another
cpu without refcounts for the interfaces.

if_put is going in now so we can go through the stack and put the
necessary calls to it in, and then we'll backfill this implementation
to actually check the refcounts when the interface detaches.

ok mpi@ mikeb@ claudio@


# 1.165 30-Aug-2015 mpi

Use a global table for domains instead of building a list at run time.

As a side effect there's no need to run if_attachdomain() after the
list of domains has been built.

ok claudio@, reyk@


Revision tags: OPENBSD_5_8_BASE
# 1.164 07-Jun-2015 jsg

Introduce unhandled_af() for cases where code conditionally does
something based on an address family and later assumes one of the paths
was taken. This was initially just calls to panic until guenther
suggested a function to reduce the amount of strings needed.

This reduces the amount of noise with static analysers and acts
as a sanity check.

ok guenther@ bluhm@


# 1.163 18-May-2015 reyk

Move the rdomain from struct ifnet into struct if_data. This way it
will be exported to userland with the existing sysctl, getifaddrs()
and routing socket (if_msghdr.ifm_data) interfaces that expose
if_data. All programs and daemons - Apps - that call the
SIOCGIFRDOMAIN ioctl in a getifaddrs() loop or after receiving an
interface message on the routing socket can now remove the pointless
additional ioctl. In base, that could be: dhclient, isakmpd, dhcpd,
dhcrelay, ntpd, ospfd, ripd, ifconfig.

No ABI breakage because it uses a previously unused pad field in if_data.

OK mpi@ deraadt@


# 1.162 10-Apr-2015 mpi

Run detach hook and similar before cleaning up any other resource when
an interface is destroyed/removed. This way we can ensure pseudo-driver
changes done after attaching an interface are undone before detaching it.

Note: it is safe to call if_deactivate() multiple times as the interface
should not have any attached pseudo-interface after the first call.

ok deraadt@, dlg@


# 1.161 18-Mar-2015 dlg

remove the congestion handling from struct ifqueue.

its only used for the ip and ip6 network stack input queues, so it
seems unfair that every instance of ifqueue has to carry a pointer
around for this specific use case.

this moves the congestion marker to a kernel global. if we detect
that we're congested, we assume the whole system is busy and punish
all input queues.

marking a system as congested is done by setting the global to the
current value of ticks. as the system moves away from that value,
it moves away from being congested until the comparison fails.

written at s2k15
ok henning@ beck@ bluhm@ claudio@


Revision tags: OPENBSD_5_7_BASE
# 1.160 08-Feb-2015 mpi

Introduce if_input() a function to pass packets dequeued from a
recieving ring to the stack.

if_input() is at the moment a drop-in replacement for ether_input_mbuf()
but will let us stack pseudo-driver in a nice way in order to no longer
call ether_input() recursively.

ok pelikan@, reyk@, blambert@, henning@


# 1.159 06-Jan-2015 stsp

Remove the NOINET6 interface flag, a left-over from the times when IPv6
was enabled by default. Add AFATTACH/AFDETACH ioctls which enable/disable
an address family for an interface (currently used for IPv6 only).

New kernel needs new ifconfig for IPv6 configuration (address assignment
still works with old ifconfig making this easy to cross over).

Committing on behalf of henning@ who is currently lebensmittelvergiftet.
ok stsp, benno, mpi


# 1.158 05-Dec-2014 mpi

Explicitly include <net/if_var.h> instead of pulling it in <net/if.h>.

ok mikeb@, krw@, bluhm@, tedu@


Revision tags: OPENBSD_5_6_BASE
# 1.157 14-Jul-2014 dlg

now that receive ring accounting has been pulled out of the mbuf layer,
we can pull the space the mbuf layer used to do per interface accounting
out of struct if_data.

saves a hundredish bytes on every interface.

ok deraadt@ claudio@


# 1.156 11-Jul-2014 henning

introduce the IFXF_AUTOCONF6 interface flag which controls wether we
accept rtadvs on that interface. the global net.inet6.ip6.accept_rtadv
sysctl just doesn't cut it, even tho the spec wants that - but in their
little absurd world, a host just has one interface by definition anyway...
the sysctlgoes away.
lots of head scratching, brain cell elemination etc from bluhm benno stsp
florian, excitement from simon and todd, ok bluhm stsp benno florian


# 1.155 08-Jul-2014 dlg

introduce the if_rxr api. it is intended to pull the rx ring accounting
out of the mbuf layer, and break the assumption that an interface will
only have a single ring per mbuf cluster size.

mpi@ is ok with moving this forward


# 1.154 13-Jun-2014 mpi

Instead of updating all the cluster allocation water marks of all the
interfaces when the kernel is livelocked, only do it for the current
pool and defer the other updates.

This allow us to get rid of an interface list iteration in a critical
path.

Ridding the libc crank since this change introduce an ABI break.

ok claudio@


Revision tags: OPENBSD_5_5_BASE
# 1.153 21-Nov-2013 mikeb

split kernel parts of the if.h into a separate header file if_var.h
which allows us to modify ifnet structure in a relatively safe way;
discussed with deraadt, ok mpi


# 1.152 09-Nov-2013 dlg

ticks is compared against mcl_grown to see if time has elapsed since
the rx ring was last allowed to grow and then assigned to it. it
is erroneous to do this because mcl_grown is a u_int and ticks is an
int.

this makes mcl_grown an int, and follows the idiom in kern_timeout.c
of going "thing - ticks < diff", which better copes with ticks
wrapping around and being used to calculate relative intervals.

ok pirofti@ guenther@


# 1.151 01-Nov-2013 pelikan

keep net/hfsc.h away from userspace, except in pfctl

tested by naddy, ok deraadt


# 1.150 21-Oct-2013 benno

nuke comment. How soon is now?
"do it" deraadt@


# 1.149 19-Oct-2013 reyk

Bring back the if_detachhook. We're going to have more users now.

ok mpi@ henning@ benno@


# 1.148 13-Oct-2013 reyk

Import vxlan(4), the virtual extensible local area network tunnel
interface. VXLAN is a UDP-based tunnelling protocol for overlaying
virtualized layer 2 networks over layer 3 networks. The implementation
is based on draft-mahalingam-dutt-dcops-vxlan-04 and has been tested
with other implementations in the wild.

put it in deraadt@


# 1.147 12-Oct-2013 henning

new bandwidth shaping subsystem, kernel side
uses hfsc behind the scenes; altq stays in parallel for a migration phase.
if.h even more messy for the transition, but eventuelly it should become
readable...
looked over & tested by many, ok phessler sthen


# 1.146 17-Sep-2013 mpi

Change vlan(4) detach procedure to not use a hook but a list of vlans
on the parent interface. This is similar to what bridge(4), trunk(4)
or carp(4) are doing and allows us to get rid of the detachhook.

ok reyk@, mikeb@


# 1.145 28-Aug-2013 mpi

Remove unused argument from *rtrequest()

ok krw@, mikeb@


Revision tags: OPENBSD_5_4_BASE
# 1.144 20-Jun-2013 mpi

Revert previous and unbreak asr, the new include should be protected.

Reported by naddy@


# 1.143 20-Jun-2013 mpi

Allocate the various hook head descriptors as part of the ifnet
structure rather than doing various M_WAITOK allocations during
the *attach() functions, we always rely on them anyway.

ok mikeb@, uebayasi@


# 1.142 02-Apr-2013 mpi

Instead of storing the link-level address of every interface in a global
array indexed by interface numbers, add a new field to the interface
descriptor pointing to it.

claudio@ and todd@ like it, ok mikeb@


# 1.141 26-Mar-2013 mpi

Remove various read-only *maxlen variables and use IFQ_MAXLEN directly.

ok beck@, mikeb@


# 1.140 20-Mar-2013 mpi

Introduce if_get() to retrieve an interface descriptor pointer given
an interface index and replace all the redondant checks and accesses
to a global array by a call to this function.

With imputs from and ok bluhm@, mikeb@


# 1.139 07-Mar-2013 mpi

Remove unused ifa_ifwithaf() function.

ok mikeb@, miod@


# 1.138 07-Mar-2013 mpi

Remove the IFAFREE() macro, the ifafree() function it was calling already
check for the reference counter.

ok mikeb@, miod@, pelikan@, kettenis@, krw@


Revision tags: OPENBSD_5_3_BASE
# 1.137 23-Nov-2012 sthen

Add SIOCGIFHARDMTU to allow retrieving the driver's maximum supported MTU
looks fine reyk@ ok mikeb@


# 1.136 11-Nov-2012 deraadt

align ifaliasreq.ifra_addr similar to the way that ifreq is fixed --
a gruesome union, to block the compiler from placing the struct
incorrectly aligned on stack frames
ok guenther


# 1.135 05-Oct-2012 camield

Point an interface directly to its bridgeport configuration, instead
of to the bridge itself. This is ok, since an interface can only be part
of one bridge, and the parent bridge is easy to find from the bridgeport.

This way we can get rid of a lot of list walks, improving performance
and shortening the code.

ok henning stsp sthen reyk


# 1.134 19-Sep-2012 henning

defina an IFCAP_CSUM_MASK, covering IFCAP_CSUM_*, and use it in if_vlan.c
to replace the list of them.
this actually makes vlan inherit the IPv6 CSUM flags from it's parent, that
had been commented out since this code was committed back in 2001.
ok benno mpf


# 1.133 10-Sep-2012 guenther

Bring into compliance with POSIX, exposing just the specified bits.

Requested by jasper@, ok kettenis@


# 1.132 21-Aug-2012 bluhm

Reverse the name and meaning of the IFXF_INET6_PRIVACY interface
flag. It is now called IFXF_INET6_NOPRIVACY. So IPv6 privacy
addresses are on by default without resetting the flag during
ifconfig down/up.
OK stsp@, sperreault@ (who wrote the same diff)


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.131 02-Dec-2011 haesbaert

Kill unused IFCAP_IPSEC and IFCAP_IPCOMP.

ok claudio@ henning@ mikeb@


# 1.130 02-Nov-2011 haesbaert

Expose if_capabilities to userland so that ifconfig can display the
device hardware features.
Tune ifconfig to show them with 'hwfeatures' argument.
While here, kill some old unused capabilities and respect 80 columns
in brconfig.h.

ok mcbride@, henning@, mpf@.


# 1.129 07-Oct-2011 henning

rename some vars and functions
unfortunately altq is one giant namespace violation. rename just those that
conflict with new stuff for now only to be found on my laptop. reduce pain,
the diff is huge already. ok ryan


Revision tags: OPENBSD_5_0_BASE
# 1.128 08-Jul-2011 henning

new priority queueing implementation, extremely low overhead, thus fast.
unconditional, always on. 8 priority levels, as every better switch, the
vlan header etc etc. ok ryan mpf sthen, pea tested as well


# 1.127 07-Jul-2011 henning

provide IF_LEN and IFQ_LEN to access ifq_len on an ifqueue, ryan ok


# 1.126 05-Jul-2011 henning

now of course I only noticed if_qflush is completely unused after
adjusting it to the new world order in my tree... remove it, ok ryan claudio


# 1.125 03-Jul-2011 henning

IFQ_CLASSIFY is also just schrapnel


# 1.124 03-Jul-2011 henning

no traces of ALTQ_DECL to be found anywhere, thus kill the #defines


# 1.123 03-Jul-2011 claudio

LINK_STATE_IS_UP() should consider LINK_STATE_UNKNOWN as an up state.
This is now possible because carp no longer uses LINK_STATE_UNKNOWN
for a state that is considered down. This will simplify a lot of code.
OK mpf@ mcbride@ henning@


# 1.122 13-Mar-2011 stsp

Add a way to enable/disable Wake On LAN with ifconfig.
ok deraadt


Revision tags: OPENBSD_4_9_BASE
# 1.121 17-Nov-2010 henning

introduce ifa_update_broadaddr to update an ifaddr's broadcast address,
trivial for the moment, more needed soon
tested by many as part of a larger diff, ok sthen claudio dlg krw


# 1.120 24-Sep-2010 claudio

Implement if_freenameindex() as a real function as required by posix.
OK deraadt@, millert@


# 1.119 23-Sep-2010 dlg

tweak the mclgeti algorithm to behave better under load.

instead of letting hardware rings grow on every interrupt, restrict
it so it can only grow once per softclock tick. we can only punish
the rings on softclock ticks, so it make sense to only grow on
softclock tick boundaries too.

the rings are now punished after >1 lost softclock tick rather than
>2. mclgeti is now more aggressive at detecting livelock.

the rings get punished by an 8th, rather than by half.

we now allow the rings to be punished again even if the system is
already considered in livelock.

without this diff a livelocked system will have its rx ring sizes
scale up and down very rapidly, while holding the rings low for too
long. this affected throughput significantly.

discussed and tested heavily at j2k10. there are still some games
with softnet we can play, but this is a good first step.

"put it in" and ok deraadt@
ok claudio@ krw@ henning@ mcbride@

if we find out that it sucks we can pull it out again later. till then
we'll run with it and see how it goes.


# 1.118 27-Aug-2010 jsg

remove the unused if_init callback in struct ifnet
ok deraadt@ henning@ claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.117 26-Jun-2010 claudio

Implement a simple keepalive mechanism in gre(4) that is compatible with
the one used by Cisco. It sends a return gre packet inside a gre packet
to the other side and expects it to return.
OK deraadt, reyk additional testing by sthen


# 1.116 28-May-2010 claudio

Rework the way we handle MPLS in the kernel. Instead of fumbling MPLS into
ether_output() and later on other L2 output functions use a trick and over-
load the ifp->if_output() function pointer on MPLS enabled interfaces to
go through mpls_output() which will then call the link level output function.
By setting IFXF_MPLS on an interface the output pointers are switched.
This now allows to cleanup the MPLS input and output pathes and fix mpe(4)
so that the MPLS code now actually works for both P and PE systems.
Tested by myself and michele
(A custom kernel with MPLS and mpe enabled is still needed).


# 1.115 17-Apr-2010 deraadt

split SIOCSIFLLADDR code out into an ifnewlladr() function
ok stsp


# 1.114 06-Apr-2010 stsp

Simple implementation of RFC4941, "Privacy Extensions for Stateless
Address Autoconfiguration in IPv6". For those among us who are paranoid
about broadcasting their MAC address to the IPv6 internet.

Man page help from jmc, testing by weerd, arc4random API hints from djm.

ok deraadt, claudio


Revision tags: OPENBSD_4_7_BASE
# 1.113 13-Jan-2010 henning

maintain a global RB tree of all local addresses in the system. this
includes AF_LINK addresses (aka mac addresses in the ethernet case). for
inet this also includes the broadcast addresses.
depends on ifinit() called earlier so we have a chance to pool_init before
autoconf assigns the AF_LINK addresses, the v6 fix, and the ifa_add/del
abstraction i just committed.
this is a change in semantics, it is now illegal to change the actual
address in an ifaddr struct because then the RB tree becomes unbalanced.
nothing using this tree yet.
ok theo ryan dlg


# 1.112 13-Jan-2010 henning

instead of fiddling with the per-interface address lists directly in
many places create a proper API (ifa_add / ifa_del) and use it.
ok theo ryan dlg


# 1.111 12-Jan-2010 deraadt

Make the structures for ifa_msghdr and friends even more like
the route messages so that people and compilers will not get
confused.
ok claudio


# 1.110 17-Sep-2009 claudio

Remove the comaptibility structures for routing socket version 3.
The RTM_VERSION bump is 2 years ago and so there is no need for this.
Diff made by tedu@ some time ago but got never commited so I do it now.


# 1.109 14-Sep-2009 claudio

Add a way to convert the ifi_link_state to a string without the use of
if_media. This makes link state tracking a lot easier as there is no need
to convert if types to if_media types, etc. Additionally this allows us
to extend the link states to include states tracked on higher protocol layers.
gre(4) keepalives packets, bfd and udld can be implemented without ugly hacks.
OK henning, michele, sthen, deraadt


# 1.108 10-Aug-2009 deraadt

At sys_reboot time, bring all the interfaces down so that their xxstop
functions are called, which will turn off DMA. Receiving packets into
your memory after a system reboot is pretty nasty. This will also mean
that the shutdown hooks can go; this solution is smaller.
ok henning miod dlg kettenis


Revision tags: OPENBSD_4_6_BASE
# 1.107 06-Jun-2009 rainer

when xflags got changed, tell the userland by routing sockets

ok henning@


# 1.106 05-Jun-2009 claudio

Initial support for routing domains. This allows to bind interfaces to
alternate routing table and separate them from other interfaces in distinct
routing tables. The same network can now be used in any doamin at the same
time without causing conflicts.
This diff is mostly mechanical and adds the necessary rdomain checks accross
net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6.
input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@


# 1.105 04-Jun-2009 henning

allow IPvShit to be turned off completely per-interface.
ifconfig em0 -inet6
deletes all v6 addresses including link-local and prevents new ones from
being added.
ifconfig em0 inet6 <addr>
re-enables v6, brings the link local back and adds optional <addr>
ok theo reyk


# 1.104 03-Jun-2009 beck

make wireless interfaces priority 4 by default. other interfaces remain
priority 0. while we are in here make sure we add wi interfaces to group "wlan"
in the same way the net80211 stuff already is.

this makes dhcp multiple default routes useful on laptops.

ok claudio@


Revision tags: OPENBSD_4_5_BASE
# 1.103 27-Jan-2009 dlg

make drivers tell the mclgeti allocator what their maximum ring size is
to prevent the hwm growing beyond that. this allows the livelock mitigation
to do something where the hwm used to grow beyond twice the rx rings size.

ok kettenis@ claudio@


# 1.102 12-Dec-2008 claudio

Introduce a if_priority that will be added to RTP_STATIC when routes are
added without an expilict priority. This allows to specify less prefered
interfaces that will only take over if the primary interface loses link.
OK deraadt@


# 1.101 11-Dec-2008 deraadt

export per-interface mbuf cluster pool use statistics out to userland
inside if_data, so that netstat(1) and systat(1) can see them
ok dlg


# 1.100 30-Nov-2008 brad

- Remove unused if_reset "bus reset routine" field in the ifnet struct.
- Add if_stop "stop routine" field in the ifnet struct.

ok mglocker@


# 1.99 26-Nov-2008 dlg

provide m_clsetlwm, an interface for an interface to raise its low
watermark for mbuf cluster allocations.

this is necessary for things like bge which cannot cope with less than a
certain number of pkts on the ring.

ok deraadt@


# 1.98 25-Nov-2008 deraadt

Factor increases are not needed, +1 appears to work as well.
ok dlg


# 1.97 24-Nov-2008 deraadt

move MCLPOOLS to if.h and force uipc_mbuf.c to get if.h, there is no
other option
ok dlg


# 1.96 24-Nov-2008 dlg

add several backend pools to allocate mbufs clusters of various sizes out
of. currently limited to MCLBYTES (2048 bytes) and 4096 bytes until pools
can allocate objects of sizes greater than PAGESIZE.

this allows drivers to ask for "jumbo" packets to fill rx rings with.

the second half of this change is per interface mbuf cluster allocator
statistics. drivers can use the new interface (MCLGETI), which will use
these stats to selectively fail allocations based on demand for mbufs. if
the driver isnt rapidly consuming rx mbufs, we dont allow it to allocate
many to put on its rx ring.

drivers require modifications to take advantage of both the new allocation
semantic and large clusters.

this was written and developed with deraadt@ over the last two days
ok deraadt@ claudio@


# 1.95 07-Nov-2008 deraadt

give this some /* CONSTCOND */ love


Revision tags: OPENBSD_4_4_BASE
# 1.94 10-Apr-2008 dlg

introduce mitigation for the calling of an interfaces start routine.

decent drivers prefer to have a lot of packets on the send queue so they
can queue a lot of them up on the tx ring and then post them all in one
big chunk. unfortunately our stack queues one packet onto the send queue
and then calls the start handler immediately.

this mitigates against that queue, send, queue, send behaviour by trying to
call the start routine only once per softnet. now its queue, queue, queue,
send.

this is the result of a lot of discussion with claudio@
tested by many.


Revision tags: OPENBSD_4_3_BASE
# 1.93 18-Nov-2007 mpf

Sync struct ifaltq to match struct ifqueue.
I wonder why 64-bit archs have not been bitten by this.
OK mcbride@, henning@


# 1.92 03-Sep-2007 claudio

Bump RTM_VERSION to 4 and start a new aera of routing in OpenBSD :)
Changes include 64bit counters instead of u_long, routing table id in the header
of most messages, reserved routing priority field, added a hdrlen field to skip
over the header so that binary compatibility becomes easier.
A minimal backward support for old binaries is included to ease upgrades but
don't expect anything more than ifconfig, route and dhclient to correctly work.
OK henning@ mglocker@


Revision tags: OPENBSD_4_2_BASE
# 1.91 25-Jun-2007 henning

crank ifq_maxlen from 50 to 256, so it is not smaller than most interfaces
rx rings any more. forwarding boxes with many fast interfaces can still use
some more, but this is a saner default.
ok deraadt markus henric


# 1.90 14-Jun-2007 reyk

Add a new "rtlabel" option to ifconfig. It allows to specify a route label
which will be used for new interface routes. For example,
ifconfig em0 10.1.1.0 255.255.255.0 rtlabel RING_1
will set the new interface address and attach the route label RING_1 to
the corresponding route.

manpage bits from jmc@
ok claudio@ henning@


# 1.89 29-May-2007 uwe

Define IF_ENQUEUE() and friends as proper C statements using do ... while
ok henning


# 1.88 26-May-2007 jason

one extern seems to be better than 20 for ifqmaxlen; ok krw


# 1.87 27-Mar-2007 jmc

grammar from bret lambert, and one more from me;


Revision tags: OPENBSD_4_1_BASE
# 1.86 09-Feb-2007 jmc

grammar fix from bret lambert;


# 1.85 28-Nov-2006 reyk

add additional link states to report the half duplex / full duplex
state, if known by the driver. this is required to check the full
duplex state without depending on the ifmedia ioctl which can't be
called in the kernel without process context.

ok henning@, brad@


# 1.84 16-Nov-2006 henning

introduce if_creategroup() to create an empty interface group.
code factored out from if_addgroup(), previously a group always had to have
members. ok mpf mcbride


# 1.83 31-Oct-2006 jason

ether_input_mbuf() isn't necessary, turn it into a macro and deal with
it's "special" case in ether_input(). Based on similiar idea in FreeBSD.
ok brad


Revision tags: OPENBSD_4_0_BASE
# 1.82 02-Jun-2006 mpf

Introduce attributes to interface groups.
As a first user, move the global carp(4) demotion counter
into the interface group. Thus we have the possibility
to define which carp interfaces are demoted together.

Put the demotion counter into the reserved field of the carp header.
With this, we can have carp act smarter if multiple errors occur.
It now always takes over other carp peers, that are advertising
with a higher demote count. As a side effect, we can also have
group failovers without the need of running in preempt mode.
The protocol change does not break compability with older
implementations.

Collaborative work with mcbride@

OK mcbride@, henning@


# 1.81 27-May-2006 brad

remove IFCAP_JUMBO_MTU interface capabilities flag and set if_hardmtu in a few
more drivers.

ok reyk@


# 1.80 26-May-2006 deraadt

rename jumbo mtu to if_hardmtu; ok brad reyk


# 1.79 19-May-2006 reyk

add a if_jumbo_mtu field to the interface structure for drivers
supporting ethernet jumbo frames. there's no standard for the size of
jumbo MTUs, so either let the driver set it's own value or use 9000
byte jumbo frames by default.

ok brad@


# 1.78 04-Mar-2006 brad

With the exception of two other small uncommited diffs this moves
the remainder of the network stack from splimp to splnet.

ok miod@


Revision tags: OPENBSD_3_9_BASE
# 1.77 09-Feb-2006 reyk

add an interface detach hook and use it with the vlan(4) driver. this
fixes a possible crash if the parent interface has been destroyed
(like vlan on trunk) before destroying the vlan interface.

ok brad@


Revision tags: OPENBSD_3_8_BASE
# 1.76 14-Jun-2005 henning

rename function and define to reflect the external -> egress name change
so it is clear what it is all about


# 1.75 14-Jun-2005 henning

use "egress" instead of "external" for the interface group containing the
interfaces the default route(s) point to, proposed deraadt some days ago,
ok djm deraadt


# 1.74 12-Jun-2005 henning

add SIOCGIFGMEMB ioctl, returns a list of all interfaces who are member of
the given group, markus ok


# 1.73 07-Jun-2005 henning

introduce a default "external" interface group, containing the interface(s)
the the default route(s) point to.
handles IPv4 and IPv6 as well as multipath routes.
follows default route changes, of course.
eases writing pf rulesets especially on laptops etc. that use different
interfaces depending on the environment (wired, wireless, ...)
ok theo ryan


# 1.72 06-Jun-2005 henning

use a define instead of hardcoding "all" in 3 places


# 1.71 05-Jun-2005 henning

const'ify the char *groupname param to if_addgroup and if_delgroup


# 1.70 24-May-2005 markus

add net.inet.ip.ifq for monitoring and changing ifqueue; similar to netbsd
ok henning


# 1.69 24-May-2005 reyk

initial import of a trunking (link aggregation and link failover)
implementation. it currently supports round robin mode with link state
checking, additional modes will be added later.

ok brad@, deraadt@


# 1.68 24-May-2005 henning

keep a list of member interfaces in ifg_group


# 1.67 22-May-2005 henning

allow pf to match on interface groups
pass on mygroup ...
markus ok


# 1.66 21-May-2005 henning

clean up and rework the interface absraction code big time, rip out multiple
useless layers of indirection and make the code way cleaner overall.
this is just the start, more to come...
worked very hard on by Ryan and me in Montreal last week, on the airplane to
vancouver and yesterday here in calgary. it hurt.
ok ryan theo


# 1.65 20-Apr-2005 mpf

Introduce if_linkstatehooks.
This converts if_link_state_change() to a generic usable
callback with dohooks().

OK henning@, camield@
Tested by camield@ and Alexey E. Suslikov


Revision tags: OPENBSD_3_7_BASE
# 1.64 07-Feb-2005 mcbride

Add new function if_link_state_change() to take care of sending messages
on the routing socket and notifying carp() of link changes.

ok brad@ mpf@


# 1.63 14-Jan-2005 henning

remove old ifgroups ioctls
the old ifgroups haven't been in use ever really, and the new
implementation is 3 months old today. theo ok (3 months ago)


# 1.62 07-Dec-2004 mcbride

Convert carp(4) to behave more like a regular interface, much in the same
style as vlan(4). carp interfaces no longer require the physical interface
to be on the same subnet as the carp interface, or even that the physical
interface has an adress at all, so CARP can now be used on /30 networks.

ok deraadt@ henning@


# 1.61 07-Dec-2004 mcbride

KNF


# 1.60 03-Dec-2004 henning

do not use one struct timeout for the if congestion stuff, but embed
a struct timeout to struct ifqueue so that each one has its own - it
is a per-queue thing. from chris pascoe


# 1.59 10-Nov-2004 grange

Safer IF_INPUT_ENQUEUE macro.

ok millert@


# 1.58 14-Oct-2004 mickey

avoid stupid commons


# 1.57 11-Oct-2004 henning

ifgroups reqrite
there is now a TAILQ with all interface groups as members, and
in struct ofnet there is only a pointer to the group structure stored
and not its name.
mostly hacked at c2k4 and somewhere over the atlantic ocean
ok markus mcbride


Revision tags: OPENBSD_3_6_BASE
# 1.56 26-Jun-2004 markus

cleanup ioctl for ifgroups; ok pb@


# 1.55 25-Jun-2004 pb

introduce "interface groups"

by "ifconfig fxp0 group foobar" "ifconfig xl0 group foobar"
these two interfaces are in one group.
Every interface has its if-family as default group.

idea/design from henning@, based on some work/disucssion from Joris Vink.

henning@, mcbride@ ok.


# 1.54 20-Jun-2004 beck

undo mbuf cluster breakage that causes free'ed packets to show up on the
input queues when using dhcp and hostap wi, or xl, or fxp....
ok art@


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.53 29-May-2004 jcs

introduce SIOCSIFDESCR and SIOCGIFDESCR to maintain interface
descriptions, configurable with ifconfig

help from various, ok deraadt@


# 1.52 18-May-2004 brad

if_ether.h
add ETHER_MAX_LEN_JUMBO, ETHER_VLAN_ENCAP_LEN, ETHER_ALIGN, and
ETHERMTU_JUMBO constants.

if.h
add a few more interface capabilities flags.

Some from NetBSD, some from FreeBSD.

ok markus@


# 1.51 26-Apr-2004 mcbride

Before enqueueing the packet, copy the contents of incoming clusters
to the mbuf and free the cluster when it contains a small packet.

ok deraadt@


# 1.50 17-Apr-2004 henning

add a congestion indicator to if_queue. It is set when the input queue
is full, along with a timer that unsets it again after 10ms.
The input queue beeing full is a reliable indicator for CPU overload, and
this flag allows other subsystems to cope with the situation.
hacked with beck
ok kjc@ markus@ beck@


Revision tags: OPENBSD_3_5_BASE
# 1.49 15-Jan-2004 markus

add a RTM_IFANNOUNCE message; from netbsd; ok itojun, henning


# 1.48 16-Dec-2003 markus

return error in ifc_destroy; ok deraadt, itojun, cedric, hshoexer


# 1.47 10-Dec-2003 itojun

use if_indexlim (instead of if_index) and ifindex2ifnet[x] != NULL
to check if interface exists, as (1) if_index will have different meaning
(2) ifindex2ifnet could become NULL when interface gets destroyed,
when we introduce dynamically-created interfaces. markus ok


# 1.46 08-Dec-2003 markus

add IOCIFGCLONERS; ifconfig -C; from netbsd; ok henning, deraadt


# 1.45 03-Dec-2003 markus

support for network interface "cloning", e.g. gif(4) via ifconfig(8)


# 1.44 19-Oct-2003 david

more typos


# 1.43 17-Oct-2003 mcbride

Common Address Redundancy Protocol

Allows multiple hosts to share an IP address, providing high availability
and load balancing.

Based on code by mickey@, with additional help from markus@
and Marco_Pfatschbacher@genua.de

ok deraadt@


Revision tags: OPENBSD_3_4_BASE
# 1.42 25-Aug-2003 fgsch

if_init support, required by ieee80211.
deraadt@ ok.


# 1.41 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_A UBC_SYNC_B
# 1.40 03-Jul-2002 miod

Change all variables definitions (int foo) in sys/sys/*.h to variable
declarations (extern int foo), and compensate in the appropriate locations.


# 1.39 30-Jun-2002 itojun

allocate sockaddr_dl for ifnet in if_alloc_sadl(), as we don't always know
the size of sockaddr_dl on if_attach() - for instance, see ether_ifattach().
from netbsd. fgs ok


# 1.38 23-Jun-2002 itojun

g/c last remains of old ipv6 prefix management


# 1.37 27-May-2002 itojun

if_attach() gets called before domaininit(). scan all interfaces for if_afdata
initialization after domaininit().


# 1.36 27-May-2002 itojun

framework to add af-dependent data structure to struct ifnet.
as discussed at bsd-api-discuss. sync w/kame


# 1.35 24-Apr-2002 dhartmei

Add hooks to struct ifnet that allow to register callbacks that will be
notified of interface address changes. ok provos@, angelos@


Revision tags: OPENBSD_3_1_BASE
# 1.34 15-Mar-2002 millert

Cosmetic changes only, primarily making comments line up nicely after the
__P removal.


# 1.33 14-Mar-2002 millert

First round of __P removal in sys


# 1.32 23-Jan-2002 fgsch

compatability -> compatibility.


Revision tags: OPENBSD_3_0_BASE UBC_BASE
# 1.31 05-Jul-2001 angelos

branches: 1.31.4;
KNF


# 1.30 05-Jul-2001 jjbg

Include files for IPComp support. angelos@ ok.


# 1.29 27-Jun-2001 kjc

ALTQ base modifications to the kernel.
- ALTQ introduces a set of new queue macros that coexist with the
traditional IF_XXX macros.
- "struct ifaltq" replaces "struct ifqueue" in "struct ifnet".
- assign cdev major 74 for i386 and 54 for alpha as ALTQ control interface.


# 1.28 23-Jun-2001 fgsch

Add ether_input_mbuf to help us remove the ether_header from
ether_input; all drivers should start migrating to this.
Discussed with jason@, deraadt@ more or les ok'ed.


# 1.27 15-Jun-2001 itojun

change the meaning of ifnet.if_lastchange to meet RFC1573 ifLastChange.
follows BSD/OS practice and ucd-snmp code (FreeBSD does it for specific
interfaces only).

was: if_lastchange get updated on every packet transmission/receipt.
now: if_lastchange get updated when IFF_UP is changed.


# 1.26 09-Jun-2001 angelos

By popular demand, protect from multiple inclusion, and fix to use the
same naming style.


# 1.25 28-May-2001 angelos

IPSECv4 -> IPSEC


# 1.24 28-May-2001 angelos

No need for separate ESP/AH interface capabilities.


# 1.23 28-May-2001 angelos

Interface capabilities (based on NetBSD, but merge ethercom and ifnet
capabilities into one, in the ifp).


Revision tags: OPENBSD_2_9_BASE
# 1.22 06-Feb-2001 mickey

allow changing number of loopbacks in ukc.
change rest of the code to use lo0ifp pointing
to the corresponding struct ifnet.
itojun@ and niklas@ ok


# 1.21 19-Jan-2001 itojun

pull post-4.4BSD change to sys/net/route.c from BSD/OS 4.2 (UCB copyrighted).

have sys/net/route.c:rtrequest1(), which takes rt_addrinfo * as the argument.
pass rt_addrinfo all the way down to rtrequest, and ifa->ifa_rtrequest.
3rd arg of ifa->ifa_rtrequest is now rt_addrinfo * instead of sockaddr *
(almost noone is using it anyways).

benefit: the follwoing command now works. previously we need two route(8)
invocations, "add" then "change".
# route add -inet6 default ::1 -ifp gif0

remove unsafe typecast in rtrequest(), from rtentry * to sockaddr *. it was
introduced by 4.3BSD-reno and never corrected.

XXX is eon_rtrequest() change correct regarding to 3rd arg?
eon_rtrequest() and rtrequest() were incorrect since 4.3BSD-reno,
so i do not have correct answer in the source code.
someone with more clue about netiso-over-ip, please help.


Revision tags: OPENBSD_2_8_BASE
# 1.20 20-Sep-2000 art

Since ifa_refcnt was bumped to an int and rt_flags is an int too, bump
ifa_flags to int.


# 1.19 28-Aug-2000 deraadt

changing the size of if_data has heavy impact on userland compat. there
was even a u_char slot available for ifi_link_state, which clearly does not
need a full 32 bits.


# 1.18 26-Aug-2000 nate

sync mii code with netbsd
adds detach functionality for phys
some code cleanup

Nobody really had time to test all of this out, but theo said commit anyway


Revision tags: OPENBSD_2_7_BASE
# 1.17 22-Mar-2000 itojun

remove if_withname(), which was imported during KAME merge by mistake.


# 1.16 21-Mar-2000 mickey

add SIOCGIFMTU/SIOCSIFMTU; remediate redundant code of tun, ppp, sppp; chris@ ok


Revision tags: SMP_BASE
# 1.15 02-Feb-2000 itojun

branches: 1.15.2;
wrap IFAFREE() by "do {} while (0)". it wasn't safe enough.


Revision tags: kame_19991208
# 1.14 08-Dec-1999 itojun

bring in KAME IPv6 code, dated 19991208.
replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support.
see sys/netinet6/{TODO,IMPLEMENTATION} for more details.

GENERIC configuration should work fine as before. GENERIC.v6 works fine
as well, but you'll need KAME userland tools to play with IPv6 (will be
bringed into soon).


Revision tags: OPENBSD_2_6_BASE
# 1.13 08-Aug-1999 niklas

Support detaching of network interfaces. Still work to do in ipf, and
other families than inet.


# 1.12 23-Jun-1999 cmetz

Added some protocol independent interfaces (supposedly IPv6 support APIs, but
ones that are useful for all protocols, not just IPv6).


Revision tags: OPENBSD_2_5_BASE
# 1.11 13-Mar-1999 deraadt

make ifa_refcnt a u_int; andrewb@demon.net


# 1.10 26-Feb-1999 jason

Ethernet bridge/IP firewall driver.


# 1.9 07-Jan-1999 deraadt

fix IFAFREE() to be safe for if/else nesting


Revision tags: OPENBSD_2_4_BASE
# 1.8 03-Sep-1998 jason

o OpenBSD gets if_media support (from NetBSD)
o rework/simplify if_xl to use it


Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE OPENBSD_2_2_BASE OPENBSD_2_3_BASE
# 1.7 02-Jul-1996 niklas

-Wall & -Wstrict-prototype fixes


# 1.6 29-Jun-1996 deraadt

provide if_attachhead(), and make if_loop use it


# 1.5 10-May-1996 deraadt

if_name/if_unit -> if_xname/if_softc


# 1.4 06-May-1996 mickey

if.h was missed from the commit.
if_ethersubr.c: missed variables added.


# 1.3 05-Mar-1996 mickey

Changes for ifconfig to compile.


# 1.2 03-Mar-1996 niklas

From NetBSD: 960217 merge


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.202 25-Jul-2019 krw

Add IFXF_AUTOCONF4 to if_xflags to match IFXF_AUTOCONF6. Let
ifconfig set/unset it.

ok deraadt@ kmos@


# 1.201 19-Apr-2019 dlg

add IF_HDRPRIO_OUTER for rxprio

IF_HDRPRIO_OUTER says you want the priority from the outer encap header.

ok claudio@


Revision tags: OPENBSD_6_5_BASE
# 1.200 10-Apr-2019 dlg

add struct if_sffpage so userland can read a page of sfp/qsfp info

this will be used by ifconfig so it can show you things like who
mades what module you're using and when it was made, and on some
modules you get diag info like temperature, vcc, and rx and tx
powers.

im putting the kernel side in so we can keep fiddling with the
userland printf side of things.

this work was done based on a question by rachel roch
ok deraadt@
enthusiasm from many including mikeb@ sthen@ patrick@


# 1.199 23-Jan-2019 dlg

add a SIOCGPWE3 ioctl for interfaces to advertise pwe3 capability

im going to turn mpw into an ethernet interface, which includes
changing its if_type to IFT_ETHER. currently ldpd looks for if_type
IFT_MPLSTUNNEL to decide if an interface is a pseudowire, ie, it's
going to break. the ioctl will let ldpd ask the interface if it is
pseudowire capable as an alternative.

ok claudio@


# 1.198 12-Nov-2018 dlg

add ifreq bits for the tx header prio field ioctls

a tx header prio can set to a fixed value from 0 to 7, or magic
values to represent populating the prio field from the encapsulated
packet, or from the mbuf prio value.

ok claudio@


# 1.197 12-Nov-2018 krw

Add new routing socket message RTM_80211INFO to provide details of
802.11 interface state changes (e.g. SSID) to interested parties.

Original diff from phessler@. Many suggestions and tweaks from
claudio@, stsp@, anton@.

ok claudio@ stsp@ anton@ phessler@


# 1.196 11-Nov-2018 dlg

use the llprio on gre(4) and eoip(4) interfaces for the keepalive tos

llprios are valued 0 to 7, while the ip tos/dscp/tclass is an 8 bit
value. fortunately the high 3 bits map nicely to the llprio values,
so we shift the llprio into place when generating the keepalive
frames. the llprio is defaulted to the value that cisco uses for
their gre keepalives.


Revision tags: OPENBSD_6_4_BASE
# 1.195 12-Sep-2018 krw

Fix obvious cut&pasto in comment (ifa_msghdr -> if_announcemsghdr).

ok claudio@


# 1.194 30-May-2018 dlg

restrict the prio values from SIOCSIFLLPRIO to what the kernel handles

previously the ioctl code checked that prio was an int less than
UCHAR_MAX, but the rest of the kernel (and priq code in particular)
expects it to be between 0 and 7 inclusive.

ok krw@ tb@


# 1.193 25-Apr-2018 jca

Make this header standalone #if __BSD_VISIBLE, by including needed headers

Puts us in line with Free/NetBSD and Linux and will get us rid of
pointless patches in the ports tree. ok guenther@ deraadt@


Revision tags: OPENBSD_6_3_BASE
# 1.192 19-Feb-2018 dlg

tunneldf needs ifr_df


# 1.191 10-Feb-2018 florian

Implement RFC 7217: "A Method for Generating Semantically Opaque
Interface Identifiers with IPv6 Stateless Address Autoconfiguration."

"An IPv6 address configured using this method is stable within each
subnet, but the corresponding Interface Identifier changes when the
host moves from one network to another. This method is meant to be an
alternative to generating Interface Identifiers based on hardware
addresses."

OK naddy, sthen


# 1.190 16-Jan-2018 mpi

Recycle IFF_NOTRAILERS into IFF_STATICARP and document ownerhsip
of IFF* flags.

inputs from jmc@, ok bluhm@, visa@


# 1.189 21-Dec-2017 dlg

prototype if_attach_iqueues so drivers can configure multiple iqs.


# 1.188 09-Nov-2017 tb

The cmd argument of ifconf() has been unused since COMPAT_LINUX was
purged. Remove it and move the prototype to if.c since ifconf() is
not used outside of this file.

ok mpi


# 1.187 31-Oct-2017 sashan

- add one more softnet taskq
NOTE: code still runs with single softnet task. change definition of
SOFTNET_TASKS in net/if.c, if you want to have more than one softnet task

OK mpi@, OK phessler@


Revision tags: OPENBSD_6_1_BASE OPENBSD_6_2_BASE
# 1.186 24-Jan-2017 krw

A space here, a space there. Soon we're talking real whitespace
rectification.


# 1.185 24-Jan-2017 dlg

add support for multiple transmit ifqueues per network interface.

an ifq to transmit a packet is picked by the current traffic
conditioner (ie, priq or hfsc) by providing an index into an array
of ifqs. by default interfaces get a single ifq but can ask for
more using if_attach_queues().

the vast majority of our drivers still think there's a 1:1 mapping
between interfaces and transmit queues, so their if_start routines
take an ifnet pointer instead of a pointer to the ifqueue struct.
instead of changing all the drivers in the tree, drivers can opt
into using an if_qstart routine and setting the IFXF_MPSAFE flag.
the stack provides a compatability wrapper from the new if_qstart
handler to the previous if_start handlers if IFXF_MPSAFE isnt set.

enabling hfsc on an interface configures it to transmit everything
through the first ifq. any other ifqs are left configured as priq,
but unused, when hfsc is enabled.

getting this in now so everyone can kick the tyres.

ok mpi@ visa@ (who provided some tweaks for cnmac).


# 1.184 23-Jan-2017 mpi

Flag pseudo-interfaces as such in order to call add_net_randomness()
only once per packet.

Fix a regression introduced when if_input() started to be called by
every pseudo-driver.

ok claudio@, dlg@


# 1.183 23-Jan-2017 dlg

i botched the copyout to ifr->ifr_data in SIOCGIFDATA.

this lets pflogd run again.

rename if_data() to if_getdata() while here to make grepping for
things less noisy.

reported by jsg@
worked through with deraadt@


# 1.182 23-Jan-2017 dlg

merge the ifnet and ifqueue stats together when userland wants them.

a new if_data() function takes a pointer to ifnet and merges its
if_data and ifq statistics. it takes the ifq mutex around the reads
of the ifq stats so they get a consistent copy.

the ifnet and ifq stats are merged because some parts of the stack
still update the ifnet counters.

ok visa@ (on an earlier diff) mpi@ claudio@


# 1.181 12-Dec-2016 mpi

Remove most of the splsoftnet() recursions related to cloned interfaces.

inputs and ok bluhm@


# 1.180 27-Oct-2016 dlg

add a new pool for 2k + 2 byte (mcl2k2) clusters.

a certain vendor likes to make chips that specify the rx buffer
sizes in kilobyte increments. unfortunately it places the ethernet
header on the start of the rx buffer, which means if you give it a
mcl2k cluster, the ethernet header will not be ETHER_ALIGNed cos
mcl2k clusters are always allocated on 2k boundarys (cos they pack
into pages well). that in turn means the ip header wont be aligned
correctly.

the current workaround on these chips has been to let non-strict
alignment archs just use the normal 2k cluster, but use whatever
cluster can fit 2k + 2 on strict archs. that turns out to be the
4k cluster, meaning we waste nearly 2k of space on every packet.

properly aligning the ethernet header and ip headers gives a
performance boost, even on non-strict archs.


# 1.179 04-Sep-2016 reyk

Move code to change the rdomain of an interface from the ioctl switch case
to a new function if_setrdomain().

OK mpi@ henning@


# 1.178 03-Sep-2016 reyk

Add support for a multipoint-to-multipoint mode in vxlan(4). In this
mode, vxlan(4) must be configured to accept any virtual network
identifier with "vnetid any" and added to a bridge(4) or switch(4).
This way the driver will dynamically learn the tunnel endpoints and
their vnetids for the responses and can be used to dynamically bridge
between VXLANs. It is also being used in combination with switch(4)
and the OpenFlow tunnel classifiers.

With input from yasuoka@ goda@
OK deraadt@ dlg@


Revision tags: OPENBSD_6_0_BASE
# 1.177 10-Jun-2016 vgross

Add the "llprio" field to struct ifnet, and the corresponding keyword
to ifconfig.

"llprio" allows one to set the priority of packets that do not go through
pf(4), as the case is for arp(4) or bpf(4).

ok sthen@ mikeb@


# 1.176 02-Mar-2016 dlg

provide generic ioctls for managing an interfaces parent

in the future this will subsume the individual vlandev, carpdev,
pppoedev, foodev options for things like vlan, carp, pppoe, etc.

inspired by vnetid

ok mpi@ jmatthew@


Revision tags: OPENBSD_5_9_BASE
# 1.175 05-Dec-2015 deraadt

avoid an ugly wrap in a comment


# 1.174 03-Dec-2015 dlg

rework if_start to allow nics to provide an mpsafe start routine.

existing start routines will still be called under the kernel lock
and at IPL_NET.

mpsafe start routines will be serialised so only one instance of
each interfaces function will be running in the kernel at any point
in time. this guarantees packets will be dequeued in order, and the
start routines dont have to lock against themselves because if_start
does it for them.

the code to do that is based on the scsi runqueue code.

this also provides an if_start_barrier() function that should wait
until any currently running instances of if_start have finished.

a driver can opt in to the mpsafe if_start call by doing the following:

1. setting ifp->if_xflags = IFXF_MPSAFE
2. only calling if_start() instead of its own start routine
3. clearing IFF_RUNNING before calling if_start_barrier() on its way down
4. only using IFQ_DEQUEUE (not ifq_deq_begin/commit/rollback)

to simplify the implementation the tx mitigation code has been removed.

tested by several
ok mpi@ jmatthew@


# 1.173 20-Nov-2015 mpi

Keep if_ref() private, if_get() is what you want to use before if_put().

The thread detaching an interface will sleep until all references to this
interface have been released. So we decided to only keep references for
a short period of time.

Keeping if_ref() private will hopefully help preserve this goal as long
as it makes sense.

Calling if_get()/if_put() in the same function also allows us to make
use of static analysis tools (thanks jsg@!) to catch our errors.

ok dlg@


# 1.172 24-Oct-2015 reyk

Add pair(4), a vether-based virtual Ethernet driver to interconnect
rdomains and bridges on the local system. This can be used to route
through local rdomains, to create L2 devices (like trunks) between
them, and many other things.

Discussed with many, with input from mpi@
OK sthen@ phessler@ yasuoka@ mikeb@


# 1.171 23-Oct-2015 claudio

Introduce a new sysctl NET_RT_IFNAMES that returns only ifnames to ifindex
mappings. This will be used by if_nameindex(3), if_nametoindex(3) and
if_indextoname(3) soon to fix the issues in pledge because of inet6 link
local addressing.
OK mpi@ benno@ deraadt@
The libc version will follow soon so better start updating your kernels


# 1.170 23-Oct-2015 dlg

tweak the vnetid so it can be optional and therefore cleared/deleted.

the abstract vnetid is promoted to a uin32_t, and adds a SIOCDVNETID
ioctl so it can be cleared.

this is all because i set an assignment on implementing a virtual
network interface and the students got confused when vnetid 0 didnt
show up in ifconfig output.

the vnetid in the vxlan(4) protocol is optional, but the current
code confuses 0 with no vnetid being set. this makes it clear.

ok reyk@ who also simplified my diff


# 1.169 05-Oct-2015 uebayasi

Add ifi_oqdrops and its alias to struct if_data.

Necessary bumps in Ports will be handled by sthen@.

OK mpi@ dlg@


# 1.168 27-Sep-2015 stsp

Add if_setlladdr(), factored out from ifioctl(). Will be used by iwm(4) soon.
With suggestions from tedu@ and guenther@
ok kettenis@


# 1.167 11-Sep-2015 stsp

Make room for media types of the future. Extend the ifmedia word to 64 bits.
This changes numbers of the SIOCSIFMEDIA and SIOCGIFMEDIA ioctls and
grows struct ifmediareq.

Old ifconfig and dhclient binaries can still assign addresses, however
the 'media' subcommand stops working. Recompiling ifconfig and dhclient
with new headers before a reboot should not be necessary unless in very
special circumstances where non-default media settings must be used to
get link and console access is not available.

There may be some MD fallout but that will be cleared up later.

ok deraadt miod
with help and suggestions from several sharks attending l2k15


# 1.166 09-Sep-2015 dlg

introduce reference counts for interfaces (ie, struct ifnet *ifp).

if_get can get a reference to an ifp, but it never releases that
reference. this provides an if_put function that can be used to
decrement the refcount.

we cannot come up with a scheme for letting the network stack run on
one (or many) cpus while ioctls are pulling interfaces down on another
cpu without refcounts for the interfaces.

if_put is going in now so we can go through the stack and put the
necessary calls to it in, and then we'll backfill this implementation
to actually check the refcounts when the interface detaches.

ok mpi@ mikeb@ claudio@


# 1.165 30-Aug-2015 mpi

Use a global table for domains instead of building a list at run time.

As a side effect there's no need to run if_attachdomain() after the
list of domains has been built.

ok claudio@, reyk@


Revision tags: OPENBSD_5_8_BASE
# 1.164 07-Jun-2015 jsg

Introduce unhandled_af() for cases where code conditionally does
something based on an address family and later assumes one of the paths
was taken. This was initially just calls to panic until guenther
suggested a function to reduce the amount of strings needed.

This reduces the amount of noise with static analysers and acts
as a sanity check.

ok guenther@ bluhm@


# 1.163 18-May-2015 reyk

Move the rdomain from struct ifnet into struct if_data. This way it
will be exported to userland with the existing sysctl, getifaddrs()
and routing socket (if_msghdr.ifm_data) interfaces that expose
if_data. All programs and daemons - Apps - that call the
SIOCGIFRDOMAIN ioctl in a getifaddrs() loop or after receiving an
interface message on the routing socket can now remove the pointless
additional ioctl. In base, that could be: dhclient, isakmpd, dhcpd,
dhcrelay, ntpd, ospfd, ripd, ifconfig.

No ABI breakage because it uses a previously unused pad field in if_data.

OK mpi@ deraadt@


# 1.162 10-Apr-2015 mpi

Run detach hook and similar before cleaning up any other resource when
an interface is destroyed/removed. This way we can ensure pseudo-driver
changes done after attaching an interface are undone before detaching it.

Note: it is safe to call if_deactivate() multiple times as the interface
should not have any attached pseudo-interface after the first call.

ok deraadt@, dlg@


# 1.161 18-Mar-2015 dlg

remove the congestion handling from struct ifqueue.

its only used for the ip and ip6 network stack input queues, so it
seems unfair that every instance of ifqueue has to carry a pointer
around for this specific use case.

this moves the congestion marker to a kernel global. if we detect
that we're congested, we assume the whole system is busy and punish
all input queues.

marking a system as congested is done by setting the global to the
current value of ticks. as the system moves away from that value,
it moves away from being congested until the comparison fails.

written at s2k15
ok henning@ beck@ bluhm@ claudio@


Revision tags: OPENBSD_5_7_BASE
# 1.160 08-Feb-2015 mpi

Introduce if_input() a function to pass packets dequeued from a
recieving ring to the stack.

if_input() is at the moment a drop-in replacement for ether_input_mbuf()
but will let us stack pseudo-driver in a nice way in order to no longer
call ether_input() recursively.

ok pelikan@, reyk@, blambert@, henning@


# 1.159 06-Jan-2015 stsp

Remove the NOINET6 interface flag, a left-over from the times when IPv6
was enabled by default. Add AFATTACH/AFDETACH ioctls which enable/disable
an address family for an interface (currently used for IPv6 only).

New kernel needs new ifconfig for IPv6 configuration (address assignment
still works with old ifconfig making this easy to cross over).

Committing on behalf of henning@ who is currently lebensmittelvergiftet.
ok stsp, benno, mpi


# 1.158 05-Dec-2014 mpi

Explicitly include <net/if_var.h> instead of pulling it in <net/if.h>.

ok mikeb@, krw@, bluhm@, tedu@


Revision tags: OPENBSD_5_6_BASE
# 1.157 14-Jul-2014 dlg

now that receive ring accounting has been pulled out of the mbuf layer,
we can pull the space the mbuf layer used to do per interface accounting
out of struct if_data.

saves a hundredish bytes on every interface.

ok deraadt@ claudio@


# 1.156 11-Jul-2014 henning

introduce the IFXF_AUTOCONF6 interface flag which controls wether we
accept rtadvs on that interface. the global net.inet6.ip6.accept_rtadv
sysctl just doesn't cut it, even tho the spec wants that - but in their
little absurd world, a host just has one interface by definition anyway...
the sysctlgoes away.
lots of head scratching, brain cell elemination etc from bluhm benno stsp
florian, excitement from simon and todd, ok bluhm stsp benno florian


# 1.155 08-Jul-2014 dlg

introduce the if_rxr api. it is intended to pull the rx ring accounting
out of the mbuf layer, and break the assumption that an interface will
only have a single ring per mbuf cluster size.

mpi@ is ok with moving this forward


# 1.154 13-Jun-2014 mpi

Instead of updating all the cluster allocation water marks of all the
interfaces when the kernel is livelocked, only do it for the current
pool and defer the other updates.

This allow us to get rid of an interface list iteration in a critical
path.

Ridding the libc crank since this change introduce an ABI break.

ok claudio@


Revision tags: OPENBSD_5_5_BASE
# 1.153 21-Nov-2013 mikeb

split kernel parts of the if.h into a separate header file if_var.h
which allows us to modify ifnet structure in a relatively safe way;
discussed with deraadt, ok mpi


# 1.152 09-Nov-2013 dlg

ticks is compared against mcl_grown to see if time has elapsed since
the rx ring was last allowed to grow and then assigned to it. it
is erroneous to do this because mcl_grown is a u_int and ticks is an
int.

this makes mcl_grown an int, and follows the idiom in kern_timeout.c
of going "thing - ticks < diff", which better copes with ticks
wrapping around and being used to calculate relative intervals.

ok pirofti@ guenther@


# 1.151 01-Nov-2013 pelikan

keep net/hfsc.h away from userspace, except in pfctl

tested by naddy, ok deraadt


# 1.150 21-Oct-2013 benno

nuke comment. How soon is now?
"do it" deraadt@


# 1.149 19-Oct-2013 reyk

Bring back the if_detachhook. We're going to have more users now.

ok mpi@ henning@ benno@


# 1.148 13-Oct-2013 reyk

Import vxlan(4), the virtual extensible local area network tunnel
interface. VXLAN is a UDP-based tunnelling protocol for overlaying
virtualized layer 2 networks over layer 3 networks. The implementation
is based on draft-mahalingam-dutt-dcops-vxlan-04 and has been tested
with other implementations in the wild.

put it in deraadt@


# 1.147 12-Oct-2013 henning

new bandwidth shaping subsystem, kernel side
uses hfsc behind the scenes; altq stays in parallel for a migration phase.
if.h even more messy for the transition, but eventuelly it should become
readable...
looked over & tested by many, ok phessler sthen


# 1.146 17-Sep-2013 mpi

Change vlan(4) detach procedure to not use a hook but a list of vlans
on the parent interface. This is similar to what bridge(4), trunk(4)
or carp(4) are doing and allows us to get rid of the detachhook.

ok reyk@, mikeb@


# 1.145 28-Aug-2013 mpi

Remove unused argument from *rtrequest()

ok krw@, mikeb@


Revision tags: OPENBSD_5_4_BASE
# 1.144 20-Jun-2013 mpi

Revert previous and unbreak asr, the new include should be protected.

Reported by naddy@


# 1.143 20-Jun-2013 mpi

Allocate the various hook head descriptors as part of the ifnet
structure rather than doing various M_WAITOK allocations during
the *attach() functions, we always rely on them anyway.

ok mikeb@, uebayasi@


# 1.142 02-Apr-2013 mpi

Instead of storing the link-level address of every interface in a global
array indexed by interface numbers, add a new field to the interface
descriptor pointing to it.

claudio@ and todd@ like it, ok mikeb@


# 1.141 26-Mar-2013 mpi

Remove various read-only *maxlen variables and use IFQ_MAXLEN directly.

ok beck@, mikeb@


# 1.140 20-Mar-2013 mpi

Introduce if_get() to retrieve an interface descriptor pointer given
an interface index and replace all the redondant checks and accesses
to a global array by a call to this function.

With imputs from and ok bluhm@, mikeb@


# 1.139 07-Mar-2013 mpi

Remove unused ifa_ifwithaf() function.

ok mikeb@, miod@


# 1.138 07-Mar-2013 mpi

Remove the IFAFREE() macro, the ifafree() function it was calling already
check for the reference counter.

ok mikeb@, miod@, pelikan@, kettenis@, krw@


Revision tags: OPENBSD_5_3_BASE
# 1.137 23-Nov-2012 sthen

Add SIOCGIFHARDMTU to allow retrieving the driver's maximum supported MTU
looks fine reyk@ ok mikeb@


# 1.136 11-Nov-2012 deraadt

align ifaliasreq.ifra_addr similar to the way that ifreq is fixed --
a gruesome union, to block the compiler from placing the struct
incorrectly aligned on stack frames
ok guenther


# 1.135 05-Oct-2012 camield

Point an interface directly to its bridgeport configuration, instead
of to the bridge itself. This is ok, since an interface can only be part
of one bridge, and the parent bridge is easy to find from the bridgeport.

This way we can get rid of a lot of list walks, improving performance
and shortening the code.

ok henning stsp sthen reyk


# 1.134 19-Sep-2012 henning

defina an IFCAP_CSUM_MASK, covering IFCAP_CSUM_*, and use it in if_vlan.c
to replace the list of them.
this actually makes vlan inherit the IPv6 CSUM flags from it's parent, that
had been commented out since this code was committed back in 2001.
ok benno mpf


# 1.133 10-Sep-2012 guenther

Bring into compliance with POSIX, exposing just the specified bits.

Requested by jasper@, ok kettenis@


# 1.132 21-Aug-2012 bluhm

Reverse the name and meaning of the IFXF_INET6_PRIVACY interface
flag. It is now called IFXF_INET6_NOPRIVACY. So IPv6 privacy
addresses are on by default without resetting the flag during
ifconfig down/up.
OK stsp@, sperreault@ (who wrote the same diff)


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.131 02-Dec-2011 haesbaert

Kill unused IFCAP_IPSEC and IFCAP_IPCOMP.

ok claudio@ henning@ mikeb@


# 1.130 02-Nov-2011 haesbaert

Expose if_capabilities to userland so that ifconfig can display the
device hardware features.
Tune ifconfig to show them with 'hwfeatures' argument.
While here, kill some old unused capabilities and respect 80 columns
in brconfig.h.

ok mcbride@, henning@, mpf@.


# 1.129 07-Oct-2011 henning

rename some vars and functions
unfortunately altq is one giant namespace violation. rename just those that
conflict with new stuff for now only to be found on my laptop. reduce pain,
the diff is huge already. ok ryan


Revision tags: OPENBSD_5_0_BASE
# 1.128 08-Jul-2011 henning

new priority queueing implementation, extremely low overhead, thus fast.
unconditional, always on. 8 priority levels, as every better switch, the
vlan header etc etc. ok ryan mpf sthen, pea tested as well


# 1.127 07-Jul-2011 henning

provide IF_LEN and IFQ_LEN to access ifq_len on an ifqueue, ryan ok


# 1.126 05-Jul-2011 henning

now of course I only noticed if_qflush is completely unused after
adjusting it to the new world order in my tree... remove it, ok ryan claudio


# 1.125 03-Jul-2011 henning

IFQ_CLASSIFY is also just schrapnel


# 1.124 03-Jul-2011 henning

no traces of ALTQ_DECL to be found anywhere, thus kill the #defines


# 1.123 03-Jul-2011 claudio

LINK_STATE_IS_UP() should consider LINK_STATE_UNKNOWN as an up state.
This is now possible because carp no longer uses LINK_STATE_UNKNOWN
for a state that is considered down. This will simplify a lot of code.
OK mpf@ mcbride@ henning@


# 1.122 13-Mar-2011 stsp

Add a way to enable/disable Wake On LAN with ifconfig.
ok deraadt


Revision tags: OPENBSD_4_9_BASE
# 1.121 17-Nov-2010 henning

introduce ifa_update_broadaddr to update an ifaddr's broadcast address,
trivial for the moment, more needed soon
tested by many as part of a larger diff, ok sthen claudio dlg krw


# 1.120 24-Sep-2010 claudio

Implement if_freenameindex() as a real function as required by posix.
OK deraadt@, millert@


# 1.119 23-Sep-2010 dlg

tweak the mclgeti algorithm to behave better under load.

instead of letting hardware rings grow on every interrupt, restrict
it so it can only grow once per softclock tick. we can only punish
the rings on softclock ticks, so it make sense to only grow on
softclock tick boundaries too.

the rings are now punished after >1 lost softclock tick rather than
>2. mclgeti is now more aggressive at detecting livelock.

the rings get punished by an 8th, rather than by half.

we now allow the rings to be punished again even if the system is
already considered in livelock.

without this diff a livelocked system will have its rx ring sizes
scale up and down very rapidly, while holding the rings low for too
long. this affected throughput significantly.

discussed and tested heavily at j2k10. there are still some games
with softnet we can play, but this is a good first step.

"put it in" and ok deraadt@
ok claudio@ krw@ henning@ mcbride@

if we find out that it sucks we can pull it out again later. till then
we'll run with it and see how it goes.


# 1.118 27-Aug-2010 jsg

remove the unused if_init callback in struct ifnet
ok deraadt@ henning@ claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.117 26-Jun-2010 claudio

Implement a simple keepalive mechanism in gre(4) that is compatible with
the one used by Cisco. It sends a return gre packet inside a gre packet
to the other side and expects it to return.
OK deraadt, reyk additional testing by sthen


# 1.116 28-May-2010 claudio

Rework the way we handle MPLS in the kernel. Instead of fumbling MPLS into
ether_output() and later on other L2 output functions use a trick and over-
load the ifp->if_output() function pointer on MPLS enabled interfaces to
go through mpls_output() which will then call the link level output function.
By setting IFXF_MPLS on an interface the output pointers are switched.
This now allows to cleanup the MPLS input and output pathes and fix mpe(4)
so that the MPLS code now actually works for both P and PE systems.
Tested by myself and michele
(A custom kernel with MPLS and mpe enabled is still needed).


# 1.115 17-Apr-2010 deraadt

split SIOCSIFLLADDR code out into an ifnewlladr() function
ok stsp


# 1.114 06-Apr-2010 stsp

Simple implementation of RFC4941, "Privacy Extensions for Stateless
Address Autoconfiguration in IPv6". For those among us who are paranoid
about broadcasting their MAC address to the IPv6 internet.

Man page help from jmc, testing by weerd, arc4random API hints from djm.

ok deraadt, claudio


Revision tags: OPENBSD_4_7_BASE
# 1.113 13-Jan-2010 henning

maintain a global RB tree of all local addresses in the system. this
includes AF_LINK addresses (aka mac addresses in the ethernet case). for
inet this also includes the broadcast addresses.
depends on ifinit() called earlier so we have a chance to pool_init before
autoconf assigns the AF_LINK addresses, the v6 fix, and the ifa_add/del
abstraction i just committed.
this is a change in semantics, it is now illegal to change the actual
address in an ifaddr struct because then the RB tree becomes unbalanced.
nothing using this tree yet.
ok theo ryan dlg


# 1.112 13-Jan-2010 henning

instead of fiddling with the per-interface address lists directly in
many places create a proper API (ifa_add / ifa_del) and use it.
ok theo ryan dlg


# 1.111 12-Jan-2010 deraadt

Make the structures for ifa_msghdr and friends even more like
the route messages so that people and compilers will not get
confused.
ok claudio


# 1.110 17-Sep-2009 claudio

Remove the comaptibility structures for routing socket version 3.
The RTM_VERSION bump is 2 years ago and so there is no need for this.
Diff made by tedu@ some time ago but got never commited so I do it now.


# 1.109 14-Sep-2009 claudio

Add a way to convert the ifi_link_state to a string without the use of
if_media. This makes link state tracking a lot easier as there is no need
to convert if types to if_media types, etc. Additionally this allows us
to extend the link states to include states tracked on higher protocol layers.
gre(4) keepalives packets, bfd and udld can be implemented without ugly hacks.
OK henning, michele, sthen, deraadt


# 1.108 10-Aug-2009 deraadt

At sys_reboot time, bring all the interfaces down so that their xxstop
functions are called, which will turn off DMA. Receiving packets into
your memory after a system reboot is pretty nasty. This will also mean
that the shutdown hooks can go; this solution is smaller.
ok henning miod dlg kettenis


Revision tags: OPENBSD_4_6_BASE
# 1.107 06-Jun-2009 rainer

when xflags got changed, tell the userland by routing sockets

ok henning@


# 1.106 05-Jun-2009 claudio

Initial support for routing domains. This allows to bind interfaces to
alternate routing table and separate them from other interfaces in distinct
routing tables. The same network can now be used in any doamin at the same
time without causing conflicts.
This diff is mostly mechanical and adds the necessary rdomain checks accross
net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6.
input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@


# 1.105 04-Jun-2009 henning

allow IPvShit to be turned off completely per-interface.
ifconfig em0 -inet6
deletes all v6 addresses including link-local and prevents new ones from
being added.
ifconfig em0 inet6 <addr>
re-enables v6, brings the link local back and adds optional <addr>
ok theo reyk


# 1.104 03-Jun-2009 beck

make wireless interfaces priority 4 by default. other interfaces remain
priority 0. while we are in here make sure we add wi interfaces to group "wlan"
in the same way the net80211 stuff already is.

this makes dhcp multiple default routes useful on laptops.

ok claudio@


Revision tags: OPENBSD_4_5_BASE
# 1.103 27-Jan-2009 dlg

make drivers tell the mclgeti allocator what their maximum ring size is
to prevent the hwm growing beyond that. this allows the livelock mitigation
to do something where the hwm used to grow beyond twice the rx rings size.

ok kettenis@ claudio@


# 1.102 12-Dec-2008 claudio

Introduce a if_priority that will be added to RTP_STATIC when routes are
added without an expilict priority. This allows to specify less prefered
interfaces that will only take over if the primary interface loses link.
OK deraadt@


# 1.101 11-Dec-2008 deraadt

export per-interface mbuf cluster pool use statistics out to userland
inside if_data, so that netstat(1) and systat(1) can see them
ok dlg


# 1.100 30-Nov-2008 brad

- Remove unused if_reset "bus reset routine" field in the ifnet struct.
- Add if_stop "stop routine" field in the ifnet struct.

ok mglocker@


# 1.99 26-Nov-2008 dlg

provide m_clsetlwm, an interface for an interface to raise its low
watermark for mbuf cluster allocations.

this is necessary for things like bge which cannot cope with less than a
certain number of pkts on the ring.

ok deraadt@


# 1.98 25-Nov-2008 deraadt

Factor increases are not needed, +1 appears to work as well.
ok dlg


# 1.97 24-Nov-2008 deraadt

move MCLPOOLS to if.h and force uipc_mbuf.c to get if.h, there is no
other option
ok dlg


# 1.96 24-Nov-2008 dlg

add several backend pools to allocate mbufs clusters of various sizes out
of. currently limited to MCLBYTES (2048 bytes) and 4096 bytes until pools
can allocate objects of sizes greater than PAGESIZE.

this allows drivers to ask for "jumbo" packets to fill rx rings with.

the second half of this change is per interface mbuf cluster allocator
statistics. drivers can use the new interface (MCLGETI), which will use
these stats to selectively fail allocations based on demand for mbufs. if
the driver isnt rapidly consuming rx mbufs, we dont allow it to allocate
many to put on its rx ring.

drivers require modifications to take advantage of both the new allocation
semantic and large clusters.

this was written and developed with deraadt@ over the last two days
ok deraadt@ claudio@


# 1.95 07-Nov-2008 deraadt

give this some /* CONSTCOND */ love


Revision tags: OPENBSD_4_4_BASE
# 1.94 10-Apr-2008 dlg

introduce mitigation for the calling of an interfaces start routine.

decent drivers prefer to have a lot of packets on the send queue so they
can queue a lot of them up on the tx ring and then post them all in one
big chunk. unfortunately our stack queues one packet onto the send queue
and then calls the start handler immediately.

this mitigates against that queue, send, queue, send behaviour by trying to
call the start routine only once per softnet. now its queue, queue, queue,
send.

this is the result of a lot of discussion with claudio@
tested by many.


Revision tags: OPENBSD_4_3_BASE
# 1.93 18-Nov-2007 mpf

Sync struct ifaltq to match struct ifqueue.
I wonder why 64-bit archs have not been bitten by this.
OK mcbride@, henning@


# 1.92 03-Sep-2007 claudio

Bump RTM_VERSION to 4 and start a new aera of routing in OpenBSD :)
Changes include 64bit counters instead of u_long, routing table id in the header
of most messages, reserved routing priority field, added a hdrlen field to skip
over the header so that binary compatibility becomes easier.
A minimal backward support for old binaries is included to ease upgrades but
don't expect anything more than ifconfig, route and dhclient to correctly work.
OK henning@ mglocker@


Revision tags: OPENBSD_4_2_BASE
# 1.91 25-Jun-2007 henning

crank ifq_maxlen from 50 to 256, so it is not smaller than most interfaces
rx rings any more. forwarding boxes with many fast interfaces can still use
some more, but this is a saner default.
ok deraadt markus henric


# 1.90 14-Jun-2007 reyk

Add a new "rtlabel" option to ifconfig. It allows to specify a route label
which will be used for new interface routes. For example,
ifconfig em0 10.1.1.0 255.255.255.0 rtlabel RING_1
will set the new interface address and attach the route label RING_1 to
the corresponding route.

manpage bits from jmc@
ok claudio@ henning@


# 1.89 29-May-2007 uwe

Define IF_ENQUEUE() and friends as proper C statements using do ... while
ok henning


# 1.88 26-May-2007 jason

one extern seems to be better than 20 for ifqmaxlen; ok krw


# 1.87 27-Mar-2007 jmc

grammar from bret lambert, and one more from me;


Revision tags: OPENBSD_4_1_BASE
# 1.86 09-Feb-2007 jmc

grammar fix from bret lambert;


# 1.85 28-Nov-2006 reyk

add additional link states to report the half duplex / full duplex
state, if known by the driver. this is required to check the full
duplex state without depending on the ifmedia ioctl which can't be
called in the kernel without process context.

ok henning@, brad@


# 1.84 16-Nov-2006 henning

introduce if_creategroup() to create an empty interface group.
code factored out from if_addgroup(), previously a group always had to have
members. ok mpf mcbride


# 1.83 31-Oct-2006 jason

ether_input_mbuf() isn't necessary, turn it into a macro and deal with
it's "special" case in ether_input(). Based on similiar idea in FreeBSD.
ok brad


Revision tags: OPENBSD_4_0_BASE
# 1.82 02-Jun-2006 mpf

Introduce attributes to interface groups.
As a first user, move the global carp(4) demotion counter
into the interface group. Thus we have the possibility
to define which carp interfaces are demoted together.

Put the demotion counter into the reserved field of the carp header.
With this, we can have carp act smarter if multiple errors occur.
It now always takes over other carp peers, that are advertising
with a higher demote count. As a side effect, we can also have
group failovers without the need of running in preempt mode.
The protocol change does not break compability with older
implementations.

Collaborative work with mcbride@

OK mcbride@, henning@


# 1.81 27-May-2006 brad

remove IFCAP_JUMBO_MTU interface capabilities flag and set if_hardmtu in a few
more drivers.

ok reyk@


# 1.80 26-May-2006 deraadt

rename jumbo mtu to if_hardmtu; ok brad reyk


# 1.79 19-May-2006 reyk

add a if_jumbo_mtu field to the interface structure for drivers
supporting ethernet jumbo frames. there's no standard for the size of
jumbo MTUs, so either let the driver set it's own value or use 9000
byte jumbo frames by default.

ok brad@


# 1.78 04-Mar-2006 brad

With the exception of two other small uncommited diffs this moves
the remainder of the network stack from splimp to splnet.

ok miod@


Revision tags: OPENBSD_3_9_BASE
# 1.77 09-Feb-2006 reyk

add an interface detach hook and use it with the vlan(4) driver. this
fixes a possible crash if the parent interface has been destroyed
(like vlan on trunk) before destroying the vlan interface.

ok brad@


Revision tags: OPENBSD_3_8_BASE
# 1.76 14-Jun-2005 henning

rename function and define to reflect the external -> egress name change
so it is clear what it is all about


# 1.75 14-Jun-2005 henning

use "egress" instead of "external" for the interface group containing the
interfaces the default route(s) point to, proposed deraadt some days ago,
ok djm deraadt


# 1.74 12-Jun-2005 henning

add SIOCGIFGMEMB ioctl, returns a list of all interfaces who are member of
the given group, markus ok


# 1.73 07-Jun-2005 henning

introduce a default "external" interface group, containing the interface(s)
the the default route(s) point to.
handles IPv4 and IPv6 as well as multipath routes.
follows default route changes, of course.
eases writing pf rulesets especially on laptops etc. that use different
interfaces depending on the environment (wired, wireless, ...)
ok theo ryan


# 1.72 06-Jun-2005 henning

use a define instead of hardcoding "all" in 3 places


# 1.71 05-Jun-2005 henning

const'ify the char *groupname param to if_addgroup and if_delgroup


# 1.70 24-May-2005 markus

add net.inet.ip.ifq for monitoring and changing ifqueue; similar to netbsd
ok henning


# 1.69 24-May-2005 reyk

initial import of a trunking (link aggregation and link failover)
implementation. it currently supports round robin mode with link state
checking, additional modes will be added later.

ok brad@, deraadt@


# 1.68 24-May-2005 henning

keep a list of member interfaces in ifg_group


# 1.67 22-May-2005 henning

allow pf to match on interface groups
pass on mygroup ...
markus ok


# 1.66 21-May-2005 henning

clean up and rework the interface absraction code big time, rip out multiple
useless layers of indirection and make the code way cleaner overall.
this is just the start, more to come...
worked very hard on by Ryan and me in Montreal last week, on the airplane to
vancouver and yesterday here in calgary. it hurt.
ok ryan theo


# 1.65 20-Apr-2005 mpf

Introduce if_linkstatehooks.
This converts if_link_state_change() to a generic usable
callback with dohooks().

OK henning@, camield@
Tested by camield@ and Alexey E. Suslikov


Revision tags: OPENBSD_3_7_BASE
# 1.64 07-Feb-2005 mcbride

Add new function if_link_state_change() to take care of sending messages
on the routing socket and notifying carp() of link changes.

ok brad@ mpf@


# 1.63 14-Jan-2005 henning

remove old ifgroups ioctls
the old ifgroups haven't been in use ever really, and the new
implementation is 3 months old today. theo ok (3 months ago)


# 1.62 07-Dec-2004 mcbride

Convert carp(4) to behave more like a regular interface, much in the same
style as vlan(4). carp interfaces no longer require the physical interface
to be on the same subnet as the carp interface, or even that the physical
interface has an adress at all, so CARP can now be used on /30 networks.

ok deraadt@ henning@


# 1.61 07-Dec-2004 mcbride

KNF


# 1.60 03-Dec-2004 henning

do not use one struct timeout for the if congestion stuff, but embed
a struct timeout to struct ifqueue so that each one has its own - it
is a per-queue thing. from chris pascoe


# 1.59 10-Nov-2004 grange

Safer IF_INPUT_ENQUEUE macro.

ok millert@


# 1.58 14-Oct-2004 mickey

avoid stupid commons


# 1.57 11-Oct-2004 henning

ifgroups reqrite
there is now a TAILQ with all interface groups as members, and
in struct ofnet there is only a pointer to the group structure stored
and not its name.
mostly hacked at c2k4 and somewhere over the atlantic ocean
ok markus mcbride


Revision tags: OPENBSD_3_6_BASE
# 1.56 26-Jun-2004 markus

cleanup ioctl for ifgroups; ok pb@


# 1.55 25-Jun-2004 pb

introduce "interface groups"

by "ifconfig fxp0 group foobar" "ifconfig xl0 group foobar"
these two interfaces are in one group.
Every interface has its if-family as default group.

idea/design from henning@, based on some work/disucssion from Joris Vink.

henning@, mcbride@ ok.


# 1.54 20-Jun-2004 beck

undo mbuf cluster breakage that causes free'ed packets to show up on the
input queues when using dhcp and hostap wi, or xl, or fxp....
ok art@


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.53 29-May-2004 jcs

introduce SIOCSIFDESCR and SIOCGIFDESCR to maintain interface
descriptions, configurable with ifconfig

help from various, ok deraadt@


# 1.52 18-May-2004 brad

if_ether.h
add ETHER_MAX_LEN_JUMBO, ETHER_VLAN_ENCAP_LEN, ETHER_ALIGN, and
ETHERMTU_JUMBO constants.

if.h
add a few more interface capabilities flags.

Some from NetBSD, some from FreeBSD.

ok markus@


# 1.51 26-Apr-2004 mcbride

Before enqueueing the packet, copy the contents of incoming clusters
to the mbuf and free the cluster when it contains a small packet.

ok deraadt@


# 1.50 17-Apr-2004 henning

add a congestion indicator to if_queue. It is set when the input queue
is full, along with a timer that unsets it again after 10ms.
The input queue beeing full is a reliable indicator for CPU overload, and
this flag allows other subsystems to cope with the situation.
hacked with beck
ok kjc@ markus@ beck@


Revision tags: OPENBSD_3_5_BASE
# 1.49 15-Jan-2004 markus

add a RTM_IFANNOUNCE message; from netbsd; ok itojun, henning


# 1.48 16-Dec-2003 markus

return error in ifc_destroy; ok deraadt, itojun, cedric, hshoexer


# 1.47 10-Dec-2003 itojun

use if_indexlim (instead of if_index) and ifindex2ifnet[x] != NULL
to check if interface exists, as (1) if_index will have different meaning
(2) ifindex2ifnet could become NULL when interface gets destroyed,
when we introduce dynamically-created interfaces. markus ok


# 1.46 08-Dec-2003 markus

add IOCIFGCLONERS; ifconfig -C; from netbsd; ok henning, deraadt


# 1.45 03-Dec-2003 markus

support for network interface "cloning", e.g. gif(4) via ifconfig(8)


# 1.44 19-Oct-2003 david

more typos


# 1.43 17-Oct-2003 mcbride

Common Address Redundancy Protocol

Allows multiple hosts to share an IP address, providing high availability
and load balancing.

Based on code by mickey@, with additional help from markus@
and Marco_Pfatschbacher@genua.de

ok deraadt@


Revision tags: OPENBSD_3_4_BASE
# 1.42 25-Aug-2003 fgsch

if_init support, required by ieee80211.
deraadt@ ok.


# 1.41 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_A UBC_SYNC_B
# 1.40 03-Jul-2002 miod

Change all variables definitions (int foo) in sys/sys/*.h to variable
declarations (extern int foo), and compensate in the appropriate locations.


# 1.39 30-Jun-2002 itojun

allocate sockaddr_dl for ifnet in if_alloc_sadl(), as we don't always know
the size of sockaddr_dl on if_attach() - for instance, see ether_ifattach().
from netbsd. fgs ok


# 1.38 23-Jun-2002 itojun

g/c last remains of old ipv6 prefix management


# 1.37 27-May-2002 itojun

if_attach() gets called before domaininit(). scan all interfaces for if_afdata
initialization after domaininit().


# 1.36 27-May-2002 itojun

framework to add af-dependent data structure to struct ifnet.
as discussed at bsd-api-discuss. sync w/kame


# 1.35 24-Apr-2002 dhartmei

Add hooks to struct ifnet that allow to register callbacks that will be
notified of interface address changes. ok provos@, angelos@


Revision tags: OPENBSD_3_1_BASE
# 1.34 15-Mar-2002 millert

Cosmetic changes only, primarily making comments line up nicely after the
__P removal.


# 1.33 14-Mar-2002 millert

First round of __P removal in sys


# 1.32 23-Jan-2002 fgsch

compatability -> compatibility.


Revision tags: OPENBSD_3_0_BASE UBC_BASE
# 1.31 05-Jul-2001 angelos

branches: 1.31.4;
KNF


# 1.30 05-Jul-2001 jjbg

Include files for IPComp support. angelos@ ok.


# 1.29 27-Jun-2001 kjc

ALTQ base modifications to the kernel.
- ALTQ introduces a set of new queue macros that coexist with the
traditional IF_XXX macros.
- "struct ifaltq" replaces "struct ifqueue" in "struct ifnet".
- assign cdev major 74 for i386 and 54 for alpha as ALTQ control interface.


# 1.28 23-Jun-2001 fgsch

Add ether_input_mbuf to help us remove the ether_header from
ether_input; all drivers should start migrating to this.
Discussed with jason@, deraadt@ more or les ok'ed.


# 1.27 15-Jun-2001 itojun

change the meaning of ifnet.if_lastchange to meet RFC1573 ifLastChange.
follows BSD/OS practice and ucd-snmp code (FreeBSD does it for specific
interfaces only).

was: if_lastchange get updated on every packet transmission/receipt.
now: if_lastchange get updated when IFF_UP is changed.


# 1.26 09-Jun-2001 angelos

By popular demand, protect from multiple inclusion, and fix to use the
same naming style.


# 1.25 28-May-2001 angelos

IPSECv4 -> IPSEC


# 1.24 28-May-2001 angelos

No need for separate ESP/AH interface capabilities.


# 1.23 28-May-2001 angelos

Interface capabilities (based on NetBSD, but merge ethercom and ifnet
capabilities into one, in the ifp).


Revision tags: OPENBSD_2_9_BASE
# 1.22 06-Feb-2001 mickey

allow changing number of loopbacks in ukc.
change rest of the code to use lo0ifp pointing
to the corresponding struct ifnet.
itojun@ and niklas@ ok


# 1.21 19-Jan-2001 itojun

pull post-4.4BSD change to sys/net/route.c from BSD/OS 4.2 (UCB copyrighted).

have sys/net/route.c:rtrequest1(), which takes rt_addrinfo * as the argument.
pass rt_addrinfo all the way down to rtrequest, and ifa->ifa_rtrequest.
3rd arg of ifa->ifa_rtrequest is now rt_addrinfo * instead of sockaddr *
(almost noone is using it anyways).

benefit: the follwoing command now works. previously we need two route(8)
invocations, "add" then "change".
# route add -inet6 default ::1 -ifp gif0

remove unsafe typecast in rtrequest(), from rtentry * to sockaddr *. it was
introduced by 4.3BSD-reno and never corrected.

XXX is eon_rtrequest() change correct regarding to 3rd arg?
eon_rtrequest() and rtrequest() were incorrect since 4.3BSD-reno,
so i do not have correct answer in the source code.
someone with more clue about netiso-over-ip, please help.


Revision tags: OPENBSD_2_8_BASE
# 1.20 20-Sep-2000 art

Since ifa_refcnt was bumped to an int and rt_flags is an int too, bump
ifa_flags to int.


# 1.19 28-Aug-2000 deraadt

changing the size of if_data has heavy impact on userland compat. there
was even a u_char slot available for ifi_link_state, which clearly does not
need a full 32 bits.


# 1.18 26-Aug-2000 nate

sync mii code with netbsd
adds detach functionality for phys
some code cleanup

Nobody really had time to test all of this out, but theo said commit anyway


Revision tags: OPENBSD_2_7_BASE
# 1.17 22-Mar-2000 itojun

remove if_withname(), which was imported during KAME merge by mistake.


# 1.16 21-Mar-2000 mickey

add SIOCGIFMTU/SIOCSIFMTU; remediate redundant code of tun, ppp, sppp; chris@ ok


Revision tags: SMP_BASE
# 1.15 02-Feb-2000 itojun

branches: 1.15.2;
wrap IFAFREE() by "do {} while (0)". it wasn't safe enough.


Revision tags: kame_19991208
# 1.14 08-Dec-1999 itojun

bring in KAME IPv6 code, dated 19991208.
replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support.
see sys/netinet6/{TODO,IMPLEMENTATION} for more details.

GENERIC configuration should work fine as before. GENERIC.v6 works fine
as well, but you'll need KAME userland tools to play with IPv6 (will be
bringed into soon).


Revision tags: OPENBSD_2_6_BASE
# 1.13 08-Aug-1999 niklas

Support detaching of network interfaces. Still work to do in ipf, and
other families than inet.


# 1.12 23-Jun-1999 cmetz

Added some protocol independent interfaces (supposedly IPv6 support APIs, but
ones that are useful for all protocols, not just IPv6).


Revision tags: OPENBSD_2_5_BASE
# 1.11 13-Mar-1999 deraadt

make ifa_refcnt a u_int; andrewb@demon.net


# 1.10 26-Feb-1999 jason

Ethernet bridge/IP firewall driver.


# 1.9 07-Jan-1999 deraadt

fix IFAFREE() to be safe for if/else nesting


Revision tags: OPENBSD_2_4_BASE
# 1.8 03-Sep-1998 jason

o OpenBSD gets if_media support (from NetBSD)
o rework/simplify if_xl to use it


Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE OPENBSD_2_2_BASE OPENBSD_2_3_BASE
# 1.7 02-Jul-1996 niklas

-Wall & -Wstrict-prototype fixes


# 1.6 29-Jun-1996 deraadt

provide if_attachhead(), and make if_loop use it


# 1.5 10-May-1996 deraadt

if_name/if_unit -> if_xname/if_softc


# 1.4 06-May-1996 mickey

if.h was missed from the commit.
if_ethersubr.c: missed variables added.


# 1.3 05-Mar-1996 mickey

Changes for ifconfig to compile.


# 1.2 03-Mar-1996 niklas

From NetBSD: 960217 merge


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.201 19-Apr-2019 dlg

add IF_HDRPRIO_OUTER for rxprio

IF_HDRPRIO_OUTER says you want the priority from the outer encap header.

ok claudio@


Revision tags: OPENBSD_6_5_BASE
# 1.200 10-Apr-2019 dlg

add struct if_sffpage so userland can read a page of sfp/qsfp info

this will be used by ifconfig so it can show you things like who
mades what module you're using and when it was made, and on some
modules you get diag info like temperature, vcc, and rx and tx
powers.

im putting the kernel side in so we can keep fiddling with the
userland printf side of things.

this work was done based on a question by rachel roch
ok deraadt@
enthusiasm from many including mikeb@ sthen@ patrick@


# 1.199 23-Jan-2019 dlg

add a SIOCGPWE3 ioctl for interfaces to advertise pwe3 capability

im going to turn mpw into an ethernet interface, which includes
changing its if_type to IFT_ETHER. currently ldpd looks for if_type
IFT_MPLSTUNNEL to decide if an interface is a pseudowire, ie, it's
going to break. the ioctl will let ldpd ask the interface if it is
pseudowire capable as an alternative.

ok claudio@


# 1.198 12-Nov-2018 dlg

add ifreq bits for the tx header prio field ioctls

a tx header prio can set to a fixed value from 0 to 7, or magic
values to represent populating the prio field from the encapsulated
packet, or from the mbuf prio value.

ok claudio@


# 1.197 12-Nov-2018 krw

Add new routing socket message RTM_80211INFO to provide details of
802.11 interface state changes (e.g. SSID) to interested parties.

Original diff from phessler@. Many suggestions and tweaks from
claudio@, stsp@, anton@.

ok claudio@ stsp@ anton@ phessler@


# 1.196 11-Nov-2018 dlg

use the llprio on gre(4) and eoip(4) interfaces for the keepalive tos

llprios are valued 0 to 7, while the ip tos/dscp/tclass is an 8 bit
value. fortunately the high 3 bits map nicely to the llprio values,
so we shift the llprio into place when generating the keepalive
frames. the llprio is defaulted to the value that cisco uses for
their gre keepalives.


Revision tags: OPENBSD_6_4_BASE
# 1.195 12-Sep-2018 krw

Fix obvious cut&pasto in comment (ifa_msghdr -> if_announcemsghdr).

ok claudio@


# 1.194 30-May-2018 dlg

restrict the prio values from SIOCSIFLLPRIO to what the kernel handles

previously the ioctl code checked that prio was an int less than
UCHAR_MAX, but the rest of the kernel (and priq code in particular)
expects it to be between 0 and 7 inclusive.

ok krw@ tb@


# 1.193 25-Apr-2018 jca

Make this header standalone #if __BSD_VISIBLE, by including needed headers

Puts us in line with Free/NetBSD and Linux and will get us rid of
pointless patches in the ports tree. ok guenther@ deraadt@


Revision tags: OPENBSD_6_3_BASE
# 1.192 19-Feb-2018 dlg

tunneldf needs ifr_df


# 1.191 10-Feb-2018 florian

Implement RFC 7217: "A Method for Generating Semantically Opaque
Interface Identifiers with IPv6 Stateless Address Autoconfiguration."

"An IPv6 address configured using this method is stable within each
subnet, but the corresponding Interface Identifier changes when the
host moves from one network to another. This method is meant to be an
alternative to generating Interface Identifiers based on hardware
addresses."

OK naddy, sthen


# 1.190 16-Jan-2018 mpi

Recycle IFF_NOTRAILERS into IFF_STATICARP and document ownerhsip
of IFF* flags.

inputs from jmc@, ok bluhm@, visa@


# 1.189 21-Dec-2017 dlg

prototype if_attach_iqueues so drivers can configure multiple iqs.


# 1.188 09-Nov-2017 tb

The cmd argument of ifconf() has been unused since COMPAT_LINUX was
purged. Remove it and move the prototype to if.c since ifconf() is
not used outside of this file.

ok mpi


# 1.187 31-Oct-2017 sashan

- add one more softnet taskq
NOTE: code still runs with single softnet task. change definition of
SOFTNET_TASKS in net/if.c, if you want to have more than one softnet task

OK mpi@, OK phessler@


Revision tags: OPENBSD_6_1_BASE OPENBSD_6_2_BASE
# 1.186 24-Jan-2017 krw

A space here, a space there. Soon we're talking real whitespace
rectification.


# 1.185 24-Jan-2017 dlg

add support for multiple transmit ifqueues per network interface.

an ifq to transmit a packet is picked by the current traffic
conditioner (ie, priq or hfsc) by providing an index into an array
of ifqs. by default interfaces get a single ifq but can ask for
more using if_attach_queues().

the vast majority of our drivers still think there's a 1:1 mapping
between interfaces and transmit queues, so their if_start routines
take an ifnet pointer instead of a pointer to the ifqueue struct.
instead of changing all the drivers in the tree, drivers can opt
into using an if_qstart routine and setting the IFXF_MPSAFE flag.
the stack provides a compatability wrapper from the new if_qstart
handler to the previous if_start handlers if IFXF_MPSAFE isnt set.

enabling hfsc on an interface configures it to transmit everything
through the first ifq. any other ifqs are left configured as priq,
but unused, when hfsc is enabled.

getting this in now so everyone can kick the tyres.

ok mpi@ visa@ (who provided some tweaks for cnmac).


# 1.184 23-Jan-2017 mpi

Flag pseudo-interfaces as such in order to call add_net_randomness()
only once per packet.

Fix a regression introduced when if_input() started to be called by
every pseudo-driver.

ok claudio@, dlg@


# 1.183 23-Jan-2017 dlg

i botched the copyout to ifr->ifr_data in SIOCGIFDATA.

this lets pflogd run again.

rename if_data() to if_getdata() while here to make grepping for
things less noisy.

reported by jsg@
worked through with deraadt@


# 1.182 23-Jan-2017 dlg

merge the ifnet and ifqueue stats together when userland wants them.

a new if_data() function takes a pointer to ifnet and merges its
if_data and ifq statistics. it takes the ifq mutex around the reads
of the ifq stats so they get a consistent copy.

the ifnet and ifq stats are merged because some parts of the stack
still update the ifnet counters.

ok visa@ (on an earlier diff) mpi@ claudio@


# 1.181 12-Dec-2016 mpi

Remove most of the splsoftnet() recursions related to cloned interfaces.

inputs and ok bluhm@


# 1.180 27-Oct-2016 dlg

add a new pool for 2k + 2 byte (mcl2k2) clusters.

a certain vendor likes to make chips that specify the rx buffer
sizes in kilobyte increments. unfortunately it places the ethernet
header on the start of the rx buffer, which means if you give it a
mcl2k cluster, the ethernet header will not be ETHER_ALIGNed cos
mcl2k clusters are always allocated on 2k boundarys (cos they pack
into pages well). that in turn means the ip header wont be aligned
correctly.

the current workaround on these chips has been to let non-strict
alignment archs just use the normal 2k cluster, but use whatever
cluster can fit 2k + 2 on strict archs. that turns out to be the
4k cluster, meaning we waste nearly 2k of space on every packet.

properly aligning the ethernet header and ip headers gives a
performance boost, even on non-strict archs.


# 1.179 04-Sep-2016 reyk

Move code to change the rdomain of an interface from the ioctl switch case
to a new function if_setrdomain().

OK mpi@ henning@


# 1.178 03-Sep-2016 reyk

Add support for a multipoint-to-multipoint mode in vxlan(4). In this
mode, vxlan(4) must be configured to accept any virtual network
identifier with "vnetid any" and added to a bridge(4) or switch(4).
This way the driver will dynamically learn the tunnel endpoints and
their vnetids for the responses and can be used to dynamically bridge
between VXLANs. It is also being used in combination with switch(4)
and the OpenFlow tunnel classifiers.

With input from yasuoka@ goda@
OK deraadt@ dlg@


Revision tags: OPENBSD_6_0_BASE
# 1.177 10-Jun-2016 vgross

Add the "llprio" field to struct ifnet, and the corresponding keyword
to ifconfig.

"llprio" allows one to set the priority of packets that do not go through
pf(4), as the case is for arp(4) or bpf(4).

ok sthen@ mikeb@


# 1.176 02-Mar-2016 dlg

provide generic ioctls for managing an interfaces parent

in the future this will subsume the individual vlandev, carpdev,
pppoedev, foodev options for things like vlan, carp, pppoe, etc.

inspired by vnetid

ok mpi@ jmatthew@


Revision tags: OPENBSD_5_9_BASE
# 1.175 05-Dec-2015 deraadt

avoid an ugly wrap in a comment


# 1.174 03-Dec-2015 dlg

rework if_start to allow nics to provide an mpsafe start routine.

existing start routines will still be called under the kernel lock
and at IPL_NET.

mpsafe start routines will be serialised so only one instance of
each interfaces function will be running in the kernel at any point
in time. this guarantees packets will be dequeued in order, and the
start routines dont have to lock against themselves because if_start
does it for them.

the code to do that is based on the scsi runqueue code.

this also provides an if_start_barrier() function that should wait
until any currently running instances of if_start have finished.

a driver can opt in to the mpsafe if_start call by doing the following:

1. setting ifp->if_xflags = IFXF_MPSAFE
2. only calling if_start() instead of its own start routine
3. clearing IFF_RUNNING before calling if_start_barrier() on its way down
4. only using IFQ_DEQUEUE (not ifq_deq_begin/commit/rollback)

to simplify the implementation the tx mitigation code has been removed.

tested by several
ok mpi@ jmatthew@


# 1.173 20-Nov-2015 mpi

Keep if_ref() private, if_get() is what you want to use before if_put().

The thread detaching an interface will sleep until all references to this
interface have been released. So we decided to only keep references for
a short period of time.

Keeping if_ref() private will hopefully help preserve this goal as long
as it makes sense.

Calling if_get()/if_put() in the same function also allows us to make
use of static analysis tools (thanks jsg@!) to catch our errors.

ok dlg@


# 1.172 24-Oct-2015 reyk

Add pair(4), a vether-based virtual Ethernet driver to interconnect
rdomains and bridges on the local system. This can be used to route
through local rdomains, to create L2 devices (like trunks) between
them, and many other things.

Discussed with many, with input from mpi@
OK sthen@ phessler@ yasuoka@ mikeb@


# 1.171 23-Oct-2015 claudio

Introduce a new sysctl NET_RT_IFNAMES that returns only ifnames to ifindex
mappings. This will be used by if_nameindex(3), if_nametoindex(3) and
if_indextoname(3) soon to fix the issues in pledge because of inet6 link
local addressing.
OK mpi@ benno@ deraadt@
The libc version will follow soon so better start updating your kernels


# 1.170 23-Oct-2015 dlg

tweak the vnetid so it can be optional and therefore cleared/deleted.

the abstract vnetid is promoted to a uin32_t, and adds a SIOCDVNETID
ioctl so it can be cleared.

this is all because i set an assignment on implementing a virtual
network interface and the students got confused when vnetid 0 didnt
show up in ifconfig output.

the vnetid in the vxlan(4) protocol is optional, but the current
code confuses 0 with no vnetid being set. this makes it clear.

ok reyk@ who also simplified my diff


# 1.169 05-Oct-2015 uebayasi

Add ifi_oqdrops and its alias to struct if_data.

Necessary bumps in Ports will be handled by sthen@.

OK mpi@ dlg@


# 1.168 27-Sep-2015 stsp

Add if_setlladdr(), factored out from ifioctl(). Will be used by iwm(4) soon.
With suggestions from tedu@ and guenther@
ok kettenis@


# 1.167 11-Sep-2015 stsp

Make room for media types of the future. Extend the ifmedia word to 64 bits.
This changes numbers of the SIOCSIFMEDIA and SIOCGIFMEDIA ioctls and
grows struct ifmediareq.

Old ifconfig and dhclient binaries can still assign addresses, however
the 'media' subcommand stops working. Recompiling ifconfig and dhclient
with new headers before a reboot should not be necessary unless in very
special circumstances where non-default media settings must be used to
get link and console access is not available.

There may be some MD fallout but that will be cleared up later.

ok deraadt miod
with help and suggestions from several sharks attending l2k15


# 1.166 09-Sep-2015 dlg

introduce reference counts for interfaces (ie, struct ifnet *ifp).

if_get can get a reference to an ifp, but it never releases that
reference. this provides an if_put function that can be used to
decrement the refcount.

we cannot come up with a scheme for letting the network stack run on
one (or many) cpus while ioctls are pulling interfaces down on another
cpu without refcounts for the interfaces.

if_put is going in now so we can go through the stack and put the
necessary calls to it in, and then we'll backfill this implementation
to actually check the refcounts when the interface detaches.

ok mpi@ mikeb@ claudio@


# 1.165 30-Aug-2015 mpi

Use a global table for domains instead of building a list at run time.

As a side effect there's no need to run if_attachdomain() after the
list of domains has been built.

ok claudio@, reyk@


Revision tags: OPENBSD_5_8_BASE
# 1.164 07-Jun-2015 jsg

Introduce unhandled_af() for cases where code conditionally does
something based on an address family and later assumes one of the paths
was taken. This was initially just calls to panic until guenther
suggested a function to reduce the amount of strings needed.

This reduces the amount of noise with static analysers and acts
as a sanity check.

ok guenther@ bluhm@


# 1.163 18-May-2015 reyk

Move the rdomain from struct ifnet into struct if_data. This way it
will be exported to userland with the existing sysctl, getifaddrs()
and routing socket (if_msghdr.ifm_data) interfaces that expose
if_data. All programs and daemons - Apps - that call the
SIOCGIFRDOMAIN ioctl in a getifaddrs() loop or after receiving an
interface message on the routing socket can now remove the pointless
additional ioctl. In base, that could be: dhclient, isakmpd, dhcpd,
dhcrelay, ntpd, ospfd, ripd, ifconfig.

No ABI breakage because it uses a previously unused pad field in if_data.

OK mpi@ deraadt@


# 1.162 10-Apr-2015 mpi

Run detach hook and similar before cleaning up any other resource when
an interface is destroyed/removed. This way we can ensure pseudo-driver
changes done after attaching an interface are undone before detaching it.

Note: it is safe to call if_deactivate() multiple times as the interface
should not have any attached pseudo-interface after the first call.

ok deraadt@, dlg@


# 1.161 18-Mar-2015 dlg

remove the congestion handling from struct ifqueue.

its only used for the ip and ip6 network stack input queues, so it
seems unfair that every instance of ifqueue has to carry a pointer
around for this specific use case.

this moves the congestion marker to a kernel global. if we detect
that we're congested, we assume the whole system is busy and punish
all input queues.

marking a system as congested is done by setting the global to the
current value of ticks. as the system moves away from that value,
it moves away from being congested until the comparison fails.

written at s2k15
ok henning@ beck@ bluhm@ claudio@


Revision tags: OPENBSD_5_7_BASE
# 1.160 08-Feb-2015 mpi

Introduce if_input() a function to pass packets dequeued from a
recieving ring to the stack.

if_input() is at the moment a drop-in replacement for ether_input_mbuf()
but will let us stack pseudo-driver in a nice way in order to no longer
call ether_input() recursively.

ok pelikan@, reyk@, blambert@, henning@


# 1.159 06-Jan-2015 stsp

Remove the NOINET6 interface flag, a left-over from the times when IPv6
was enabled by default. Add AFATTACH/AFDETACH ioctls which enable/disable
an address family for an interface (currently used for IPv6 only).

New kernel needs new ifconfig for IPv6 configuration (address assignment
still works with old ifconfig making this easy to cross over).

Committing on behalf of henning@ who is currently lebensmittelvergiftet.
ok stsp, benno, mpi


# 1.158 05-Dec-2014 mpi

Explicitly include <net/if_var.h> instead of pulling it in <net/if.h>.

ok mikeb@, krw@, bluhm@, tedu@


Revision tags: OPENBSD_5_6_BASE
# 1.157 14-Jul-2014 dlg

now that receive ring accounting has been pulled out of the mbuf layer,
we can pull the space the mbuf layer used to do per interface accounting
out of struct if_data.

saves a hundredish bytes on every interface.

ok deraadt@ claudio@


# 1.156 11-Jul-2014 henning

introduce the IFXF_AUTOCONF6 interface flag which controls wether we
accept rtadvs on that interface. the global net.inet6.ip6.accept_rtadv
sysctl just doesn't cut it, even tho the spec wants that - but in their
little absurd world, a host just has one interface by definition anyway...
the sysctlgoes away.
lots of head scratching, brain cell elemination etc from bluhm benno stsp
florian, excitement from simon and todd, ok bluhm stsp benno florian


# 1.155 08-Jul-2014 dlg

introduce the if_rxr api. it is intended to pull the rx ring accounting
out of the mbuf layer, and break the assumption that an interface will
only have a single ring per mbuf cluster size.

mpi@ is ok with moving this forward


# 1.154 13-Jun-2014 mpi

Instead of updating all the cluster allocation water marks of all the
interfaces when the kernel is livelocked, only do it for the current
pool and defer the other updates.

This allow us to get rid of an interface list iteration in a critical
path.

Ridding the libc crank since this change introduce an ABI break.

ok claudio@


Revision tags: OPENBSD_5_5_BASE
# 1.153 21-Nov-2013 mikeb

split kernel parts of the if.h into a separate header file if_var.h
which allows us to modify ifnet structure in a relatively safe way;
discussed with deraadt, ok mpi


# 1.152 09-Nov-2013 dlg

ticks is compared against mcl_grown to see if time has elapsed since
the rx ring was last allowed to grow and then assigned to it. it
is erroneous to do this because mcl_grown is a u_int and ticks is an
int.

this makes mcl_grown an int, and follows the idiom in kern_timeout.c
of going "thing - ticks < diff", which better copes with ticks
wrapping around and being used to calculate relative intervals.

ok pirofti@ guenther@


# 1.151 01-Nov-2013 pelikan

keep net/hfsc.h away from userspace, except in pfctl

tested by naddy, ok deraadt


# 1.150 21-Oct-2013 benno

nuke comment. How soon is now?
"do it" deraadt@


# 1.149 19-Oct-2013 reyk

Bring back the if_detachhook. We're going to have more users now.

ok mpi@ henning@ benno@


# 1.148 13-Oct-2013 reyk

Import vxlan(4), the virtual extensible local area network tunnel
interface. VXLAN is a UDP-based tunnelling protocol for overlaying
virtualized layer 2 networks over layer 3 networks. The implementation
is based on draft-mahalingam-dutt-dcops-vxlan-04 and has been tested
with other implementations in the wild.

put it in deraadt@


# 1.147 12-Oct-2013 henning

new bandwidth shaping subsystem, kernel side
uses hfsc behind the scenes; altq stays in parallel for a migration phase.
if.h even more messy for the transition, but eventuelly it should become
readable...
looked over & tested by many, ok phessler sthen


# 1.146 17-Sep-2013 mpi

Change vlan(4) detach procedure to not use a hook but a list of vlans
on the parent interface. This is similar to what bridge(4), trunk(4)
or carp(4) are doing and allows us to get rid of the detachhook.

ok reyk@, mikeb@


# 1.145 28-Aug-2013 mpi

Remove unused argument from *rtrequest()

ok krw@, mikeb@


Revision tags: OPENBSD_5_4_BASE
# 1.144 20-Jun-2013 mpi

Revert previous and unbreak asr, the new include should be protected.

Reported by naddy@


# 1.143 20-Jun-2013 mpi

Allocate the various hook head descriptors as part of the ifnet
structure rather than doing various M_WAITOK allocations during
the *attach() functions, we always rely on them anyway.

ok mikeb@, uebayasi@


# 1.142 02-Apr-2013 mpi

Instead of storing the link-level address of every interface in a global
array indexed by interface numbers, add a new field to the interface
descriptor pointing to it.

claudio@ and todd@ like it, ok mikeb@


# 1.141 26-Mar-2013 mpi

Remove various read-only *maxlen variables and use IFQ_MAXLEN directly.

ok beck@, mikeb@


# 1.140 20-Mar-2013 mpi

Introduce if_get() to retrieve an interface descriptor pointer given
an interface index and replace all the redondant checks and accesses
to a global array by a call to this function.

With imputs from and ok bluhm@, mikeb@


# 1.139 07-Mar-2013 mpi

Remove unused ifa_ifwithaf() function.

ok mikeb@, miod@


# 1.138 07-Mar-2013 mpi

Remove the IFAFREE() macro, the ifafree() function it was calling already
check for the reference counter.

ok mikeb@, miod@, pelikan@, kettenis@, krw@


Revision tags: OPENBSD_5_3_BASE
# 1.137 23-Nov-2012 sthen

Add SIOCGIFHARDMTU to allow retrieving the driver's maximum supported MTU
looks fine reyk@ ok mikeb@


# 1.136 11-Nov-2012 deraadt

align ifaliasreq.ifra_addr similar to the way that ifreq is fixed --
a gruesome union, to block the compiler from placing the struct
incorrectly aligned on stack frames
ok guenther


# 1.135 05-Oct-2012 camield

Point an interface directly to its bridgeport configuration, instead
of to the bridge itself. This is ok, since an interface can only be part
of one bridge, and the parent bridge is easy to find from the bridgeport.

This way we can get rid of a lot of list walks, improving performance
and shortening the code.

ok henning stsp sthen reyk


# 1.134 19-Sep-2012 henning

defina an IFCAP_CSUM_MASK, covering IFCAP_CSUM_*, and use it in if_vlan.c
to replace the list of them.
this actually makes vlan inherit the IPv6 CSUM flags from it's parent, that
had been commented out since this code was committed back in 2001.
ok benno mpf


# 1.133 10-Sep-2012 guenther

Bring into compliance with POSIX, exposing just the specified bits.

Requested by jasper@, ok kettenis@


# 1.132 21-Aug-2012 bluhm

Reverse the name and meaning of the IFXF_INET6_PRIVACY interface
flag. It is now called IFXF_INET6_NOPRIVACY. So IPv6 privacy
addresses are on by default without resetting the flag during
ifconfig down/up.
OK stsp@, sperreault@ (who wrote the same diff)


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.131 02-Dec-2011 haesbaert

Kill unused IFCAP_IPSEC and IFCAP_IPCOMP.

ok claudio@ henning@ mikeb@


# 1.130 02-Nov-2011 haesbaert

Expose if_capabilities to userland so that ifconfig can display the
device hardware features.
Tune ifconfig to show them with 'hwfeatures' argument.
While here, kill some old unused capabilities and respect 80 columns
in brconfig.h.

ok mcbride@, henning@, mpf@.


# 1.129 07-Oct-2011 henning

rename some vars and functions
unfortunately altq is one giant namespace violation. rename just those that
conflict with new stuff for now only to be found on my laptop. reduce pain,
the diff is huge already. ok ryan


Revision tags: OPENBSD_5_0_BASE
# 1.128 08-Jul-2011 henning

new priority queueing implementation, extremely low overhead, thus fast.
unconditional, always on. 8 priority levels, as every better switch, the
vlan header etc etc. ok ryan mpf sthen, pea tested as well


# 1.127 07-Jul-2011 henning

provide IF_LEN and IFQ_LEN to access ifq_len on an ifqueue, ryan ok


# 1.126 05-Jul-2011 henning

now of course I only noticed if_qflush is completely unused after
adjusting it to the new world order in my tree... remove it, ok ryan claudio


# 1.125 03-Jul-2011 henning

IFQ_CLASSIFY is also just schrapnel


# 1.124 03-Jul-2011 henning

no traces of ALTQ_DECL to be found anywhere, thus kill the #defines


# 1.123 03-Jul-2011 claudio

LINK_STATE_IS_UP() should consider LINK_STATE_UNKNOWN as an up state.
This is now possible because carp no longer uses LINK_STATE_UNKNOWN
for a state that is considered down. This will simplify a lot of code.
OK mpf@ mcbride@ henning@


# 1.122 13-Mar-2011 stsp

Add a way to enable/disable Wake On LAN with ifconfig.
ok deraadt


Revision tags: OPENBSD_4_9_BASE
# 1.121 17-Nov-2010 henning

introduce ifa_update_broadaddr to update an ifaddr's broadcast address,
trivial for the moment, more needed soon
tested by many as part of a larger diff, ok sthen claudio dlg krw


# 1.120 24-Sep-2010 claudio

Implement if_freenameindex() as a real function as required by posix.
OK deraadt@, millert@


# 1.119 23-Sep-2010 dlg

tweak the mclgeti algorithm to behave better under load.

instead of letting hardware rings grow on every interrupt, restrict
it so it can only grow once per softclock tick. we can only punish
the rings on softclock ticks, so it make sense to only grow on
softclock tick boundaries too.

the rings are now punished after >1 lost softclock tick rather than
>2. mclgeti is now more aggressive at detecting livelock.

the rings get punished by an 8th, rather than by half.

we now allow the rings to be punished again even if the system is
already considered in livelock.

without this diff a livelocked system will have its rx ring sizes
scale up and down very rapidly, while holding the rings low for too
long. this affected throughput significantly.

discussed and tested heavily at j2k10. there are still some games
with softnet we can play, but this is a good first step.

"put it in" and ok deraadt@
ok claudio@ krw@ henning@ mcbride@

if we find out that it sucks we can pull it out again later. till then
we'll run with it and see how it goes.


# 1.118 27-Aug-2010 jsg

remove the unused if_init callback in struct ifnet
ok deraadt@ henning@ claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.117 26-Jun-2010 claudio

Implement a simple keepalive mechanism in gre(4) that is compatible with
the one used by Cisco. It sends a return gre packet inside a gre packet
to the other side and expects it to return.
OK deraadt, reyk additional testing by sthen


# 1.116 28-May-2010 claudio

Rework the way we handle MPLS in the kernel. Instead of fumbling MPLS into
ether_output() and later on other L2 output functions use a trick and over-
load the ifp->if_output() function pointer on MPLS enabled interfaces to
go through mpls_output() which will then call the link level output function.
By setting IFXF_MPLS on an interface the output pointers are switched.
This now allows to cleanup the MPLS input and output pathes and fix mpe(4)
so that the MPLS code now actually works for both P and PE systems.
Tested by myself and michele
(A custom kernel with MPLS and mpe enabled is still needed).


# 1.115 17-Apr-2010 deraadt

split SIOCSIFLLADDR code out into an ifnewlladr() function
ok stsp


# 1.114 06-Apr-2010 stsp

Simple implementation of RFC4941, "Privacy Extensions for Stateless
Address Autoconfiguration in IPv6". For those among us who are paranoid
about broadcasting their MAC address to the IPv6 internet.

Man page help from jmc, testing by weerd, arc4random API hints from djm.

ok deraadt, claudio


Revision tags: OPENBSD_4_7_BASE
# 1.113 13-Jan-2010 henning

maintain a global RB tree of all local addresses in the system. this
includes AF_LINK addresses (aka mac addresses in the ethernet case). for
inet this also includes the broadcast addresses.
depends on ifinit() called earlier so we have a chance to pool_init before
autoconf assigns the AF_LINK addresses, the v6 fix, and the ifa_add/del
abstraction i just committed.
this is a change in semantics, it is now illegal to change the actual
address in an ifaddr struct because then the RB tree becomes unbalanced.
nothing using this tree yet.
ok theo ryan dlg


# 1.112 13-Jan-2010 henning

instead of fiddling with the per-interface address lists directly in
many places create a proper API (ifa_add / ifa_del) and use it.
ok theo ryan dlg


# 1.111 12-Jan-2010 deraadt

Make the structures for ifa_msghdr and friends even more like
the route messages so that people and compilers will not get
confused.
ok claudio


# 1.110 17-Sep-2009 claudio

Remove the comaptibility structures for routing socket version 3.
The RTM_VERSION bump is 2 years ago and so there is no need for this.
Diff made by tedu@ some time ago but got never commited so I do it now.


# 1.109 14-Sep-2009 claudio

Add a way to convert the ifi_link_state to a string without the use of
if_media. This makes link state tracking a lot easier as there is no need
to convert if types to if_media types, etc. Additionally this allows us
to extend the link states to include states tracked on higher protocol layers.
gre(4) keepalives packets, bfd and udld can be implemented without ugly hacks.
OK henning, michele, sthen, deraadt


# 1.108 10-Aug-2009 deraadt

At sys_reboot time, bring all the interfaces down so that their xxstop
functions are called, which will turn off DMA. Receiving packets into
your memory after a system reboot is pretty nasty. This will also mean
that the shutdown hooks can go; this solution is smaller.
ok henning miod dlg kettenis


Revision tags: OPENBSD_4_6_BASE
# 1.107 06-Jun-2009 rainer

when xflags got changed, tell the userland by routing sockets

ok henning@


# 1.106 05-Jun-2009 claudio

Initial support for routing domains. This allows to bind interfaces to
alternate routing table and separate them from other interfaces in distinct
routing tables. The same network can now be used in any doamin at the same
time without causing conflicts.
This diff is mostly mechanical and adds the necessary rdomain checks accross
net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6.
input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@


# 1.105 04-Jun-2009 henning

allow IPvShit to be turned off completely per-interface.
ifconfig em0 -inet6
deletes all v6 addresses including link-local and prevents new ones from
being added.
ifconfig em0 inet6 <addr>
re-enables v6, brings the link local back and adds optional <addr>
ok theo reyk


# 1.104 03-Jun-2009 beck

make wireless interfaces priority 4 by default. other interfaces remain
priority 0. while we are in here make sure we add wi interfaces to group "wlan"
in the same way the net80211 stuff already is.

this makes dhcp multiple default routes useful on laptops.

ok claudio@


Revision tags: OPENBSD_4_5_BASE
# 1.103 27-Jan-2009 dlg

make drivers tell the mclgeti allocator what their maximum ring size is
to prevent the hwm growing beyond that. this allows the livelock mitigation
to do something where the hwm used to grow beyond twice the rx rings size.

ok kettenis@ claudio@


# 1.102 12-Dec-2008 claudio

Introduce a if_priority that will be added to RTP_STATIC when routes are
added without an expilict priority. This allows to specify less prefered
interfaces that will only take over if the primary interface loses link.
OK deraadt@


# 1.101 11-Dec-2008 deraadt

export per-interface mbuf cluster pool use statistics out to userland
inside if_data, so that netstat(1) and systat(1) can see them
ok dlg


# 1.100 30-Nov-2008 brad

- Remove unused if_reset "bus reset routine" field in the ifnet struct.
- Add if_stop "stop routine" field in the ifnet struct.

ok mglocker@


# 1.99 26-Nov-2008 dlg

provide m_clsetlwm, an interface for an interface to raise its low
watermark for mbuf cluster allocations.

this is necessary for things like bge which cannot cope with less than a
certain number of pkts on the ring.

ok deraadt@


# 1.98 25-Nov-2008 deraadt

Factor increases are not needed, +1 appears to work as well.
ok dlg


# 1.97 24-Nov-2008 deraadt

move MCLPOOLS to if.h and force uipc_mbuf.c to get if.h, there is no
other option
ok dlg


# 1.96 24-Nov-2008 dlg

add several backend pools to allocate mbufs clusters of various sizes out
of. currently limited to MCLBYTES (2048 bytes) and 4096 bytes until pools
can allocate objects of sizes greater than PAGESIZE.

this allows drivers to ask for "jumbo" packets to fill rx rings with.

the second half of this change is per interface mbuf cluster allocator
statistics. drivers can use the new interface (MCLGETI), which will use
these stats to selectively fail allocations based on demand for mbufs. if
the driver isnt rapidly consuming rx mbufs, we dont allow it to allocate
many to put on its rx ring.

drivers require modifications to take advantage of both the new allocation
semantic and large clusters.

this was written and developed with deraadt@ over the last two days
ok deraadt@ claudio@


# 1.95 07-Nov-2008 deraadt

give this some /* CONSTCOND */ love


Revision tags: OPENBSD_4_4_BASE
# 1.94 10-Apr-2008 dlg

introduce mitigation for the calling of an interfaces start routine.

decent drivers prefer to have a lot of packets on the send queue so they
can queue a lot of them up on the tx ring and then post them all in one
big chunk. unfortunately our stack queues one packet onto the send queue
and then calls the start handler immediately.

this mitigates against that queue, send, queue, send behaviour by trying to
call the start routine only once per softnet. now its queue, queue, queue,
send.

this is the result of a lot of discussion with claudio@
tested by many.


Revision tags: OPENBSD_4_3_BASE
# 1.93 18-Nov-2007 mpf

Sync struct ifaltq to match struct ifqueue.
I wonder why 64-bit archs have not been bitten by this.
OK mcbride@, henning@


# 1.92 03-Sep-2007 claudio

Bump RTM_VERSION to 4 and start a new aera of routing in OpenBSD :)
Changes include 64bit counters instead of u_long, routing table id in the header
of most messages, reserved routing priority field, added a hdrlen field to skip
over the header so that binary compatibility becomes easier.
A minimal backward support for old binaries is included to ease upgrades but
don't expect anything more than ifconfig, route and dhclient to correctly work.
OK henning@ mglocker@


Revision tags: OPENBSD_4_2_BASE
# 1.91 25-Jun-2007 henning

crank ifq_maxlen from 50 to 256, so it is not smaller than most interfaces
rx rings any more. forwarding boxes with many fast interfaces can still use
some more, but this is a saner default.
ok deraadt markus henric


# 1.90 14-Jun-2007 reyk

Add a new "rtlabel" option to ifconfig. It allows to specify a route label
which will be used for new interface routes. For example,
ifconfig em0 10.1.1.0 255.255.255.0 rtlabel RING_1
will set the new interface address and attach the route label RING_1 to
the corresponding route.

manpage bits from jmc@
ok claudio@ henning@


# 1.89 29-May-2007 uwe

Define IF_ENQUEUE() and friends as proper C statements using do ... while
ok henning


# 1.88 26-May-2007 jason

one extern seems to be better than 20 for ifqmaxlen; ok krw


# 1.87 27-Mar-2007 jmc

grammar from bret lambert, and one more from me;


Revision tags: OPENBSD_4_1_BASE
# 1.86 09-Feb-2007 jmc

grammar fix from bret lambert;


# 1.85 28-Nov-2006 reyk

add additional link states to report the half duplex / full duplex
state, if known by the driver. this is required to check the full
duplex state without depending on the ifmedia ioctl which can't be
called in the kernel without process context.

ok henning@, brad@


# 1.84 16-Nov-2006 henning

introduce if_creategroup() to create an empty interface group.
code factored out from if_addgroup(), previously a group always had to have
members. ok mpf mcbride


# 1.83 31-Oct-2006 jason

ether_input_mbuf() isn't necessary, turn it into a macro and deal with
it's "special" case in ether_input(). Based on similiar idea in FreeBSD.
ok brad


Revision tags: OPENBSD_4_0_BASE
# 1.82 02-Jun-2006 mpf

Introduce attributes to interface groups.
As a first user, move the global carp(4) demotion counter
into the interface group. Thus we have the possibility
to define which carp interfaces are demoted together.

Put the demotion counter into the reserved field of the carp header.
With this, we can have carp act smarter if multiple errors occur.
It now always takes over other carp peers, that are advertising
with a higher demote count. As a side effect, we can also have
group failovers without the need of running in preempt mode.
The protocol change does not break compability with older
implementations.

Collaborative work with mcbride@

OK mcbride@, henning@


# 1.81 27-May-2006 brad

remove IFCAP_JUMBO_MTU interface capabilities flag and set if_hardmtu in a few
more drivers.

ok reyk@


# 1.80 26-May-2006 deraadt

rename jumbo mtu to if_hardmtu; ok brad reyk


# 1.79 19-May-2006 reyk

add a if_jumbo_mtu field to the interface structure for drivers
supporting ethernet jumbo frames. there's no standard for the size of
jumbo MTUs, so either let the driver set it's own value or use 9000
byte jumbo frames by default.

ok brad@


# 1.78 04-Mar-2006 brad

With the exception of two other small uncommited diffs this moves
the remainder of the network stack from splimp to splnet.

ok miod@


Revision tags: OPENBSD_3_9_BASE
# 1.77 09-Feb-2006 reyk

add an interface detach hook and use it with the vlan(4) driver. this
fixes a possible crash if the parent interface has been destroyed
(like vlan on trunk) before destroying the vlan interface.

ok brad@


Revision tags: OPENBSD_3_8_BASE
# 1.76 14-Jun-2005 henning

rename function and define to reflect the external -> egress name change
so it is clear what it is all about


# 1.75 14-Jun-2005 henning

use "egress" instead of "external" for the interface group containing the
interfaces the default route(s) point to, proposed deraadt some days ago,
ok djm deraadt


# 1.74 12-Jun-2005 henning

add SIOCGIFGMEMB ioctl, returns a list of all interfaces who are member of
the given group, markus ok


# 1.73 07-Jun-2005 henning

introduce a default "external" interface group, containing the interface(s)
the the default route(s) point to.
handles IPv4 and IPv6 as well as multipath routes.
follows default route changes, of course.
eases writing pf rulesets especially on laptops etc. that use different
interfaces depending on the environment (wired, wireless, ...)
ok theo ryan


# 1.72 06-Jun-2005 henning

use a define instead of hardcoding "all" in 3 places


# 1.71 05-Jun-2005 henning

const'ify the char *groupname param to if_addgroup and if_delgroup


# 1.70 24-May-2005 markus

add net.inet.ip.ifq for monitoring and changing ifqueue; similar to netbsd
ok henning


# 1.69 24-May-2005 reyk

initial import of a trunking (link aggregation and link failover)
implementation. it currently supports round robin mode with link state
checking, additional modes will be added later.

ok brad@, deraadt@


# 1.68 24-May-2005 henning

keep a list of member interfaces in ifg_group


# 1.67 22-May-2005 henning

allow pf to match on interface groups
pass on mygroup ...
markus ok


# 1.66 21-May-2005 henning

clean up and rework the interface absraction code big time, rip out multiple
useless layers of indirection and make the code way cleaner overall.
this is just the start, more to come...
worked very hard on by Ryan and me in Montreal last week, on the airplane to
vancouver and yesterday here in calgary. it hurt.
ok ryan theo


# 1.65 20-Apr-2005 mpf

Introduce if_linkstatehooks.
This converts if_link_state_change() to a generic usable
callback with dohooks().

OK henning@, camield@
Tested by camield@ and Alexey E. Suslikov


Revision tags: OPENBSD_3_7_BASE
# 1.64 07-Feb-2005 mcbride

Add new function if_link_state_change() to take care of sending messages
on the routing socket and notifying carp() of link changes.

ok brad@ mpf@


# 1.63 14-Jan-2005 henning

remove old ifgroups ioctls
the old ifgroups haven't been in use ever really, and the new
implementation is 3 months old today. theo ok (3 months ago)


# 1.62 07-Dec-2004 mcbride

Convert carp(4) to behave more like a regular interface, much in the same
style as vlan(4). carp interfaces no longer require the physical interface
to be on the same subnet as the carp interface, or even that the physical
interface has an adress at all, so CARP can now be used on /30 networks.

ok deraadt@ henning@


# 1.61 07-Dec-2004 mcbride

KNF


# 1.60 03-Dec-2004 henning

do not use one struct timeout for the if congestion stuff, but embed
a struct timeout to struct ifqueue so that each one has its own - it
is a per-queue thing. from chris pascoe


# 1.59 10-Nov-2004 grange

Safer IF_INPUT_ENQUEUE macro.

ok millert@


# 1.58 14-Oct-2004 mickey

avoid stupid commons


# 1.57 11-Oct-2004 henning

ifgroups reqrite
there is now a TAILQ with all interface groups as members, and
in struct ofnet there is only a pointer to the group structure stored
and not its name.
mostly hacked at c2k4 and somewhere over the atlantic ocean
ok markus mcbride


Revision tags: OPENBSD_3_6_BASE
# 1.56 26-Jun-2004 markus

cleanup ioctl for ifgroups; ok pb@


# 1.55 25-Jun-2004 pb

introduce "interface groups"

by "ifconfig fxp0 group foobar" "ifconfig xl0 group foobar"
these two interfaces are in one group.
Every interface has its if-family as default group.

idea/design from henning@, based on some work/disucssion from Joris Vink.

henning@, mcbride@ ok.


# 1.54 20-Jun-2004 beck

undo mbuf cluster breakage that causes free'ed packets to show up on the
input queues when using dhcp and hostap wi, or xl, or fxp....
ok art@


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.53 29-May-2004 jcs

introduce SIOCSIFDESCR and SIOCGIFDESCR to maintain interface
descriptions, configurable with ifconfig

help from various, ok deraadt@


# 1.52 18-May-2004 brad

if_ether.h
add ETHER_MAX_LEN_JUMBO, ETHER_VLAN_ENCAP_LEN, ETHER_ALIGN, and
ETHERMTU_JUMBO constants.

if.h
add a few more interface capabilities flags.

Some from NetBSD, some from FreeBSD.

ok markus@


# 1.51 26-Apr-2004 mcbride

Before enqueueing the packet, copy the contents of incoming clusters
to the mbuf and free the cluster when it contains a small packet.

ok deraadt@


# 1.50 17-Apr-2004 henning

add a congestion indicator to if_queue. It is set when the input queue
is full, along with a timer that unsets it again after 10ms.
The input queue beeing full is a reliable indicator for CPU overload, and
this flag allows other subsystems to cope with the situation.
hacked with beck
ok kjc@ markus@ beck@


Revision tags: OPENBSD_3_5_BASE
# 1.49 15-Jan-2004 markus

add a RTM_IFANNOUNCE message; from netbsd; ok itojun, henning


# 1.48 16-Dec-2003 markus

return error in ifc_destroy; ok deraadt, itojun, cedric, hshoexer


# 1.47 10-Dec-2003 itojun

use if_indexlim (instead of if_index) and ifindex2ifnet[x] != NULL
to check if interface exists, as (1) if_index will have different meaning
(2) ifindex2ifnet could become NULL when interface gets destroyed,
when we introduce dynamically-created interfaces. markus ok


# 1.46 08-Dec-2003 markus

add IOCIFGCLONERS; ifconfig -C; from netbsd; ok henning, deraadt


# 1.45 03-Dec-2003 markus

support for network interface "cloning", e.g. gif(4) via ifconfig(8)


# 1.44 19-Oct-2003 david

more typos


# 1.43 17-Oct-2003 mcbride

Common Address Redundancy Protocol

Allows multiple hosts to share an IP address, providing high availability
and load balancing.

Based on code by mickey@, with additional help from markus@
and Marco_Pfatschbacher@genua.de

ok deraadt@


Revision tags: OPENBSD_3_4_BASE
# 1.42 25-Aug-2003 fgsch

if_init support, required by ieee80211.
deraadt@ ok.


# 1.41 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_A UBC_SYNC_B
# 1.40 03-Jul-2002 miod

Change all variables definitions (int foo) in sys/sys/*.h to variable
declarations (extern int foo), and compensate in the appropriate locations.


# 1.39 30-Jun-2002 itojun

allocate sockaddr_dl for ifnet in if_alloc_sadl(), as we don't always know
the size of sockaddr_dl on if_attach() - for instance, see ether_ifattach().
from netbsd. fgs ok


# 1.38 23-Jun-2002 itojun

g/c last remains of old ipv6 prefix management


# 1.37 27-May-2002 itojun

if_attach() gets called before domaininit(). scan all interfaces for if_afdata
initialization after domaininit().


# 1.36 27-May-2002 itojun

framework to add af-dependent data structure to struct ifnet.
as discussed at bsd-api-discuss. sync w/kame


# 1.35 24-Apr-2002 dhartmei

Add hooks to struct ifnet that allow to register callbacks that will be
notified of interface address changes. ok provos@, angelos@


Revision tags: OPENBSD_3_1_BASE
# 1.34 15-Mar-2002 millert

Cosmetic changes only, primarily making comments line up nicely after the
__P removal.


# 1.33 14-Mar-2002 millert

First round of __P removal in sys


# 1.32 23-Jan-2002 fgsch

compatability -> compatibility.


Revision tags: OPENBSD_3_0_BASE UBC_BASE
# 1.31 05-Jul-2001 angelos

branches: 1.31.4;
KNF


# 1.30 05-Jul-2001 jjbg

Include files for IPComp support. angelos@ ok.


# 1.29 27-Jun-2001 kjc

ALTQ base modifications to the kernel.
- ALTQ introduces a set of new queue macros that coexist with the
traditional IF_XXX macros.
- "struct ifaltq" replaces "struct ifqueue" in "struct ifnet".
- assign cdev major 74 for i386 and 54 for alpha as ALTQ control interface.


# 1.28 23-Jun-2001 fgsch

Add ether_input_mbuf to help us remove the ether_header from
ether_input; all drivers should start migrating to this.
Discussed with jason@, deraadt@ more or les ok'ed.


# 1.27 15-Jun-2001 itojun

change the meaning of ifnet.if_lastchange to meet RFC1573 ifLastChange.
follows BSD/OS practice and ucd-snmp code (FreeBSD does it for specific
interfaces only).

was: if_lastchange get updated on every packet transmission/receipt.
now: if_lastchange get updated when IFF_UP is changed.


# 1.26 09-Jun-2001 angelos

By popular demand, protect from multiple inclusion, and fix to use the
same naming style.


# 1.25 28-May-2001 angelos

IPSECv4 -> IPSEC


# 1.24 28-May-2001 angelos

No need for separate ESP/AH interface capabilities.


# 1.23 28-May-2001 angelos

Interface capabilities (based on NetBSD, but merge ethercom and ifnet
capabilities into one, in the ifp).


Revision tags: OPENBSD_2_9_BASE
# 1.22 06-Feb-2001 mickey

allow changing number of loopbacks in ukc.
change rest of the code to use lo0ifp pointing
to the corresponding struct ifnet.
itojun@ and niklas@ ok


# 1.21 19-Jan-2001 itojun

pull post-4.4BSD change to sys/net/route.c from BSD/OS 4.2 (UCB copyrighted).

have sys/net/route.c:rtrequest1(), which takes rt_addrinfo * as the argument.
pass rt_addrinfo all the way down to rtrequest, and ifa->ifa_rtrequest.
3rd arg of ifa->ifa_rtrequest is now rt_addrinfo * instead of sockaddr *
(almost noone is using it anyways).

benefit: the follwoing command now works. previously we need two route(8)
invocations, "add" then "change".
# route add -inet6 default ::1 -ifp gif0

remove unsafe typecast in rtrequest(), from rtentry * to sockaddr *. it was
introduced by 4.3BSD-reno and never corrected.

XXX is eon_rtrequest() change correct regarding to 3rd arg?
eon_rtrequest() and rtrequest() were incorrect since 4.3BSD-reno,
so i do not have correct answer in the source code.
someone with more clue about netiso-over-ip, please help.


Revision tags: OPENBSD_2_8_BASE
# 1.20 20-Sep-2000 art

Since ifa_refcnt was bumped to an int and rt_flags is an int too, bump
ifa_flags to int.


# 1.19 28-Aug-2000 deraadt

changing the size of if_data has heavy impact on userland compat. there
was even a u_char slot available for ifi_link_state, which clearly does not
need a full 32 bits.


# 1.18 26-Aug-2000 nate

sync mii code with netbsd
adds detach functionality for phys
some code cleanup

Nobody really had time to test all of this out, but theo said commit anyway


Revision tags: OPENBSD_2_7_BASE
# 1.17 22-Mar-2000 itojun

remove if_withname(), which was imported during KAME merge by mistake.


# 1.16 21-Mar-2000 mickey

add SIOCGIFMTU/SIOCSIFMTU; remediate redundant code of tun, ppp, sppp; chris@ ok


Revision tags: SMP_BASE
# 1.15 02-Feb-2000 itojun

branches: 1.15.2;
wrap IFAFREE() by "do {} while (0)". it wasn't safe enough.


Revision tags: kame_19991208
# 1.14 08-Dec-1999 itojun

bring in KAME IPv6 code, dated 19991208.
replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support.
see sys/netinet6/{TODO,IMPLEMENTATION} for more details.

GENERIC configuration should work fine as before. GENERIC.v6 works fine
as well, but you'll need KAME userland tools to play with IPv6 (will be
bringed into soon).


Revision tags: OPENBSD_2_6_BASE
# 1.13 08-Aug-1999 niklas

Support detaching of network interfaces. Still work to do in ipf, and
other families than inet.


# 1.12 23-Jun-1999 cmetz

Added some protocol independent interfaces (supposedly IPv6 support APIs, but
ones that are useful for all protocols, not just IPv6).


Revision tags: OPENBSD_2_5_BASE
# 1.11 13-Mar-1999 deraadt

make ifa_refcnt a u_int; andrewb@demon.net


# 1.10 26-Feb-1999 jason

Ethernet bridge/IP firewall driver.


# 1.9 07-Jan-1999 deraadt

fix IFAFREE() to be safe for if/else nesting


Revision tags: OPENBSD_2_4_BASE
# 1.8 03-Sep-1998 jason

o OpenBSD gets if_media support (from NetBSD)
o rework/simplify if_xl to use it


Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE OPENBSD_2_2_BASE OPENBSD_2_3_BASE
# 1.7 02-Jul-1996 niklas

-Wall & -Wstrict-prototype fixes


# 1.6 29-Jun-1996 deraadt

provide if_attachhead(), and make if_loop use it


# 1.5 10-May-1996 deraadt

if_name/if_unit -> if_xname/if_softc


# 1.4 06-May-1996 mickey

if.h was missed from the commit.
if_ethersubr.c: missed variables added.


# 1.3 05-Mar-1996 mickey

Changes for ifconfig to compile.


# 1.2 03-Mar-1996 niklas

From NetBSD: 960217 merge


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.199 23-Jan-2019 dlg

add a SIOCGPWE3 ioctl for interfaces to advertise pwe3 capability

im going to turn mpw into an ethernet interface, which includes
changing its if_type to IFT_ETHER. currently ldpd looks for if_type
IFT_MPLSTUNNEL to decide if an interface is a pseudowire, ie, it's
going to break. the ioctl will let ldpd ask the interface if it is
pseudowire capable as an alternative.

ok claudio@


# 1.198 12-Nov-2018 dlg

add ifreq bits for the tx header prio field ioctls

a tx header prio can set to a fixed value from 0 to 7, or magic
values to represent populating the prio field from the encapsulated
packet, or from the mbuf prio value.

ok claudio@


# 1.197 12-Nov-2018 krw

Add new routing socket message RTM_80211INFO to provide details of
802.11 interface state changes (e.g. SSID) to interested parties.

Original diff from phessler@. Many suggestions and tweaks from
claudio@, stsp@, anton@.

ok claudio@ stsp@ anton@ phessler@


# 1.196 11-Nov-2018 dlg

use the llprio on gre(4) and eoip(4) interfaces for the keepalive tos

llprios are valued 0 to 7, while the ip tos/dscp/tclass is an 8 bit
value. fortunately the high 3 bits map nicely to the llprio values,
so we shift the llprio into place when generating the keepalive
frames. the llprio is defaulted to the value that cisco uses for
their gre keepalives.


Revision tags: OPENBSD_6_4_BASE
# 1.195 12-Sep-2018 krw

Fix obvious cut&pasto in comment (ifa_msghdr -> if_announcemsghdr).

ok claudio@


# 1.194 30-May-2018 dlg

restrict the prio values from SIOCSIFLLPRIO to what the kernel handles

previously the ioctl code checked that prio was an int less than
UCHAR_MAX, but the rest of the kernel (and priq code in particular)
expects it to be between 0 and 7 inclusive.

ok krw@ tb@


# 1.193 25-Apr-2018 jca

Make this header standalone #if __BSD_VISIBLE, by including needed headers

Puts us in line with Free/NetBSD and Linux and will get us rid of
pointless patches in the ports tree. ok guenther@ deraadt@


Revision tags: OPENBSD_6_3_BASE
# 1.192 19-Feb-2018 dlg

tunneldf needs ifr_df


# 1.191 10-Feb-2018 florian

Implement RFC 7217: "A Method for Generating Semantically Opaque
Interface Identifiers with IPv6 Stateless Address Autoconfiguration."

"An IPv6 address configured using this method is stable within each
subnet, but the corresponding Interface Identifier changes when the
host moves from one network to another. This method is meant to be an
alternative to generating Interface Identifiers based on hardware
addresses."

OK naddy, sthen


# 1.190 16-Jan-2018 mpi

Recycle IFF_NOTRAILERS into IFF_STATICARP and document ownerhsip
of IFF* flags.

inputs from jmc@, ok bluhm@, visa@


# 1.189 21-Dec-2017 dlg

prototype if_attach_iqueues so drivers can configure multiple iqs.


# 1.188 09-Nov-2017 tb

The cmd argument of ifconf() has been unused since COMPAT_LINUX was
purged. Remove it and move the prototype to if.c since ifconf() is
not used outside of this file.

ok mpi


# 1.187 31-Oct-2017 sashan

- add one more softnet taskq
NOTE: code still runs with single softnet task. change definition of
SOFTNET_TASKS in net/if.c, if you want to have more than one softnet task

OK mpi@, OK phessler@


Revision tags: OPENBSD_6_1_BASE OPENBSD_6_2_BASE
# 1.186 24-Jan-2017 krw

A space here, a space there. Soon we're talking real whitespace
rectification.


# 1.185 24-Jan-2017 dlg

add support for multiple transmit ifqueues per network interface.

an ifq to transmit a packet is picked by the current traffic
conditioner (ie, priq or hfsc) by providing an index into an array
of ifqs. by default interfaces get a single ifq but can ask for
more using if_attach_queues().

the vast majority of our drivers still think there's a 1:1 mapping
between interfaces and transmit queues, so their if_start routines
take an ifnet pointer instead of a pointer to the ifqueue struct.
instead of changing all the drivers in the tree, drivers can opt
into using an if_qstart routine and setting the IFXF_MPSAFE flag.
the stack provides a compatability wrapper from the new if_qstart
handler to the previous if_start handlers if IFXF_MPSAFE isnt set.

enabling hfsc on an interface configures it to transmit everything
through the first ifq. any other ifqs are left configured as priq,
but unused, when hfsc is enabled.

getting this in now so everyone can kick the tyres.

ok mpi@ visa@ (who provided some tweaks for cnmac).


# 1.184 23-Jan-2017 mpi

Flag pseudo-interfaces as such in order to call add_net_randomness()
only once per packet.

Fix a regression introduced when if_input() started to be called by
every pseudo-driver.

ok claudio@, dlg@


# 1.183 23-Jan-2017 dlg

i botched the copyout to ifr->ifr_data in SIOCGIFDATA.

this lets pflogd run again.

rename if_data() to if_getdata() while here to make grepping for
things less noisy.

reported by jsg@
worked through with deraadt@


# 1.182 23-Jan-2017 dlg

merge the ifnet and ifqueue stats together when userland wants them.

a new if_data() function takes a pointer to ifnet and merges its
if_data and ifq statistics. it takes the ifq mutex around the reads
of the ifq stats so they get a consistent copy.

the ifnet and ifq stats are merged because some parts of the stack
still update the ifnet counters.

ok visa@ (on an earlier diff) mpi@ claudio@


# 1.181 12-Dec-2016 mpi

Remove most of the splsoftnet() recursions related to cloned interfaces.

inputs and ok bluhm@


# 1.180 27-Oct-2016 dlg

add a new pool for 2k + 2 byte (mcl2k2) clusters.

a certain vendor likes to make chips that specify the rx buffer
sizes in kilobyte increments. unfortunately it places the ethernet
header on the start of the rx buffer, which means if you give it a
mcl2k cluster, the ethernet header will not be ETHER_ALIGNed cos
mcl2k clusters are always allocated on 2k boundarys (cos they pack
into pages well). that in turn means the ip header wont be aligned
correctly.

the current workaround on these chips has been to let non-strict
alignment archs just use the normal 2k cluster, but use whatever
cluster can fit 2k + 2 on strict archs. that turns out to be the
4k cluster, meaning we waste nearly 2k of space on every packet.

properly aligning the ethernet header and ip headers gives a
performance boost, even on non-strict archs.


# 1.179 04-Sep-2016 reyk

Move code to change the rdomain of an interface from the ioctl switch case
to a new function if_setrdomain().

OK mpi@ henning@


# 1.178 03-Sep-2016 reyk

Add support for a multipoint-to-multipoint mode in vxlan(4). In this
mode, vxlan(4) must be configured to accept any virtual network
identifier with "vnetid any" and added to a bridge(4) or switch(4).
This way the driver will dynamically learn the tunnel endpoints and
their vnetids for the responses and can be used to dynamically bridge
between VXLANs. It is also being used in combination with switch(4)
and the OpenFlow tunnel classifiers.

With input from yasuoka@ goda@
OK deraadt@ dlg@


Revision tags: OPENBSD_6_0_BASE
# 1.177 10-Jun-2016 vgross

Add the "llprio" field to struct ifnet, and the corresponding keyword
to ifconfig.

"llprio" allows one to set the priority of packets that do not go through
pf(4), as the case is for arp(4) or bpf(4).

ok sthen@ mikeb@


# 1.176 02-Mar-2016 dlg

provide generic ioctls for managing an interfaces parent

in the future this will subsume the individual vlandev, carpdev,
pppoedev, foodev options for things like vlan, carp, pppoe, etc.

inspired by vnetid

ok mpi@ jmatthew@


Revision tags: OPENBSD_5_9_BASE
# 1.175 05-Dec-2015 deraadt

avoid an ugly wrap in a comment


# 1.174 03-Dec-2015 dlg

rework if_start to allow nics to provide an mpsafe start routine.

existing start routines will still be called under the kernel lock
and at IPL_NET.

mpsafe start routines will be serialised so only one instance of
each interfaces function will be running in the kernel at any point
in time. this guarantees packets will be dequeued in order, and the
start routines dont have to lock against themselves because if_start
does it for them.

the code to do that is based on the scsi runqueue code.

this also provides an if_start_barrier() function that should wait
until any currently running instances of if_start have finished.

a driver can opt in to the mpsafe if_start call by doing the following:

1. setting ifp->if_xflags = IFXF_MPSAFE
2. only calling if_start() instead of its own start routine
3. clearing IFF_RUNNING before calling if_start_barrier() on its way down
4. only using IFQ_DEQUEUE (not ifq_deq_begin/commit/rollback)

to simplify the implementation the tx mitigation code has been removed.

tested by several
ok mpi@ jmatthew@


# 1.173 20-Nov-2015 mpi

Keep if_ref() private, if_get() is what you want to use before if_put().

The thread detaching an interface will sleep until all references to this
interface have been released. So we decided to only keep references for
a short period of time.

Keeping if_ref() private will hopefully help preserve this goal as long
as it makes sense.

Calling if_get()/if_put() in the same function also allows us to make
use of static analysis tools (thanks jsg@!) to catch our errors.

ok dlg@


# 1.172 24-Oct-2015 reyk

Add pair(4), a vether-based virtual Ethernet driver to interconnect
rdomains and bridges on the local system. This can be used to route
through local rdomains, to create L2 devices (like trunks) between
them, and many other things.

Discussed with many, with input from mpi@
OK sthen@ phessler@ yasuoka@ mikeb@


# 1.171 23-Oct-2015 claudio

Introduce a new sysctl NET_RT_IFNAMES that returns only ifnames to ifindex
mappings. This will be used by if_nameindex(3), if_nametoindex(3) and
if_indextoname(3) soon to fix the issues in pledge because of inet6 link
local addressing.
OK mpi@ benno@ deraadt@
The libc version will follow soon so better start updating your kernels


# 1.170 23-Oct-2015 dlg

tweak the vnetid so it can be optional and therefore cleared/deleted.

the abstract vnetid is promoted to a uin32_t, and adds a SIOCDVNETID
ioctl so it can be cleared.

this is all because i set an assignment on implementing a virtual
network interface and the students got confused when vnetid 0 didnt
show up in ifconfig output.

the vnetid in the vxlan(4) protocol is optional, but the current
code confuses 0 with no vnetid being set. this makes it clear.

ok reyk@ who also simplified my diff


# 1.169 05-Oct-2015 uebayasi

Add ifi_oqdrops and its alias to struct if_data.

Necessary bumps in Ports will be handled by sthen@.

OK mpi@ dlg@


# 1.168 27-Sep-2015 stsp

Add if_setlladdr(), factored out from ifioctl(). Will be used by iwm(4) soon.
With suggestions from tedu@ and guenther@
ok kettenis@


# 1.167 11-Sep-2015 stsp

Make room for media types of the future. Extend the ifmedia word to 64 bits.
This changes numbers of the SIOCSIFMEDIA and SIOCGIFMEDIA ioctls and
grows struct ifmediareq.

Old ifconfig and dhclient binaries can still assign addresses, however
the 'media' subcommand stops working. Recompiling ifconfig and dhclient
with new headers before a reboot should not be necessary unless in very
special circumstances where non-default media settings must be used to
get link and console access is not available.

There may be some MD fallout but that will be cleared up later.

ok deraadt miod
with help and suggestions from several sharks attending l2k15


# 1.166 09-Sep-2015 dlg

introduce reference counts for interfaces (ie, struct ifnet *ifp).

if_get can get a reference to an ifp, but it never releases that
reference. this provides an if_put function that can be used to
decrement the refcount.

we cannot come up with a scheme for letting the network stack run on
one (or many) cpus while ioctls are pulling interfaces down on another
cpu without refcounts for the interfaces.

if_put is going in now so we can go through the stack and put the
necessary calls to it in, and then we'll backfill this implementation
to actually check the refcounts when the interface detaches.

ok mpi@ mikeb@ claudio@


# 1.165 30-Aug-2015 mpi

Use a global table for domains instead of building a list at run time.

As a side effect there's no need to run if_attachdomain() after the
list of domains has been built.

ok claudio@, reyk@


Revision tags: OPENBSD_5_8_BASE
# 1.164 07-Jun-2015 jsg

Introduce unhandled_af() for cases where code conditionally does
something based on an address family and later assumes one of the paths
was taken. This was initially just calls to panic until guenther
suggested a function to reduce the amount of strings needed.

This reduces the amount of noise with static analysers and acts
as a sanity check.

ok guenther@ bluhm@


# 1.163 18-May-2015 reyk

Move the rdomain from struct ifnet into struct if_data. This way it
will be exported to userland with the existing sysctl, getifaddrs()
and routing socket (if_msghdr.ifm_data) interfaces that expose
if_data. All programs and daemons - Apps - that call the
SIOCGIFRDOMAIN ioctl in a getifaddrs() loop or after receiving an
interface message on the routing socket can now remove the pointless
additional ioctl. In base, that could be: dhclient, isakmpd, dhcpd,
dhcrelay, ntpd, ospfd, ripd, ifconfig.

No ABI breakage because it uses a previously unused pad field in if_data.

OK mpi@ deraadt@


# 1.162 10-Apr-2015 mpi

Run detach hook and similar before cleaning up any other resource when
an interface is destroyed/removed. This way we can ensure pseudo-driver
changes done after attaching an interface are undone before detaching it.

Note: it is safe to call if_deactivate() multiple times as the interface
should not have any attached pseudo-interface after the first call.

ok deraadt@, dlg@


# 1.161 18-Mar-2015 dlg

remove the congestion handling from struct ifqueue.

its only used for the ip and ip6 network stack input queues, so it
seems unfair that every instance of ifqueue has to carry a pointer
around for this specific use case.

this moves the congestion marker to a kernel global. if we detect
that we're congested, we assume the whole system is busy and punish
all input queues.

marking a system as congested is done by setting the global to the
current value of ticks. as the system moves away from that value,
it moves away from being congested until the comparison fails.

written at s2k15
ok henning@ beck@ bluhm@ claudio@


Revision tags: OPENBSD_5_7_BASE
# 1.160 08-Feb-2015 mpi

Introduce if_input() a function to pass packets dequeued from a
recieving ring to the stack.

if_input() is at the moment a drop-in replacement for ether_input_mbuf()
but will let us stack pseudo-driver in a nice way in order to no longer
call ether_input() recursively.

ok pelikan@, reyk@, blambert@, henning@


# 1.159 06-Jan-2015 stsp

Remove the NOINET6 interface flag, a left-over from the times when IPv6
was enabled by default. Add AFATTACH/AFDETACH ioctls which enable/disable
an address family for an interface (currently used for IPv6 only).

New kernel needs new ifconfig for IPv6 configuration (address assignment
still works with old ifconfig making this easy to cross over).

Committing on behalf of henning@ who is currently lebensmittelvergiftet.
ok stsp, benno, mpi


# 1.158 05-Dec-2014 mpi

Explicitly include <net/if_var.h> instead of pulling it in <net/if.h>.

ok mikeb@, krw@, bluhm@, tedu@


Revision tags: OPENBSD_5_6_BASE
# 1.157 14-Jul-2014 dlg

now that receive ring accounting has been pulled out of the mbuf layer,
we can pull the space the mbuf layer used to do per interface accounting
out of struct if_data.

saves a hundredish bytes on every interface.

ok deraadt@ claudio@


# 1.156 11-Jul-2014 henning

introduce the IFXF_AUTOCONF6 interface flag which controls wether we
accept rtadvs on that interface. the global net.inet6.ip6.accept_rtadv
sysctl just doesn't cut it, even tho the spec wants that - but in their
little absurd world, a host just has one interface by definition anyway...
the sysctlgoes away.
lots of head scratching, brain cell elemination etc from bluhm benno stsp
florian, excitement from simon and todd, ok bluhm stsp benno florian


# 1.155 08-Jul-2014 dlg

introduce the if_rxr api. it is intended to pull the rx ring accounting
out of the mbuf layer, and break the assumption that an interface will
only have a single ring per mbuf cluster size.

mpi@ is ok with moving this forward


# 1.154 13-Jun-2014 mpi

Instead of updating all the cluster allocation water marks of all the
interfaces when the kernel is livelocked, only do it for the current
pool and defer the other updates.

This allow us to get rid of an interface list iteration in a critical
path.

Ridding the libc crank since this change introduce an ABI break.

ok claudio@


Revision tags: OPENBSD_5_5_BASE
# 1.153 21-Nov-2013 mikeb

split kernel parts of the if.h into a separate header file if_var.h
which allows us to modify ifnet structure in a relatively safe way;
discussed with deraadt, ok mpi


# 1.152 09-Nov-2013 dlg

ticks is compared against mcl_grown to see if time has elapsed since
the rx ring was last allowed to grow and then assigned to it. it
is erroneous to do this because mcl_grown is a u_int and ticks is an
int.

this makes mcl_grown an int, and follows the idiom in kern_timeout.c
of going "thing - ticks < diff", which better copes with ticks
wrapping around and being used to calculate relative intervals.

ok pirofti@ guenther@


# 1.151 01-Nov-2013 pelikan

keep net/hfsc.h away from userspace, except in pfctl

tested by naddy, ok deraadt


# 1.150 21-Oct-2013 benno

nuke comment. How soon is now?
"do it" deraadt@


# 1.149 19-Oct-2013 reyk

Bring back the if_detachhook. We're going to have more users now.

ok mpi@ henning@ benno@


# 1.148 13-Oct-2013 reyk

Import vxlan(4), the virtual extensible local area network tunnel
interface. VXLAN is a UDP-based tunnelling protocol for overlaying
virtualized layer 2 networks over layer 3 networks. The implementation
is based on draft-mahalingam-dutt-dcops-vxlan-04 and has been tested
with other implementations in the wild.

put it in deraadt@


# 1.147 12-Oct-2013 henning

new bandwidth shaping subsystem, kernel side
uses hfsc behind the scenes; altq stays in parallel for a migration phase.
if.h even more messy for the transition, but eventuelly it should become
readable...
looked over & tested by many, ok phessler sthen


# 1.146 17-Sep-2013 mpi

Change vlan(4) detach procedure to not use a hook but a list of vlans
on the parent interface. This is similar to what bridge(4), trunk(4)
or carp(4) are doing and allows us to get rid of the detachhook.

ok reyk@, mikeb@


# 1.145 28-Aug-2013 mpi

Remove unused argument from *rtrequest()

ok krw@, mikeb@


Revision tags: OPENBSD_5_4_BASE
# 1.144 20-Jun-2013 mpi

Revert previous and unbreak asr, the new include should be protected.

Reported by naddy@


# 1.143 20-Jun-2013 mpi

Allocate the various hook head descriptors as part of the ifnet
structure rather than doing various M_WAITOK allocations during
the *attach() functions, we always rely on them anyway.

ok mikeb@, uebayasi@


# 1.142 02-Apr-2013 mpi

Instead of storing the link-level address of every interface in a global
array indexed by interface numbers, add a new field to the interface
descriptor pointing to it.

claudio@ and todd@ like it, ok mikeb@


# 1.141 26-Mar-2013 mpi

Remove various read-only *maxlen variables and use IFQ_MAXLEN directly.

ok beck@, mikeb@


# 1.140 20-Mar-2013 mpi

Introduce if_get() to retrieve an interface descriptor pointer given
an interface index and replace all the redondant checks and accesses
to a global array by a call to this function.

With imputs from and ok bluhm@, mikeb@


# 1.139 07-Mar-2013 mpi

Remove unused ifa_ifwithaf() function.

ok mikeb@, miod@


# 1.138 07-Mar-2013 mpi

Remove the IFAFREE() macro, the ifafree() function it was calling already
check for the reference counter.

ok mikeb@, miod@, pelikan@, kettenis@, krw@


Revision tags: OPENBSD_5_3_BASE
# 1.137 23-Nov-2012 sthen

Add SIOCGIFHARDMTU to allow retrieving the driver's maximum supported MTU
looks fine reyk@ ok mikeb@


# 1.136 11-Nov-2012 deraadt

align ifaliasreq.ifra_addr similar to the way that ifreq is fixed --
a gruesome union, to block the compiler from placing the struct
incorrectly aligned on stack frames
ok guenther


# 1.135 05-Oct-2012 camield

Point an interface directly to its bridgeport configuration, instead
of to the bridge itself. This is ok, since an interface can only be part
of one bridge, and the parent bridge is easy to find from the bridgeport.

This way we can get rid of a lot of list walks, improving performance
and shortening the code.

ok henning stsp sthen reyk


# 1.134 19-Sep-2012 henning

defina an IFCAP_CSUM_MASK, covering IFCAP_CSUM_*, and use it in if_vlan.c
to replace the list of them.
this actually makes vlan inherit the IPv6 CSUM flags from it's parent, that
had been commented out since this code was committed back in 2001.
ok benno mpf


# 1.133 10-Sep-2012 guenther

Bring into compliance with POSIX, exposing just the specified bits.

Requested by jasper@, ok kettenis@


# 1.132 21-Aug-2012 bluhm

Reverse the name and meaning of the IFXF_INET6_PRIVACY interface
flag. It is now called IFXF_INET6_NOPRIVACY. So IPv6 privacy
addresses are on by default without resetting the flag during
ifconfig down/up.
OK stsp@, sperreault@ (who wrote the same diff)


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.131 02-Dec-2011 haesbaert

Kill unused IFCAP_IPSEC and IFCAP_IPCOMP.

ok claudio@ henning@ mikeb@


# 1.130 02-Nov-2011 haesbaert

Expose if_capabilities to userland so that ifconfig can display the
device hardware features.
Tune ifconfig to show them with 'hwfeatures' argument.
While here, kill some old unused capabilities and respect 80 columns
in brconfig.h.

ok mcbride@, henning@, mpf@.


# 1.129 07-Oct-2011 henning

rename some vars and functions
unfortunately altq is one giant namespace violation. rename just those that
conflict with new stuff for now only to be found on my laptop. reduce pain,
the diff is huge already. ok ryan


Revision tags: OPENBSD_5_0_BASE
# 1.128 08-Jul-2011 henning

new priority queueing implementation, extremely low overhead, thus fast.
unconditional, always on. 8 priority levels, as every better switch, the
vlan header etc etc. ok ryan mpf sthen, pea tested as well


# 1.127 07-Jul-2011 henning

provide IF_LEN and IFQ_LEN to access ifq_len on an ifqueue, ryan ok


# 1.126 05-Jul-2011 henning

now of course I only noticed if_qflush is completely unused after
adjusting it to the new world order in my tree... remove it, ok ryan claudio


# 1.125 03-Jul-2011 henning

IFQ_CLASSIFY is also just schrapnel


# 1.124 03-Jul-2011 henning

no traces of ALTQ_DECL to be found anywhere, thus kill the #defines


# 1.123 03-Jul-2011 claudio

LINK_STATE_IS_UP() should consider LINK_STATE_UNKNOWN as an up state.
This is now possible because carp no longer uses LINK_STATE_UNKNOWN
for a state that is considered down. This will simplify a lot of code.
OK mpf@ mcbride@ henning@


# 1.122 13-Mar-2011 stsp

Add a way to enable/disable Wake On LAN with ifconfig.
ok deraadt


Revision tags: OPENBSD_4_9_BASE
# 1.121 17-Nov-2010 henning

introduce ifa_update_broadaddr to update an ifaddr's broadcast address,
trivial for the moment, more needed soon
tested by many as part of a larger diff, ok sthen claudio dlg krw


# 1.120 24-Sep-2010 claudio

Implement if_freenameindex() as a real function as required by posix.
OK deraadt@, millert@


# 1.119 23-Sep-2010 dlg

tweak the mclgeti algorithm to behave better under load.

instead of letting hardware rings grow on every interrupt, restrict
it so it can only grow once per softclock tick. we can only punish
the rings on softclock ticks, so it make sense to only grow on
softclock tick boundaries too.

the rings are now punished after >1 lost softclock tick rather than
>2. mclgeti is now more aggressive at detecting livelock.

the rings get punished by an 8th, rather than by half.

we now allow the rings to be punished again even if the system is
already considered in livelock.

without this diff a livelocked system will have its rx ring sizes
scale up and down very rapidly, while holding the rings low for too
long. this affected throughput significantly.

discussed and tested heavily at j2k10. there are still some games
with softnet we can play, but this is a good first step.

"put it in" and ok deraadt@
ok claudio@ krw@ henning@ mcbride@

if we find out that it sucks we can pull it out again later. till then
we'll run with it and see how it goes.


# 1.118 27-Aug-2010 jsg

remove the unused if_init callback in struct ifnet
ok deraadt@ henning@ claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.117 26-Jun-2010 claudio

Implement a simple keepalive mechanism in gre(4) that is compatible with
the one used by Cisco. It sends a return gre packet inside a gre packet
to the other side and expects it to return.
OK deraadt, reyk additional testing by sthen


# 1.116 28-May-2010 claudio

Rework the way we handle MPLS in the kernel. Instead of fumbling MPLS into
ether_output() and later on other L2 output functions use a trick and over-
load the ifp->if_output() function pointer on MPLS enabled interfaces to
go through mpls_output() which will then call the link level output function.
By setting IFXF_MPLS on an interface the output pointers are switched.
This now allows to cleanup the MPLS input and output pathes and fix mpe(4)
so that the MPLS code now actually works for both P and PE systems.
Tested by myself and michele
(A custom kernel with MPLS and mpe enabled is still needed).


# 1.115 17-Apr-2010 deraadt

split SIOCSIFLLADDR code out into an ifnewlladr() function
ok stsp


# 1.114 06-Apr-2010 stsp

Simple implementation of RFC4941, "Privacy Extensions for Stateless
Address Autoconfiguration in IPv6". For those among us who are paranoid
about broadcasting their MAC address to the IPv6 internet.

Man page help from jmc, testing by weerd, arc4random API hints from djm.

ok deraadt, claudio


Revision tags: OPENBSD_4_7_BASE
# 1.113 13-Jan-2010 henning

maintain a global RB tree of all local addresses in the system. this
includes AF_LINK addresses (aka mac addresses in the ethernet case). for
inet this also includes the broadcast addresses.
depends on ifinit() called earlier so we have a chance to pool_init before
autoconf assigns the AF_LINK addresses, the v6 fix, and the ifa_add/del
abstraction i just committed.
this is a change in semantics, it is now illegal to change the actual
address in an ifaddr struct because then the RB tree becomes unbalanced.
nothing using this tree yet.
ok theo ryan dlg


# 1.112 13-Jan-2010 henning

instead of fiddling with the per-interface address lists directly in
many places create a proper API (ifa_add / ifa_del) and use it.
ok theo ryan dlg


# 1.111 12-Jan-2010 deraadt

Make the structures for ifa_msghdr and friends even more like
the route messages so that people and compilers will not get
confused.
ok claudio


# 1.110 17-Sep-2009 claudio

Remove the comaptibility structures for routing socket version 3.
The RTM_VERSION bump is 2 years ago and so there is no need for this.
Diff made by tedu@ some time ago but got never commited so I do it now.


# 1.109 14-Sep-2009 claudio

Add a way to convert the ifi_link_state to a string without the use of
if_media. This makes link state tracking a lot easier as there is no need
to convert if types to if_media types, etc. Additionally this allows us
to extend the link states to include states tracked on higher protocol layers.
gre(4) keepalives packets, bfd and udld can be implemented without ugly hacks.
OK henning, michele, sthen, deraadt


# 1.108 10-Aug-2009 deraadt

At sys_reboot time, bring all the interfaces down so that their xxstop
functions are called, which will turn off DMA. Receiving packets into
your memory after a system reboot is pretty nasty. This will also mean
that the shutdown hooks can go; this solution is smaller.
ok henning miod dlg kettenis


Revision tags: OPENBSD_4_6_BASE
# 1.107 06-Jun-2009 rainer

when xflags got changed, tell the userland by routing sockets

ok henning@


# 1.106 05-Jun-2009 claudio

Initial support for routing domains. This allows to bind interfaces to
alternate routing table and separate them from other interfaces in distinct
routing tables. The same network can now be used in any doamin at the same
time without causing conflicts.
This diff is mostly mechanical and adds the necessary rdomain checks accross
net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6.
input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@


# 1.105 04-Jun-2009 henning

allow IPvShit to be turned off completely per-interface.
ifconfig em0 -inet6
deletes all v6 addresses including link-local and prevents new ones from
being added.
ifconfig em0 inet6 <addr>
re-enables v6, brings the link local back and adds optional <addr>
ok theo reyk


# 1.104 03-Jun-2009 beck

make wireless interfaces priority 4 by default. other interfaces remain
priority 0. while we are in here make sure we add wi interfaces to group "wlan"
in the same way the net80211 stuff already is.

this makes dhcp multiple default routes useful on laptops.

ok claudio@


Revision tags: OPENBSD_4_5_BASE
# 1.103 27-Jan-2009 dlg

make drivers tell the mclgeti allocator what their maximum ring size is
to prevent the hwm growing beyond that. this allows the livelock mitigation
to do something where the hwm used to grow beyond twice the rx rings size.

ok kettenis@ claudio@


# 1.102 12-Dec-2008 claudio

Introduce a if_priority that will be added to RTP_STATIC when routes are
added without an expilict priority. This allows to specify less prefered
interfaces that will only take over if the primary interface loses link.
OK deraadt@


# 1.101 11-Dec-2008 deraadt

export per-interface mbuf cluster pool use statistics out to userland
inside if_data, so that netstat(1) and systat(1) can see them
ok dlg


# 1.100 30-Nov-2008 brad

- Remove unused if_reset "bus reset routine" field in the ifnet struct.
- Add if_stop "stop routine" field in the ifnet struct.

ok mglocker@


# 1.99 26-Nov-2008 dlg

provide m_clsetlwm, an interface for an interface to raise its low
watermark for mbuf cluster allocations.

this is necessary for things like bge which cannot cope with less than a
certain number of pkts on the ring.

ok deraadt@


# 1.98 25-Nov-2008 deraadt

Factor increases are not needed, +1 appears to work as well.
ok dlg


# 1.97 24-Nov-2008 deraadt

move MCLPOOLS to if.h and force uipc_mbuf.c to get if.h, there is no
other option
ok dlg


# 1.96 24-Nov-2008 dlg

add several backend pools to allocate mbufs clusters of various sizes out
of. currently limited to MCLBYTES (2048 bytes) and 4096 bytes until pools
can allocate objects of sizes greater than PAGESIZE.

this allows drivers to ask for "jumbo" packets to fill rx rings with.

the second half of this change is per interface mbuf cluster allocator
statistics. drivers can use the new interface (MCLGETI), which will use
these stats to selectively fail allocations based on demand for mbufs. if
the driver isnt rapidly consuming rx mbufs, we dont allow it to allocate
many to put on its rx ring.

drivers require modifications to take advantage of both the new allocation
semantic and large clusters.

this was written and developed with deraadt@ over the last two days
ok deraadt@ claudio@


# 1.95 07-Nov-2008 deraadt

give this some /* CONSTCOND */ love


Revision tags: OPENBSD_4_4_BASE
# 1.94 10-Apr-2008 dlg

introduce mitigation for the calling of an interfaces start routine.

decent drivers prefer to have a lot of packets on the send queue so they
can queue a lot of them up on the tx ring and then post them all in one
big chunk. unfortunately our stack queues one packet onto the send queue
and then calls the start handler immediately.

this mitigates against that queue, send, queue, send behaviour by trying to
call the start routine only once per softnet. now its queue, queue, queue,
send.

this is the result of a lot of discussion with claudio@
tested by many.


Revision tags: OPENBSD_4_3_BASE
# 1.93 18-Nov-2007 mpf

Sync struct ifaltq to match struct ifqueue.
I wonder why 64-bit archs have not been bitten by this.
OK mcbride@, henning@


# 1.92 03-Sep-2007 claudio

Bump RTM_VERSION to 4 and start a new aera of routing in OpenBSD :)
Changes include 64bit counters instead of u_long, routing table id in the header
of most messages, reserved routing priority field, added a hdrlen field to skip
over the header so that binary compatibility becomes easier.
A minimal backward support for old binaries is included to ease upgrades but
don't expect anything more than ifconfig, route and dhclient to correctly work.
OK henning@ mglocker@


Revision tags: OPENBSD_4_2_BASE
# 1.91 25-Jun-2007 henning

crank ifq_maxlen from 50 to 256, so it is not smaller than most interfaces
rx rings any more. forwarding boxes with many fast interfaces can still use
some more, but this is a saner default.
ok deraadt markus henric


# 1.90 14-Jun-2007 reyk

Add a new "rtlabel" option to ifconfig. It allows to specify a route label
which will be used for new interface routes. For example,
ifconfig em0 10.1.1.0 255.255.255.0 rtlabel RING_1
will set the new interface address and attach the route label RING_1 to
the corresponding route.

manpage bits from jmc@
ok claudio@ henning@


# 1.89 29-May-2007 uwe

Define IF_ENQUEUE() and friends as proper C statements using do ... while
ok henning


# 1.88 26-May-2007 jason

one extern seems to be better than 20 for ifqmaxlen; ok krw


# 1.87 27-Mar-2007 jmc

grammar from bret lambert, and one more from me;


Revision tags: OPENBSD_4_1_BASE
# 1.86 09-Feb-2007 jmc

grammar fix from bret lambert;


# 1.85 28-Nov-2006 reyk

add additional link states to report the half duplex / full duplex
state, if known by the driver. this is required to check the full
duplex state without depending on the ifmedia ioctl which can't be
called in the kernel without process context.

ok henning@, brad@


# 1.84 16-Nov-2006 henning

introduce if_creategroup() to create an empty interface group.
code factored out from if_addgroup(), previously a group always had to have
members. ok mpf mcbride


# 1.83 31-Oct-2006 jason

ether_input_mbuf() isn't necessary, turn it into a macro and deal with
it's "special" case in ether_input(). Based on similiar idea in FreeBSD.
ok brad


Revision tags: OPENBSD_4_0_BASE
# 1.82 02-Jun-2006 mpf

Introduce attributes to interface groups.
As a first user, move the global carp(4) demotion counter
into the interface group. Thus we have the possibility
to define which carp interfaces are demoted together.

Put the demotion counter into the reserved field of the carp header.
With this, we can have carp act smarter if multiple errors occur.
It now always takes over other carp peers, that are advertising
with a higher demote count. As a side effect, we can also have
group failovers without the need of running in preempt mode.
The protocol change does not break compability with older
implementations.

Collaborative work with mcbride@

OK mcbride@, henning@


# 1.81 27-May-2006 brad

remove IFCAP_JUMBO_MTU interface capabilities flag and set if_hardmtu in a few
more drivers.

ok reyk@


# 1.80 26-May-2006 deraadt

rename jumbo mtu to if_hardmtu; ok brad reyk


# 1.79 19-May-2006 reyk

add a if_jumbo_mtu field to the interface structure for drivers
supporting ethernet jumbo frames. there's no standard for the size of
jumbo MTUs, so either let the driver set it's own value or use 9000
byte jumbo frames by default.

ok brad@


# 1.78 04-Mar-2006 brad

With the exception of two other small uncommited diffs this moves
the remainder of the network stack from splimp to splnet.

ok miod@


Revision tags: OPENBSD_3_9_BASE
# 1.77 09-Feb-2006 reyk

add an interface detach hook and use it with the vlan(4) driver. this
fixes a possible crash if the parent interface has been destroyed
(like vlan on trunk) before destroying the vlan interface.

ok brad@


Revision tags: OPENBSD_3_8_BASE
# 1.76 14-Jun-2005 henning

rename function and define to reflect the external -> egress name change
so it is clear what it is all about


# 1.75 14-Jun-2005 henning

use "egress" instead of "external" for the interface group containing the
interfaces the default route(s) point to, proposed deraadt some days ago,
ok djm deraadt


# 1.74 12-Jun-2005 henning

add SIOCGIFGMEMB ioctl, returns a list of all interfaces who are member of
the given group, markus ok


# 1.73 07-Jun-2005 henning

introduce a default "external" interface group, containing the interface(s)
the the default route(s) point to.
handles IPv4 and IPv6 as well as multipath routes.
follows default route changes, of course.
eases writing pf rulesets especially on laptops etc. that use different
interfaces depending on the environment (wired, wireless, ...)
ok theo ryan


# 1.72 06-Jun-2005 henning

use a define instead of hardcoding "all" in 3 places


# 1.71 05-Jun-2005 henning

const'ify the char *groupname param to if_addgroup and if_delgroup


# 1.70 24-May-2005 markus

add net.inet.ip.ifq for monitoring and changing ifqueue; similar to netbsd
ok henning


# 1.69 24-May-2005 reyk

initial import of a trunking (link aggregation and link failover)
implementation. it currently supports round robin mode with link state
checking, additional modes will be added later.

ok brad@, deraadt@


# 1.68 24-May-2005 henning

keep a list of member interfaces in ifg_group


# 1.67 22-May-2005 henning

allow pf to match on interface groups
pass on mygroup ...
markus ok


# 1.66 21-May-2005 henning

clean up and rework the interface absraction code big time, rip out multiple
useless layers of indirection and make the code way cleaner overall.
this is just the start, more to come...
worked very hard on by Ryan and me in Montreal last week, on the airplane to
vancouver and yesterday here in calgary. it hurt.
ok ryan theo


# 1.65 20-Apr-2005 mpf

Introduce if_linkstatehooks.
This converts if_link_state_change() to a generic usable
callback with dohooks().

OK henning@, camield@
Tested by camield@ and Alexey E. Suslikov


Revision tags: OPENBSD_3_7_BASE
# 1.64 07-Feb-2005 mcbride

Add new function if_link_state_change() to take care of sending messages
on the routing socket and notifying carp() of link changes.

ok brad@ mpf@


# 1.63 14-Jan-2005 henning

remove old ifgroups ioctls
the old ifgroups haven't been in use ever really, and the new
implementation is 3 months old today. theo ok (3 months ago)


# 1.62 07-Dec-2004 mcbride

Convert carp(4) to behave more like a regular interface, much in the same
style as vlan(4). carp interfaces no longer require the physical interface
to be on the same subnet as the carp interface, or even that the physical
interface has an adress at all, so CARP can now be used on /30 networks.

ok deraadt@ henning@


# 1.61 07-Dec-2004 mcbride

KNF


# 1.60 03-Dec-2004 henning

do not use one struct timeout for the if congestion stuff, but embed
a struct timeout to struct ifqueue so that each one has its own - it
is a per-queue thing. from chris pascoe


# 1.59 10-Nov-2004 grange

Safer IF_INPUT_ENQUEUE macro.

ok millert@


# 1.58 14-Oct-2004 mickey

avoid stupid commons


# 1.57 11-Oct-2004 henning

ifgroups reqrite
there is now a TAILQ with all interface groups as members, and
in struct ofnet there is only a pointer to the group structure stored
and not its name.
mostly hacked at c2k4 and somewhere over the atlantic ocean
ok markus mcbride


Revision tags: OPENBSD_3_6_BASE
# 1.56 26-Jun-2004 markus

cleanup ioctl for ifgroups; ok pb@


# 1.55 25-Jun-2004 pb

introduce "interface groups"

by "ifconfig fxp0 group foobar" "ifconfig xl0 group foobar"
these two interfaces are in one group.
Every interface has its if-family as default group.

idea/design from henning@, based on some work/disucssion from Joris Vink.

henning@, mcbride@ ok.


# 1.54 20-Jun-2004 beck

undo mbuf cluster breakage that causes free'ed packets to show up on the
input queues when using dhcp and hostap wi, or xl, or fxp....
ok art@


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.53 29-May-2004 jcs

introduce SIOCSIFDESCR and SIOCGIFDESCR to maintain interface
descriptions, configurable with ifconfig

help from various, ok deraadt@


# 1.52 18-May-2004 brad

if_ether.h
add ETHER_MAX_LEN_JUMBO, ETHER_VLAN_ENCAP_LEN, ETHER_ALIGN, and
ETHERMTU_JUMBO constants.

if.h
add a few more interface capabilities flags.

Some from NetBSD, some from FreeBSD.

ok markus@


# 1.51 26-Apr-2004 mcbride

Before enqueueing the packet, copy the contents of incoming clusters
to the mbuf and free the cluster when it contains a small packet.

ok deraadt@


# 1.50 17-Apr-2004 henning

add a congestion indicator to if_queue. It is set when the input queue
is full, along with a timer that unsets it again after 10ms.
The input queue beeing full is a reliable indicator for CPU overload, and
this flag allows other subsystems to cope with the situation.
hacked with beck
ok kjc@ markus@ beck@


Revision tags: OPENBSD_3_5_BASE
# 1.49 15-Jan-2004 markus

add a RTM_IFANNOUNCE message; from netbsd; ok itojun, henning


# 1.48 16-Dec-2003 markus

return error in ifc_destroy; ok deraadt, itojun, cedric, hshoexer


# 1.47 10-Dec-2003 itojun

use if_indexlim (instead of if_index) and ifindex2ifnet[x] != NULL
to check if interface exists, as (1) if_index will have different meaning
(2) ifindex2ifnet could become NULL when interface gets destroyed,
when we introduce dynamically-created interfaces. markus ok


# 1.46 08-Dec-2003 markus

add IOCIFGCLONERS; ifconfig -C; from netbsd; ok henning, deraadt


# 1.45 03-Dec-2003 markus

support for network interface "cloning", e.g. gif(4) via ifconfig(8)


# 1.44 19-Oct-2003 david

more typos


# 1.43 17-Oct-2003 mcbride

Common Address Redundancy Protocol

Allows multiple hosts to share an IP address, providing high availability
and load balancing.

Based on code by mickey@, with additional help from markus@
and Marco_Pfatschbacher@genua.de

ok deraadt@


Revision tags: OPENBSD_3_4_BASE
# 1.42 25-Aug-2003 fgsch

if_init support, required by ieee80211.
deraadt@ ok.


# 1.41 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_A UBC_SYNC_B
# 1.40 03-Jul-2002 miod

Change all variables definitions (int foo) in sys/sys/*.h to variable
declarations (extern int foo), and compensate in the appropriate locations.


# 1.39 30-Jun-2002 itojun

allocate sockaddr_dl for ifnet in if_alloc_sadl(), as we don't always know
the size of sockaddr_dl on if_attach() - for instance, see ether_ifattach().
from netbsd. fgs ok


# 1.38 23-Jun-2002 itojun

g/c last remains of old ipv6 prefix management


# 1.37 27-May-2002 itojun

if_attach() gets called before domaininit(). scan all interfaces for if_afdata
initialization after domaininit().


# 1.36 27-May-2002 itojun

framework to add af-dependent data structure to struct ifnet.
as discussed at bsd-api-discuss. sync w/kame


# 1.35 24-Apr-2002 dhartmei

Add hooks to struct ifnet that allow to register callbacks that will be
notified of interface address changes. ok provos@, angelos@


Revision tags: OPENBSD_3_1_BASE
# 1.34 15-Mar-2002 millert

Cosmetic changes only, primarily making comments line up nicely after the
__P removal.


# 1.33 14-Mar-2002 millert

First round of __P removal in sys


# 1.32 23-Jan-2002 fgsch

compatability -> compatibility.


Revision tags: OPENBSD_3_0_BASE UBC_BASE
# 1.31 05-Jul-2001 angelos

branches: 1.31.4;
KNF


# 1.30 05-Jul-2001 jjbg

Include files for IPComp support. angelos@ ok.


# 1.29 27-Jun-2001 kjc

ALTQ base modifications to the kernel.
- ALTQ introduces a set of new queue macros that coexist with the
traditional IF_XXX macros.
- "struct ifaltq" replaces "struct ifqueue" in "struct ifnet".
- assign cdev major 74 for i386 and 54 for alpha as ALTQ control interface.


# 1.28 23-Jun-2001 fgsch

Add ether_input_mbuf to help us remove the ether_header from
ether_input; all drivers should start migrating to this.
Discussed with jason@, deraadt@ more or les ok'ed.


# 1.27 15-Jun-2001 itojun

change the meaning of ifnet.if_lastchange to meet RFC1573 ifLastChange.
follows BSD/OS practice and ucd-snmp code (FreeBSD does it for specific
interfaces only).

was: if_lastchange get updated on every packet transmission/receipt.
now: if_lastchange get updated when IFF_UP is changed.


# 1.26 09-Jun-2001 angelos

By popular demand, protect from multiple inclusion, and fix to use the
same naming style.


# 1.25 28-May-2001 angelos

IPSECv4 -> IPSEC


# 1.24 28-May-2001 angelos

No need for separate ESP/AH interface capabilities.


# 1.23 28-May-2001 angelos

Interface capabilities (based on NetBSD, but merge ethercom and ifnet
capabilities into one, in the ifp).


Revision tags: OPENBSD_2_9_BASE
# 1.22 06-Feb-2001 mickey

allow changing number of loopbacks in ukc.
change rest of the code to use lo0ifp pointing
to the corresponding struct ifnet.
itojun@ and niklas@ ok


# 1.21 19-Jan-2001 itojun

pull post-4.4BSD change to sys/net/route.c from BSD/OS 4.2 (UCB copyrighted).

have sys/net/route.c:rtrequest1(), which takes rt_addrinfo * as the argument.
pass rt_addrinfo all the way down to rtrequest, and ifa->ifa_rtrequest.
3rd arg of ifa->ifa_rtrequest is now rt_addrinfo * instead of sockaddr *
(almost noone is using it anyways).

benefit: the follwoing command now works. previously we need two route(8)
invocations, "add" then "change".
# route add -inet6 default ::1 -ifp gif0

remove unsafe typecast in rtrequest(), from rtentry * to sockaddr *. it was
introduced by 4.3BSD-reno and never corrected.

XXX is eon_rtrequest() change correct regarding to 3rd arg?
eon_rtrequest() and rtrequest() were incorrect since 4.3BSD-reno,
so i do not have correct answer in the source code.
someone with more clue about netiso-over-ip, please help.


Revision tags: OPENBSD_2_8_BASE
# 1.20 20-Sep-2000 art

Since ifa_refcnt was bumped to an int and rt_flags is an int too, bump
ifa_flags to int.


# 1.19 28-Aug-2000 deraadt

changing the size of if_data has heavy impact on userland compat. there
was even a u_char slot available for ifi_link_state, which clearly does not
need a full 32 bits.


# 1.18 26-Aug-2000 nate

sync mii code with netbsd
adds detach functionality for phys
some code cleanup

Nobody really had time to test all of this out, but theo said commit anyway


Revision tags: OPENBSD_2_7_BASE
# 1.17 22-Mar-2000 itojun

remove if_withname(), which was imported during KAME merge by mistake.


# 1.16 21-Mar-2000 mickey

add SIOCGIFMTU/SIOCSIFMTU; remediate redundant code of tun, ppp, sppp; chris@ ok


Revision tags: SMP_BASE
# 1.15 02-Feb-2000 itojun

branches: 1.15.2;
wrap IFAFREE() by "do {} while (0)". it wasn't safe enough.


Revision tags: kame_19991208
# 1.14 08-Dec-1999 itojun

bring in KAME IPv6 code, dated 19991208.
replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support.
see sys/netinet6/{TODO,IMPLEMENTATION} for more details.

GENERIC configuration should work fine as before. GENERIC.v6 works fine
as well, but you'll need KAME userland tools to play with IPv6 (will be
bringed into soon).


Revision tags: OPENBSD_2_6_BASE
# 1.13 08-Aug-1999 niklas

Support detaching of network interfaces. Still work to do in ipf, and
other families than inet.


# 1.12 23-Jun-1999 cmetz

Added some protocol independent interfaces (supposedly IPv6 support APIs, but
ones that are useful for all protocols, not just IPv6).


Revision tags: OPENBSD_2_5_BASE
# 1.11 13-Mar-1999 deraadt

make ifa_refcnt a u_int; andrewb@demon.net


# 1.10 26-Feb-1999 jason

Ethernet bridge/IP firewall driver.


# 1.9 07-Jan-1999 deraadt

fix IFAFREE() to be safe for if/else nesting


Revision tags: OPENBSD_2_4_BASE
# 1.8 03-Sep-1998 jason

o OpenBSD gets if_media support (from NetBSD)
o rework/simplify if_xl to use it


Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE OPENBSD_2_2_BASE OPENBSD_2_3_BASE
# 1.7 02-Jul-1996 niklas

-Wall & -Wstrict-prototype fixes


# 1.6 29-Jun-1996 deraadt

provide if_attachhead(), and make if_loop use it


# 1.5 10-May-1996 deraadt

if_name/if_unit -> if_xname/if_softc


# 1.4 06-May-1996 mickey

if.h was missed from the commit.
if_ethersubr.c: missed variables added.


# 1.3 05-Mar-1996 mickey

Changes for ifconfig to compile.


# 1.2 03-Mar-1996 niklas

From NetBSD: 960217 merge


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.198 12-Nov-2018 dlg

add ifreq bits for the tx header prio field ioctls

a tx header prio can set to a fixed value from 0 to 7, or magic
values to represent populating the prio field from the encapsulated
packet, or from the mbuf prio value.

ok claudio@


# 1.197 12-Nov-2018 krw

Add new routing socket message RTM_80211INFO to provide details of
802.11 interface state changes (e.g. SSID) to interested parties.

Original diff from phessler@. Many suggestions and tweaks from
claudio@, stsp@, anton@.

ok claudio@ stsp@ anton@ phessler@


# 1.196 11-Nov-2018 dlg

use the llprio on gre(4) and eoip(4) interfaces for the keepalive tos

llprios are valued 0 to 7, while the ip tos/dscp/tclass is an 8 bit
value. fortunately the high 3 bits map nicely to the llprio values,
so we shift the llprio into place when generating the keepalive
frames. the llprio is defaulted to the value that cisco uses for
their gre keepalives.


Revision tags: OPENBSD_6_4_BASE
# 1.195 12-Sep-2018 krw

Fix obvious cut&pasto in comment (ifa_msghdr -> if_announcemsghdr).

ok claudio@


# 1.194 30-May-2018 dlg

restrict the prio values from SIOCSIFLLPRIO to what the kernel handles

previously the ioctl code checked that prio was an int less than
UCHAR_MAX, but the rest of the kernel (and priq code in particular)
expects it to be between 0 and 7 inclusive.

ok krw@ tb@


# 1.193 25-Apr-2018 jca

Make this header standalone #if __BSD_VISIBLE, by including needed headers

Puts us in line with Free/NetBSD and Linux and will get us rid of
pointless patches in the ports tree. ok guenther@ deraadt@


Revision tags: OPENBSD_6_3_BASE
# 1.192 19-Feb-2018 dlg

tunneldf needs ifr_df


# 1.191 10-Feb-2018 florian

Implement RFC 7217: "A Method for Generating Semantically Opaque
Interface Identifiers with IPv6 Stateless Address Autoconfiguration."

"An IPv6 address configured using this method is stable within each
subnet, but the corresponding Interface Identifier changes when the
host moves from one network to another. This method is meant to be an
alternative to generating Interface Identifiers based on hardware
addresses."

OK naddy, sthen


# 1.190 16-Jan-2018 mpi

Recycle IFF_NOTRAILERS into IFF_STATICARP and document ownerhsip
of IFF* flags.

inputs from jmc@, ok bluhm@, visa@


# 1.189 21-Dec-2017 dlg

prototype if_attach_iqueues so drivers can configure multiple iqs.


# 1.188 09-Nov-2017 tb

The cmd argument of ifconf() has been unused since COMPAT_LINUX was
purged. Remove it and move the prototype to if.c since ifconf() is
not used outside of this file.

ok mpi


# 1.187 31-Oct-2017 sashan

- add one more softnet taskq
NOTE: code still runs with single softnet task. change definition of
SOFTNET_TASKS in net/if.c, if you want to have more than one softnet task

OK mpi@, OK phessler@


Revision tags: OPENBSD_6_1_BASE OPENBSD_6_2_BASE
# 1.186 24-Jan-2017 krw

A space here, a space there. Soon we're talking real whitespace
rectification.


# 1.185 24-Jan-2017 dlg

add support for multiple transmit ifqueues per network interface.

an ifq to transmit a packet is picked by the current traffic
conditioner (ie, priq or hfsc) by providing an index into an array
of ifqs. by default interfaces get a single ifq but can ask for
more using if_attach_queues().

the vast majority of our drivers still think there's a 1:1 mapping
between interfaces and transmit queues, so their if_start routines
take an ifnet pointer instead of a pointer to the ifqueue struct.
instead of changing all the drivers in the tree, drivers can opt
into using an if_qstart routine and setting the IFXF_MPSAFE flag.
the stack provides a compatability wrapper from the new if_qstart
handler to the previous if_start handlers if IFXF_MPSAFE isnt set.

enabling hfsc on an interface configures it to transmit everything
through the first ifq. any other ifqs are left configured as priq,
but unused, when hfsc is enabled.

getting this in now so everyone can kick the tyres.

ok mpi@ visa@ (who provided some tweaks for cnmac).


# 1.184 23-Jan-2017 mpi

Flag pseudo-interfaces as such in order to call add_net_randomness()
only once per packet.

Fix a regression introduced when if_input() started to be called by
every pseudo-driver.

ok claudio@, dlg@


# 1.183 23-Jan-2017 dlg

i botched the copyout to ifr->ifr_data in SIOCGIFDATA.

this lets pflogd run again.

rename if_data() to if_getdata() while here to make grepping for
things less noisy.

reported by jsg@
worked through with deraadt@


# 1.182 23-Jan-2017 dlg

merge the ifnet and ifqueue stats together when userland wants them.

a new if_data() function takes a pointer to ifnet and merges its
if_data and ifq statistics. it takes the ifq mutex around the reads
of the ifq stats so they get a consistent copy.

the ifnet and ifq stats are merged because some parts of the stack
still update the ifnet counters.

ok visa@ (on an earlier diff) mpi@ claudio@


# 1.181 12-Dec-2016 mpi

Remove most of the splsoftnet() recursions related to cloned interfaces.

inputs and ok bluhm@


# 1.180 27-Oct-2016 dlg

add a new pool for 2k + 2 byte (mcl2k2) clusters.

a certain vendor likes to make chips that specify the rx buffer
sizes in kilobyte increments. unfortunately it places the ethernet
header on the start of the rx buffer, which means if you give it a
mcl2k cluster, the ethernet header will not be ETHER_ALIGNed cos
mcl2k clusters are always allocated on 2k boundarys (cos they pack
into pages well). that in turn means the ip header wont be aligned
correctly.

the current workaround on these chips has been to let non-strict
alignment archs just use the normal 2k cluster, but use whatever
cluster can fit 2k + 2 on strict archs. that turns out to be the
4k cluster, meaning we waste nearly 2k of space on every packet.

properly aligning the ethernet header and ip headers gives a
performance boost, even on non-strict archs.


# 1.179 04-Sep-2016 reyk

Move code to change the rdomain of an interface from the ioctl switch case
to a new function if_setrdomain().

OK mpi@ henning@


# 1.178 03-Sep-2016 reyk

Add support for a multipoint-to-multipoint mode in vxlan(4). In this
mode, vxlan(4) must be configured to accept any virtual network
identifier with "vnetid any" and added to a bridge(4) or switch(4).
This way the driver will dynamically learn the tunnel endpoints and
their vnetids for the responses and can be used to dynamically bridge
between VXLANs. It is also being used in combination with switch(4)
and the OpenFlow tunnel classifiers.

With input from yasuoka@ goda@
OK deraadt@ dlg@


Revision tags: OPENBSD_6_0_BASE
# 1.177 10-Jun-2016 vgross

Add the "llprio" field to struct ifnet, and the corresponding keyword
to ifconfig.

"llprio" allows one to set the priority of packets that do not go through
pf(4), as the case is for arp(4) or bpf(4).

ok sthen@ mikeb@


# 1.176 02-Mar-2016 dlg

provide generic ioctls for managing an interfaces parent

in the future this will subsume the individual vlandev, carpdev,
pppoedev, foodev options for things like vlan, carp, pppoe, etc.

inspired by vnetid

ok mpi@ jmatthew@


Revision tags: OPENBSD_5_9_BASE
# 1.175 05-Dec-2015 deraadt

avoid an ugly wrap in a comment


# 1.174 03-Dec-2015 dlg

rework if_start to allow nics to provide an mpsafe start routine.

existing start routines will still be called under the kernel lock
and at IPL_NET.

mpsafe start routines will be serialised so only one instance of
each interfaces function will be running in the kernel at any point
in time. this guarantees packets will be dequeued in order, and the
start routines dont have to lock against themselves because if_start
does it for them.

the code to do that is based on the scsi runqueue code.

this also provides an if_start_barrier() function that should wait
until any currently running instances of if_start have finished.

a driver can opt in to the mpsafe if_start call by doing the following:

1. setting ifp->if_xflags = IFXF_MPSAFE
2. only calling if_start() instead of its own start routine
3. clearing IFF_RUNNING before calling if_start_barrier() on its way down
4. only using IFQ_DEQUEUE (not ifq_deq_begin/commit/rollback)

to simplify the implementation the tx mitigation code has been removed.

tested by several
ok mpi@ jmatthew@


# 1.173 20-Nov-2015 mpi

Keep if_ref() private, if_get() is what you want to use before if_put().

The thread detaching an interface will sleep until all references to this
interface have been released. So we decided to only keep references for
a short period of time.

Keeping if_ref() private will hopefully help preserve this goal as long
as it makes sense.

Calling if_get()/if_put() in the same function also allows us to make
use of static analysis tools (thanks jsg@!) to catch our errors.

ok dlg@


# 1.172 24-Oct-2015 reyk

Add pair(4), a vether-based virtual Ethernet driver to interconnect
rdomains and bridges on the local system. This can be used to route
through local rdomains, to create L2 devices (like trunks) between
them, and many other things.

Discussed with many, with input from mpi@
OK sthen@ phessler@ yasuoka@ mikeb@


# 1.171 23-Oct-2015 claudio

Introduce a new sysctl NET_RT_IFNAMES that returns only ifnames to ifindex
mappings. This will be used by if_nameindex(3), if_nametoindex(3) and
if_indextoname(3) soon to fix the issues in pledge because of inet6 link
local addressing.
OK mpi@ benno@ deraadt@
The libc version will follow soon so better start updating your kernels


# 1.170 23-Oct-2015 dlg

tweak the vnetid so it can be optional and therefore cleared/deleted.

the abstract vnetid is promoted to a uin32_t, and adds a SIOCDVNETID
ioctl so it can be cleared.

this is all because i set an assignment on implementing a virtual
network interface and the students got confused when vnetid 0 didnt
show up in ifconfig output.

the vnetid in the vxlan(4) protocol is optional, but the current
code confuses 0 with no vnetid being set. this makes it clear.

ok reyk@ who also simplified my diff


# 1.169 05-Oct-2015 uebayasi

Add ifi_oqdrops and its alias to struct if_data.

Necessary bumps in Ports will be handled by sthen@.

OK mpi@ dlg@


# 1.168 27-Sep-2015 stsp

Add if_setlladdr(), factored out from ifioctl(). Will be used by iwm(4) soon.
With suggestions from tedu@ and guenther@
ok kettenis@


# 1.167 11-Sep-2015 stsp

Make room for media types of the future. Extend the ifmedia word to 64 bits.
This changes numbers of the SIOCSIFMEDIA and SIOCGIFMEDIA ioctls and
grows struct ifmediareq.

Old ifconfig and dhclient binaries can still assign addresses, however
the 'media' subcommand stops working. Recompiling ifconfig and dhclient
with new headers before a reboot should not be necessary unless in very
special circumstances where non-default media settings must be used to
get link and console access is not available.

There may be some MD fallout but that will be cleared up later.

ok deraadt miod
with help and suggestions from several sharks attending l2k15


# 1.166 09-Sep-2015 dlg

introduce reference counts for interfaces (ie, struct ifnet *ifp).

if_get can get a reference to an ifp, but it never releases that
reference. this provides an if_put function that can be used to
decrement the refcount.

we cannot come up with a scheme for letting the network stack run on
one (or many) cpus while ioctls are pulling interfaces down on another
cpu without refcounts for the interfaces.

if_put is going in now so we can go through the stack and put the
necessary calls to it in, and then we'll backfill this implementation
to actually check the refcounts when the interface detaches.

ok mpi@ mikeb@ claudio@


# 1.165 30-Aug-2015 mpi

Use a global table for domains instead of building a list at run time.

As a side effect there's no need to run if_attachdomain() after the
list of domains has been built.

ok claudio@, reyk@


Revision tags: OPENBSD_5_8_BASE
# 1.164 07-Jun-2015 jsg

Introduce unhandled_af() for cases where code conditionally does
something based on an address family and later assumes one of the paths
was taken. This was initially just calls to panic until guenther
suggested a function to reduce the amount of strings needed.

This reduces the amount of noise with static analysers and acts
as a sanity check.

ok guenther@ bluhm@


# 1.163 18-May-2015 reyk

Move the rdomain from struct ifnet into struct if_data. This way it
will be exported to userland with the existing sysctl, getifaddrs()
and routing socket (if_msghdr.ifm_data) interfaces that expose
if_data. All programs and daemons - Apps - that call the
SIOCGIFRDOMAIN ioctl in a getifaddrs() loop or after receiving an
interface message on the routing socket can now remove the pointless
additional ioctl. In base, that could be: dhclient, isakmpd, dhcpd,
dhcrelay, ntpd, ospfd, ripd, ifconfig.

No ABI breakage because it uses a previously unused pad field in if_data.

OK mpi@ deraadt@


# 1.162 10-Apr-2015 mpi

Run detach hook and similar before cleaning up any other resource when
an interface is destroyed/removed. This way we can ensure pseudo-driver
changes done after attaching an interface are undone before detaching it.

Note: it is safe to call if_deactivate() multiple times as the interface
should not have any attached pseudo-interface after the first call.

ok deraadt@, dlg@


# 1.161 18-Mar-2015 dlg

remove the congestion handling from struct ifqueue.

its only used for the ip and ip6 network stack input queues, so it
seems unfair that every instance of ifqueue has to carry a pointer
around for this specific use case.

this moves the congestion marker to a kernel global. if we detect
that we're congested, we assume the whole system is busy and punish
all input queues.

marking a system as congested is done by setting the global to the
current value of ticks. as the system moves away from that value,
it moves away from being congested until the comparison fails.

written at s2k15
ok henning@ beck@ bluhm@ claudio@


Revision tags: OPENBSD_5_7_BASE
# 1.160 08-Feb-2015 mpi

Introduce if_input() a function to pass packets dequeued from a
recieving ring to the stack.

if_input() is at the moment a drop-in replacement for ether_input_mbuf()
but will let us stack pseudo-driver in a nice way in order to no longer
call ether_input() recursively.

ok pelikan@, reyk@, blambert@, henning@


# 1.159 06-Jan-2015 stsp

Remove the NOINET6 interface flag, a left-over from the times when IPv6
was enabled by default. Add AFATTACH/AFDETACH ioctls which enable/disable
an address family for an interface (currently used for IPv6 only).

New kernel needs new ifconfig for IPv6 configuration (address assignment
still works with old ifconfig making this easy to cross over).

Committing on behalf of henning@ who is currently lebensmittelvergiftet.
ok stsp, benno, mpi


# 1.158 05-Dec-2014 mpi

Explicitly include <net/if_var.h> instead of pulling it in <net/if.h>.

ok mikeb@, krw@, bluhm@, tedu@


Revision tags: OPENBSD_5_6_BASE
# 1.157 14-Jul-2014 dlg

now that receive ring accounting has been pulled out of the mbuf layer,
we can pull the space the mbuf layer used to do per interface accounting
out of struct if_data.

saves a hundredish bytes on every interface.

ok deraadt@ claudio@


# 1.156 11-Jul-2014 henning

introduce the IFXF_AUTOCONF6 interface flag which controls wether we
accept rtadvs on that interface. the global net.inet6.ip6.accept_rtadv
sysctl just doesn't cut it, even tho the spec wants that - but in their
little absurd world, a host just has one interface by definition anyway...
the sysctlgoes away.
lots of head scratching, brain cell elemination etc from bluhm benno stsp
florian, excitement from simon and todd, ok bluhm stsp benno florian


# 1.155 08-Jul-2014 dlg

introduce the if_rxr api. it is intended to pull the rx ring accounting
out of the mbuf layer, and break the assumption that an interface will
only have a single ring per mbuf cluster size.

mpi@ is ok with moving this forward


# 1.154 13-Jun-2014 mpi

Instead of updating all the cluster allocation water marks of all the
interfaces when the kernel is livelocked, only do it for the current
pool and defer the other updates.

This allow us to get rid of an interface list iteration in a critical
path.

Ridding the libc crank since this change introduce an ABI break.

ok claudio@


Revision tags: OPENBSD_5_5_BASE
# 1.153 21-Nov-2013 mikeb

split kernel parts of the if.h into a separate header file if_var.h
which allows us to modify ifnet structure in a relatively safe way;
discussed with deraadt, ok mpi


# 1.152 09-Nov-2013 dlg

ticks is compared against mcl_grown to see if time has elapsed since
the rx ring was last allowed to grow and then assigned to it. it
is erroneous to do this because mcl_grown is a u_int and ticks is an
int.

this makes mcl_grown an int, and follows the idiom in kern_timeout.c
of going "thing - ticks < diff", which better copes with ticks
wrapping around and being used to calculate relative intervals.

ok pirofti@ guenther@


# 1.151 01-Nov-2013 pelikan

keep net/hfsc.h away from userspace, except in pfctl

tested by naddy, ok deraadt


# 1.150 21-Oct-2013 benno

nuke comment. How soon is now?
"do it" deraadt@


# 1.149 19-Oct-2013 reyk

Bring back the if_detachhook. We're going to have more users now.

ok mpi@ henning@ benno@


# 1.148 13-Oct-2013 reyk

Import vxlan(4), the virtual extensible local area network tunnel
interface. VXLAN is a UDP-based tunnelling protocol for overlaying
virtualized layer 2 networks over layer 3 networks. The implementation
is based on draft-mahalingam-dutt-dcops-vxlan-04 and has been tested
with other implementations in the wild.

put it in deraadt@


# 1.147 12-Oct-2013 henning

new bandwidth shaping subsystem, kernel side
uses hfsc behind the scenes; altq stays in parallel for a migration phase.
if.h even more messy for the transition, but eventuelly it should become
readable...
looked over & tested by many, ok phessler sthen


# 1.146 17-Sep-2013 mpi

Change vlan(4) detach procedure to not use a hook but a list of vlans
on the parent interface. This is similar to what bridge(4), trunk(4)
or carp(4) are doing and allows us to get rid of the detachhook.

ok reyk@, mikeb@


# 1.145 28-Aug-2013 mpi

Remove unused argument from *rtrequest()

ok krw@, mikeb@


Revision tags: OPENBSD_5_4_BASE
# 1.144 20-Jun-2013 mpi

Revert previous and unbreak asr, the new include should be protected.

Reported by naddy@


# 1.143 20-Jun-2013 mpi

Allocate the various hook head descriptors as part of the ifnet
structure rather than doing various M_WAITOK allocations during
the *attach() functions, we always rely on them anyway.

ok mikeb@, uebayasi@


# 1.142 02-Apr-2013 mpi

Instead of storing the link-level address of every interface in a global
array indexed by interface numbers, add a new field to the interface
descriptor pointing to it.

claudio@ and todd@ like it, ok mikeb@


# 1.141 26-Mar-2013 mpi

Remove various read-only *maxlen variables and use IFQ_MAXLEN directly.

ok beck@, mikeb@


# 1.140 20-Mar-2013 mpi

Introduce if_get() to retrieve an interface descriptor pointer given
an interface index and replace all the redondant checks and accesses
to a global array by a call to this function.

With imputs from and ok bluhm@, mikeb@


# 1.139 07-Mar-2013 mpi

Remove unused ifa_ifwithaf() function.

ok mikeb@, miod@


# 1.138 07-Mar-2013 mpi

Remove the IFAFREE() macro, the ifafree() function it was calling already
check for the reference counter.

ok mikeb@, miod@, pelikan@, kettenis@, krw@


Revision tags: OPENBSD_5_3_BASE
# 1.137 23-Nov-2012 sthen

Add SIOCGIFHARDMTU to allow retrieving the driver's maximum supported MTU
looks fine reyk@ ok mikeb@


# 1.136 11-Nov-2012 deraadt

align ifaliasreq.ifra_addr similar to the way that ifreq is fixed --
a gruesome union, to block the compiler from placing the struct
incorrectly aligned on stack frames
ok guenther


# 1.135 05-Oct-2012 camield

Point an interface directly to its bridgeport configuration, instead
of to the bridge itself. This is ok, since an interface can only be part
of one bridge, and the parent bridge is easy to find from the bridgeport.

This way we can get rid of a lot of list walks, improving performance
and shortening the code.

ok henning stsp sthen reyk


# 1.134 19-Sep-2012 henning

defina an IFCAP_CSUM_MASK, covering IFCAP_CSUM_*, and use it in if_vlan.c
to replace the list of them.
this actually makes vlan inherit the IPv6 CSUM flags from it's parent, that
had been commented out since this code was committed back in 2001.
ok benno mpf


# 1.133 10-Sep-2012 guenther

Bring into compliance with POSIX, exposing just the specified bits.

Requested by jasper@, ok kettenis@


# 1.132 21-Aug-2012 bluhm

Reverse the name and meaning of the IFXF_INET6_PRIVACY interface
flag. It is now called IFXF_INET6_NOPRIVACY. So IPv6 privacy
addresses are on by default without resetting the flag during
ifconfig down/up.
OK stsp@, sperreault@ (who wrote the same diff)


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.131 02-Dec-2011 haesbaert

Kill unused IFCAP_IPSEC and IFCAP_IPCOMP.

ok claudio@ henning@ mikeb@


# 1.130 02-Nov-2011 haesbaert

Expose if_capabilities to userland so that ifconfig can display the
device hardware features.
Tune ifconfig to show them with 'hwfeatures' argument.
While here, kill some old unused capabilities and respect 80 columns
in brconfig.h.

ok mcbride@, henning@, mpf@.


# 1.129 07-Oct-2011 henning

rename some vars and functions
unfortunately altq is one giant namespace violation. rename just those that
conflict with new stuff for now only to be found on my laptop. reduce pain,
the diff is huge already. ok ryan


Revision tags: OPENBSD_5_0_BASE
# 1.128 08-Jul-2011 henning

new priority queueing implementation, extremely low overhead, thus fast.
unconditional, always on. 8 priority levels, as every better switch, the
vlan header etc etc. ok ryan mpf sthen, pea tested as well


# 1.127 07-Jul-2011 henning

provide IF_LEN and IFQ_LEN to access ifq_len on an ifqueue, ryan ok


# 1.126 05-Jul-2011 henning

now of course I only noticed if_qflush is completely unused after
adjusting it to the new world order in my tree... remove it, ok ryan claudio


# 1.125 03-Jul-2011 henning

IFQ_CLASSIFY is also just schrapnel


# 1.124 03-Jul-2011 henning

no traces of ALTQ_DECL to be found anywhere, thus kill the #defines


# 1.123 03-Jul-2011 claudio

LINK_STATE_IS_UP() should consider LINK_STATE_UNKNOWN as an up state.
This is now possible because carp no longer uses LINK_STATE_UNKNOWN
for a state that is considered down. This will simplify a lot of code.
OK mpf@ mcbride@ henning@


# 1.122 13-Mar-2011 stsp

Add a way to enable/disable Wake On LAN with ifconfig.
ok deraadt


Revision tags: OPENBSD_4_9_BASE
# 1.121 17-Nov-2010 henning

introduce ifa_update_broadaddr to update an ifaddr's broadcast address,
trivial for the moment, more needed soon
tested by many as part of a larger diff, ok sthen claudio dlg krw


# 1.120 24-Sep-2010 claudio

Implement if_freenameindex() as a real function as required by posix.
OK deraadt@, millert@


# 1.119 23-Sep-2010 dlg

tweak the mclgeti algorithm to behave better under load.

instead of letting hardware rings grow on every interrupt, restrict
it so it can only grow once per softclock tick. we can only punish
the rings on softclock ticks, so it make sense to only grow on
softclock tick boundaries too.

the rings are now punished after >1 lost softclock tick rather than
>2. mclgeti is now more aggressive at detecting livelock.

the rings get punished by an 8th, rather than by half.

we now allow the rings to be punished again even if the system is
already considered in livelock.

without this diff a livelocked system will have its rx ring sizes
scale up and down very rapidly, while holding the rings low for too
long. this affected throughput significantly.

discussed and tested heavily at j2k10. there are still some games
with softnet we can play, but this is a good first step.

"put it in" and ok deraadt@
ok claudio@ krw@ henning@ mcbride@

if we find out that it sucks we can pull it out again later. till then
we'll run with it and see how it goes.


# 1.118 27-Aug-2010 jsg

remove the unused if_init callback in struct ifnet
ok deraadt@ henning@ claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.117 26-Jun-2010 claudio

Implement a simple keepalive mechanism in gre(4) that is compatible with
the one used by Cisco. It sends a return gre packet inside a gre packet
to the other side and expects it to return.
OK deraadt, reyk additional testing by sthen


# 1.116 28-May-2010 claudio

Rework the way we handle MPLS in the kernel. Instead of fumbling MPLS into
ether_output() and later on other L2 output functions use a trick and over-
load the ifp->if_output() function pointer on MPLS enabled interfaces to
go through mpls_output() which will then call the link level output function.
By setting IFXF_MPLS on an interface the output pointers are switched.
This now allows to cleanup the MPLS input and output pathes and fix mpe(4)
so that the MPLS code now actually works for both P and PE systems.
Tested by myself and michele
(A custom kernel with MPLS and mpe enabled is still needed).


# 1.115 17-Apr-2010 deraadt

split SIOCSIFLLADDR code out into an ifnewlladr() function
ok stsp


# 1.114 06-Apr-2010 stsp

Simple implementation of RFC4941, "Privacy Extensions for Stateless
Address Autoconfiguration in IPv6". For those among us who are paranoid
about broadcasting their MAC address to the IPv6 internet.

Man page help from jmc, testing by weerd, arc4random API hints from djm.

ok deraadt, claudio


Revision tags: OPENBSD_4_7_BASE
# 1.113 13-Jan-2010 henning

maintain a global RB tree of all local addresses in the system. this
includes AF_LINK addresses (aka mac addresses in the ethernet case). for
inet this also includes the broadcast addresses.
depends on ifinit() called earlier so we have a chance to pool_init before
autoconf assigns the AF_LINK addresses, the v6 fix, and the ifa_add/del
abstraction i just committed.
this is a change in semantics, it is now illegal to change the actual
address in an ifaddr struct because then the RB tree becomes unbalanced.
nothing using this tree yet.
ok theo ryan dlg


# 1.112 13-Jan-2010 henning

instead of fiddling with the per-interface address lists directly in
many places create a proper API (ifa_add / ifa_del) and use it.
ok theo ryan dlg


# 1.111 12-Jan-2010 deraadt

Make the structures for ifa_msghdr and friends even more like
the route messages so that people and compilers will not get
confused.
ok claudio


# 1.110 17-Sep-2009 claudio

Remove the comaptibility structures for routing socket version 3.
The RTM_VERSION bump is 2 years ago and so there is no need for this.
Diff made by tedu@ some time ago but got never commited so I do it now.


# 1.109 14-Sep-2009 claudio

Add a way to convert the ifi_link_state to a string without the use of
if_media. This makes link state tracking a lot easier as there is no need
to convert if types to if_media types, etc. Additionally this allows us
to extend the link states to include states tracked on higher protocol layers.
gre(4) keepalives packets, bfd and udld can be implemented without ugly hacks.
OK henning, michele, sthen, deraadt


# 1.108 10-Aug-2009 deraadt

At sys_reboot time, bring all the interfaces down so that their xxstop
functions are called, which will turn off DMA. Receiving packets into
your memory after a system reboot is pretty nasty. This will also mean
that the shutdown hooks can go; this solution is smaller.
ok henning miod dlg kettenis


Revision tags: OPENBSD_4_6_BASE
# 1.107 06-Jun-2009 rainer

when xflags got changed, tell the userland by routing sockets

ok henning@


# 1.106 05-Jun-2009 claudio

Initial support for routing domains. This allows to bind interfaces to
alternate routing table and separate them from other interfaces in distinct
routing tables. The same network can now be used in any doamin at the same
time without causing conflicts.
This diff is mostly mechanical and adds the necessary rdomain checks accross
net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6.
input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@


# 1.105 04-Jun-2009 henning

allow IPvShit to be turned off completely per-interface.
ifconfig em0 -inet6
deletes all v6 addresses including link-local and prevents new ones from
being added.
ifconfig em0 inet6 <addr>
re-enables v6, brings the link local back and adds optional <addr>
ok theo reyk


# 1.104 03-Jun-2009 beck

make wireless interfaces priority 4 by default. other interfaces remain
priority 0. while we are in here make sure we add wi interfaces to group "wlan"
in the same way the net80211 stuff already is.

this makes dhcp multiple default routes useful on laptops.

ok claudio@


Revision tags: OPENBSD_4_5_BASE
# 1.103 27-Jan-2009 dlg

make drivers tell the mclgeti allocator what their maximum ring size is
to prevent the hwm growing beyond that. this allows the livelock mitigation
to do something where the hwm used to grow beyond twice the rx rings size.

ok kettenis@ claudio@


# 1.102 12-Dec-2008 claudio

Introduce a if_priority that will be added to RTP_STATIC when routes are
added without an expilict priority. This allows to specify less prefered
interfaces that will only take over if the primary interface loses link.
OK deraadt@


# 1.101 11-Dec-2008 deraadt

export per-interface mbuf cluster pool use statistics out to userland
inside if_data, so that netstat(1) and systat(1) can see them
ok dlg


# 1.100 30-Nov-2008 brad

- Remove unused if_reset "bus reset routine" field in the ifnet struct.
- Add if_stop "stop routine" field in the ifnet struct.

ok mglocker@


# 1.99 26-Nov-2008 dlg

provide m_clsetlwm, an interface for an interface to raise its low
watermark for mbuf cluster allocations.

this is necessary for things like bge which cannot cope with less than a
certain number of pkts on the ring.

ok deraadt@


# 1.98 25-Nov-2008 deraadt

Factor increases are not needed, +1 appears to work as well.
ok dlg


# 1.97 24-Nov-2008 deraadt

move MCLPOOLS to if.h and force uipc_mbuf.c to get if.h, there is no
other option
ok dlg


# 1.96 24-Nov-2008 dlg

add several backend pools to allocate mbufs clusters of various sizes out
of. currently limited to MCLBYTES (2048 bytes) and 4096 bytes until pools
can allocate objects of sizes greater than PAGESIZE.

this allows drivers to ask for "jumbo" packets to fill rx rings with.

the second half of this change is per interface mbuf cluster allocator
statistics. drivers can use the new interface (MCLGETI), which will use
these stats to selectively fail allocations based on demand for mbufs. if
the driver isnt rapidly consuming rx mbufs, we dont allow it to allocate
many to put on its rx ring.

drivers require modifications to take advantage of both the new allocation
semantic and large clusters.

this was written and developed with deraadt@ over the last two days
ok deraadt@ claudio@


# 1.95 07-Nov-2008 deraadt

give this some /* CONSTCOND */ love


Revision tags: OPENBSD_4_4_BASE
# 1.94 10-Apr-2008 dlg

introduce mitigation for the calling of an interfaces start routine.

decent drivers prefer to have a lot of packets on the send queue so they
can queue a lot of them up on the tx ring and then post them all in one
big chunk. unfortunately our stack queues one packet onto the send queue
and then calls the start handler immediately.

this mitigates against that queue, send, queue, send behaviour by trying to
call the start routine only once per softnet. now its queue, queue, queue,
send.

this is the result of a lot of discussion with claudio@
tested by many.


Revision tags: OPENBSD_4_3_BASE
# 1.93 18-Nov-2007 mpf

Sync struct ifaltq to match struct ifqueue.
I wonder why 64-bit archs have not been bitten by this.
OK mcbride@, henning@


# 1.92 03-Sep-2007 claudio

Bump RTM_VERSION to 4 and start a new aera of routing in OpenBSD :)
Changes include 64bit counters instead of u_long, routing table id in the header
of most messages, reserved routing priority field, added a hdrlen field to skip
over the header so that binary compatibility becomes easier.
A minimal backward support for old binaries is included to ease upgrades but
don't expect anything more than ifconfig, route and dhclient to correctly work.
OK henning@ mglocker@


Revision tags: OPENBSD_4_2_BASE
# 1.91 25-Jun-2007 henning

crank ifq_maxlen from 50 to 256, so it is not smaller than most interfaces
rx rings any more. forwarding boxes with many fast interfaces can still use
some more, but this is a saner default.
ok deraadt markus henric


# 1.90 14-Jun-2007 reyk

Add a new "rtlabel" option to ifconfig. It allows to specify a route label
which will be used for new interface routes. For example,
ifconfig em0 10.1.1.0 255.255.255.0 rtlabel RING_1
will set the new interface address and attach the route label RING_1 to
the corresponding route.

manpage bits from jmc@
ok claudio@ henning@


# 1.89 29-May-2007 uwe

Define IF_ENQUEUE() and friends as proper C statements using do ... while
ok henning


# 1.88 26-May-2007 jason

one extern seems to be better than 20 for ifqmaxlen; ok krw


# 1.87 27-Mar-2007 jmc

grammar from bret lambert, and one more from me;


Revision tags: OPENBSD_4_1_BASE
# 1.86 09-Feb-2007 jmc

grammar fix from bret lambert;


# 1.85 28-Nov-2006 reyk

add additional link states to report the half duplex / full duplex
state, if known by the driver. this is required to check the full
duplex state without depending on the ifmedia ioctl which can't be
called in the kernel without process context.

ok henning@, brad@


# 1.84 16-Nov-2006 henning

introduce if_creategroup() to create an empty interface group.
code factored out from if_addgroup(), previously a group always had to have
members. ok mpf mcbride


# 1.83 31-Oct-2006 jason

ether_input_mbuf() isn't necessary, turn it into a macro and deal with
it's "special" case in ether_input(). Based on similiar idea in FreeBSD.
ok brad


Revision tags: OPENBSD_4_0_BASE
# 1.82 02-Jun-2006 mpf

Introduce attributes to interface groups.
As a first user, move the global carp(4) demotion counter
into the interface group. Thus we have the possibility
to define which carp interfaces are demoted together.

Put the demotion counter into the reserved field of the carp header.
With this, we can have carp act smarter if multiple errors occur.
It now always takes over other carp peers, that are advertising
with a higher demote count. As a side effect, we can also have
group failovers without the need of running in preempt mode.
The protocol change does not break compability with older
implementations.

Collaborative work with mcbride@

OK mcbride@, henning@


# 1.81 27-May-2006 brad

remove IFCAP_JUMBO_MTU interface capabilities flag and set if_hardmtu in a few
more drivers.

ok reyk@


# 1.80 26-May-2006 deraadt

rename jumbo mtu to if_hardmtu; ok brad reyk


# 1.79 19-May-2006 reyk

add a if_jumbo_mtu field to the interface structure for drivers
supporting ethernet jumbo frames. there's no standard for the size of
jumbo MTUs, so either let the driver set it's own value or use 9000
byte jumbo frames by default.

ok brad@


# 1.78 04-Mar-2006 brad

With the exception of two other small uncommited diffs this moves
the remainder of the network stack from splimp to splnet.

ok miod@


Revision tags: OPENBSD_3_9_BASE
# 1.77 09-Feb-2006 reyk

add an interface detach hook and use it with the vlan(4) driver. this
fixes a possible crash if the parent interface has been destroyed
(like vlan on trunk) before destroying the vlan interface.

ok brad@


Revision tags: OPENBSD_3_8_BASE
# 1.76 14-Jun-2005 henning

rename function and define to reflect the external -> egress name change
so it is clear what it is all about


# 1.75 14-Jun-2005 henning

use "egress" instead of "external" for the interface group containing the
interfaces the default route(s) point to, proposed deraadt some days ago,
ok djm deraadt


# 1.74 12-Jun-2005 henning

add SIOCGIFGMEMB ioctl, returns a list of all interfaces who are member of
the given group, markus ok


# 1.73 07-Jun-2005 henning

introduce a default "external" interface group, containing the interface(s)
the the default route(s) point to.
handles IPv4 and IPv6 as well as multipath routes.
follows default route changes, of course.
eases writing pf rulesets especially on laptops etc. that use different
interfaces depending on the environment (wired, wireless, ...)
ok theo ryan


# 1.72 06-Jun-2005 henning

use a define instead of hardcoding "all" in 3 places


# 1.71 05-Jun-2005 henning

const'ify the char *groupname param to if_addgroup and if_delgroup


# 1.70 24-May-2005 markus

add net.inet.ip.ifq for monitoring and changing ifqueue; similar to netbsd
ok henning


# 1.69 24-May-2005 reyk

initial import of a trunking (link aggregation and link failover)
implementation. it currently supports round robin mode with link state
checking, additional modes will be added later.

ok brad@, deraadt@


# 1.68 24-May-2005 henning

keep a list of member interfaces in ifg_group


# 1.67 22-May-2005 henning

allow pf to match on interface groups
pass on mygroup ...
markus ok


# 1.66 21-May-2005 henning

clean up and rework the interface absraction code big time, rip out multiple
useless layers of indirection and make the code way cleaner overall.
this is just the start, more to come...
worked very hard on by Ryan and me in Montreal last week, on the airplane to
vancouver and yesterday here in calgary. it hurt.
ok ryan theo


# 1.65 20-Apr-2005 mpf

Introduce if_linkstatehooks.
This converts if_link_state_change() to a generic usable
callback with dohooks().

OK henning@, camield@
Tested by camield@ and Alexey E. Suslikov


Revision tags: OPENBSD_3_7_BASE
# 1.64 07-Feb-2005 mcbride

Add new function if_link_state_change() to take care of sending messages
on the routing socket and notifying carp() of link changes.

ok brad@ mpf@


# 1.63 14-Jan-2005 henning

remove old ifgroups ioctls
the old ifgroups haven't been in use ever really, and the new
implementation is 3 months old today. theo ok (3 months ago)


# 1.62 07-Dec-2004 mcbride

Convert carp(4) to behave more like a regular interface, much in the same
style as vlan(4). carp interfaces no longer require the physical interface
to be on the same subnet as the carp interface, or even that the physical
interface has an adress at all, so CARP can now be used on /30 networks.

ok deraadt@ henning@


# 1.61 07-Dec-2004 mcbride

KNF


# 1.60 03-Dec-2004 henning

do not use one struct timeout for the if congestion stuff, but embed
a struct timeout to struct ifqueue so that each one has its own - it
is a per-queue thing. from chris pascoe


# 1.59 10-Nov-2004 grange

Safer IF_INPUT_ENQUEUE macro.

ok millert@


# 1.58 14-Oct-2004 mickey

avoid stupid commons


# 1.57 11-Oct-2004 henning

ifgroups reqrite
there is now a TAILQ with all interface groups as members, and
in struct ofnet there is only a pointer to the group structure stored
and not its name.
mostly hacked at c2k4 and somewhere over the atlantic ocean
ok markus mcbride


Revision tags: OPENBSD_3_6_BASE
# 1.56 26-Jun-2004 markus

cleanup ioctl for ifgroups; ok pb@


# 1.55 25-Jun-2004 pb

introduce "interface groups"

by "ifconfig fxp0 group foobar" "ifconfig xl0 group foobar"
these two interfaces are in one group.
Every interface has its if-family as default group.

idea/design from henning@, based on some work/disucssion from Joris Vink.

henning@, mcbride@ ok.


# 1.54 20-Jun-2004 beck

undo mbuf cluster breakage that causes free'ed packets to show up on the
input queues when using dhcp and hostap wi, or xl, or fxp....
ok art@


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.53 29-May-2004 jcs

introduce SIOCSIFDESCR and SIOCGIFDESCR to maintain interface
descriptions, configurable with ifconfig

help from various, ok deraadt@


# 1.52 18-May-2004 brad

if_ether.h
add ETHER_MAX_LEN_JUMBO, ETHER_VLAN_ENCAP_LEN, ETHER_ALIGN, and
ETHERMTU_JUMBO constants.

if.h
add a few more interface capabilities flags.

Some from NetBSD, some from FreeBSD.

ok markus@


# 1.51 26-Apr-2004 mcbride

Before enqueueing the packet, copy the contents of incoming clusters
to the mbuf and free the cluster when it contains a small packet.

ok deraadt@


# 1.50 17-Apr-2004 henning

add a congestion indicator to if_queue. It is set when the input queue
is full, along with a timer that unsets it again after 10ms.
The input queue beeing full is a reliable indicator for CPU overload, and
this flag allows other subsystems to cope with the situation.
hacked with beck
ok kjc@ markus@ beck@


Revision tags: OPENBSD_3_5_BASE
# 1.49 15-Jan-2004 markus

add a RTM_IFANNOUNCE message; from netbsd; ok itojun, henning


# 1.48 16-Dec-2003 markus

return error in ifc_destroy; ok deraadt, itojun, cedric, hshoexer


# 1.47 10-Dec-2003 itojun

use if_indexlim (instead of if_index) and ifindex2ifnet[x] != NULL
to check if interface exists, as (1) if_index will have different meaning
(2) ifindex2ifnet could become NULL when interface gets destroyed,
when we introduce dynamically-created interfaces. markus ok


# 1.46 08-Dec-2003 markus

add IOCIFGCLONERS; ifconfig -C; from netbsd; ok henning, deraadt


# 1.45 03-Dec-2003 markus

support for network interface "cloning", e.g. gif(4) via ifconfig(8)


# 1.44 19-Oct-2003 david

more typos


# 1.43 17-Oct-2003 mcbride

Common Address Redundancy Protocol

Allows multiple hosts to share an IP address, providing high availability
and load balancing.

Based on code by mickey@, with additional help from markus@
and Marco_Pfatschbacher@genua.de

ok deraadt@


Revision tags: OPENBSD_3_4_BASE
# 1.42 25-Aug-2003 fgsch

if_init support, required by ieee80211.
deraadt@ ok.


# 1.41 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_A UBC_SYNC_B
# 1.40 03-Jul-2002 miod

Change all variables definitions (int foo) in sys/sys/*.h to variable
declarations (extern int foo), and compensate in the appropriate locations.


# 1.39 30-Jun-2002 itojun

allocate sockaddr_dl for ifnet in if_alloc_sadl(), as we don't always know
the size of sockaddr_dl on if_attach() - for instance, see ether_ifattach().
from netbsd. fgs ok


# 1.38 23-Jun-2002 itojun

g/c last remains of old ipv6 prefix management


# 1.37 27-May-2002 itojun

if_attach() gets called before domaininit(). scan all interfaces for if_afdata
initialization after domaininit().


# 1.36 27-May-2002 itojun

framework to add af-dependent data structure to struct ifnet.
as discussed at bsd-api-discuss. sync w/kame


# 1.35 24-Apr-2002 dhartmei

Add hooks to struct ifnet that allow to register callbacks that will be
notified of interface address changes. ok provos@, angelos@


Revision tags: OPENBSD_3_1_BASE
# 1.34 15-Mar-2002 millert

Cosmetic changes only, primarily making comments line up nicely after the
__P removal.


# 1.33 14-Mar-2002 millert

First round of __P removal in sys


# 1.32 23-Jan-2002 fgsch

compatability -> compatibility.


Revision tags: OPENBSD_3_0_BASE UBC_BASE
# 1.31 05-Jul-2001 angelos

branches: 1.31.4;
KNF


# 1.30 05-Jul-2001 jjbg

Include files for IPComp support. angelos@ ok.


# 1.29 27-Jun-2001 kjc

ALTQ base modifications to the kernel.
- ALTQ introduces a set of new queue macros that coexist with the
traditional IF_XXX macros.
- "struct ifaltq" replaces "struct ifqueue" in "struct ifnet".
- assign cdev major 74 for i386 and 54 for alpha as ALTQ control interface.


# 1.28 23-Jun-2001 fgsch

Add ether_input_mbuf to help us remove the ether_header from
ether_input; all drivers should start migrating to this.
Discussed with jason@, deraadt@ more or les ok'ed.


# 1.27 15-Jun-2001 itojun

change the meaning of ifnet.if_lastchange to meet RFC1573 ifLastChange.
follows BSD/OS practice and ucd-snmp code (FreeBSD does it for specific
interfaces only).

was: if_lastchange get updated on every packet transmission/receipt.
now: if_lastchange get updated when IFF_UP is changed.


# 1.26 09-Jun-2001 angelos

By popular demand, protect from multiple inclusion, and fix to use the
same naming style.


# 1.25 28-May-2001 angelos

IPSECv4 -> IPSEC


# 1.24 28-May-2001 angelos

No need for separate ESP/AH interface capabilities.


# 1.23 28-May-2001 angelos

Interface capabilities (based on NetBSD, but merge ethercom and ifnet
capabilities into one, in the ifp).


Revision tags: OPENBSD_2_9_BASE
# 1.22 06-Feb-2001 mickey

allow changing number of loopbacks in ukc.
change rest of the code to use lo0ifp pointing
to the corresponding struct ifnet.
itojun@ and niklas@ ok


# 1.21 19-Jan-2001 itojun

pull post-4.4BSD change to sys/net/route.c from BSD/OS 4.2 (UCB copyrighted).

have sys/net/route.c:rtrequest1(), which takes rt_addrinfo * as the argument.
pass rt_addrinfo all the way down to rtrequest, and ifa->ifa_rtrequest.
3rd arg of ifa->ifa_rtrequest is now rt_addrinfo * instead of sockaddr *
(almost noone is using it anyways).

benefit: the follwoing command now works. previously we need two route(8)
invocations, "add" then "change".
# route add -inet6 default ::1 -ifp gif0

remove unsafe typecast in rtrequest(), from rtentry * to sockaddr *. it was
introduced by 4.3BSD-reno and never corrected.

XXX is eon_rtrequest() change correct regarding to 3rd arg?
eon_rtrequest() and rtrequest() were incorrect since 4.3BSD-reno,
so i do not have correct answer in the source code.
someone with more clue about netiso-over-ip, please help.


Revision tags: OPENBSD_2_8_BASE
# 1.20 20-Sep-2000 art

Since ifa_refcnt was bumped to an int and rt_flags is an int too, bump
ifa_flags to int.


# 1.19 28-Aug-2000 deraadt

changing the size of if_data has heavy impact on userland compat. there
was even a u_char slot available for ifi_link_state, which clearly does not
need a full 32 bits.


# 1.18 26-Aug-2000 nate

sync mii code with netbsd
adds detach functionality for phys
some code cleanup

Nobody really had time to test all of this out, but theo said commit anyway


Revision tags: OPENBSD_2_7_BASE
# 1.17 22-Mar-2000 itojun

remove if_withname(), which was imported during KAME merge by mistake.


# 1.16 21-Mar-2000 mickey

add SIOCGIFMTU/SIOCSIFMTU; remediate redundant code of tun, ppp, sppp; chris@ ok


Revision tags: SMP_BASE
# 1.15 02-Feb-2000 itojun

branches: 1.15.2;
wrap IFAFREE() by "do {} while (0)". it wasn't safe enough.


Revision tags: kame_19991208
# 1.14 08-Dec-1999 itojun

bring in KAME IPv6 code, dated 19991208.
replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support.
see sys/netinet6/{TODO,IMPLEMENTATION} for more details.

GENERIC configuration should work fine as before. GENERIC.v6 works fine
as well, but you'll need KAME userland tools to play with IPv6 (will be
bringed into soon).


Revision tags: OPENBSD_2_6_BASE
# 1.13 08-Aug-1999 niklas

Support detaching of network interfaces. Still work to do in ipf, and
other families than inet.


# 1.12 23-Jun-1999 cmetz

Added some protocol independent interfaces (supposedly IPv6 support APIs, but
ones that are useful for all protocols, not just IPv6).


Revision tags: OPENBSD_2_5_BASE
# 1.11 13-Mar-1999 deraadt

make ifa_refcnt a u_int; andrewb@demon.net


# 1.10 26-Feb-1999 jason

Ethernet bridge/IP firewall driver.


# 1.9 07-Jan-1999 deraadt

fix IFAFREE() to be safe for if/else nesting


Revision tags: OPENBSD_2_4_BASE
# 1.8 03-Sep-1998 jason

o OpenBSD gets if_media support (from NetBSD)
o rework/simplify if_xl to use it


Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE OPENBSD_2_2_BASE OPENBSD_2_3_BASE
# 1.7 02-Jul-1996 niklas

-Wall & -Wstrict-prototype fixes


# 1.6 29-Jun-1996 deraadt

provide if_attachhead(), and make if_loop use it


# 1.5 10-May-1996 deraadt

if_name/if_unit -> if_xname/if_softc


# 1.4 06-May-1996 mickey

if.h was missed from the commit.
if_ethersubr.c: missed variables added.


# 1.3 05-Mar-1996 mickey

Changes for ifconfig to compile.


# 1.2 03-Mar-1996 niklas

From NetBSD: 960217 merge


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.196 11-Nov-2018 dlg

use the llprio on gre(4) and eoip(4) interfaces for the keepalive tos

llprios are valued 0 to 7, while the ip tos/dscp/tclass is an 8 bit
value. fortunately the high 3 bits map nicely to the llprio values,
so we shift the llprio into place when generating the keepalive
frames. the llprio is defaulted to the value that cisco uses for
their gre keepalives.


Revision tags: OPENBSD_6_4_BASE
# 1.195 12-Sep-2018 krw

Fix obvious cut&pasto in comment (ifa_msghdr -> if_announcemsghdr).

ok claudio@


# 1.194 30-May-2018 dlg

restrict the prio values from SIOCSIFLLPRIO to what the kernel handles

previously the ioctl code checked that prio was an int less than
UCHAR_MAX, but the rest of the kernel (and priq code in particular)
expects it to be between 0 and 7 inclusive.

ok krw@ tb@


# 1.193 25-Apr-2018 jca

Make this header standalone #if __BSD_VISIBLE, by including needed headers

Puts us in line with Free/NetBSD and Linux and will get us rid of
pointless patches in the ports tree. ok guenther@ deraadt@


Revision tags: OPENBSD_6_3_BASE
# 1.192 19-Feb-2018 dlg

tunneldf needs ifr_df


# 1.191 10-Feb-2018 florian

Implement RFC 7217: "A Method for Generating Semantically Opaque
Interface Identifiers with IPv6 Stateless Address Autoconfiguration."

"An IPv6 address configured using this method is stable within each
subnet, but the corresponding Interface Identifier changes when the
host moves from one network to another. This method is meant to be an
alternative to generating Interface Identifiers based on hardware
addresses."

OK naddy, sthen


# 1.190 16-Jan-2018 mpi

Recycle IFF_NOTRAILERS into IFF_STATICARP and document ownerhsip
of IFF* flags.

inputs from jmc@, ok bluhm@, visa@


# 1.189 21-Dec-2017 dlg

prototype if_attach_iqueues so drivers can configure multiple iqs.


# 1.188 09-Nov-2017 tb

The cmd argument of ifconf() has been unused since COMPAT_LINUX was
purged. Remove it and move the prototype to if.c since ifconf() is
not used outside of this file.

ok mpi


# 1.187 31-Oct-2017 sashan

- add one more softnet taskq
NOTE: code still runs with single softnet task. change definition of
SOFTNET_TASKS in net/if.c, if you want to have more than one softnet task

OK mpi@, OK phessler@


Revision tags: OPENBSD_6_1_BASE OPENBSD_6_2_BASE
# 1.186 24-Jan-2017 krw

A space here, a space there. Soon we're talking real whitespace
rectification.


# 1.185 24-Jan-2017 dlg

add support for multiple transmit ifqueues per network interface.

an ifq to transmit a packet is picked by the current traffic
conditioner (ie, priq or hfsc) by providing an index into an array
of ifqs. by default interfaces get a single ifq but can ask for
more using if_attach_queues().

the vast majority of our drivers still think there's a 1:1 mapping
between interfaces and transmit queues, so their if_start routines
take an ifnet pointer instead of a pointer to the ifqueue struct.
instead of changing all the drivers in the tree, drivers can opt
into using an if_qstart routine and setting the IFXF_MPSAFE flag.
the stack provides a compatability wrapper from the new if_qstart
handler to the previous if_start handlers if IFXF_MPSAFE isnt set.

enabling hfsc on an interface configures it to transmit everything
through the first ifq. any other ifqs are left configured as priq,
but unused, when hfsc is enabled.

getting this in now so everyone can kick the tyres.

ok mpi@ visa@ (who provided some tweaks for cnmac).


# 1.184 23-Jan-2017 mpi

Flag pseudo-interfaces as such in order to call add_net_randomness()
only once per packet.

Fix a regression introduced when if_input() started to be called by
every pseudo-driver.

ok claudio@, dlg@


# 1.183 23-Jan-2017 dlg

i botched the copyout to ifr->ifr_data in SIOCGIFDATA.

this lets pflogd run again.

rename if_data() to if_getdata() while here to make grepping for
things less noisy.

reported by jsg@
worked through with deraadt@


# 1.182 23-Jan-2017 dlg

merge the ifnet and ifqueue stats together when userland wants them.

a new if_data() function takes a pointer to ifnet and merges its
if_data and ifq statistics. it takes the ifq mutex around the reads
of the ifq stats so they get a consistent copy.

the ifnet and ifq stats are merged because some parts of the stack
still update the ifnet counters.

ok visa@ (on an earlier diff) mpi@ claudio@


# 1.181 12-Dec-2016 mpi

Remove most of the splsoftnet() recursions related to cloned interfaces.

inputs and ok bluhm@


# 1.180 27-Oct-2016 dlg

add a new pool for 2k + 2 byte (mcl2k2) clusters.

a certain vendor likes to make chips that specify the rx buffer
sizes in kilobyte increments. unfortunately it places the ethernet
header on the start of the rx buffer, which means if you give it a
mcl2k cluster, the ethernet header will not be ETHER_ALIGNed cos
mcl2k clusters are always allocated on 2k boundarys (cos they pack
into pages well). that in turn means the ip header wont be aligned
correctly.

the current workaround on these chips has been to let non-strict
alignment archs just use the normal 2k cluster, but use whatever
cluster can fit 2k + 2 on strict archs. that turns out to be the
4k cluster, meaning we waste nearly 2k of space on every packet.

properly aligning the ethernet header and ip headers gives a
performance boost, even on non-strict archs.


# 1.179 04-Sep-2016 reyk

Move code to change the rdomain of an interface from the ioctl switch case
to a new function if_setrdomain().

OK mpi@ henning@


# 1.178 03-Sep-2016 reyk

Add support for a multipoint-to-multipoint mode in vxlan(4). In this
mode, vxlan(4) must be configured to accept any virtual network
identifier with "vnetid any" and added to a bridge(4) or switch(4).
This way the driver will dynamically learn the tunnel endpoints and
their vnetids for the responses and can be used to dynamically bridge
between VXLANs. It is also being used in combination with switch(4)
and the OpenFlow tunnel classifiers.

With input from yasuoka@ goda@
OK deraadt@ dlg@


Revision tags: OPENBSD_6_0_BASE
# 1.177 10-Jun-2016 vgross

Add the "llprio" field to struct ifnet, and the corresponding keyword
to ifconfig.

"llprio" allows one to set the priority of packets that do not go through
pf(4), as the case is for arp(4) or bpf(4).

ok sthen@ mikeb@


# 1.176 02-Mar-2016 dlg

provide generic ioctls for managing an interfaces parent

in the future this will subsume the individual vlandev, carpdev,
pppoedev, foodev options for things like vlan, carp, pppoe, etc.

inspired by vnetid

ok mpi@ jmatthew@


Revision tags: OPENBSD_5_9_BASE
# 1.175 05-Dec-2015 deraadt

avoid an ugly wrap in a comment


# 1.174 03-Dec-2015 dlg

rework if_start to allow nics to provide an mpsafe start routine.

existing start routines will still be called under the kernel lock
and at IPL_NET.

mpsafe start routines will be serialised so only one instance of
each interfaces function will be running in the kernel at any point
in time. this guarantees packets will be dequeued in order, and the
start routines dont have to lock against themselves because if_start
does it for them.

the code to do that is based on the scsi runqueue code.

this also provides an if_start_barrier() function that should wait
until any currently running instances of if_start have finished.

a driver can opt in to the mpsafe if_start call by doing the following:

1. setting ifp->if_xflags = IFXF_MPSAFE
2. only calling if_start() instead of its own start routine
3. clearing IFF_RUNNING before calling if_start_barrier() on its way down
4. only using IFQ_DEQUEUE (not ifq_deq_begin/commit/rollback)

to simplify the implementation the tx mitigation code has been removed.

tested by several
ok mpi@ jmatthew@


# 1.173 20-Nov-2015 mpi

Keep if_ref() private, if_get() is what you want to use before if_put().

The thread detaching an interface will sleep until all references to this
interface have been released. So we decided to only keep references for
a short period of time.

Keeping if_ref() private will hopefully help preserve this goal as long
as it makes sense.

Calling if_get()/if_put() in the same function also allows us to make
use of static analysis tools (thanks jsg@!) to catch our errors.

ok dlg@


# 1.172 24-Oct-2015 reyk

Add pair(4), a vether-based virtual Ethernet driver to interconnect
rdomains and bridges on the local system. This can be used to route
through local rdomains, to create L2 devices (like trunks) between
them, and many other things.

Discussed with many, with input from mpi@
OK sthen@ phessler@ yasuoka@ mikeb@


# 1.171 23-Oct-2015 claudio

Introduce a new sysctl NET_RT_IFNAMES that returns only ifnames to ifindex
mappings. This will be used by if_nameindex(3), if_nametoindex(3) and
if_indextoname(3) soon to fix the issues in pledge because of inet6 link
local addressing.
OK mpi@ benno@ deraadt@
The libc version will follow soon so better start updating your kernels


# 1.170 23-Oct-2015 dlg

tweak the vnetid so it can be optional and therefore cleared/deleted.

the abstract vnetid is promoted to a uin32_t, and adds a SIOCDVNETID
ioctl so it can be cleared.

this is all because i set an assignment on implementing a virtual
network interface and the students got confused when vnetid 0 didnt
show up in ifconfig output.

the vnetid in the vxlan(4) protocol is optional, but the current
code confuses 0 with no vnetid being set. this makes it clear.

ok reyk@ who also simplified my diff


# 1.169 05-Oct-2015 uebayasi

Add ifi_oqdrops and its alias to struct if_data.

Necessary bumps in Ports will be handled by sthen@.

OK mpi@ dlg@


# 1.168 27-Sep-2015 stsp

Add if_setlladdr(), factored out from ifioctl(). Will be used by iwm(4) soon.
With suggestions from tedu@ and guenther@
ok kettenis@


# 1.167 11-Sep-2015 stsp

Make room for media types of the future. Extend the ifmedia word to 64 bits.
This changes numbers of the SIOCSIFMEDIA and SIOCGIFMEDIA ioctls and
grows struct ifmediareq.

Old ifconfig and dhclient binaries can still assign addresses, however
the 'media' subcommand stops working. Recompiling ifconfig and dhclient
with new headers before a reboot should not be necessary unless in very
special circumstances where non-default media settings must be used to
get link and console access is not available.

There may be some MD fallout but that will be cleared up later.

ok deraadt miod
with help and suggestions from several sharks attending l2k15


# 1.166 09-Sep-2015 dlg

introduce reference counts for interfaces (ie, struct ifnet *ifp).

if_get can get a reference to an ifp, but it never releases that
reference. this provides an if_put function that can be used to
decrement the refcount.

we cannot come up with a scheme for letting the network stack run on
one (or many) cpus while ioctls are pulling interfaces down on another
cpu without refcounts for the interfaces.

if_put is going in now so we can go through the stack and put the
necessary calls to it in, and then we'll backfill this implementation
to actually check the refcounts when the interface detaches.

ok mpi@ mikeb@ claudio@


# 1.165 30-Aug-2015 mpi

Use a global table for domains instead of building a list at run time.

As a side effect there's no need to run if_attachdomain() after the
list of domains has been built.

ok claudio@, reyk@


Revision tags: OPENBSD_5_8_BASE
# 1.164 07-Jun-2015 jsg

Introduce unhandled_af() for cases where code conditionally does
something based on an address family and later assumes one of the paths
was taken. This was initially just calls to panic until guenther
suggested a function to reduce the amount of strings needed.

This reduces the amount of noise with static analysers and acts
as a sanity check.

ok guenther@ bluhm@


# 1.163 18-May-2015 reyk

Move the rdomain from struct ifnet into struct if_data. This way it
will be exported to userland with the existing sysctl, getifaddrs()
and routing socket (if_msghdr.ifm_data) interfaces that expose
if_data. All programs and daemons - Apps - that call the
SIOCGIFRDOMAIN ioctl in a getifaddrs() loop or after receiving an
interface message on the routing socket can now remove the pointless
additional ioctl. In base, that could be: dhclient, isakmpd, dhcpd,
dhcrelay, ntpd, ospfd, ripd, ifconfig.

No ABI breakage because it uses a previously unused pad field in if_data.

OK mpi@ deraadt@


# 1.162 10-Apr-2015 mpi

Run detach hook and similar before cleaning up any other resource when
an interface is destroyed/removed. This way we can ensure pseudo-driver
changes done after attaching an interface are undone before detaching it.

Note: it is safe to call if_deactivate() multiple times as the interface
should not have any attached pseudo-interface after the first call.

ok deraadt@, dlg@


# 1.161 18-Mar-2015 dlg

remove the congestion handling from struct ifqueue.

its only used for the ip and ip6 network stack input queues, so it
seems unfair that every instance of ifqueue has to carry a pointer
around for this specific use case.

this moves the congestion marker to a kernel global. if we detect
that we're congested, we assume the whole system is busy and punish
all input queues.

marking a system as congested is done by setting the global to the
current value of ticks. as the system moves away from that value,
it moves away from being congested until the comparison fails.

written at s2k15
ok henning@ beck@ bluhm@ claudio@


Revision tags: OPENBSD_5_7_BASE
# 1.160 08-Feb-2015 mpi

Introduce if_input() a function to pass packets dequeued from a
recieving ring to the stack.

if_input() is at the moment a drop-in replacement for ether_input_mbuf()
but will let us stack pseudo-driver in a nice way in order to no longer
call ether_input() recursively.

ok pelikan@, reyk@, blambert@, henning@


# 1.159 06-Jan-2015 stsp

Remove the NOINET6 interface flag, a left-over from the times when IPv6
was enabled by default. Add AFATTACH/AFDETACH ioctls which enable/disable
an address family for an interface (currently used for IPv6 only).

New kernel needs new ifconfig for IPv6 configuration (address assignment
still works with old ifconfig making this easy to cross over).

Committing on behalf of henning@ who is currently lebensmittelvergiftet.
ok stsp, benno, mpi


# 1.158 05-Dec-2014 mpi

Explicitly include <net/if_var.h> instead of pulling it in <net/if.h>.

ok mikeb@, krw@, bluhm@, tedu@


Revision tags: OPENBSD_5_6_BASE
# 1.157 14-Jul-2014 dlg

now that receive ring accounting has been pulled out of the mbuf layer,
we can pull the space the mbuf layer used to do per interface accounting
out of struct if_data.

saves a hundredish bytes on every interface.

ok deraadt@ claudio@


# 1.156 11-Jul-2014 henning

introduce the IFXF_AUTOCONF6 interface flag which controls wether we
accept rtadvs on that interface. the global net.inet6.ip6.accept_rtadv
sysctl just doesn't cut it, even tho the spec wants that - but in their
little absurd world, a host just has one interface by definition anyway...
the sysctlgoes away.
lots of head scratching, brain cell elemination etc from bluhm benno stsp
florian, excitement from simon and todd, ok bluhm stsp benno florian


# 1.155 08-Jul-2014 dlg

introduce the if_rxr api. it is intended to pull the rx ring accounting
out of the mbuf layer, and break the assumption that an interface will
only have a single ring per mbuf cluster size.

mpi@ is ok with moving this forward


# 1.154 13-Jun-2014 mpi

Instead of updating all the cluster allocation water marks of all the
interfaces when the kernel is livelocked, only do it for the current
pool and defer the other updates.

This allow us to get rid of an interface list iteration in a critical
path.

Ridding the libc crank since this change introduce an ABI break.

ok claudio@


Revision tags: OPENBSD_5_5_BASE
# 1.153 21-Nov-2013 mikeb

split kernel parts of the if.h into a separate header file if_var.h
which allows us to modify ifnet structure in a relatively safe way;
discussed with deraadt, ok mpi


# 1.152 09-Nov-2013 dlg

ticks is compared against mcl_grown to see if time has elapsed since
the rx ring was last allowed to grow and then assigned to it. it
is erroneous to do this because mcl_grown is a u_int and ticks is an
int.

this makes mcl_grown an int, and follows the idiom in kern_timeout.c
of going "thing - ticks < diff", which better copes with ticks
wrapping around and being used to calculate relative intervals.

ok pirofti@ guenther@


# 1.151 01-Nov-2013 pelikan

keep net/hfsc.h away from userspace, except in pfctl

tested by naddy, ok deraadt


# 1.150 21-Oct-2013 benno

nuke comment. How soon is now?
"do it" deraadt@


# 1.149 19-Oct-2013 reyk

Bring back the if_detachhook. We're going to have more users now.

ok mpi@ henning@ benno@


# 1.148 13-Oct-2013 reyk

Import vxlan(4), the virtual extensible local area network tunnel
interface. VXLAN is a UDP-based tunnelling protocol for overlaying
virtualized layer 2 networks over layer 3 networks. The implementation
is based on draft-mahalingam-dutt-dcops-vxlan-04 and has been tested
with other implementations in the wild.

put it in deraadt@


# 1.147 12-Oct-2013 henning

new bandwidth shaping subsystem, kernel side
uses hfsc behind the scenes; altq stays in parallel for a migration phase.
if.h even more messy for the transition, but eventuelly it should become
readable...
looked over & tested by many, ok phessler sthen


# 1.146 17-Sep-2013 mpi

Change vlan(4) detach procedure to not use a hook but a list of vlans
on the parent interface. This is similar to what bridge(4), trunk(4)
or carp(4) are doing and allows us to get rid of the detachhook.

ok reyk@, mikeb@


# 1.145 28-Aug-2013 mpi

Remove unused argument from *rtrequest()

ok krw@, mikeb@


Revision tags: OPENBSD_5_4_BASE
# 1.144 20-Jun-2013 mpi

Revert previous and unbreak asr, the new include should be protected.

Reported by naddy@


# 1.143 20-Jun-2013 mpi

Allocate the various hook head descriptors as part of the ifnet
structure rather than doing various M_WAITOK allocations during
the *attach() functions, we always rely on them anyway.

ok mikeb@, uebayasi@


# 1.142 02-Apr-2013 mpi

Instead of storing the link-level address of every interface in a global
array indexed by interface numbers, add a new field to the interface
descriptor pointing to it.

claudio@ and todd@ like it, ok mikeb@


# 1.141 26-Mar-2013 mpi

Remove various read-only *maxlen variables and use IFQ_MAXLEN directly.

ok beck@, mikeb@


# 1.140 20-Mar-2013 mpi

Introduce if_get() to retrieve an interface descriptor pointer given
an interface index and replace all the redondant checks and accesses
to a global array by a call to this function.

With imputs from and ok bluhm@, mikeb@


# 1.139 07-Mar-2013 mpi

Remove unused ifa_ifwithaf() function.

ok mikeb@, miod@


# 1.138 07-Mar-2013 mpi

Remove the IFAFREE() macro, the ifafree() function it was calling already
check for the reference counter.

ok mikeb@, miod@, pelikan@, kettenis@, krw@


Revision tags: OPENBSD_5_3_BASE
# 1.137 23-Nov-2012 sthen

Add SIOCGIFHARDMTU to allow retrieving the driver's maximum supported MTU
looks fine reyk@ ok mikeb@


# 1.136 11-Nov-2012 deraadt

align ifaliasreq.ifra_addr similar to the way that ifreq is fixed --
a gruesome union, to block the compiler from placing the struct
incorrectly aligned on stack frames
ok guenther


# 1.135 05-Oct-2012 camield

Point an interface directly to its bridgeport configuration, instead
of to the bridge itself. This is ok, since an interface can only be part
of one bridge, and the parent bridge is easy to find from the bridgeport.

This way we can get rid of a lot of list walks, improving performance
and shortening the code.

ok henning stsp sthen reyk


# 1.134 19-Sep-2012 henning

defina an IFCAP_CSUM_MASK, covering IFCAP_CSUM_*, and use it in if_vlan.c
to replace the list of them.
this actually makes vlan inherit the IPv6 CSUM flags from it's parent, that
had been commented out since this code was committed back in 2001.
ok benno mpf


# 1.133 10-Sep-2012 guenther

Bring into compliance with POSIX, exposing just the specified bits.

Requested by jasper@, ok kettenis@


# 1.132 21-Aug-2012 bluhm

Reverse the name and meaning of the IFXF_INET6_PRIVACY interface
flag. It is now called IFXF_INET6_NOPRIVACY. So IPv6 privacy
addresses are on by default without resetting the flag during
ifconfig down/up.
OK stsp@, sperreault@ (who wrote the same diff)


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.131 02-Dec-2011 haesbaert

Kill unused IFCAP_IPSEC and IFCAP_IPCOMP.

ok claudio@ henning@ mikeb@


# 1.130 02-Nov-2011 haesbaert

Expose if_capabilities to userland so that ifconfig can display the
device hardware features.
Tune ifconfig to show them with 'hwfeatures' argument.
While here, kill some old unused capabilities and respect 80 columns
in brconfig.h.

ok mcbride@, henning@, mpf@.


# 1.129 07-Oct-2011 henning

rename some vars and functions
unfortunately altq is one giant namespace violation. rename just those that
conflict with new stuff for now only to be found on my laptop. reduce pain,
the diff is huge already. ok ryan


Revision tags: OPENBSD_5_0_BASE
# 1.128 08-Jul-2011 henning

new priority queueing implementation, extremely low overhead, thus fast.
unconditional, always on. 8 priority levels, as every better switch, the
vlan header etc etc. ok ryan mpf sthen, pea tested as well


# 1.127 07-Jul-2011 henning

provide IF_LEN and IFQ_LEN to access ifq_len on an ifqueue, ryan ok


# 1.126 05-Jul-2011 henning

now of course I only noticed if_qflush is completely unused after
adjusting it to the new world order in my tree... remove it, ok ryan claudio


# 1.125 03-Jul-2011 henning

IFQ_CLASSIFY is also just schrapnel


# 1.124 03-Jul-2011 henning

no traces of ALTQ_DECL to be found anywhere, thus kill the #defines


# 1.123 03-Jul-2011 claudio

LINK_STATE_IS_UP() should consider LINK_STATE_UNKNOWN as an up state.
This is now possible because carp no longer uses LINK_STATE_UNKNOWN
for a state that is considered down. This will simplify a lot of code.
OK mpf@ mcbride@ henning@


# 1.122 13-Mar-2011 stsp

Add a way to enable/disable Wake On LAN with ifconfig.
ok deraadt


Revision tags: OPENBSD_4_9_BASE
# 1.121 17-Nov-2010 henning

introduce ifa_update_broadaddr to update an ifaddr's broadcast address,
trivial for the moment, more needed soon
tested by many as part of a larger diff, ok sthen claudio dlg krw


# 1.120 24-Sep-2010 claudio

Implement if_freenameindex() as a real function as required by posix.
OK deraadt@, millert@


# 1.119 23-Sep-2010 dlg

tweak the mclgeti algorithm to behave better under load.

instead of letting hardware rings grow on every interrupt, restrict
it so it can only grow once per softclock tick. we can only punish
the rings on softclock ticks, so it make sense to only grow on
softclock tick boundaries too.

the rings are now punished after >1 lost softclock tick rather than
>2. mclgeti is now more aggressive at detecting livelock.

the rings get punished by an 8th, rather than by half.

we now allow the rings to be punished again even if the system is
already considered in livelock.

without this diff a livelocked system will have its rx ring sizes
scale up and down very rapidly, while holding the rings low for too
long. this affected throughput significantly.

discussed and tested heavily at j2k10. there are still some games
with softnet we can play, but this is a good first step.

"put it in" and ok deraadt@
ok claudio@ krw@ henning@ mcbride@

if we find out that it sucks we can pull it out again later. till then
we'll run with it and see how it goes.


# 1.118 27-Aug-2010 jsg

remove the unused if_init callback in struct ifnet
ok deraadt@ henning@ claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.117 26-Jun-2010 claudio

Implement a simple keepalive mechanism in gre(4) that is compatible with
the one used by Cisco. It sends a return gre packet inside a gre packet
to the other side and expects it to return.
OK deraadt, reyk additional testing by sthen


# 1.116 28-May-2010 claudio

Rework the way we handle MPLS in the kernel. Instead of fumbling MPLS into
ether_output() and later on other L2 output functions use a trick and over-
load the ifp->if_output() function pointer on MPLS enabled interfaces to
go through mpls_output() which will then call the link level output function.
By setting IFXF_MPLS on an interface the output pointers are switched.
This now allows to cleanup the MPLS input and output pathes and fix mpe(4)
so that the MPLS code now actually works for both P and PE systems.
Tested by myself and michele
(A custom kernel with MPLS and mpe enabled is still needed).


# 1.115 17-Apr-2010 deraadt

split SIOCSIFLLADDR code out into an ifnewlladr() function
ok stsp


# 1.114 06-Apr-2010 stsp

Simple implementation of RFC4941, "Privacy Extensions for Stateless
Address Autoconfiguration in IPv6". For those among us who are paranoid
about broadcasting their MAC address to the IPv6 internet.

Man page help from jmc, testing by weerd, arc4random API hints from djm.

ok deraadt, claudio


Revision tags: OPENBSD_4_7_BASE
# 1.113 13-Jan-2010 henning

maintain a global RB tree of all local addresses in the system. this
includes AF_LINK addresses (aka mac addresses in the ethernet case). for
inet this also includes the broadcast addresses.
depends on ifinit() called earlier so we have a chance to pool_init before
autoconf assigns the AF_LINK addresses, the v6 fix, and the ifa_add/del
abstraction i just committed.
this is a change in semantics, it is now illegal to change the actual
address in an ifaddr struct because then the RB tree becomes unbalanced.
nothing using this tree yet.
ok theo ryan dlg


# 1.112 13-Jan-2010 henning

instead of fiddling with the per-interface address lists directly in
many places create a proper API (ifa_add / ifa_del) and use it.
ok theo ryan dlg


# 1.111 12-Jan-2010 deraadt

Make the structures for ifa_msghdr and friends even more like
the route messages so that people and compilers will not get
confused.
ok claudio


# 1.110 17-Sep-2009 claudio

Remove the comaptibility structures for routing socket version 3.
The RTM_VERSION bump is 2 years ago and so there is no need for this.
Diff made by tedu@ some time ago but got never commited so I do it now.


# 1.109 14-Sep-2009 claudio

Add a way to convert the ifi_link_state to a string without the use of
if_media. This makes link state tracking a lot easier as there is no need
to convert if types to if_media types, etc. Additionally this allows us
to extend the link states to include states tracked on higher protocol layers.
gre(4) keepalives packets, bfd and udld can be implemented without ugly hacks.
OK henning, michele, sthen, deraadt


# 1.108 10-Aug-2009 deraadt

At sys_reboot time, bring all the interfaces down so that their xxstop
functions are called, which will turn off DMA. Receiving packets into
your memory after a system reboot is pretty nasty. This will also mean
that the shutdown hooks can go; this solution is smaller.
ok henning miod dlg kettenis


Revision tags: OPENBSD_4_6_BASE
# 1.107 06-Jun-2009 rainer

when xflags got changed, tell the userland by routing sockets

ok henning@


# 1.106 05-Jun-2009 claudio

Initial support for routing domains. This allows to bind interfaces to
alternate routing table and separate them from other interfaces in distinct
routing tables. The same network can now be used in any doamin at the same
time without causing conflicts.
This diff is mostly mechanical and adds the necessary rdomain checks accross
net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6.
input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@


# 1.105 04-Jun-2009 henning

allow IPvShit to be turned off completely per-interface.
ifconfig em0 -inet6
deletes all v6 addresses including link-local and prevents new ones from
being added.
ifconfig em0 inet6 <addr>
re-enables v6, brings the link local back and adds optional <addr>
ok theo reyk


# 1.104 03-Jun-2009 beck

make wireless interfaces priority 4 by default. other interfaces remain
priority 0. while we are in here make sure we add wi interfaces to group "wlan"
in the same way the net80211 stuff already is.

this makes dhcp multiple default routes useful on laptops.

ok claudio@


Revision tags: OPENBSD_4_5_BASE
# 1.103 27-Jan-2009 dlg

make drivers tell the mclgeti allocator what their maximum ring size is
to prevent the hwm growing beyond that. this allows the livelock mitigation
to do something where the hwm used to grow beyond twice the rx rings size.

ok kettenis@ claudio@


# 1.102 12-Dec-2008 claudio

Introduce a if_priority that will be added to RTP_STATIC when routes are
added without an expilict priority. This allows to specify less prefered
interfaces that will only take over if the primary interface loses link.
OK deraadt@


# 1.101 11-Dec-2008 deraadt

export per-interface mbuf cluster pool use statistics out to userland
inside if_data, so that netstat(1) and systat(1) can see them
ok dlg


# 1.100 30-Nov-2008 brad

- Remove unused if_reset "bus reset routine" field in the ifnet struct.
- Add if_stop "stop routine" field in the ifnet struct.

ok mglocker@


# 1.99 26-Nov-2008 dlg

provide m_clsetlwm, an interface for an interface to raise its low
watermark for mbuf cluster allocations.

this is necessary for things like bge which cannot cope with less than a
certain number of pkts on the ring.

ok deraadt@


# 1.98 25-Nov-2008 deraadt

Factor increases are not needed, +1 appears to work as well.
ok dlg


# 1.97 24-Nov-2008 deraadt

move MCLPOOLS to if.h and force uipc_mbuf.c to get if.h, there is no
other option
ok dlg


# 1.96 24-Nov-2008 dlg

add several backend pools to allocate mbufs clusters of various sizes out
of. currently limited to MCLBYTES (2048 bytes) and 4096 bytes until pools
can allocate objects of sizes greater than PAGESIZE.

this allows drivers to ask for "jumbo" packets to fill rx rings with.

the second half of this change is per interface mbuf cluster allocator
statistics. drivers can use the new interface (MCLGETI), which will use
these stats to selectively fail allocations based on demand for mbufs. if
the driver isnt rapidly consuming rx mbufs, we dont allow it to allocate
many to put on its rx ring.

drivers require modifications to take advantage of both the new allocation
semantic and large clusters.

this was written and developed with deraadt@ over the last two days
ok deraadt@ claudio@


# 1.95 07-Nov-2008 deraadt

give this some /* CONSTCOND */ love


Revision tags: OPENBSD_4_4_BASE
# 1.94 10-Apr-2008 dlg

introduce mitigation for the calling of an interfaces start routine.

decent drivers prefer to have a lot of packets on the send queue so they
can queue a lot of them up on the tx ring and then post them all in one
big chunk. unfortunately our stack queues one packet onto the send queue
and then calls the start handler immediately.

this mitigates against that queue, send, queue, send behaviour by trying to
call the start routine only once per softnet. now its queue, queue, queue,
send.

this is the result of a lot of discussion with claudio@
tested by many.


Revision tags: OPENBSD_4_3_BASE
# 1.93 18-Nov-2007 mpf

Sync struct ifaltq to match struct ifqueue.
I wonder why 64-bit archs have not been bitten by this.
OK mcbride@, henning@


# 1.92 03-Sep-2007 claudio

Bump RTM_VERSION to 4 and start a new aera of routing in OpenBSD :)
Changes include 64bit counters instead of u_long, routing table id in the header
of most messages, reserved routing priority field, added a hdrlen field to skip
over the header so that binary compatibility becomes easier.
A minimal backward support for old binaries is included to ease upgrades but
don't expect anything more than ifconfig, route and dhclient to correctly work.
OK henning@ mglocker@


Revision tags: OPENBSD_4_2_BASE
# 1.91 25-Jun-2007 henning

crank ifq_maxlen from 50 to 256, so it is not smaller than most interfaces
rx rings any more. forwarding boxes with many fast interfaces can still use
some more, but this is a saner default.
ok deraadt markus henric


# 1.90 14-Jun-2007 reyk

Add a new "rtlabel" option to ifconfig. It allows to specify a route label
which will be used for new interface routes. For example,
ifconfig em0 10.1.1.0 255.255.255.0 rtlabel RING_1
will set the new interface address and attach the route label RING_1 to
the corresponding route.

manpage bits from jmc@
ok claudio@ henning@


# 1.89 29-May-2007 uwe

Define IF_ENQUEUE() and friends as proper C statements using do ... while
ok henning


# 1.88 26-May-2007 jason

one extern seems to be better than 20 for ifqmaxlen; ok krw


# 1.87 27-Mar-2007 jmc

grammar from bret lambert, and one more from me;


Revision tags: OPENBSD_4_1_BASE
# 1.86 09-Feb-2007 jmc

grammar fix from bret lambert;


# 1.85 28-Nov-2006 reyk

add additional link states to report the half duplex / full duplex
state, if known by the driver. this is required to check the full
duplex state without depending on the ifmedia ioctl which can't be
called in the kernel without process context.

ok henning@, brad@


# 1.84 16-Nov-2006 henning

introduce if_creategroup() to create an empty interface group.
code factored out from if_addgroup(), previously a group always had to have
members. ok mpf mcbride


# 1.83 31-Oct-2006 jason

ether_input_mbuf() isn't necessary, turn it into a macro and deal with
it's "special" case in ether_input(). Based on similiar idea in FreeBSD.
ok brad


Revision tags: OPENBSD_4_0_BASE
# 1.82 02-Jun-2006 mpf

Introduce attributes to interface groups.
As a first user, move the global carp(4) demotion counter
into the interface group. Thus we have the possibility
to define which carp interfaces are demoted together.

Put the demotion counter into the reserved field of the carp header.
With this, we can have carp act smarter if multiple errors occur.
It now always takes over other carp peers, that are advertising
with a higher demote count. As a side effect, we can also have
group failovers without the need of running in preempt mode.
The protocol change does not break compability with older
implementations.

Collaborative work with mcbride@

OK mcbride@, henning@


# 1.81 27-May-2006 brad

remove IFCAP_JUMBO_MTU interface capabilities flag and set if_hardmtu in a few
more drivers.

ok reyk@


# 1.80 26-May-2006 deraadt

rename jumbo mtu to if_hardmtu; ok brad reyk


# 1.79 19-May-2006 reyk

add a if_jumbo_mtu field to the interface structure for drivers
supporting ethernet jumbo frames. there's no standard for the size of
jumbo MTUs, so either let the driver set it's own value or use 9000
byte jumbo frames by default.

ok brad@


# 1.78 04-Mar-2006 brad

With the exception of two other small uncommited diffs this moves
the remainder of the network stack from splimp to splnet.

ok miod@


Revision tags: OPENBSD_3_9_BASE
# 1.77 09-Feb-2006 reyk

add an interface detach hook and use it with the vlan(4) driver. this
fixes a possible crash if the parent interface has been destroyed
(like vlan on trunk) before destroying the vlan interface.

ok brad@


Revision tags: OPENBSD_3_8_BASE
# 1.76 14-Jun-2005 henning

rename function and define to reflect the external -> egress name change
so it is clear what it is all about


# 1.75 14-Jun-2005 henning

use "egress" instead of "external" for the interface group containing the
interfaces the default route(s) point to, proposed deraadt some days ago,
ok djm deraadt


# 1.74 12-Jun-2005 henning

add SIOCGIFGMEMB ioctl, returns a list of all interfaces who are member of
the given group, markus ok


# 1.73 07-Jun-2005 henning

introduce a default "external" interface group, containing the interface(s)
the the default route(s) point to.
handles IPv4 and IPv6 as well as multipath routes.
follows default route changes, of course.
eases writing pf rulesets especially on laptops etc. that use different
interfaces depending on the environment (wired, wireless, ...)
ok theo ryan


# 1.72 06-Jun-2005 henning

use a define instead of hardcoding "all" in 3 places


# 1.71 05-Jun-2005 henning

const'ify the char *groupname param to if_addgroup and if_delgroup


# 1.70 24-May-2005 markus

add net.inet.ip.ifq for monitoring and changing ifqueue; similar to netbsd
ok henning


# 1.69 24-May-2005 reyk

initial import of a trunking (link aggregation and link failover)
implementation. it currently supports round robin mode with link state
checking, additional modes will be added later.

ok brad@, deraadt@


# 1.68 24-May-2005 henning

keep a list of member interfaces in ifg_group


# 1.67 22-May-2005 henning

allow pf to match on interface groups
pass on mygroup ...
markus ok


# 1.66 21-May-2005 henning

clean up and rework the interface absraction code big time, rip out multiple
useless layers of indirection and make the code way cleaner overall.
this is just the start, more to come...
worked very hard on by Ryan and me in Montreal last week, on the airplane to
vancouver and yesterday here in calgary. it hurt.
ok ryan theo


# 1.65 20-Apr-2005 mpf

Introduce if_linkstatehooks.
This converts if_link_state_change() to a generic usable
callback with dohooks().

OK henning@, camield@
Tested by camield@ and Alexey E. Suslikov


Revision tags: OPENBSD_3_7_BASE
# 1.64 07-Feb-2005 mcbride

Add new function if_link_state_change() to take care of sending messages
on the routing socket and notifying carp() of link changes.

ok brad@ mpf@


# 1.63 14-Jan-2005 henning

remove old ifgroups ioctls
the old ifgroups haven't been in use ever really, and the new
implementation is 3 months old today. theo ok (3 months ago)


# 1.62 07-Dec-2004 mcbride

Convert carp(4) to behave more like a regular interface, much in the same
style as vlan(4). carp interfaces no longer require the physical interface
to be on the same subnet as the carp interface, or even that the physical
interface has an adress at all, so CARP can now be used on /30 networks.

ok deraadt@ henning@


# 1.61 07-Dec-2004 mcbride

KNF


# 1.60 03-Dec-2004 henning

do not use one struct timeout for the if congestion stuff, but embed
a struct timeout to struct ifqueue so that each one has its own - it
is a per-queue thing. from chris pascoe


# 1.59 10-Nov-2004 grange

Safer IF_INPUT_ENQUEUE macro.

ok millert@


# 1.58 14-Oct-2004 mickey

avoid stupid commons


# 1.57 11-Oct-2004 henning

ifgroups reqrite
there is now a TAILQ with all interface groups as members, and
in struct ofnet there is only a pointer to the group structure stored
and not its name.
mostly hacked at c2k4 and somewhere over the atlantic ocean
ok markus mcbride


Revision tags: OPENBSD_3_6_BASE
# 1.56 26-Jun-2004 markus

cleanup ioctl for ifgroups; ok pb@


# 1.55 25-Jun-2004 pb

introduce "interface groups"

by "ifconfig fxp0 group foobar" "ifconfig xl0 group foobar"
these two interfaces are in one group.
Every interface has its if-family as default group.

idea/design from henning@, based on some work/disucssion from Joris Vink.

henning@, mcbride@ ok.


# 1.54 20-Jun-2004 beck

undo mbuf cluster breakage that causes free'ed packets to show up on the
input queues when using dhcp and hostap wi, or xl, or fxp....
ok art@


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.53 29-May-2004 jcs

introduce SIOCSIFDESCR and SIOCGIFDESCR to maintain interface
descriptions, configurable with ifconfig

help from various, ok deraadt@


# 1.52 18-May-2004 brad

if_ether.h
add ETHER_MAX_LEN_JUMBO, ETHER_VLAN_ENCAP_LEN, ETHER_ALIGN, and
ETHERMTU_JUMBO constants.

if.h
add a few more interface capabilities flags.

Some from NetBSD, some from FreeBSD.

ok markus@


# 1.51 26-Apr-2004 mcbride

Before enqueueing the packet, copy the contents of incoming clusters
to the mbuf and free the cluster when it contains a small packet.

ok deraadt@


# 1.50 17-Apr-2004 henning

add a congestion indicator to if_queue. It is set when the input queue
is full, along with a timer that unsets it again after 10ms.
The input queue beeing full is a reliable indicator for CPU overload, and
this flag allows other subsystems to cope with the situation.
hacked with beck
ok kjc@ markus@ beck@


Revision tags: OPENBSD_3_5_BASE
# 1.49 15-Jan-2004 markus

add a RTM_IFANNOUNCE message; from netbsd; ok itojun, henning


# 1.48 16-Dec-2003 markus

return error in ifc_destroy; ok deraadt, itojun, cedric, hshoexer


# 1.47 10-Dec-2003 itojun

use if_indexlim (instead of if_index) and ifindex2ifnet[x] != NULL
to check if interface exists, as (1) if_index will have different meaning
(2) ifindex2ifnet could become NULL when interface gets destroyed,
when we introduce dynamically-created interfaces. markus ok


# 1.46 08-Dec-2003 markus

add IOCIFGCLONERS; ifconfig -C; from netbsd; ok henning, deraadt


# 1.45 03-Dec-2003 markus

support for network interface "cloning", e.g. gif(4) via ifconfig(8)


# 1.44 19-Oct-2003 david

more typos


# 1.43 17-Oct-2003 mcbride

Common Address Redundancy Protocol

Allows multiple hosts to share an IP address, providing high availability
and load balancing.

Based on code by mickey@, with additional help from markus@
and Marco_Pfatschbacher@genua.de

ok deraadt@


Revision tags: OPENBSD_3_4_BASE
# 1.42 25-Aug-2003 fgsch

if_init support, required by ieee80211.
deraadt@ ok.


# 1.41 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_A UBC_SYNC_B
# 1.40 03-Jul-2002 miod

Change all variables definitions (int foo) in sys/sys/*.h to variable
declarations (extern int foo), and compensate in the appropriate locations.


# 1.39 30-Jun-2002 itojun

allocate sockaddr_dl for ifnet in if_alloc_sadl(), as we don't always know
the size of sockaddr_dl on if_attach() - for instance, see ether_ifattach().
from netbsd. fgs ok


# 1.38 23-Jun-2002 itojun

g/c last remains of old ipv6 prefix management


# 1.37 27-May-2002 itojun

if_attach() gets called before domaininit(). scan all interfaces for if_afdata
initialization after domaininit().


# 1.36 27-May-2002 itojun

framework to add af-dependent data structure to struct ifnet.
as discussed at bsd-api-discuss. sync w/kame


# 1.35 24-Apr-2002 dhartmei

Add hooks to struct ifnet that allow to register callbacks that will be
notified of interface address changes. ok provos@, angelos@


Revision tags: OPENBSD_3_1_BASE
# 1.34 15-Mar-2002 millert

Cosmetic changes only, primarily making comments line up nicely after the
__P removal.


# 1.33 14-Mar-2002 millert

First round of __P removal in sys


# 1.32 23-Jan-2002 fgsch

compatability -> compatibility.


Revision tags: OPENBSD_3_0_BASE UBC_BASE
# 1.31 05-Jul-2001 angelos

branches: 1.31.4;
KNF


# 1.30 05-Jul-2001 jjbg

Include files for IPComp support. angelos@ ok.


# 1.29 27-Jun-2001 kjc

ALTQ base modifications to the kernel.
- ALTQ introduces a set of new queue macros that coexist with the
traditional IF_XXX macros.
- "struct ifaltq" replaces "struct ifqueue" in "struct ifnet".
- assign cdev major 74 for i386 and 54 for alpha as ALTQ control interface.


# 1.28 23-Jun-2001 fgsch

Add ether_input_mbuf to help us remove the ether_header from
ether_input; all drivers should start migrating to this.
Discussed with jason@, deraadt@ more or les ok'ed.


# 1.27 15-Jun-2001 itojun

change the meaning of ifnet.if_lastchange to meet RFC1573 ifLastChange.
follows BSD/OS practice and ucd-snmp code (FreeBSD does it for specific
interfaces only).

was: if_lastchange get updated on every packet transmission/receipt.
now: if_lastchange get updated when IFF_UP is changed.


# 1.26 09-Jun-2001 angelos

By popular demand, protect from multiple inclusion, and fix to use the
same naming style.


# 1.25 28-May-2001 angelos

IPSECv4 -> IPSEC


# 1.24 28-May-2001 angelos

No need for separate ESP/AH interface capabilities.


# 1.23 28-May-2001 angelos

Interface capabilities (based on NetBSD, but merge ethercom and ifnet
capabilities into one, in the ifp).


Revision tags: OPENBSD_2_9_BASE
# 1.22 06-Feb-2001 mickey

allow changing number of loopbacks in ukc.
change rest of the code to use lo0ifp pointing
to the corresponding struct ifnet.
itojun@ and niklas@ ok


# 1.21 19-Jan-2001 itojun

pull post-4.4BSD change to sys/net/route.c from BSD/OS 4.2 (UCB copyrighted).

have sys/net/route.c:rtrequest1(), which takes rt_addrinfo * as the argument.
pass rt_addrinfo all the way down to rtrequest, and ifa->ifa_rtrequest.
3rd arg of ifa->ifa_rtrequest is now rt_addrinfo * instead of sockaddr *
(almost noone is using it anyways).

benefit: the follwoing command now works. previously we need two route(8)
invocations, "add" then "change".
# route add -inet6 default ::1 -ifp gif0

remove unsafe typecast in rtrequest(), from rtentry * to sockaddr *. it was
introduced by 4.3BSD-reno and never corrected.

XXX is eon_rtrequest() change correct regarding to 3rd arg?
eon_rtrequest() and rtrequest() were incorrect since 4.3BSD-reno,
so i do not have correct answer in the source code.
someone with more clue about netiso-over-ip, please help.


Revision tags: OPENBSD_2_8_BASE
# 1.20 20-Sep-2000 art

Since ifa_refcnt was bumped to an int and rt_flags is an int too, bump
ifa_flags to int.


# 1.19 28-Aug-2000 deraadt

changing the size of if_data has heavy impact on userland compat. there
was even a u_char slot available for ifi_link_state, which clearly does not
need a full 32 bits.


# 1.18 26-Aug-2000 nate

sync mii code with netbsd
adds detach functionality for phys
some code cleanup

Nobody really had time to test all of this out, but theo said commit anyway


Revision tags: OPENBSD_2_7_BASE
# 1.17 22-Mar-2000 itojun

remove if_withname(), which was imported during KAME merge by mistake.


# 1.16 21-Mar-2000 mickey

add SIOCGIFMTU/SIOCSIFMTU; remediate redundant code of tun, ppp, sppp; chris@ ok


Revision tags: SMP_BASE
# 1.15 02-Feb-2000 itojun

branches: 1.15.2;
wrap IFAFREE() by "do {} while (0)". it wasn't safe enough.


Revision tags: kame_19991208
# 1.14 08-Dec-1999 itojun

bring in KAME IPv6 code, dated 19991208.
replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support.
see sys/netinet6/{TODO,IMPLEMENTATION} for more details.

GENERIC configuration should work fine as before. GENERIC.v6 works fine
as well, but you'll need KAME userland tools to play with IPv6 (will be
bringed into soon).


Revision tags: OPENBSD_2_6_BASE
# 1.13 08-Aug-1999 niklas

Support detaching of network interfaces. Still work to do in ipf, and
other families than inet.


# 1.12 23-Jun-1999 cmetz

Added some protocol independent interfaces (supposedly IPv6 support APIs, but
ones that are useful for all protocols, not just IPv6).


Revision tags: OPENBSD_2_5_BASE
# 1.11 13-Mar-1999 deraadt

make ifa_refcnt a u_int; andrewb@demon.net


# 1.10 26-Feb-1999 jason

Ethernet bridge/IP firewall driver.


# 1.9 07-Jan-1999 deraadt

fix IFAFREE() to be safe for if/else nesting


Revision tags: OPENBSD_2_4_BASE
# 1.8 03-Sep-1998 jason

o OpenBSD gets if_media support (from NetBSD)
o rework/simplify if_xl to use it


Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE OPENBSD_2_2_BASE OPENBSD_2_3_BASE
# 1.7 02-Jul-1996 niklas

-Wall & -Wstrict-prototype fixes


# 1.6 29-Jun-1996 deraadt

provide if_attachhead(), and make if_loop use it


# 1.5 10-May-1996 deraadt

if_name/if_unit -> if_xname/if_softc


# 1.4 06-May-1996 mickey

if.h was missed from the commit.
if_ethersubr.c: missed variables added.


# 1.3 05-Mar-1996 mickey

Changes for ifconfig to compile.


# 1.2 03-Mar-1996 niklas

From NetBSD: 960217 merge


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.195 12-Sep-2018 krw

Fix obvious cut&pasto in comment (ifa_msghdr -> if_announcemsghdr).

ok claudio@


# 1.194 30-May-2018 dlg

restrict the prio values from SIOCSIFLLPRIO to what the kernel handles

previously the ioctl code checked that prio was an int less than
UCHAR_MAX, but the rest of the kernel (and priq code in particular)
expects it to be between 0 and 7 inclusive.

ok krw@ tb@


# 1.193 25-Apr-2018 jca

Make this header standalone #if __BSD_VISIBLE, by including needed headers

Puts us in line with Free/NetBSD and Linux and will get us rid of
pointless patches in the ports tree. ok guenther@ deraadt@


Revision tags: OPENBSD_6_3_BASE
# 1.192 19-Feb-2018 dlg

tunneldf needs ifr_df


# 1.191 10-Feb-2018 florian

Implement RFC 7217: "A Method for Generating Semantically Opaque
Interface Identifiers with IPv6 Stateless Address Autoconfiguration."

"An IPv6 address configured using this method is stable within each
subnet, but the corresponding Interface Identifier changes when the
host moves from one network to another. This method is meant to be an
alternative to generating Interface Identifiers based on hardware
addresses."

OK naddy, sthen


# 1.190 16-Jan-2018 mpi

Recycle IFF_NOTRAILERS into IFF_STATICARP and document ownerhsip
of IFF* flags.

inputs from jmc@, ok bluhm@, visa@


# 1.189 21-Dec-2017 dlg

prototype if_attach_iqueues so drivers can configure multiple iqs.


# 1.188 09-Nov-2017 tb

The cmd argument of ifconf() has been unused since COMPAT_LINUX was
purged. Remove it and move the prototype to if.c since ifconf() is
not used outside of this file.

ok mpi


# 1.187 31-Oct-2017 sashan

- add one more softnet taskq
NOTE: code still runs with single softnet task. change definition of
SOFTNET_TASKS in net/if.c, if you want to have more than one softnet task

OK mpi@, OK phessler@


Revision tags: OPENBSD_6_1_BASE OPENBSD_6_2_BASE
# 1.186 24-Jan-2017 krw

A space here, a space there. Soon we're talking real whitespace
rectification.


# 1.185 24-Jan-2017 dlg

add support for multiple transmit ifqueues per network interface.

an ifq to transmit a packet is picked by the current traffic
conditioner (ie, priq or hfsc) by providing an index into an array
of ifqs. by default interfaces get a single ifq but can ask for
more using if_attach_queues().

the vast majority of our drivers still think there's a 1:1 mapping
between interfaces and transmit queues, so their if_start routines
take an ifnet pointer instead of a pointer to the ifqueue struct.
instead of changing all the drivers in the tree, drivers can opt
into using an if_qstart routine and setting the IFXF_MPSAFE flag.
the stack provides a compatability wrapper from the new if_qstart
handler to the previous if_start handlers if IFXF_MPSAFE isnt set.

enabling hfsc on an interface configures it to transmit everything
through the first ifq. any other ifqs are left configured as priq,
but unused, when hfsc is enabled.

getting this in now so everyone can kick the tyres.

ok mpi@ visa@ (who provided some tweaks for cnmac).


# 1.184 23-Jan-2017 mpi

Flag pseudo-interfaces as such in order to call add_net_randomness()
only once per packet.

Fix a regression introduced when if_input() started to be called by
every pseudo-driver.

ok claudio@, dlg@


# 1.183 23-Jan-2017 dlg

i botched the copyout to ifr->ifr_data in SIOCGIFDATA.

this lets pflogd run again.

rename if_data() to if_getdata() while here to make grepping for
things less noisy.

reported by jsg@
worked through with deraadt@


# 1.182 23-Jan-2017 dlg

merge the ifnet and ifqueue stats together when userland wants them.

a new if_data() function takes a pointer to ifnet and merges its
if_data and ifq statistics. it takes the ifq mutex around the reads
of the ifq stats so they get a consistent copy.

the ifnet and ifq stats are merged because some parts of the stack
still update the ifnet counters.

ok visa@ (on an earlier diff) mpi@ claudio@


# 1.181 12-Dec-2016 mpi

Remove most of the splsoftnet() recursions related to cloned interfaces.

inputs and ok bluhm@


# 1.180 27-Oct-2016 dlg

add a new pool for 2k + 2 byte (mcl2k2) clusters.

a certain vendor likes to make chips that specify the rx buffer
sizes in kilobyte increments. unfortunately it places the ethernet
header on the start of the rx buffer, which means if you give it a
mcl2k cluster, the ethernet header will not be ETHER_ALIGNed cos
mcl2k clusters are always allocated on 2k boundarys (cos they pack
into pages well). that in turn means the ip header wont be aligned
correctly.

the current workaround on these chips has been to let non-strict
alignment archs just use the normal 2k cluster, but use whatever
cluster can fit 2k + 2 on strict archs. that turns out to be the
4k cluster, meaning we waste nearly 2k of space on every packet.

properly aligning the ethernet header and ip headers gives a
performance boost, even on non-strict archs.


# 1.179 04-Sep-2016 reyk

Move code to change the rdomain of an interface from the ioctl switch case
to a new function if_setrdomain().

OK mpi@ henning@


# 1.178 03-Sep-2016 reyk

Add support for a multipoint-to-multipoint mode in vxlan(4). In this
mode, vxlan(4) must be configured to accept any virtual network
identifier with "vnetid any" and added to a bridge(4) or switch(4).
This way the driver will dynamically learn the tunnel endpoints and
their vnetids for the responses and can be used to dynamically bridge
between VXLANs. It is also being used in combination with switch(4)
and the OpenFlow tunnel classifiers.

With input from yasuoka@ goda@
OK deraadt@ dlg@


Revision tags: OPENBSD_6_0_BASE
# 1.177 10-Jun-2016 vgross

Add the "llprio" field to struct ifnet, and the corresponding keyword
to ifconfig.

"llprio" allows one to set the priority of packets that do not go through
pf(4), as the case is for arp(4) or bpf(4).

ok sthen@ mikeb@


# 1.176 02-Mar-2016 dlg

provide generic ioctls for managing an interfaces parent

in the future this will subsume the individual vlandev, carpdev,
pppoedev, foodev options for things like vlan, carp, pppoe, etc.

inspired by vnetid

ok mpi@ jmatthew@


Revision tags: OPENBSD_5_9_BASE
# 1.175 05-Dec-2015 deraadt

avoid an ugly wrap in a comment


# 1.174 03-Dec-2015 dlg

rework if_start to allow nics to provide an mpsafe start routine.

existing start routines will still be called under the kernel lock
and at IPL_NET.

mpsafe start routines will be serialised so only one instance of
each interfaces function will be running in the kernel at any point
in time. this guarantees packets will be dequeued in order, and the
start routines dont have to lock against themselves because if_start
does it for them.

the code to do that is based on the scsi runqueue code.

this also provides an if_start_barrier() function that should wait
until any currently running instances of if_start have finished.

a driver can opt in to the mpsafe if_start call by doing the following:

1. setting ifp->if_xflags = IFXF_MPSAFE
2. only calling if_start() instead of its own start routine
3. clearing IFF_RUNNING before calling if_start_barrier() on its way down
4. only using IFQ_DEQUEUE (not ifq_deq_begin/commit/rollback)

to simplify the implementation the tx mitigation code has been removed.

tested by several
ok mpi@ jmatthew@


# 1.173 20-Nov-2015 mpi

Keep if_ref() private, if_get() is what you want to use before if_put().

The thread detaching an interface will sleep until all references to this
interface have been released. So we decided to only keep references for
a short period of time.

Keeping if_ref() private will hopefully help preserve this goal as long
as it makes sense.

Calling if_get()/if_put() in the same function also allows us to make
use of static analysis tools (thanks jsg@!) to catch our errors.

ok dlg@


# 1.172 24-Oct-2015 reyk

Add pair(4), a vether-based virtual Ethernet driver to interconnect
rdomains and bridges on the local system. This can be used to route
through local rdomains, to create L2 devices (like trunks) between
them, and many other things.

Discussed with many, with input from mpi@
OK sthen@ phessler@ yasuoka@ mikeb@


# 1.171 23-Oct-2015 claudio

Introduce a new sysctl NET_RT_IFNAMES that returns only ifnames to ifindex
mappings. This will be used by if_nameindex(3), if_nametoindex(3) and
if_indextoname(3) soon to fix the issues in pledge because of inet6 link
local addressing.
OK mpi@ benno@ deraadt@
The libc version will follow soon so better start updating your kernels


# 1.170 23-Oct-2015 dlg

tweak the vnetid so it can be optional and therefore cleared/deleted.

the abstract vnetid is promoted to a uin32_t, and adds a SIOCDVNETID
ioctl so it can be cleared.

this is all because i set an assignment on implementing a virtual
network interface and the students got confused when vnetid 0 didnt
show up in ifconfig output.

the vnetid in the vxlan(4) protocol is optional, but the current
code confuses 0 with no vnetid being set. this makes it clear.

ok reyk@ who also simplified my diff


# 1.169 05-Oct-2015 uebayasi

Add ifi_oqdrops and its alias to struct if_data.

Necessary bumps in Ports will be handled by sthen@.

OK mpi@ dlg@


# 1.168 27-Sep-2015 stsp

Add if_setlladdr(), factored out from ifioctl(). Will be used by iwm(4) soon.
With suggestions from tedu@ and guenther@
ok kettenis@


# 1.167 11-Sep-2015 stsp

Make room for media types of the future. Extend the ifmedia word to 64 bits.
This changes numbers of the SIOCSIFMEDIA and SIOCGIFMEDIA ioctls and
grows struct ifmediareq.

Old ifconfig and dhclient binaries can still assign addresses, however
the 'media' subcommand stops working. Recompiling ifconfig and dhclient
with new headers before a reboot should not be necessary unless in very
special circumstances where non-default media settings must be used to
get link and console access is not available.

There may be some MD fallout but that will be cleared up later.

ok deraadt miod
with help and suggestions from several sharks attending l2k15


# 1.166 09-Sep-2015 dlg

introduce reference counts for interfaces (ie, struct ifnet *ifp).

if_get can get a reference to an ifp, but it never releases that
reference. this provides an if_put function that can be used to
decrement the refcount.

we cannot come up with a scheme for letting the network stack run on
one (or many) cpus while ioctls are pulling interfaces down on another
cpu without refcounts for the interfaces.

if_put is going in now so we can go through the stack and put the
necessary calls to it in, and then we'll backfill this implementation
to actually check the refcounts when the interface detaches.

ok mpi@ mikeb@ claudio@


# 1.165 30-Aug-2015 mpi

Use a global table for domains instead of building a list at run time.

As a side effect there's no need to run if_attachdomain() after the
list of domains has been built.

ok claudio@, reyk@


Revision tags: OPENBSD_5_8_BASE
# 1.164 07-Jun-2015 jsg

Introduce unhandled_af() for cases where code conditionally does
something based on an address family and later assumes one of the paths
was taken. This was initially just calls to panic until guenther
suggested a function to reduce the amount of strings needed.

This reduces the amount of noise with static analysers and acts
as a sanity check.

ok guenther@ bluhm@


# 1.163 18-May-2015 reyk

Move the rdomain from struct ifnet into struct if_data. This way it
will be exported to userland with the existing sysctl, getifaddrs()
and routing socket (if_msghdr.ifm_data) interfaces that expose
if_data. All programs and daemons - Apps - that call the
SIOCGIFRDOMAIN ioctl in a getifaddrs() loop or after receiving an
interface message on the routing socket can now remove the pointless
additional ioctl. In base, that could be: dhclient, isakmpd, dhcpd,
dhcrelay, ntpd, ospfd, ripd, ifconfig.

No ABI breakage because it uses a previously unused pad field in if_data.

OK mpi@ deraadt@


# 1.162 10-Apr-2015 mpi

Run detach hook and similar before cleaning up any other resource when
an interface is destroyed/removed. This way we can ensure pseudo-driver
changes done after attaching an interface are undone before detaching it.

Note: it is safe to call if_deactivate() multiple times as the interface
should not have any attached pseudo-interface after the first call.

ok deraadt@, dlg@


# 1.161 18-Mar-2015 dlg

remove the congestion handling from struct ifqueue.

its only used for the ip and ip6 network stack input queues, so it
seems unfair that every instance of ifqueue has to carry a pointer
around for this specific use case.

this moves the congestion marker to a kernel global. if we detect
that we're congested, we assume the whole system is busy and punish
all input queues.

marking a system as congested is done by setting the global to the
current value of ticks. as the system moves away from that value,
it moves away from being congested until the comparison fails.

written at s2k15
ok henning@ beck@ bluhm@ claudio@


Revision tags: OPENBSD_5_7_BASE
# 1.160 08-Feb-2015 mpi

Introduce if_input() a function to pass packets dequeued from a
recieving ring to the stack.

if_input() is at the moment a drop-in replacement for ether_input_mbuf()
but will let us stack pseudo-driver in a nice way in order to no longer
call ether_input() recursively.

ok pelikan@, reyk@, blambert@, henning@


# 1.159 06-Jan-2015 stsp

Remove the NOINET6 interface flag, a left-over from the times when IPv6
was enabled by default. Add AFATTACH/AFDETACH ioctls which enable/disable
an address family for an interface (currently used for IPv6 only).

New kernel needs new ifconfig for IPv6 configuration (address assignment
still works with old ifconfig making this easy to cross over).

Committing on behalf of henning@ who is currently lebensmittelvergiftet.
ok stsp, benno, mpi


# 1.158 05-Dec-2014 mpi

Explicitly include <net/if_var.h> instead of pulling it in <net/if.h>.

ok mikeb@, krw@, bluhm@, tedu@


Revision tags: OPENBSD_5_6_BASE
# 1.157 14-Jul-2014 dlg

now that receive ring accounting has been pulled out of the mbuf layer,
we can pull the space the mbuf layer used to do per interface accounting
out of struct if_data.

saves a hundredish bytes on every interface.

ok deraadt@ claudio@


# 1.156 11-Jul-2014 henning

introduce the IFXF_AUTOCONF6 interface flag which controls wether we
accept rtadvs on that interface. the global net.inet6.ip6.accept_rtadv
sysctl just doesn't cut it, even tho the spec wants that - but in their
little absurd world, a host just has one interface by definition anyway...
the sysctlgoes away.
lots of head scratching, brain cell elemination etc from bluhm benno stsp
florian, excitement from simon and todd, ok bluhm stsp benno florian


# 1.155 08-Jul-2014 dlg

introduce the if_rxr api. it is intended to pull the rx ring accounting
out of the mbuf layer, and break the assumption that an interface will
only have a single ring per mbuf cluster size.

mpi@ is ok with moving this forward


# 1.154 13-Jun-2014 mpi

Instead of updating all the cluster allocation water marks of all the
interfaces when the kernel is livelocked, only do it for the current
pool and defer the other updates.

This allow us to get rid of an interface list iteration in a critical
path.

Ridding the libc crank since this change introduce an ABI break.

ok claudio@


Revision tags: OPENBSD_5_5_BASE
# 1.153 21-Nov-2013 mikeb

split kernel parts of the if.h into a separate header file if_var.h
which allows us to modify ifnet structure in a relatively safe way;
discussed with deraadt, ok mpi


# 1.152 09-Nov-2013 dlg

ticks is compared against mcl_grown to see if time has elapsed since
the rx ring was last allowed to grow and then assigned to it. it
is erroneous to do this because mcl_grown is a u_int and ticks is an
int.

this makes mcl_grown an int, and follows the idiom in kern_timeout.c
of going "thing - ticks < diff", which better copes with ticks
wrapping around and being used to calculate relative intervals.

ok pirofti@ guenther@


# 1.151 01-Nov-2013 pelikan

keep net/hfsc.h away from userspace, except in pfctl

tested by naddy, ok deraadt


# 1.150 21-Oct-2013 benno

nuke comment. How soon is now?
"do it" deraadt@


# 1.149 19-Oct-2013 reyk

Bring back the if_detachhook. We're going to have more users now.

ok mpi@ henning@ benno@


# 1.148 13-Oct-2013 reyk

Import vxlan(4), the virtual extensible local area network tunnel
interface. VXLAN is a UDP-based tunnelling protocol for overlaying
virtualized layer 2 networks over layer 3 networks. The implementation
is based on draft-mahalingam-dutt-dcops-vxlan-04 and has been tested
with other implementations in the wild.

put it in deraadt@


# 1.147 12-Oct-2013 henning

new bandwidth shaping subsystem, kernel side
uses hfsc behind the scenes; altq stays in parallel for a migration phase.
if.h even more messy for the transition, but eventuelly it should become
readable...
looked over & tested by many, ok phessler sthen


# 1.146 17-Sep-2013 mpi

Change vlan(4) detach procedure to not use a hook but a list of vlans
on the parent interface. This is similar to what bridge(4), trunk(4)
or carp(4) are doing and allows us to get rid of the detachhook.

ok reyk@, mikeb@


# 1.145 28-Aug-2013 mpi

Remove unused argument from *rtrequest()

ok krw@, mikeb@


Revision tags: OPENBSD_5_4_BASE
# 1.144 20-Jun-2013 mpi

Revert previous and unbreak asr, the new include should be protected.

Reported by naddy@


# 1.143 20-Jun-2013 mpi

Allocate the various hook head descriptors as part of the ifnet
structure rather than doing various M_WAITOK allocations during
the *attach() functions, we always rely on them anyway.

ok mikeb@, uebayasi@


# 1.142 02-Apr-2013 mpi

Instead of storing the link-level address of every interface in a global
array indexed by interface numbers, add a new field to the interface
descriptor pointing to it.

claudio@ and todd@ like it, ok mikeb@


# 1.141 26-Mar-2013 mpi

Remove various read-only *maxlen variables and use IFQ_MAXLEN directly.

ok beck@, mikeb@


# 1.140 20-Mar-2013 mpi

Introduce if_get() to retrieve an interface descriptor pointer given
an interface index and replace all the redondant checks and accesses
to a global array by a call to this function.

With imputs from and ok bluhm@, mikeb@


# 1.139 07-Mar-2013 mpi

Remove unused ifa_ifwithaf() function.

ok mikeb@, miod@


# 1.138 07-Mar-2013 mpi

Remove the IFAFREE() macro, the ifafree() function it was calling already
check for the reference counter.

ok mikeb@, miod@, pelikan@, kettenis@, krw@


Revision tags: OPENBSD_5_3_BASE
# 1.137 23-Nov-2012 sthen

Add SIOCGIFHARDMTU to allow retrieving the driver's maximum supported MTU
looks fine reyk@ ok mikeb@


# 1.136 11-Nov-2012 deraadt

align ifaliasreq.ifra_addr similar to the way that ifreq is fixed --
a gruesome union, to block the compiler from placing the struct
incorrectly aligned on stack frames
ok guenther


# 1.135 05-Oct-2012 camield

Point an interface directly to its bridgeport configuration, instead
of to the bridge itself. This is ok, since an interface can only be part
of one bridge, and the parent bridge is easy to find from the bridgeport.

This way we can get rid of a lot of list walks, improving performance
and shortening the code.

ok henning stsp sthen reyk


# 1.134 19-Sep-2012 henning

defina an IFCAP_CSUM_MASK, covering IFCAP_CSUM_*, and use it in if_vlan.c
to replace the list of them.
this actually makes vlan inherit the IPv6 CSUM flags from it's parent, that
had been commented out since this code was committed back in 2001.
ok benno mpf


# 1.133 10-Sep-2012 guenther

Bring into compliance with POSIX, exposing just the specified bits.

Requested by jasper@, ok kettenis@


# 1.132 21-Aug-2012 bluhm

Reverse the name and meaning of the IFXF_INET6_PRIVACY interface
flag. It is now called IFXF_INET6_NOPRIVACY. So IPv6 privacy
addresses are on by default without resetting the flag during
ifconfig down/up.
OK stsp@, sperreault@ (who wrote the same diff)


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.131 02-Dec-2011 haesbaert

Kill unused IFCAP_IPSEC and IFCAP_IPCOMP.

ok claudio@ henning@ mikeb@


# 1.130 02-Nov-2011 haesbaert

Expose if_capabilities to userland so that ifconfig can display the
device hardware features.
Tune ifconfig to show them with 'hwfeatures' argument.
While here, kill some old unused capabilities and respect 80 columns
in brconfig.h.

ok mcbride@, henning@, mpf@.


# 1.129 07-Oct-2011 henning

rename some vars and functions
unfortunately altq is one giant namespace violation. rename just those that
conflict with new stuff for now only to be found on my laptop. reduce pain,
the diff is huge already. ok ryan


Revision tags: OPENBSD_5_0_BASE
# 1.128 08-Jul-2011 henning

new priority queueing implementation, extremely low overhead, thus fast.
unconditional, always on. 8 priority levels, as every better switch, the
vlan header etc etc. ok ryan mpf sthen, pea tested as well


# 1.127 07-Jul-2011 henning

provide IF_LEN and IFQ_LEN to access ifq_len on an ifqueue, ryan ok


# 1.126 05-Jul-2011 henning

now of course I only noticed if_qflush is completely unused after
adjusting it to the new world order in my tree... remove it, ok ryan claudio


# 1.125 03-Jul-2011 henning

IFQ_CLASSIFY is also just schrapnel


# 1.124 03-Jul-2011 henning

no traces of ALTQ_DECL to be found anywhere, thus kill the #defines


# 1.123 03-Jul-2011 claudio

LINK_STATE_IS_UP() should consider LINK_STATE_UNKNOWN as an up state.
This is now possible because carp no longer uses LINK_STATE_UNKNOWN
for a state that is considered down. This will simplify a lot of code.
OK mpf@ mcbride@ henning@


# 1.122 13-Mar-2011 stsp

Add a way to enable/disable Wake On LAN with ifconfig.
ok deraadt


Revision tags: OPENBSD_4_9_BASE
# 1.121 17-Nov-2010 henning

introduce ifa_update_broadaddr to update an ifaddr's broadcast address,
trivial for the moment, more needed soon
tested by many as part of a larger diff, ok sthen claudio dlg krw


# 1.120 24-Sep-2010 claudio

Implement if_freenameindex() as a real function as required by posix.
OK deraadt@, millert@


# 1.119 23-Sep-2010 dlg

tweak the mclgeti algorithm to behave better under load.

instead of letting hardware rings grow on every interrupt, restrict
it so it can only grow once per softclock tick. we can only punish
the rings on softclock ticks, so it make sense to only grow on
softclock tick boundaries too.

the rings are now punished after >1 lost softclock tick rather than
>2. mclgeti is now more aggressive at detecting livelock.

the rings get punished by an 8th, rather than by half.

we now allow the rings to be punished again even if the system is
already considered in livelock.

without this diff a livelocked system will have its rx ring sizes
scale up and down very rapidly, while holding the rings low for too
long. this affected throughput significantly.

discussed and tested heavily at j2k10. there are still some games
with softnet we can play, but this is a good first step.

"put it in" and ok deraadt@
ok claudio@ krw@ henning@ mcbride@

if we find out that it sucks we can pull it out again later. till then
we'll run with it and see how it goes.


# 1.118 27-Aug-2010 jsg

remove the unused if_init callback in struct ifnet
ok deraadt@ henning@ claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.117 26-Jun-2010 claudio

Implement a simple keepalive mechanism in gre(4) that is compatible with
the one used by Cisco. It sends a return gre packet inside a gre packet
to the other side and expects it to return.
OK deraadt, reyk additional testing by sthen


# 1.116 28-May-2010 claudio

Rework the way we handle MPLS in the kernel. Instead of fumbling MPLS into
ether_output() and later on other L2 output functions use a trick and over-
load the ifp->if_output() function pointer on MPLS enabled interfaces to
go through mpls_output() which will then call the link level output function.
By setting IFXF_MPLS on an interface the output pointers are switched.
This now allows to cleanup the MPLS input and output pathes and fix mpe(4)
so that the MPLS code now actually works for both P and PE systems.
Tested by myself and michele
(A custom kernel with MPLS and mpe enabled is still needed).


# 1.115 17-Apr-2010 deraadt

split SIOCSIFLLADDR code out into an ifnewlladr() function
ok stsp


# 1.114 06-Apr-2010 stsp

Simple implementation of RFC4941, "Privacy Extensions for Stateless
Address Autoconfiguration in IPv6". For those among us who are paranoid
about broadcasting their MAC address to the IPv6 internet.

Man page help from jmc, testing by weerd, arc4random API hints from djm.

ok deraadt, claudio


Revision tags: OPENBSD_4_7_BASE
# 1.113 13-Jan-2010 henning

maintain a global RB tree of all local addresses in the system. this
includes AF_LINK addresses (aka mac addresses in the ethernet case). for
inet this also includes the broadcast addresses.
depends on ifinit() called earlier so we have a chance to pool_init before
autoconf assigns the AF_LINK addresses, the v6 fix, and the ifa_add/del
abstraction i just committed.
this is a change in semantics, it is now illegal to change the actual
address in an ifaddr struct because then the RB tree becomes unbalanced.
nothing using this tree yet.
ok theo ryan dlg


# 1.112 13-Jan-2010 henning

instead of fiddling with the per-interface address lists directly in
many places create a proper API (ifa_add / ifa_del) and use it.
ok theo ryan dlg


# 1.111 12-Jan-2010 deraadt

Make the structures for ifa_msghdr and friends even more like
the route messages so that people and compilers will not get
confused.
ok claudio


# 1.110 17-Sep-2009 claudio

Remove the comaptibility structures for routing socket version 3.
The RTM_VERSION bump is 2 years ago and so there is no need for this.
Diff made by tedu@ some time ago but got never commited so I do it now.


# 1.109 14-Sep-2009 claudio

Add a way to convert the ifi_link_state to a string without the use of
if_media. This makes link state tracking a lot easier as there is no need
to convert if types to if_media types, etc. Additionally this allows us
to extend the link states to include states tracked on higher protocol layers.
gre(4) keepalives packets, bfd and udld can be implemented without ugly hacks.
OK henning, michele, sthen, deraadt


# 1.108 10-Aug-2009 deraadt

At sys_reboot time, bring all the interfaces down so that their xxstop
functions are called, which will turn off DMA. Receiving packets into
your memory after a system reboot is pretty nasty. This will also mean
that the shutdown hooks can go; this solution is smaller.
ok henning miod dlg kettenis


Revision tags: OPENBSD_4_6_BASE
# 1.107 06-Jun-2009 rainer

when xflags got changed, tell the userland by routing sockets

ok henning@


# 1.106 05-Jun-2009 claudio

Initial support for routing domains. This allows to bind interfaces to
alternate routing table and separate them from other interfaces in distinct
routing tables. The same network can now be used in any doamin at the same
time without causing conflicts.
This diff is mostly mechanical and adds the necessary rdomain checks accross
net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6.
input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@


# 1.105 04-Jun-2009 henning

allow IPvShit to be turned off completely per-interface.
ifconfig em0 -inet6
deletes all v6 addresses including link-local and prevents new ones from
being added.
ifconfig em0 inet6 <addr>
re-enables v6, brings the link local back and adds optional <addr>
ok theo reyk


# 1.104 03-Jun-2009 beck

make wireless interfaces priority 4 by default. other interfaces remain
priority 0. while we are in here make sure we add wi interfaces to group "wlan"
in the same way the net80211 stuff already is.

this makes dhcp multiple default routes useful on laptops.

ok claudio@


Revision tags: OPENBSD_4_5_BASE
# 1.103 27-Jan-2009 dlg

make drivers tell the mclgeti allocator what their maximum ring size is
to prevent the hwm growing beyond that. this allows the livelock mitigation
to do something where the hwm used to grow beyond twice the rx rings size.

ok kettenis@ claudio@


# 1.102 12-Dec-2008 claudio

Introduce a if_priority that will be added to RTP_STATIC when routes are
added without an expilict priority. This allows to specify less prefered
interfaces that will only take over if the primary interface loses link.
OK deraadt@


# 1.101 11-Dec-2008 deraadt

export per-interface mbuf cluster pool use statistics out to userland
inside if_data, so that netstat(1) and systat(1) can see them
ok dlg


# 1.100 30-Nov-2008 brad

- Remove unused if_reset "bus reset routine" field in the ifnet struct.
- Add if_stop "stop routine" field in the ifnet struct.

ok mglocker@


# 1.99 26-Nov-2008 dlg

provide m_clsetlwm, an interface for an interface to raise its low
watermark for mbuf cluster allocations.

this is necessary for things like bge which cannot cope with less than a
certain number of pkts on the ring.

ok deraadt@


# 1.98 25-Nov-2008 deraadt

Factor increases are not needed, +1 appears to work as well.
ok dlg


# 1.97 24-Nov-2008 deraadt

move MCLPOOLS to if.h and force uipc_mbuf.c to get if.h, there is no
other option
ok dlg


# 1.96 24-Nov-2008 dlg

add several backend pools to allocate mbufs clusters of various sizes out
of. currently limited to MCLBYTES (2048 bytes) and 4096 bytes until pools
can allocate objects of sizes greater than PAGESIZE.

this allows drivers to ask for "jumbo" packets to fill rx rings with.

the second half of this change is per interface mbuf cluster allocator
statistics. drivers can use the new interface (MCLGETI), which will use
these stats to selectively fail allocations based on demand for mbufs. if
the driver isnt rapidly consuming rx mbufs, we dont allow it to allocate
many to put on its rx ring.

drivers require modifications to take advantage of both the new allocation
semantic and large clusters.

this was written and developed with deraadt@ over the last two days
ok deraadt@ claudio@


# 1.95 07-Nov-2008 deraadt

give this some /* CONSTCOND */ love


Revision tags: OPENBSD_4_4_BASE
# 1.94 10-Apr-2008 dlg

introduce mitigation for the calling of an interfaces start routine.

decent drivers prefer to have a lot of packets on the send queue so they
can queue a lot of them up on the tx ring and then post them all in one
big chunk. unfortunately our stack queues one packet onto the send queue
and then calls the start handler immediately.

this mitigates against that queue, send, queue, send behaviour by trying to
call the start routine only once per softnet. now its queue, queue, queue,
send.

this is the result of a lot of discussion with claudio@
tested by many.


Revision tags: OPENBSD_4_3_BASE
# 1.93 18-Nov-2007 mpf

Sync struct ifaltq to match struct ifqueue.
I wonder why 64-bit archs have not been bitten by this.
OK mcbride@, henning@


# 1.92 03-Sep-2007 claudio

Bump RTM_VERSION to 4 and start a new aera of routing in OpenBSD :)
Changes include 64bit counters instead of u_long, routing table id in the header
of most messages, reserved routing priority field, added a hdrlen field to skip
over the header so that binary compatibility becomes easier.
A minimal backward support for old binaries is included to ease upgrades but
don't expect anything more than ifconfig, route and dhclient to correctly work.
OK henning@ mglocker@


Revision tags: OPENBSD_4_2_BASE
# 1.91 25-Jun-2007 henning

crank ifq_maxlen from 50 to 256, so it is not smaller than most interfaces
rx rings any more. forwarding boxes with many fast interfaces can still use
some more, but this is a saner default.
ok deraadt markus henric


# 1.90 14-Jun-2007 reyk

Add a new "rtlabel" option to ifconfig. It allows to specify a route label
which will be used for new interface routes. For example,
ifconfig em0 10.1.1.0 255.255.255.0 rtlabel RING_1
will set the new interface address and attach the route label RING_1 to
the corresponding route.

manpage bits from jmc@
ok claudio@ henning@


# 1.89 29-May-2007 uwe

Define IF_ENQUEUE() and friends as proper C statements using do ... while
ok henning


# 1.88 26-May-2007 jason

one extern seems to be better than 20 for ifqmaxlen; ok krw


# 1.87 27-Mar-2007 jmc

grammar from bret lambert, and one more from me;


Revision tags: OPENBSD_4_1_BASE
# 1.86 09-Feb-2007 jmc

grammar fix from bret lambert;


# 1.85 28-Nov-2006 reyk

add additional link states to report the half duplex / full duplex
state, if known by the driver. this is required to check the full
duplex state without depending on the ifmedia ioctl which can't be
called in the kernel without process context.

ok henning@, brad@


# 1.84 16-Nov-2006 henning

introduce if_creategroup() to create an empty interface group.
code factored out from if_addgroup(), previously a group always had to have
members. ok mpf mcbride


# 1.83 31-Oct-2006 jason

ether_input_mbuf() isn't necessary, turn it into a macro and deal with
it's "special" case in ether_input(). Based on similiar idea in FreeBSD.
ok brad


Revision tags: OPENBSD_4_0_BASE
# 1.82 02-Jun-2006 mpf

Introduce attributes to interface groups.
As a first user, move the global carp(4) demotion counter
into the interface group. Thus we have the possibility
to define which carp interfaces are demoted together.

Put the demotion counter into the reserved field of the carp header.
With this, we can have carp act smarter if multiple errors occur.
It now always takes over other carp peers, that are advertising
with a higher demote count. As a side effect, we can also have
group failovers without the need of running in preempt mode.
The protocol change does not break compability with older
implementations.

Collaborative work with mcbride@

OK mcbride@, henning@


# 1.81 27-May-2006 brad

remove IFCAP_JUMBO_MTU interface capabilities flag and set if_hardmtu in a few
more drivers.

ok reyk@


# 1.80 26-May-2006 deraadt

rename jumbo mtu to if_hardmtu; ok brad reyk


# 1.79 19-May-2006 reyk

add a if_jumbo_mtu field to the interface structure for drivers
supporting ethernet jumbo frames. there's no standard for the size of
jumbo MTUs, so either let the driver set it's own value or use 9000
byte jumbo frames by default.

ok brad@


# 1.78 04-Mar-2006 brad

With the exception of two other small uncommited diffs this moves
the remainder of the network stack from splimp to splnet.

ok miod@


Revision tags: OPENBSD_3_9_BASE
# 1.77 09-Feb-2006 reyk

add an interface detach hook and use it with the vlan(4) driver. this
fixes a possible crash if the parent interface has been destroyed
(like vlan on trunk) before destroying the vlan interface.

ok brad@


Revision tags: OPENBSD_3_8_BASE
# 1.76 14-Jun-2005 henning

rename function and define to reflect the external -> egress name change
so it is clear what it is all about


# 1.75 14-Jun-2005 henning

use "egress" instead of "external" for the interface group containing the
interfaces the default route(s) point to, proposed deraadt some days ago,
ok djm deraadt


# 1.74 12-Jun-2005 henning

add SIOCGIFGMEMB ioctl, returns a list of all interfaces who are member of
the given group, markus ok


# 1.73 07-Jun-2005 henning

introduce a default "external" interface group, containing the interface(s)
the the default route(s) point to.
handles IPv4 and IPv6 as well as multipath routes.
follows default route changes, of course.
eases writing pf rulesets especially on laptops etc. that use different
interfaces depending on the environment (wired, wireless, ...)
ok theo ryan


# 1.72 06-Jun-2005 henning

use a define instead of hardcoding "all" in 3 places


# 1.71 05-Jun-2005 henning

const'ify the char *groupname param to if_addgroup and if_delgroup


# 1.70 24-May-2005 markus

add net.inet.ip.ifq for monitoring and changing ifqueue; similar to netbsd
ok henning


# 1.69 24-May-2005 reyk

initial import of a trunking (link aggregation and link failover)
implementation. it currently supports round robin mode with link state
checking, additional modes will be added later.

ok brad@, deraadt@


# 1.68 24-May-2005 henning

keep a list of member interfaces in ifg_group


# 1.67 22-May-2005 henning

allow pf to match on interface groups
pass on mygroup ...
markus ok


# 1.66 21-May-2005 henning

clean up and rework the interface absraction code big time, rip out multiple
useless layers of indirection and make the code way cleaner overall.
this is just the start, more to come...
worked very hard on by Ryan and me in Montreal last week, on the airplane to
vancouver and yesterday here in calgary. it hurt.
ok ryan theo


# 1.65 20-Apr-2005 mpf

Introduce if_linkstatehooks.
This converts if_link_state_change() to a generic usable
callback with dohooks().

OK henning@, camield@
Tested by camield@ and Alexey E. Suslikov


Revision tags: OPENBSD_3_7_BASE
# 1.64 07-Feb-2005 mcbride

Add new function if_link_state_change() to take care of sending messages
on the routing socket and notifying carp() of link changes.

ok brad@ mpf@


# 1.63 14-Jan-2005 henning

remove old ifgroups ioctls
the old ifgroups haven't been in use ever really, and the new
implementation is 3 months old today. theo ok (3 months ago)


# 1.62 07-Dec-2004 mcbride

Convert carp(4) to behave more like a regular interface, much in the same
style as vlan(4). carp interfaces no longer require the physical interface
to be on the same subnet as the carp interface, or even that the physical
interface has an adress at all, so CARP can now be used on /30 networks.

ok deraadt@ henning@


# 1.61 07-Dec-2004 mcbride

KNF


# 1.60 03-Dec-2004 henning

do not use one struct timeout for the if congestion stuff, but embed
a struct timeout to struct ifqueue so that each one has its own - it
is a per-queue thing. from chris pascoe


# 1.59 10-Nov-2004 grange

Safer IF_INPUT_ENQUEUE macro.

ok millert@


# 1.58 14-Oct-2004 mickey

avoid stupid commons


# 1.57 11-Oct-2004 henning

ifgroups reqrite
there is now a TAILQ with all interface groups as members, and
in struct ofnet there is only a pointer to the group structure stored
and not its name.
mostly hacked at c2k4 and somewhere over the atlantic ocean
ok markus mcbride


Revision tags: OPENBSD_3_6_BASE
# 1.56 26-Jun-2004 markus

cleanup ioctl for ifgroups; ok pb@


# 1.55 25-Jun-2004 pb

introduce "interface groups"

by "ifconfig fxp0 group foobar" "ifconfig xl0 group foobar"
these two interfaces are in one group.
Every interface has its if-family as default group.

idea/design from henning@, based on some work/disucssion from Joris Vink.

henning@, mcbride@ ok.


# 1.54 20-Jun-2004 beck

undo mbuf cluster breakage that causes free'ed packets to show up on the
input queues when using dhcp and hostap wi, or xl, or fxp....
ok art@


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.53 29-May-2004 jcs

introduce SIOCSIFDESCR and SIOCGIFDESCR to maintain interface
descriptions, configurable with ifconfig

help from various, ok deraadt@


# 1.52 18-May-2004 brad

if_ether.h
add ETHER_MAX_LEN_JUMBO, ETHER_VLAN_ENCAP_LEN, ETHER_ALIGN, and
ETHERMTU_JUMBO constants.

if.h
add a few more interface capabilities flags.

Some from NetBSD, some from FreeBSD.

ok markus@


# 1.51 26-Apr-2004 mcbride

Before enqueueing the packet, copy the contents of incoming clusters
to the mbuf and free the cluster when it contains a small packet.

ok deraadt@


# 1.50 17-Apr-2004 henning

add a congestion indicator to if_queue. It is set when the input queue
is full, along with a timer that unsets it again after 10ms.
The input queue beeing full is a reliable indicator for CPU overload, and
this flag allows other subsystems to cope with the situation.
hacked with beck
ok kjc@ markus@ beck@


Revision tags: OPENBSD_3_5_BASE
# 1.49 15-Jan-2004 markus

add a RTM_IFANNOUNCE message; from netbsd; ok itojun, henning


# 1.48 16-Dec-2003 markus

return error in ifc_destroy; ok deraadt, itojun, cedric, hshoexer


# 1.47 10-Dec-2003 itojun

use if_indexlim (instead of if_index) and ifindex2ifnet[x] != NULL
to check if interface exists, as (1) if_index will have different meaning
(2) ifindex2ifnet could become NULL when interface gets destroyed,
when we introduce dynamically-created interfaces. markus ok


# 1.46 08-Dec-2003 markus

add IOCIFGCLONERS; ifconfig -C; from netbsd; ok henning, deraadt


# 1.45 03-Dec-2003 markus

support for network interface "cloning", e.g. gif(4) via ifconfig(8)


# 1.44 19-Oct-2003 david

more typos


# 1.43 17-Oct-2003 mcbride

Common Address Redundancy Protocol

Allows multiple hosts to share an IP address, providing high availability
and load balancing.

Based on code by mickey@, with additional help from markus@
and Marco_Pfatschbacher@genua.de

ok deraadt@


Revision tags: OPENBSD_3_4_BASE
# 1.42 25-Aug-2003 fgsch

if_init support, required by ieee80211.
deraadt@ ok.


# 1.41 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_A UBC_SYNC_B
# 1.40 03-Jul-2002 miod

Change all variables definitions (int foo) in sys/sys/*.h to variable
declarations (extern int foo), and compensate in the appropriate locations.


# 1.39 30-Jun-2002 itojun

allocate sockaddr_dl for ifnet in if_alloc_sadl(), as we don't always know
the size of sockaddr_dl on if_attach() - for instance, see ether_ifattach().
from netbsd. fgs ok


# 1.38 23-Jun-2002 itojun

g/c last remains of old ipv6 prefix management


# 1.37 27-May-2002 itojun

if_attach() gets called before domaininit(). scan all interfaces for if_afdata
initialization after domaininit().


# 1.36 27-May-2002 itojun

framework to add af-dependent data structure to struct ifnet.
as discussed at bsd-api-discuss. sync w/kame


# 1.35 24-Apr-2002 dhartmei

Add hooks to struct ifnet that allow to register callbacks that will be
notified of interface address changes. ok provos@, angelos@


Revision tags: OPENBSD_3_1_BASE
# 1.34 15-Mar-2002 millert

Cosmetic changes only, primarily making comments line up nicely after the
__P removal.


# 1.33 14-Mar-2002 millert

First round of __P removal in sys


# 1.32 23-Jan-2002 fgsch

compatability -> compatibility.


Revision tags: OPENBSD_3_0_BASE UBC_BASE
# 1.31 05-Jul-2001 angelos

branches: 1.31.4;
KNF


# 1.30 05-Jul-2001 jjbg

Include files for IPComp support. angelos@ ok.


# 1.29 27-Jun-2001 kjc

ALTQ base modifications to the kernel.
- ALTQ introduces a set of new queue macros that coexist with the
traditional IF_XXX macros.
- "struct ifaltq" replaces "struct ifqueue" in "struct ifnet".
- assign cdev major 74 for i386 and 54 for alpha as ALTQ control interface.


# 1.28 23-Jun-2001 fgsch

Add ether_input_mbuf to help us remove the ether_header from
ether_input; all drivers should start migrating to this.
Discussed with jason@, deraadt@ more or les ok'ed.


# 1.27 15-Jun-2001 itojun

change the meaning of ifnet.if_lastchange to meet RFC1573 ifLastChange.
follows BSD/OS practice and ucd-snmp code (FreeBSD does it for specific
interfaces only).

was: if_lastchange get updated on every packet transmission/receipt.
now: if_lastchange get updated when IFF_UP is changed.


# 1.26 09-Jun-2001 angelos

By popular demand, protect from multiple inclusion, and fix to use the
same naming style.


# 1.25 28-May-2001 angelos

IPSECv4 -> IPSEC


# 1.24 28-May-2001 angelos

No need for separate ESP/AH interface capabilities.


# 1.23 28-May-2001 angelos

Interface capabilities (based on NetBSD, but merge ethercom and ifnet
capabilities into one, in the ifp).


Revision tags: OPENBSD_2_9_BASE
# 1.22 06-Feb-2001 mickey

allow changing number of loopbacks in ukc.
change rest of the code to use lo0ifp pointing
to the corresponding struct ifnet.
itojun@ and niklas@ ok


# 1.21 19-Jan-2001 itojun

pull post-4.4BSD change to sys/net/route.c from BSD/OS 4.2 (UCB copyrighted).

have sys/net/route.c:rtrequest1(), which takes rt_addrinfo * as the argument.
pass rt_addrinfo all the way down to rtrequest, and ifa->ifa_rtrequest.
3rd arg of ifa->ifa_rtrequest is now rt_addrinfo * instead of sockaddr *
(almost noone is using it anyways).

benefit: the follwoing command now works. previously we need two route(8)
invocations, "add" then "change".
# route add -inet6 default ::1 -ifp gif0

remove unsafe typecast in rtrequest(), from rtentry * to sockaddr *. it was
introduced by 4.3BSD-reno and never corrected.

XXX is eon_rtrequest() change correct regarding to 3rd arg?
eon_rtrequest() and rtrequest() were incorrect since 4.3BSD-reno,
so i do not have correct answer in the source code.
someone with more clue about netiso-over-ip, please help.


Revision tags: OPENBSD_2_8_BASE
# 1.20 20-Sep-2000 art

Since ifa_refcnt was bumped to an int and rt_flags is an int too, bump
ifa_flags to int.


# 1.19 28-Aug-2000 deraadt

changing the size of if_data has heavy impact on userland compat. there
was even a u_char slot available for ifi_link_state, which clearly does not
need a full 32 bits.


# 1.18 26-Aug-2000 nate

sync mii code with netbsd
adds detach functionality for phys
some code cleanup

Nobody really had time to test all of this out, but theo said commit anyway


Revision tags: OPENBSD_2_7_BASE
# 1.17 22-Mar-2000 itojun

remove if_withname(), which was imported during KAME merge by mistake.


# 1.16 21-Mar-2000 mickey

add SIOCGIFMTU/SIOCSIFMTU; remediate redundant code of tun, ppp, sppp; chris@ ok


Revision tags: SMP_BASE
# 1.15 02-Feb-2000 itojun

branches: 1.15.2;
wrap IFAFREE() by "do {} while (0)". it wasn't safe enough.


Revision tags: kame_19991208
# 1.14 08-Dec-1999 itojun

bring in KAME IPv6 code, dated 19991208.
replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support.
see sys/netinet6/{TODO,IMPLEMENTATION} for more details.

GENERIC configuration should work fine as before. GENERIC.v6 works fine
as well, but you'll need KAME userland tools to play with IPv6 (will be
bringed into soon).


Revision tags: OPENBSD_2_6_BASE
# 1.13 08-Aug-1999 niklas

Support detaching of network interfaces. Still work to do in ipf, and
other families than inet.


# 1.12 23-Jun-1999 cmetz

Added some protocol independent interfaces (supposedly IPv6 support APIs, but
ones that are useful for all protocols, not just IPv6).


Revision tags: OPENBSD_2_5_BASE
# 1.11 13-Mar-1999 deraadt

make ifa_refcnt a u_int; andrewb@demon.net


# 1.10 26-Feb-1999 jason

Ethernet bridge/IP firewall driver.


# 1.9 07-Jan-1999 deraadt

fix IFAFREE() to be safe for if/else nesting


Revision tags: OPENBSD_2_4_BASE
# 1.8 03-Sep-1998 jason

o OpenBSD gets if_media support (from NetBSD)
o rework/simplify if_xl to use it


Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE OPENBSD_2_2_BASE OPENBSD_2_3_BASE
# 1.7 02-Jul-1996 niklas

-Wall & -Wstrict-prototype fixes


# 1.6 29-Jun-1996 deraadt

provide if_attachhead(), and make if_loop use it


# 1.5 10-May-1996 deraadt

if_name/if_unit -> if_xname/if_softc


# 1.4 06-May-1996 mickey

if.h was missed from the commit.
if_ethersubr.c: missed variables added.


# 1.3 05-Mar-1996 mickey

Changes for ifconfig to compile.


# 1.2 03-Mar-1996 niklas

From NetBSD: 960217 merge


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.194 30-May-2018 dlg

restrict the prio values from SIOCSIFLLPRIO to what the kernel handles

previously the ioctl code checked that prio was an int less than
UCHAR_MAX, but the rest of the kernel (and priq code in particular)
expects it to be between 0 and 7 inclusive.

ok krw@ tb@


# 1.193 25-Apr-2018 jca

Make this header standalone #if __BSD_VISIBLE, by including needed headers

Puts us in line with Free/NetBSD and Linux and will get us rid of
pointless patches in the ports tree. ok guenther@ deraadt@


Revision tags: OPENBSD_6_3_BASE
# 1.192 19-Feb-2018 dlg

tunneldf needs ifr_df


# 1.191 10-Feb-2018 florian

Implement RFC 7217: "A Method for Generating Semantically Opaque
Interface Identifiers with IPv6 Stateless Address Autoconfiguration."

"An IPv6 address configured using this method is stable within each
subnet, but the corresponding Interface Identifier changes when the
host moves from one network to another. This method is meant to be an
alternative to generating Interface Identifiers based on hardware
addresses."

OK naddy, sthen


# 1.190 16-Jan-2018 mpi

Recycle IFF_NOTRAILERS into IFF_STATICARP and document ownerhsip
of IFF* flags.

inputs from jmc@, ok bluhm@, visa@


# 1.189 21-Dec-2017 dlg

prototype if_attach_iqueues so drivers can configure multiple iqs.


# 1.188 09-Nov-2017 tb

The cmd argument of ifconf() has been unused since COMPAT_LINUX was
purged. Remove it and move the prototype to if.c since ifconf() is
not used outside of this file.

ok mpi


# 1.187 31-Oct-2017 sashan

- add one more softnet taskq
NOTE: code still runs with single softnet task. change definition of
SOFTNET_TASKS in net/if.c, if you want to have more than one softnet task

OK mpi@, OK phessler@


Revision tags: OPENBSD_6_1_BASE OPENBSD_6_2_BASE
# 1.186 24-Jan-2017 krw

A space here, a space there. Soon we're talking real whitespace
rectification.


# 1.185 24-Jan-2017 dlg

add support for multiple transmit ifqueues per network interface.

an ifq to transmit a packet is picked by the current traffic
conditioner (ie, priq or hfsc) by providing an index into an array
of ifqs. by default interfaces get a single ifq but can ask for
more using if_attach_queues().

the vast majority of our drivers still think there's a 1:1 mapping
between interfaces and transmit queues, so their if_start routines
take an ifnet pointer instead of a pointer to the ifqueue struct.
instead of changing all the drivers in the tree, drivers can opt
into using an if_qstart routine and setting the IFXF_MPSAFE flag.
the stack provides a compatability wrapper from the new if_qstart
handler to the previous if_start handlers if IFXF_MPSAFE isnt set.

enabling hfsc on an interface configures it to transmit everything
through the first ifq. any other ifqs are left configured as priq,
but unused, when hfsc is enabled.

getting this in now so everyone can kick the tyres.

ok mpi@ visa@ (who provided some tweaks for cnmac).


# 1.184 23-Jan-2017 mpi

Flag pseudo-interfaces as such in order to call add_net_randomness()
only once per packet.

Fix a regression introduced when if_input() started to be called by
every pseudo-driver.

ok claudio@, dlg@


# 1.183 23-Jan-2017 dlg

i botched the copyout to ifr->ifr_data in SIOCGIFDATA.

this lets pflogd run again.

rename if_data() to if_getdata() while here to make grepping for
things less noisy.

reported by jsg@
worked through with deraadt@


# 1.182 23-Jan-2017 dlg

merge the ifnet and ifqueue stats together when userland wants them.

a new if_data() function takes a pointer to ifnet and merges its
if_data and ifq statistics. it takes the ifq mutex around the reads
of the ifq stats so they get a consistent copy.

the ifnet and ifq stats are merged because some parts of the stack
still update the ifnet counters.

ok visa@ (on an earlier diff) mpi@ claudio@


# 1.181 12-Dec-2016 mpi

Remove most of the splsoftnet() recursions related to cloned interfaces.

inputs and ok bluhm@


# 1.180 27-Oct-2016 dlg

add a new pool for 2k + 2 byte (mcl2k2) clusters.

a certain vendor likes to make chips that specify the rx buffer
sizes in kilobyte increments. unfortunately it places the ethernet
header on the start of the rx buffer, which means if you give it a
mcl2k cluster, the ethernet header will not be ETHER_ALIGNed cos
mcl2k clusters are always allocated on 2k boundarys (cos they pack
into pages well). that in turn means the ip header wont be aligned
correctly.

the current workaround on these chips has been to let non-strict
alignment archs just use the normal 2k cluster, but use whatever
cluster can fit 2k + 2 on strict archs. that turns out to be the
4k cluster, meaning we waste nearly 2k of space on every packet.

properly aligning the ethernet header and ip headers gives a
performance boost, even on non-strict archs.


# 1.179 04-Sep-2016 reyk

Move code to change the rdomain of an interface from the ioctl switch case
to a new function if_setrdomain().

OK mpi@ henning@


# 1.178 03-Sep-2016 reyk

Add support for a multipoint-to-multipoint mode in vxlan(4). In this
mode, vxlan(4) must be configured to accept any virtual network
identifier with "vnetid any" and added to a bridge(4) or switch(4).
This way the driver will dynamically learn the tunnel endpoints and
their vnetids for the responses and can be used to dynamically bridge
between VXLANs. It is also being used in combination with switch(4)
and the OpenFlow tunnel classifiers.

With input from yasuoka@ goda@
OK deraadt@ dlg@


Revision tags: OPENBSD_6_0_BASE
# 1.177 10-Jun-2016 vgross

Add the "llprio" field to struct ifnet, and the corresponding keyword
to ifconfig.

"llprio" allows one to set the priority of packets that do not go through
pf(4), as the case is for arp(4) or bpf(4).

ok sthen@ mikeb@


# 1.176 02-Mar-2016 dlg

provide generic ioctls for managing an interfaces parent

in the future this will subsume the individual vlandev, carpdev,
pppoedev, foodev options for things like vlan, carp, pppoe, etc.

inspired by vnetid

ok mpi@ jmatthew@


Revision tags: OPENBSD_5_9_BASE
# 1.175 05-Dec-2015 deraadt

avoid an ugly wrap in a comment


# 1.174 03-Dec-2015 dlg

rework if_start to allow nics to provide an mpsafe start routine.

existing start routines will still be called under the kernel lock
and at IPL_NET.

mpsafe start routines will be serialised so only one instance of
each interfaces function will be running in the kernel at any point
in time. this guarantees packets will be dequeued in order, and the
start routines dont have to lock against themselves because if_start
does it for them.

the code to do that is based on the scsi runqueue code.

this also provides an if_start_barrier() function that should wait
until any currently running instances of if_start have finished.

a driver can opt in to the mpsafe if_start call by doing the following:

1. setting ifp->if_xflags = IFXF_MPSAFE
2. only calling if_start() instead of its own start routine
3. clearing IFF_RUNNING before calling if_start_barrier() on its way down
4. only using IFQ_DEQUEUE (not ifq_deq_begin/commit/rollback)

to simplify the implementation the tx mitigation code has been removed.

tested by several
ok mpi@ jmatthew@


# 1.173 20-Nov-2015 mpi

Keep if_ref() private, if_get() is what you want to use before if_put().

The thread detaching an interface will sleep until all references to this
interface have been released. So we decided to only keep references for
a short period of time.

Keeping if_ref() private will hopefully help preserve this goal as long
as it makes sense.

Calling if_get()/if_put() in the same function also allows us to make
use of static analysis tools (thanks jsg@!) to catch our errors.

ok dlg@


# 1.172 24-Oct-2015 reyk

Add pair(4), a vether-based virtual Ethernet driver to interconnect
rdomains and bridges on the local system. This can be used to route
through local rdomains, to create L2 devices (like trunks) between
them, and many other things.

Discussed with many, with input from mpi@
OK sthen@ phessler@ yasuoka@ mikeb@


# 1.171 23-Oct-2015 claudio

Introduce a new sysctl NET_RT_IFNAMES that returns only ifnames to ifindex
mappings. This will be used by if_nameindex(3), if_nametoindex(3) and
if_indextoname(3) soon to fix the issues in pledge because of inet6 link
local addressing.
OK mpi@ benno@ deraadt@
The libc version will follow soon so better start updating your kernels


# 1.170 23-Oct-2015 dlg

tweak the vnetid so it can be optional and therefore cleared/deleted.

the abstract vnetid is promoted to a uin32_t, and adds a SIOCDVNETID
ioctl so it can be cleared.

this is all because i set an assignment on implementing a virtual
network interface and the students got confused when vnetid 0 didnt
show up in ifconfig output.

the vnetid in the vxlan(4) protocol is optional, but the current
code confuses 0 with no vnetid being set. this makes it clear.

ok reyk@ who also simplified my diff


# 1.169 05-Oct-2015 uebayasi

Add ifi_oqdrops and its alias to struct if_data.

Necessary bumps in Ports will be handled by sthen@.

OK mpi@ dlg@


# 1.168 27-Sep-2015 stsp

Add if_setlladdr(), factored out from ifioctl(). Will be used by iwm(4) soon.
With suggestions from tedu@ and guenther@
ok kettenis@


# 1.167 11-Sep-2015 stsp

Make room for media types of the future. Extend the ifmedia word to 64 bits.
This changes numbers of the SIOCSIFMEDIA and SIOCGIFMEDIA ioctls and
grows struct ifmediareq.

Old ifconfig and dhclient binaries can still assign addresses, however
the 'media' subcommand stops working. Recompiling ifconfig and dhclient
with new headers before a reboot should not be necessary unless in very
special circumstances where non-default media settings must be used to
get link and console access is not available.

There may be some MD fallout but that will be cleared up later.

ok deraadt miod
with help and suggestions from several sharks attending l2k15


# 1.166 09-Sep-2015 dlg

introduce reference counts for interfaces (ie, struct ifnet *ifp).

if_get can get a reference to an ifp, but it never releases that
reference. this provides an if_put function that can be used to
decrement the refcount.

we cannot come up with a scheme for letting the network stack run on
one (or many) cpus while ioctls are pulling interfaces down on another
cpu without refcounts for the interfaces.

if_put is going in now so we can go through the stack and put the
necessary calls to it in, and then we'll backfill this implementation
to actually check the refcounts when the interface detaches.

ok mpi@ mikeb@ claudio@


# 1.165 30-Aug-2015 mpi

Use a global table for domains instead of building a list at run time.

As a side effect there's no need to run if_attachdomain() after the
list of domains has been built.

ok claudio@, reyk@


Revision tags: OPENBSD_5_8_BASE
# 1.164 07-Jun-2015 jsg

Introduce unhandled_af() for cases where code conditionally does
something based on an address family and later assumes one of the paths
was taken. This was initially just calls to panic until guenther
suggested a function to reduce the amount of strings needed.

This reduces the amount of noise with static analysers and acts
as a sanity check.

ok guenther@ bluhm@


# 1.163 18-May-2015 reyk

Move the rdomain from struct ifnet into struct if_data. This way it
will be exported to userland with the existing sysctl, getifaddrs()
and routing socket (if_msghdr.ifm_data) interfaces that expose
if_data. All programs and daemons - Apps - that call the
SIOCGIFRDOMAIN ioctl in a getifaddrs() loop or after receiving an
interface message on the routing socket can now remove the pointless
additional ioctl. In base, that could be: dhclient, isakmpd, dhcpd,
dhcrelay, ntpd, ospfd, ripd, ifconfig.

No ABI breakage because it uses a previously unused pad field in if_data.

OK mpi@ deraadt@


# 1.162 10-Apr-2015 mpi

Run detach hook and similar before cleaning up any other resource when
an interface is destroyed/removed. This way we can ensure pseudo-driver
changes done after attaching an interface are undone before detaching it.

Note: it is safe to call if_deactivate() multiple times as the interface
should not have any attached pseudo-interface after the first call.

ok deraadt@, dlg@


# 1.161 18-Mar-2015 dlg

remove the congestion handling from struct ifqueue.

its only used for the ip and ip6 network stack input queues, so it
seems unfair that every instance of ifqueue has to carry a pointer
around for this specific use case.

this moves the congestion marker to a kernel global. if we detect
that we're congested, we assume the whole system is busy and punish
all input queues.

marking a system as congested is done by setting the global to the
current value of ticks. as the system moves away from that value,
it moves away from being congested until the comparison fails.

written at s2k15
ok henning@ beck@ bluhm@ claudio@


Revision tags: OPENBSD_5_7_BASE
# 1.160 08-Feb-2015 mpi

Introduce if_input() a function to pass packets dequeued from a
recieving ring to the stack.

if_input() is at the moment a drop-in replacement for ether_input_mbuf()
but will let us stack pseudo-driver in a nice way in order to no longer
call ether_input() recursively.

ok pelikan@, reyk@, blambert@, henning@


# 1.159 06-Jan-2015 stsp

Remove the NOINET6 interface flag, a left-over from the times when IPv6
was enabled by default. Add AFATTACH/AFDETACH ioctls which enable/disable
an address family for an interface (currently used for IPv6 only).

New kernel needs new ifconfig for IPv6 configuration (address assignment
still works with old ifconfig making this easy to cross over).

Committing on behalf of henning@ who is currently lebensmittelvergiftet.
ok stsp, benno, mpi


# 1.158 05-Dec-2014 mpi

Explicitly include <net/if_var.h> instead of pulling it in <net/if.h>.

ok mikeb@, krw@, bluhm@, tedu@


Revision tags: OPENBSD_5_6_BASE
# 1.157 14-Jul-2014 dlg

now that receive ring accounting has been pulled out of the mbuf layer,
we can pull the space the mbuf layer used to do per interface accounting
out of struct if_data.

saves a hundredish bytes on every interface.

ok deraadt@ claudio@


# 1.156 11-Jul-2014 henning

introduce the IFXF_AUTOCONF6 interface flag which controls wether we
accept rtadvs on that interface. the global net.inet6.ip6.accept_rtadv
sysctl just doesn't cut it, even tho the spec wants that - but in their
little absurd world, a host just has one interface by definition anyway...
the sysctlgoes away.
lots of head scratching, brain cell elemination etc from bluhm benno stsp
florian, excitement from simon and todd, ok bluhm stsp benno florian


# 1.155 08-Jul-2014 dlg

introduce the if_rxr api. it is intended to pull the rx ring accounting
out of the mbuf layer, and break the assumption that an interface will
only have a single ring per mbuf cluster size.

mpi@ is ok with moving this forward


# 1.154 13-Jun-2014 mpi

Instead of updating all the cluster allocation water marks of all the
interfaces when the kernel is livelocked, only do it for the current
pool and defer the other updates.

This allow us to get rid of an interface list iteration in a critical
path.

Ridding the libc crank since this change introduce an ABI break.

ok claudio@


Revision tags: OPENBSD_5_5_BASE
# 1.153 21-Nov-2013 mikeb

split kernel parts of the if.h into a separate header file if_var.h
which allows us to modify ifnet structure in a relatively safe way;
discussed with deraadt, ok mpi


# 1.152 09-Nov-2013 dlg

ticks is compared against mcl_grown to see if time has elapsed since
the rx ring was last allowed to grow and then assigned to it. it
is erroneous to do this because mcl_grown is a u_int and ticks is an
int.

this makes mcl_grown an int, and follows the idiom in kern_timeout.c
of going "thing - ticks < diff", which better copes with ticks
wrapping around and being used to calculate relative intervals.

ok pirofti@ guenther@


# 1.151 01-Nov-2013 pelikan

keep net/hfsc.h away from userspace, except in pfctl

tested by naddy, ok deraadt


# 1.150 21-Oct-2013 benno

nuke comment. How soon is now?
"do it" deraadt@


# 1.149 19-Oct-2013 reyk

Bring back the if_detachhook. We're going to have more users now.

ok mpi@ henning@ benno@


# 1.148 13-Oct-2013 reyk

Import vxlan(4), the virtual extensible local area network tunnel
interface. VXLAN is a UDP-based tunnelling protocol for overlaying
virtualized layer 2 networks over layer 3 networks. The implementation
is based on draft-mahalingam-dutt-dcops-vxlan-04 and has been tested
with other implementations in the wild.

put it in deraadt@


# 1.147 12-Oct-2013 henning

new bandwidth shaping subsystem, kernel side
uses hfsc behind the scenes; altq stays in parallel for a migration phase.
if.h even more messy for the transition, but eventuelly it should become
readable...
looked over & tested by many, ok phessler sthen


# 1.146 17-Sep-2013 mpi

Change vlan(4) detach procedure to not use a hook but a list of vlans
on the parent interface. This is similar to what bridge(4), trunk(4)
or carp(4) are doing and allows us to get rid of the detachhook.

ok reyk@, mikeb@


# 1.145 28-Aug-2013 mpi

Remove unused argument from *rtrequest()

ok krw@, mikeb@


Revision tags: OPENBSD_5_4_BASE
# 1.144 20-Jun-2013 mpi

Revert previous and unbreak asr, the new include should be protected.

Reported by naddy@


# 1.143 20-Jun-2013 mpi

Allocate the various hook head descriptors as part of the ifnet
structure rather than doing various M_WAITOK allocations during
the *attach() functions, we always rely on them anyway.

ok mikeb@, uebayasi@


# 1.142 02-Apr-2013 mpi

Instead of storing the link-level address of every interface in a global
array indexed by interface numbers, add a new field to the interface
descriptor pointing to it.

claudio@ and todd@ like it, ok mikeb@


# 1.141 26-Mar-2013 mpi

Remove various read-only *maxlen variables and use IFQ_MAXLEN directly.

ok beck@, mikeb@


# 1.140 20-Mar-2013 mpi

Introduce if_get() to retrieve an interface descriptor pointer given
an interface index and replace all the redondant checks and accesses
to a global array by a call to this function.

With imputs from and ok bluhm@, mikeb@


# 1.139 07-Mar-2013 mpi

Remove unused ifa_ifwithaf() function.

ok mikeb@, miod@


# 1.138 07-Mar-2013 mpi

Remove the IFAFREE() macro, the ifafree() function it was calling already
check for the reference counter.

ok mikeb@, miod@, pelikan@, kettenis@, krw@


Revision tags: OPENBSD_5_3_BASE
# 1.137 23-Nov-2012 sthen

Add SIOCGIFHARDMTU to allow retrieving the driver's maximum supported MTU
looks fine reyk@ ok mikeb@


# 1.136 11-Nov-2012 deraadt

align ifaliasreq.ifra_addr similar to the way that ifreq is fixed --
a gruesome union, to block the compiler from placing the struct
incorrectly aligned on stack frames
ok guenther


# 1.135 05-Oct-2012 camield

Point an interface directly to its bridgeport configuration, instead
of to the bridge itself. This is ok, since an interface can only be part
of one bridge, and the parent bridge is easy to find from the bridgeport.

This way we can get rid of a lot of list walks, improving performance
and shortening the code.

ok henning stsp sthen reyk


# 1.134 19-Sep-2012 henning

defina an IFCAP_CSUM_MASK, covering IFCAP_CSUM_*, and use it in if_vlan.c
to replace the list of them.
this actually makes vlan inherit the IPv6 CSUM flags from it's parent, that
had been commented out since this code was committed back in 2001.
ok benno mpf


# 1.133 10-Sep-2012 guenther

Bring into compliance with POSIX, exposing just the specified bits.

Requested by jasper@, ok kettenis@


# 1.132 21-Aug-2012 bluhm

Reverse the name and meaning of the IFXF_INET6_PRIVACY interface
flag. It is now called IFXF_INET6_NOPRIVACY. So IPv6 privacy
addresses are on by default without resetting the flag during
ifconfig down/up.
OK stsp@, sperreault@ (who wrote the same diff)


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.131 02-Dec-2011 haesbaert

Kill unused IFCAP_IPSEC and IFCAP_IPCOMP.

ok claudio@ henning@ mikeb@


# 1.130 02-Nov-2011 haesbaert

Expose if_capabilities to userland so that ifconfig can display the
device hardware features.
Tune ifconfig to show them with 'hwfeatures' argument.
While here, kill some old unused capabilities and respect 80 columns
in brconfig.h.

ok mcbride@, henning@, mpf@.


# 1.129 07-Oct-2011 henning

rename some vars and functions
unfortunately altq is one giant namespace violation. rename just those that
conflict with new stuff for now only to be found on my laptop. reduce pain,
the diff is huge already. ok ryan


Revision tags: OPENBSD_5_0_BASE
# 1.128 08-Jul-2011 henning

new priority queueing implementation, extremely low overhead, thus fast.
unconditional, always on. 8 priority levels, as every better switch, the
vlan header etc etc. ok ryan mpf sthen, pea tested as well


# 1.127 07-Jul-2011 henning

provide IF_LEN and IFQ_LEN to access ifq_len on an ifqueue, ryan ok


# 1.126 05-Jul-2011 henning

now of course I only noticed if_qflush is completely unused after
adjusting it to the new world order in my tree... remove it, ok ryan claudio


# 1.125 03-Jul-2011 henning

IFQ_CLASSIFY is also just schrapnel


# 1.124 03-Jul-2011 henning

no traces of ALTQ_DECL to be found anywhere, thus kill the #defines


# 1.123 03-Jul-2011 claudio

LINK_STATE_IS_UP() should consider LINK_STATE_UNKNOWN as an up state.
This is now possible because carp no longer uses LINK_STATE_UNKNOWN
for a state that is considered down. This will simplify a lot of code.
OK mpf@ mcbride@ henning@


# 1.122 13-Mar-2011 stsp

Add a way to enable/disable Wake On LAN with ifconfig.
ok deraadt


Revision tags: OPENBSD_4_9_BASE
# 1.121 17-Nov-2010 henning

introduce ifa_update_broadaddr to update an ifaddr's broadcast address,
trivial for the moment, more needed soon
tested by many as part of a larger diff, ok sthen claudio dlg krw


# 1.120 24-Sep-2010 claudio

Implement if_freenameindex() as a real function as required by posix.
OK deraadt@, millert@


# 1.119 23-Sep-2010 dlg

tweak the mclgeti algorithm to behave better under load.

instead of letting hardware rings grow on every interrupt, restrict
it so it can only grow once per softclock tick. we can only punish
the rings on softclock ticks, so it make sense to only grow on
softclock tick boundaries too.

the rings are now punished after >1 lost softclock tick rather than
>2. mclgeti is now more aggressive at detecting livelock.

the rings get punished by an 8th, rather than by half.

we now allow the rings to be punished again even if the system is
already considered in livelock.

without this diff a livelocked system will have its rx ring sizes
scale up and down very rapidly, while holding the rings low for too
long. this affected throughput significantly.

discussed and tested heavily at j2k10. there are still some games
with softnet we can play, but this is a good first step.

"put it in" and ok deraadt@
ok claudio@ krw@ henning@ mcbride@

if we find out that it sucks we can pull it out again later. till then
we'll run with it and see how it goes.


# 1.118 27-Aug-2010 jsg

remove the unused if_init callback in struct ifnet
ok deraadt@ henning@ claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.117 26-Jun-2010 claudio

Implement a simple keepalive mechanism in gre(4) that is compatible with
the one used by Cisco. It sends a return gre packet inside a gre packet
to the other side and expects it to return.
OK deraadt, reyk additional testing by sthen


# 1.116 28-May-2010 claudio

Rework the way we handle MPLS in the kernel. Instead of fumbling MPLS into
ether_output() and later on other L2 output functions use a trick and over-
load the ifp->if_output() function pointer on MPLS enabled interfaces to
go through mpls_output() which will then call the link level output function.
By setting IFXF_MPLS on an interface the output pointers are switched.
This now allows to cleanup the MPLS input and output pathes and fix mpe(4)
so that the MPLS code now actually works for both P and PE systems.
Tested by myself and michele
(A custom kernel with MPLS and mpe enabled is still needed).


# 1.115 17-Apr-2010 deraadt

split SIOCSIFLLADDR code out into an ifnewlladr() function
ok stsp


# 1.114 06-Apr-2010 stsp

Simple implementation of RFC4941, "Privacy Extensions for Stateless
Address Autoconfiguration in IPv6". For those among us who are paranoid
about broadcasting their MAC address to the IPv6 internet.

Man page help from jmc, testing by weerd, arc4random API hints from djm.

ok deraadt, claudio


Revision tags: OPENBSD_4_7_BASE
# 1.113 13-Jan-2010 henning

maintain a global RB tree of all local addresses in the system. this
includes AF_LINK addresses (aka mac addresses in the ethernet case). for
inet this also includes the broadcast addresses.
depends on ifinit() called earlier so we have a chance to pool_init before
autoconf assigns the AF_LINK addresses, the v6 fix, and the ifa_add/del
abstraction i just committed.
this is a change in semantics, it is now illegal to change the actual
address in an ifaddr struct because then the RB tree becomes unbalanced.
nothing using this tree yet.
ok theo ryan dlg


# 1.112 13-Jan-2010 henning

instead of fiddling with the per-interface address lists directly in
many places create a proper API (ifa_add / ifa_del) and use it.
ok theo ryan dlg


# 1.111 12-Jan-2010 deraadt

Make the structures for ifa_msghdr and friends even more like
the route messages so that people and compilers will not get
confused.
ok claudio


# 1.110 17-Sep-2009 claudio

Remove the comaptibility structures for routing socket version 3.
The RTM_VERSION bump is 2 years ago and so there is no need for this.
Diff made by tedu@ some time ago but got never commited so I do it now.


# 1.109 14-Sep-2009 claudio

Add a way to convert the ifi_link_state to a string without the use of
if_media. This makes link state tracking a lot easier as there is no need
to convert if types to if_media types, etc. Additionally this allows us
to extend the link states to include states tracked on higher protocol layers.
gre(4) keepalives packets, bfd and udld can be implemented without ugly hacks.
OK henning, michele, sthen, deraadt


# 1.108 10-Aug-2009 deraadt

At sys_reboot time, bring all the interfaces down so that their xxstop
functions are called, which will turn off DMA. Receiving packets into
your memory after a system reboot is pretty nasty. This will also mean
that the shutdown hooks can go; this solution is smaller.
ok henning miod dlg kettenis


Revision tags: OPENBSD_4_6_BASE
# 1.107 06-Jun-2009 rainer

when xflags got changed, tell the userland by routing sockets

ok henning@


# 1.106 05-Jun-2009 claudio

Initial support for routing domains. This allows to bind interfaces to
alternate routing table and separate them from other interfaces in distinct
routing tables. The same network can now be used in any doamin at the same
time without causing conflicts.
This diff is mostly mechanical and adds the necessary rdomain checks accross
net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6.
input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@


# 1.105 04-Jun-2009 henning

allow IPvShit to be turned off completely per-interface.
ifconfig em0 -inet6
deletes all v6 addresses including link-local and prevents new ones from
being added.
ifconfig em0 inet6 <addr>
re-enables v6, brings the link local back and adds optional <addr>
ok theo reyk


# 1.104 03-Jun-2009 beck

make wireless interfaces priority 4 by default. other interfaces remain
priority 0. while we are in here make sure we add wi interfaces to group "wlan"
in the same way the net80211 stuff already is.

this makes dhcp multiple default routes useful on laptops.

ok claudio@


Revision tags: OPENBSD_4_5_BASE
# 1.103 27-Jan-2009 dlg

make drivers tell the mclgeti allocator what their maximum ring size is
to prevent the hwm growing beyond that. this allows the livelock mitigation
to do something where the hwm used to grow beyond twice the rx rings size.

ok kettenis@ claudio@


# 1.102 12-Dec-2008 claudio

Introduce a if_priority that will be added to RTP_STATIC when routes are
added without an expilict priority. This allows to specify less prefered
interfaces that will only take over if the primary interface loses link.
OK deraadt@


# 1.101 11-Dec-2008 deraadt

export per-interface mbuf cluster pool use statistics out to userland
inside if_data, so that netstat(1) and systat(1) can see them
ok dlg


# 1.100 30-Nov-2008 brad

- Remove unused if_reset "bus reset routine" field in the ifnet struct.
- Add if_stop "stop routine" field in the ifnet struct.

ok mglocker@


# 1.99 26-Nov-2008 dlg

provide m_clsetlwm, an interface for an interface to raise its low
watermark for mbuf cluster allocations.

this is necessary for things like bge which cannot cope with less than a
certain number of pkts on the ring.

ok deraadt@


# 1.98 25-Nov-2008 deraadt

Factor increases are not needed, +1 appears to work as well.
ok dlg


# 1.97 24-Nov-2008 deraadt

move MCLPOOLS to if.h and force uipc_mbuf.c to get if.h, there is no
other option
ok dlg


# 1.96 24-Nov-2008 dlg

add several backend pools to allocate mbufs clusters of various sizes out
of. currently limited to MCLBYTES (2048 bytes) and 4096 bytes until pools
can allocate objects of sizes greater than PAGESIZE.

this allows drivers to ask for "jumbo" packets to fill rx rings with.

the second half of this change is per interface mbuf cluster allocator
statistics. drivers can use the new interface (MCLGETI), which will use
these stats to selectively fail allocations based on demand for mbufs. if
the driver isnt rapidly consuming rx mbufs, we dont allow it to allocate
many to put on its rx ring.

drivers require modifications to take advantage of both the new allocation
semantic and large clusters.

this was written and developed with deraadt@ over the last two days
ok deraadt@ claudio@


# 1.95 07-Nov-2008 deraadt

give this some /* CONSTCOND */ love


Revision tags: OPENBSD_4_4_BASE
# 1.94 10-Apr-2008 dlg

introduce mitigation for the calling of an interfaces start routine.

decent drivers prefer to have a lot of packets on the send queue so they
can queue a lot of them up on the tx ring and then post them all in one
big chunk. unfortunately our stack queues one packet onto the send queue
and then calls the start handler immediately.

this mitigates against that queue, send, queue, send behaviour by trying to
call the start routine only once per softnet. now its queue, queue, queue,
send.

this is the result of a lot of discussion with claudio@
tested by many.


Revision tags: OPENBSD_4_3_BASE
# 1.93 18-Nov-2007 mpf

Sync struct ifaltq to match struct ifqueue.
I wonder why 64-bit archs have not been bitten by this.
OK mcbride@, henning@


# 1.92 03-Sep-2007 claudio

Bump RTM_VERSION to 4 and start a new aera of routing in OpenBSD :)
Changes include 64bit counters instead of u_long, routing table id in the header
of most messages, reserved routing priority field, added a hdrlen field to skip
over the header so that binary compatibility becomes easier.
A minimal backward support for old binaries is included to ease upgrades but
don't expect anything more than ifconfig, route and dhclient to correctly work.
OK henning@ mglocker@


Revision tags: OPENBSD_4_2_BASE
# 1.91 25-Jun-2007 henning

crank ifq_maxlen from 50 to 256, so it is not smaller than most interfaces
rx rings any more. forwarding boxes with many fast interfaces can still use
some more, but this is a saner default.
ok deraadt markus henric


# 1.90 14-Jun-2007 reyk

Add a new "rtlabel" option to ifconfig. It allows to specify a route label
which will be used for new interface routes. For example,
ifconfig em0 10.1.1.0 255.255.255.0 rtlabel RING_1
will set the new interface address and attach the route label RING_1 to
the corresponding route.

manpage bits from jmc@
ok claudio@ henning@


# 1.89 29-May-2007 uwe

Define IF_ENQUEUE() and friends as proper C statements using do ... while
ok henning


# 1.88 26-May-2007 jason

one extern seems to be better than 20 for ifqmaxlen; ok krw


# 1.87 27-Mar-2007 jmc

grammar from bret lambert, and one more from me;


Revision tags: OPENBSD_4_1_BASE
# 1.86 09-Feb-2007 jmc

grammar fix from bret lambert;


# 1.85 28-Nov-2006 reyk

add additional link states to report the half duplex / full duplex
state, if known by the driver. this is required to check the full
duplex state without depending on the ifmedia ioctl which can't be
called in the kernel without process context.

ok henning@, brad@


# 1.84 16-Nov-2006 henning

introduce if_creategroup() to create an empty interface group.
code factored out from if_addgroup(), previously a group always had to have
members. ok mpf mcbride


# 1.83 31-Oct-2006 jason

ether_input_mbuf() isn't necessary, turn it into a macro and deal with
it's "special" case in ether_input(). Based on similiar idea in FreeBSD.
ok brad


Revision tags: OPENBSD_4_0_BASE
# 1.82 02-Jun-2006 mpf

Introduce attributes to interface groups.
As a first user, move the global carp(4) demotion counter
into the interface group. Thus we have the possibility
to define which carp interfaces are demoted together.

Put the demotion counter into the reserved field of the carp header.
With this, we can have carp act smarter if multiple errors occur.
It now always takes over other carp peers, that are advertising
with a higher demote count. As a side effect, we can also have
group failovers without the need of running in preempt mode.
The protocol change does not break compability with older
implementations.

Collaborative work with mcbride@

OK mcbride@, henning@


# 1.81 27-May-2006 brad

remove IFCAP_JUMBO_MTU interface capabilities flag and set if_hardmtu in a few
more drivers.

ok reyk@


# 1.80 26-May-2006 deraadt

rename jumbo mtu to if_hardmtu; ok brad reyk


# 1.79 19-May-2006 reyk

add a if_jumbo_mtu field to the interface structure for drivers
supporting ethernet jumbo frames. there's no standard for the size of
jumbo MTUs, so either let the driver set it's own value or use 9000
byte jumbo frames by default.

ok brad@


# 1.78 04-Mar-2006 brad

With the exception of two other small uncommited diffs this moves
the remainder of the network stack from splimp to splnet.

ok miod@


Revision tags: OPENBSD_3_9_BASE
# 1.77 09-Feb-2006 reyk

add an interface detach hook and use it with the vlan(4) driver. this
fixes a possible crash if the parent interface has been destroyed
(like vlan on trunk) before destroying the vlan interface.

ok brad@


Revision tags: OPENBSD_3_8_BASE
# 1.76 14-Jun-2005 henning

rename function and define to reflect the external -> egress name change
so it is clear what it is all about


# 1.75 14-Jun-2005 henning

use "egress" instead of "external" for the interface group containing the
interfaces the default route(s) point to, proposed deraadt some days ago,
ok djm deraadt


# 1.74 12-Jun-2005 henning

add SIOCGIFGMEMB ioctl, returns a list of all interfaces who are member of
the given group, markus ok


# 1.73 07-Jun-2005 henning

introduce a default "external" interface group, containing the interface(s)
the the default route(s) point to.
handles IPv4 and IPv6 as well as multipath routes.
follows default route changes, of course.
eases writing pf rulesets especially on laptops etc. that use different
interfaces depending on the environment (wired, wireless, ...)
ok theo ryan


# 1.72 06-Jun-2005 henning

use a define instead of hardcoding "all" in 3 places


# 1.71 05-Jun-2005 henning

const'ify the char *groupname param to if_addgroup and if_delgroup


# 1.70 24-May-2005 markus

add net.inet.ip.ifq for monitoring and changing ifqueue; similar to netbsd
ok henning


# 1.69 24-May-2005 reyk

initial import of a trunking (link aggregation and link failover)
implementation. it currently supports round robin mode with link state
checking, additional modes will be added later.

ok brad@, deraadt@


# 1.68 24-May-2005 henning

keep a list of member interfaces in ifg_group


# 1.67 22-May-2005 henning

allow pf to match on interface groups
pass on mygroup ...
markus ok


# 1.66 21-May-2005 henning

clean up and rework the interface absraction code big time, rip out multiple
useless layers of indirection and make the code way cleaner overall.
this is just the start, more to come...
worked very hard on by Ryan and me in Montreal last week, on the airplane to
vancouver and yesterday here in calgary. it hurt.
ok ryan theo


# 1.65 20-Apr-2005 mpf

Introduce if_linkstatehooks.
This converts if_link_state_change() to a generic usable
callback with dohooks().

OK henning@, camield@
Tested by camield@ and Alexey E. Suslikov


Revision tags: OPENBSD_3_7_BASE
# 1.64 07-Feb-2005 mcbride

Add new function if_link_state_change() to take care of sending messages
on the routing socket and notifying carp() of link changes.

ok brad@ mpf@


# 1.63 14-Jan-2005 henning

remove old ifgroups ioctls
the old ifgroups haven't been in use ever really, and the new
implementation is 3 months old today. theo ok (3 months ago)


# 1.62 07-Dec-2004 mcbride

Convert carp(4) to behave more like a regular interface, much in the same
style as vlan(4). carp interfaces no longer require the physical interface
to be on the same subnet as the carp interface, or even that the physical
interface has an adress at all, so CARP can now be used on /30 networks.

ok deraadt@ henning@


# 1.61 07-Dec-2004 mcbride

KNF


# 1.60 03-Dec-2004 henning

do not use one struct timeout for the if congestion stuff, but embed
a struct timeout to struct ifqueue so that each one has its own - it
is a per-queue thing. from chris pascoe


# 1.59 10-Nov-2004 grange

Safer IF_INPUT_ENQUEUE macro.

ok millert@


# 1.58 14-Oct-2004 mickey

avoid stupid commons


# 1.57 11-Oct-2004 henning

ifgroups reqrite
there is now a TAILQ with all interface groups as members, and
in struct ofnet there is only a pointer to the group structure stored
and not its name.
mostly hacked at c2k4 and somewhere over the atlantic ocean
ok markus mcbride


Revision tags: OPENBSD_3_6_BASE
# 1.56 26-Jun-2004 markus

cleanup ioctl for ifgroups; ok pb@


# 1.55 25-Jun-2004 pb

introduce "interface groups"

by "ifconfig fxp0 group foobar" "ifconfig xl0 group foobar"
these two interfaces are in one group.
Every interface has its if-family as default group.

idea/design from henning@, based on some work/disucssion from Joris Vink.

henning@, mcbride@ ok.


# 1.54 20-Jun-2004 beck

undo mbuf cluster breakage that causes free'ed packets to show up on the
input queues when using dhcp and hostap wi, or xl, or fxp....
ok art@


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.53 29-May-2004 jcs

introduce SIOCSIFDESCR and SIOCGIFDESCR to maintain interface
descriptions, configurable with ifconfig

help from various, ok deraadt@


# 1.52 18-May-2004 brad

if_ether.h
add ETHER_MAX_LEN_JUMBO, ETHER_VLAN_ENCAP_LEN, ETHER_ALIGN, and
ETHERMTU_JUMBO constants.

if.h
add a few more interface capabilities flags.

Some from NetBSD, some from FreeBSD.

ok markus@


# 1.51 26-Apr-2004 mcbride

Before enqueueing the packet, copy the contents of incoming clusters
to the mbuf and free the cluster when it contains a small packet.

ok deraadt@


# 1.50 17-Apr-2004 henning

add a congestion indicator to if_queue. It is set when the input queue
is full, along with a timer that unsets it again after 10ms.
The input queue beeing full is a reliable indicator for CPU overload, and
this flag allows other subsystems to cope with the situation.
hacked with beck
ok kjc@ markus@ beck@


Revision tags: OPENBSD_3_5_BASE
# 1.49 15-Jan-2004 markus

add a RTM_IFANNOUNCE message; from netbsd; ok itojun, henning


# 1.48 16-Dec-2003 markus

return error in ifc_destroy; ok deraadt, itojun, cedric, hshoexer


# 1.47 10-Dec-2003 itojun

use if_indexlim (instead of if_index) and ifindex2ifnet[x] != NULL
to check if interface exists, as (1) if_index will have different meaning
(2) ifindex2ifnet could become NULL when interface gets destroyed,
when we introduce dynamically-created interfaces. markus ok


# 1.46 08-Dec-2003 markus

add IOCIFGCLONERS; ifconfig -C; from netbsd; ok henning, deraadt


# 1.45 03-Dec-2003 markus

support for network interface "cloning", e.g. gif(4) via ifconfig(8)


# 1.44 19-Oct-2003 david

more typos


# 1.43 17-Oct-2003 mcbride

Common Address Redundancy Protocol

Allows multiple hosts to share an IP address, providing high availability
and load balancing.

Based on code by mickey@, with additional help from markus@
and Marco_Pfatschbacher@genua.de

ok deraadt@


Revision tags: OPENBSD_3_4_BASE
# 1.42 25-Aug-2003 fgsch

if_init support, required by ieee80211.
deraadt@ ok.


# 1.41 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_A UBC_SYNC_B
# 1.40 03-Jul-2002 miod

Change all variables definitions (int foo) in sys/sys/*.h to variable
declarations (extern int foo), and compensate in the appropriate locations.


# 1.39 30-Jun-2002 itojun

allocate sockaddr_dl for ifnet in if_alloc_sadl(), as we don't always know
the size of sockaddr_dl on if_attach() - for instance, see ether_ifattach().
from netbsd. fgs ok


# 1.38 23-Jun-2002 itojun

g/c last remains of old ipv6 prefix management


# 1.37 27-May-2002 itojun

if_attach() gets called before domaininit(). scan all interfaces for if_afdata
initialization after domaininit().


# 1.36 27-May-2002 itojun

framework to add af-dependent data structure to struct ifnet.
as discussed at bsd-api-discuss. sync w/kame


# 1.35 24-Apr-2002 dhartmei

Add hooks to struct ifnet that allow to register callbacks that will be
notified of interface address changes. ok provos@, angelos@


Revision tags: OPENBSD_3_1_BASE
# 1.34 15-Mar-2002 millert

Cosmetic changes only, primarily making comments line up nicely after the
__P removal.


# 1.33 14-Mar-2002 millert

First round of __P removal in sys


# 1.32 23-Jan-2002 fgsch

compatability -> compatibility.


Revision tags: OPENBSD_3_0_BASE UBC_BASE
# 1.31 05-Jul-2001 angelos

branches: 1.31.4;
KNF


# 1.30 05-Jul-2001 jjbg

Include files for IPComp support. angelos@ ok.


# 1.29 27-Jun-2001 kjc

ALTQ base modifications to the kernel.
- ALTQ introduces a set of new queue macros that coexist with the
traditional IF_XXX macros.
- "struct ifaltq" replaces "struct ifqueue" in "struct ifnet".
- assign cdev major 74 for i386 and 54 for alpha as ALTQ control interface.


# 1.28 23-Jun-2001 fgsch

Add ether_input_mbuf to help us remove the ether_header from
ether_input; all drivers should start migrating to this.
Discussed with jason@, deraadt@ more or les ok'ed.


# 1.27 15-Jun-2001 itojun

change the meaning of ifnet.if_lastchange to meet RFC1573 ifLastChange.
follows BSD/OS practice and ucd-snmp code (FreeBSD does it for specific
interfaces only).

was: if_lastchange get updated on every packet transmission/receipt.
now: if_lastchange get updated when IFF_UP is changed.


# 1.26 09-Jun-2001 angelos

By popular demand, protect from multiple inclusion, and fix to use the
same naming style.


# 1.25 28-May-2001 angelos

IPSECv4 -> IPSEC


# 1.24 28-May-2001 angelos

No need for separate ESP/AH interface capabilities.


# 1.23 28-May-2001 angelos

Interface capabilities (based on NetBSD, but merge ethercom and ifnet
capabilities into one, in the ifp).


Revision tags: OPENBSD_2_9_BASE
# 1.22 06-Feb-2001 mickey

allow changing number of loopbacks in ukc.
change rest of the code to use lo0ifp pointing
to the corresponding struct ifnet.
itojun@ and niklas@ ok


# 1.21 19-Jan-2001 itojun

pull post-4.4BSD change to sys/net/route.c from BSD/OS 4.2 (UCB copyrighted).

have sys/net/route.c:rtrequest1(), which takes rt_addrinfo * as the argument.
pass rt_addrinfo all the way down to rtrequest, and ifa->ifa_rtrequest.
3rd arg of ifa->ifa_rtrequest is now rt_addrinfo * instead of sockaddr *
(almost noone is using it anyways).

benefit: the follwoing command now works. previously we need two route(8)
invocations, "add" then "change".
# route add -inet6 default ::1 -ifp gif0

remove unsafe typecast in rtrequest(), from rtentry * to sockaddr *. it was
introduced by 4.3BSD-reno and never corrected.

XXX is eon_rtrequest() change correct regarding to 3rd arg?
eon_rtrequest() and rtrequest() were incorrect since 4.3BSD-reno,
so i do not have correct answer in the source code.
someone with more clue about netiso-over-ip, please help.


Revision tags: OPENBSD_2_8_BASE
# 1.20 20-Sep-2000 art

Since ifa_refcnt was bumped to an int and rt_flags is an int too, bump
ifa_flags to int.


# 1.19 28-Aug-2000 deraadt

changing the size of if_data has heavy impact on userland compat. there
was even a u_char slot available for ifi_link_state, which clearly does not
need a full 32 bits.


# 1.18 26-Aug-2000 nate

sync mii code with netbsd
adds detach functionality for phys
some code cleanup

Nobody really had time to test all of this out, but theo said commit anyway


Revision tags: OPENBSD_2_7_BASE
# 1.17 22-Mar-2000 itojun

remove if_withname(), which was imported during KAME merge by mistake.


# 1.16 21-Mar-2000 mickey

add SIOCGIFMTU/SIOCSIFMTU; remediate redundant code of tun, ppp, sppp; chris@ ok


Revision tags: SMP_BASE
# 1.15 02-Feb-2000 itojun

branches: 1.15.2;
wrap IFAFREE() by "do {} while (0)". it wasn't safe enough.


Revision tags: kame_19991208
# 1.14 08-Dec-1999 itojun

bring in KAME IPv6 code, dated 19991208.
replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support.
see sys/netinet6/{TODO,IMPLEMENTATION} for more details.

GENERIC configuration should work fine as before. GENERIC.v6 works fine
as well, but you'll need KAME userland tools to play with IPv6 (will be
bringed into soon).


Revision tags: OPENBSD_2_6_BASE
# 1.13 08-Aug-1999 niklas

Support detaching of network interfaces. Still work to do in ipf, and
other families than inet.


# 1.12 23-Jun-1999 cmetz

Added some protocol independent interfaces (supposedly IPv6 support APIs, but
ones that are useful for all protocols, not just IPv6).


Revision tags: OPENBSD_2_5_BASE
# 1.11 13-Mar-1999 deraadt

make ifa_refcnt a u_int; andrewb@demon.net


# 1.10 26-Feb-1999 jason

Ethernet bridge/IP firewall driver.


# 1.9 07-Jan-1999 deraadt

fix IFAFREE() to be safe for if/else nesting


Revision tags: OPENBSD_2_4_BASE
# 1.8 03-Sep-1998 jason

o OpenBSD gets if_media support (from NetBSD)
o rework/simplify if_xl to use it


Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE OPENBSD_2_2_BASE OPENBSD_2_3_BASE
# 1.7 02-Jul-1996 niklas

-Wall & -Wstrict-prototype fixes


# 1.6 29-Jun-1996 deraadt

provide if_attachhead(), and make if_loop use it


# 1.5 10-May-1996 deraadt

if_name/if_unit -> if_xname/if_softc


# 1.4 06-May-1996 mickey

if.h was missed from the commit.
if_ethersubr.c: missed variables added.


# 1.3 05-Mar-1996 mickey

Changes for ifconfig to compile.


# 1.2 03-Mar-1996 niklas

From NetBSD: 960217 merge


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.193 25-Apr-2018 jca

Make this header standalone #if __BSD_VISIBLE, by including needed headers

Puts us in line with Free/NetBSD and Linux and will get us rid of
pointless patches in the ports tree. ok guenther@ deraadt@


Revision tags: OPENBSD_6_3_BASE
# 1.192 19-Feb-2018 dlg

tunneldf needs ifr_df


# 1.191 10-Feb-2018 florian

Implement RFC 7217: "A Method for Generating Semantically Opaque
Interface Identifiers with IPv6 Stateless Address Autoconfiguration."

"An IPv6 address configured using this method is stable within each
subnet, but the corresponding Interface Identifier changes when the
host moves from one network to another. This method is meant to be an
alternative to generating Interface Identifiers based on hardware
addresses."

OK naddy, sthen


# 1.190 16-Jan-2018 mpi

Recycle IFF_NOTRAILERS into IFF_STATICARP and document ownerhsip
of IFF* flags.

inputs from jmc@, ok bluhm@, visa@


# 1.189 21-Dec-2017 dlg

prototype if_attach_iqueues so drivers can configure multiple iqs.


# 1.188 09-Nov-2017 tb

The cmd argument of ifconf() has been unused since COMPAT_LINUX was
purged. Remove it and move the prototype to if.c since ifconf() is
not used outside of this file.

ok mpi


# 1.187 31-Oct-2017 sashan

- add one more softnet taskq
NOTE: code still runs with single softnet task. change definition of
SOFTNET_TASKS in net/if.c, if you want to have more than one softnet task

OK mpi@, OK phessler@


Revision tags: OPENBSD_6_1_BASE OPENBSD_6_2_BASE
# 1.186 24-Jan-2017 krw

A space here, a space there. Soon we're talking real whitespace
rectification.


# 1.185 24-Jan-2017 dlg

add support for multiple transmit ifqueues per network interface.

an ifq to transmit a packet is picked by the current traffic
conditioner (ie, priq or hfsc) by providing an index into an array
of ifqs. by default interfaces get a single ifq but can ask for
more using if_attach_queues().

the vast majority of our drivers still think there's a 1:1 mapping
between interfaces and transmit queues, so their if_start routines
take an ifnet pointer instead of a pointer to the ifqueue struct.
instead of changing all the drivers in the tree, drivers can opt
into using an if_qstart routine and setting the IFXF_MPSAFE flag.
the stack provides a compatability wrapper from the new if_qstart
handler to the previous if_start handlers if IFXF_MPSAFE isnt set.

enabling hfsc on an interface configures it to transmit everything
through the first ifq. any other ifqs are left configured as priq,
but unused, when hfsc is enabled.

getting this in now so everyone can kick the tyres.

ok mpi@ visa@ (who provided some tweaks for cnmac).


# 1.184 23-Jan-2017 mpi

Flag pseudo-interfaces as such in order to call add_net_randomness()
only once per packet.

Fix a regression introduced when if_input() started to be called by
every pseudo-driver.

ok claudio@, dlg@


# 1.183 23-Jan-2017 dlg

i botched the copyout to ifr->ifr_data in SIOCGIFDATA.

this lets pflogd run again.

rename if_data() to if_getdata() while here to make grepping for
things less noisy.

reported by jsg@
worked through with deraadt@


# 1.182 23-Jan-2017 dlg

merge the ifnet and ifqueue stats together when userland wants them.

a new if_data() function takes a pointer to ifnet and merges its
if_data and ifq statistics. it takes the ifq mutex around the reads
of the ifq stats so they get a consistent copy.

the ifnet and ifq stats are merged because some parts of the stack
still update the ifnet counters.

ok visa@ (on an earlier diff) mpi@ claudio@


# 1.181 12-Dec-2016 mpi

Remove most of the splsoftnet() recursions related to cloned interfaces.

inputs and ok bluhm@


# 1.180 27-Oct-2016 dlg

add a new pool for 2k + 2 byte (mcl2k2) clusters.

a certain vendor likes to make chips that specify the rx buffer
sizes in kilobyte increments. unfortunately it places the ethernet
header on the start of the rx buffer, which means if you give it a
mcl2k cluster, the ethernet header will not be ETHER_ALIGNed cos
mcl2k clusters are always allocated on 2k boundarys (cos they pack
into pages well). that in turn means the ip header wont be aligned
correctly.

the current workaround on these chips has been to let non-strict
alignment archs just use the normal 2k cluster, but use whatever
cluster can fit 2k + 2 on strict archs. that turns out to be the
4k cluster, meaning we waste nearly 2k of space on every packet.

properly aligning the ethernet header and ip headers gives a
performance boost, even on non-strict archs.


# 1.179 04-Sep-2016 reyk

Move code to change the rdomain of an interface from the ioctl switch case
to a new function if_setrdomain().

OK mpi@ henning@


# 1.178 03-Sep-2016 reyk

Add support for a multipoint-to-multipoint mode in vxlan(4). In this
mode, vxlan(4) must be configured to accept any virtual network
identifier with "vnetid any" and added to a bridge(4) or switch(4).
This way the driver will dynamically learn the tunnel endpoints and
their vnetids for the responses and can be used to dynamically bridge
between VXLANs. It is also being used in combination with switch(4)
and the OpenFlow tunnel classifiers.

With input from yasuoka@ goda@
OK deraadt@ dlg@


Revision tags: OPENBSD_6_0_BASE
# 1.177 10-Jun-2016 vgross

Add the "llprio" field to struct ifnet, and the corresponding keyword
to ifconfig.

"llprio" allows one to set the priority of packets that do not go through
pf(4), as the case is for arp(4) or bpf(4).

ok sthen@ mikeb@


# 1.176 02-Mar-2016 dlg

provide generic ioctls for managing an interfaces parent

in the future this will subsume the individual vlandev, carpdev,
pppoedev, foodev options for things like vlan, carp, pppoe, etc.

inspired by vnetid

ok mpi@ jmatthew@


Revision tags: OPENBSD_5_9_BASE
# 1.175 05-Dec-2015 deraadt

avoid an ugly wrap in a comment


# 1.174 03-Dec-2015 dlg

rework if_start to allow nics to provide an mpsafe start routine.

existing start routines will still be called under the kernel lock
and at IPL_NET.

mpsafe start routines will be serialised so only one instance of
each interfaces function will be running in the kernel at any point
in time. this guarantees packets will be dequeued in order, and the
start routines dont have to lock against themselves because if_start
does it for them.

the code to do that is based on the scsi runqueue code.

this also provides an if_start_barrier() function that should wait
until any currently running instances of if_start have finished.

a driver can opt in to the mpsafe if_start call by doing the following:

1. setting ifp->if_xflags = IFXF_MPSAFE
2. only calling if_start() instead of its own start routine
3. clearing IFF_RUNNING before calling if_start_barrier() on its way down
4. only using IFQ_DEQUEUE (not ifq_deq_begin/commit/rollback)

to simplify the implementation the tx mitigation code has been removed.

tested by several
ok mpi@ jmatthew@


# 1.173 20-Nov-2015 mpi

Keep if_ref() private, if_get() is what you want to use before if_put().

The thread detaching an interface will sleep until all references to this
interface have been released. So we decided to only keep references for
a short period of time.

Keeping if_ref() private will hopefully help preserve this goal as long
as it makes sense.

Calling if_get()/if_put() in the same function also allows us to make
use of static analysis tools (thanks jsg@!) to catch our errors.

ok dlg@


# 1.172 24-Oct-2015 reyk

Add pair(4), a vether-based virtual Ethernet driver to interconnect
rdomains and bridges on the local system. This can be used to route
through local rdomains, to create L2 devices (like trunks) between
them, and many other things.

Discussed with many, with input from mpi@
OK sthen@ phessler@ yasuoka@ mikeb@


# 1.171 23-Oct-2015 claudio

Introduce a new sysctl NET_RT_IFNAMES that returns only ifnames to ifindex
mappings. This will be used by if_nameindex(3), if_nametoindex(3) and
if_indextoname(3) soon to fix the issues in pledge because of inet6 link
local addressing.
OK mpi@ benno@ deraadt@
The libc version will follow soon so better start updating your kernels


# 1.170 23-Oct-2015 dlg

tweak the vnetid so it can be optional and therefore cleared/deleted.

the abstract vnetid is promoted to a uin32_t, and adds a SIOCDVNETID
ioctl so it can be cleared.

this is all because i set an assignment on implementing a virtual
network interface and the students got confused when vnetid 0 didnt
show up in ifconfig output.

the vnetid in the vxlan(4) protocol is optional, but the current
code confuses 0 with no vnetid being set. this makes it clear.

ok reyk@ who also simplified my diff


# 1.169 05-Oct-2015 uebayasi

Add ifi_oqdrops and its alias to struct if_data.

Necessary bumps in Ports will be handled by sthen@.

OK mpi@ dlg@


# 1.168 27-Sep-2015 stsp

Add if_setlladdr(), factored out from ifioctl(). Will be used by iwm(4) soon.
With suggestions from tedu@ and guenther@
ok kettenis@


# 1.167 11-Sep-2015 stsp

Make room for media types of the future. Extend the ifmedia word to 64 bits.
This changes numbers of the SIOCSIFMEDIA and SIOCGIFMEDIA ioctls and
grows struct ifmediareq.

Old ifconfig and dhclient binaries can still assign addresses, however
the 'media' subcommand stops working. Recompiling ifconfig and dhclient
with new headers before a reboot should not be necessary unless in very
special circumstances where non-default media settings must be used to
get link and console access is not available.

There may be some MD fallout but that will be cleared up later.

ok deraadt miod
with help and suggestions from several sharks attending l2k15


# 1.166 09-Sep-2015 dlg

introduce reference counts for interfaces (ie, struct ifnet *ifp).

if_get can get a reference to an ifp, but it never releases that
reference. this provides an if_put function that can be used to
decrement the refcount.

we cannot come up with a scheme for letting the network stack run on
one (or many) cpus while ioctls are pulling interfaces down on another
cpu without refcounts for the interfaces.

if_put is going in now so we can go through the stack and put the
necessary calls to it in, and then we'll backfill this implementation
to actually check the refcounts when the interface detaches.

ok mpi@ mikeb@ claudio@


# 1.165 30-Aug-2015 mpi

Use a global table for domains instead of building a list at run time.

As a side effect there's no need to run if_attachdomain() after the
list of domains has been built.

ok claudio@, reyk@


Revision tags: OPENBSD_5_8_BASE
# 1.164 07-Jun-2015 jsg

Introduce unhandled_af() for cases where code conditionally does
something based on an address family and later assumes one of the paths
was taken. This was initially just calls to panic until guenther
suggested a function to reduce the amount of strings needed.

This reduces the amount of noise with static analysers and acts
as a sanity check.

ok guenther@ bluhm@


# 1.163 18-May-2015 reyk

Move the rdomain from struct ifnet into struct if_data. This way it
will be exported to userland with the existing sysctl, getifaddrs()
and routing socket (if_msghdr.ifm_data) interfaces that expose
if_data. All programs and daemons - Apps - that call the
SIOCGIFRDOMAIN ioctl in a getifaddrs() loop or after receiving an
interface message on the routing socket can now remove the pointless
additional ioctl. In base, that could be: dhclient, isakmpd, dhcpd,
dhcrelay, ntpd, ospfd, ripd, ifconfig.

No ABI breakage because it uses a previously unused pad field in if_data.

OK mpi@ deraadt@


# 1.162 10-Apr-2015 mpi

Run detach hook and similar before cleaning up any other resource when
an interface is destroyed/removed. This way we can ensure pseudo-driver
changes done after attaching an interface are undone before detaching it.

Note: it is safe to call if_deactivate() multiple times as the interface
should not have any attached pseudo-interface after the first call.

ok deraadt@, dlg@


# 1.161 18-Mar-2015 dlg

remove the congestion handling from struct ifqueue.

its only used for the ip and ip6 network stack input queues, so it
seems unfair that every instance of ifqueue has to carry a pointer
around for this specific use case.

this moves the congestion marker to a kernel global. if we detect
that we're congested, we assume the whole system is busy and punish
all input queues.

marking a system as congested is done by setting the global to the
current value of ticks. as the system moves away from that value,
it moves away from being congested until the comparison fails.

written at s2k15
ok henning@ beck@ bluhm@ claudio@


Revision tags: OPENBSD_5_7_BASE
# 1.160 08-Feb-2015 mpi

Introduce if_input() a function to pass packets dequeued from a
recieving ring to the stack.

if_input() is at the moment a drop-in replacement for ether_input_mbuf()
but will let us stack pseudo-driver in a nice way in order to no longer
call ether_input() recursively.

ok pelikan@, reyk@, blambert@, henning@


# 1.159 06-Jan-2015 stsp

Remove the NOINET6 interface flag, a left-over from the times when IPv6
was enabled by default. Add AFATTACH/AFDETACH ioctls which enable/disable
an address family for an interface (currently used for IPv6 only).

New kernel needs new ifconfig for IPv6 configuration (address assignment
still works with old ifconfig making this easy to cross over).

Committing on behalf of henning@ who is currently lebensmittelvergiftet.
ok stsp, benno, mpi


# 1.158 05-Dec-2014 mpi

Explicitly include <net/if_var.h> instead of pulling it in <net/if.h>.

ok mikeb@, krw@, bluhm@, tedu@


Revision tags: OPENBSD_5_6_BASE
# 1.157 14-Jul-2014 dlg

now that receive ring accounting has been pulled out of the mbuf layer,
we can pull the space the mbuf layer used to do per interface accounting
out of struct if_data.

saves a hundredish bytes on every interface.

ok deraadt@ claudio@


# 1.156 11-Jul-2014 henning

introduce the IFXF_AUTOCONF6 interface flag which controls wether we
accept rtadvs on that interface. the global net.inet6.ip6.accept_rtadv
sysctl just doesn't cut it, even tho the spec wants that - but in their
little absurd world, a host just has one interface by definition anyway...
the sysctlgoes away.
lots of head scratching, brain cell elemination etc from bluhm benno stsp
florian, excitement from simon and todd, ok bluhm stsp benno florian


# 1.155 08-Jul-2014 dlg

introduce the if_rxr api. it is intended to pull the rx ring accounting
out of the mbuf layer, and break the assumption that an interface will
only have a single ring per mbuf cluster size.

mpi@ is ok with moving this forward


# 1.154 13-Jun-2014 mpi

Instead of updating all the cluster allocation water marks of all the
interfaces when the kernel is livelocked, only do it for the current
pool and defer the other updates.

This allow us to get rid of an interface list iteration in a critical
path.

Ridding the libc crank since this change introduce an ABI break.

ok claudio@


Revision tags: OPENBSD_5_5_BASE
# 1.153 21-Nov-2013 mikeb

split kernel parts of the if.h into a separate header file if_var.h
which allows us to modify ifnet structure in a relatively safe way;
discussed with deraadt, ok mpi


# 1.152 09-Nov-2013 dlg

ticks is compared against mcl_grown to see if time has elapsed since
the rx ring was last allowed to grow and then assigned to it. it
is erroneous to do this because mcl_grown is a u_int and ticks is an
int.

this makes mcl_grown an int, and follows the idiom in kern_timeout.c
of going "thing - ticks < diff", which better copes with ticks
wrapping around and being used to calculate relative intervals.

ok pirofti@ guenther@


# 1.151 01-Nov-2013 pelikan

keep net/hfsc.h away from userspace, except in pfctl

tested by naddy, ok deraadt


# 1.150 21-Oct-2013 benno

nuke comment. How soon is now?
"do it" deraadt@


# 1.149 19-Oct-2013 reyk

Bring back the if_detachhook. We're going to have more users now.

ok mpi@ henning@ benno@


# 1.148 13-Oct-2013 reyk

Import vxlan(4), the virtual extensible local area network tunnel
interface. VXLAN is a UDP-based tunnelling protocol for overlaying
virtualized layer 2 networks over layer 3 networks. The implementation
is based on draft-mahalingam-dutt-dcops-vxlan-04 and has been tested
with other implementations in the wild.

put it in deraadt@


# 1.147 12-Oct-2013 henning

new bandwidth shaping subsystem, kernel side
uses hfsc behind the scenes; altq stays in parallel for a migration phase.
if.h even more messy for the transition, but eventuelly it should become
readable...
looked over & tested by many, ok phessler sthen


# 1.146 17-Sep-2013 mpi

Change vlan(4) detach procedure to not use a hook but a list of vlans
on the parent interface. This is similar to what bridge(4), trunk(4)
or carp(4) are doing and allows us to get rid of the detachhook.

ok reyk@, mikeb@


# 1.145 28-Aug-2013 mpi

Remove unused argument from *rtrequest()

ok krw@, mikeb@


Revision tags: OPENBSD_5_4_BASE
# 1.144 20-Jun-2013 mpi

Revert previous and unbreak asr, the new include should be protected.

Reported by naddy@


# 1.143 20-Jun-2013 mpi

Allocate the various hook head descriptors as part of the ifnet
structure rather than doing various M_WAITOK allocations during
the *attach() functions, we always rely on them anyway.

ok mikeb@, uebayasi@


# 1.142 02-Apr-2013 mpi

Instead of storing the link-level address of every interface in a global
array indexed by interface numbers, add a new field to the interface
descriptor pointing to it.

claudio@ and todd@ like it, ok mikeb@


# 1.141 26-Mar-2013 mpi

Remove various read-only *maxlen variables and use IFQ_MAXLEN directly.

ok beck@, mikeb@


# 1.140 20-Mar-2013 mpi

Introduce if_get() to retrieve an interface descriptor pointer given
an interface index and replace all the redondant checks and accesses
to a global array by a call to this function.

With imputs from and ok bluhm@, mikeb@


# 1.139 07-Mar-2013 mpi

Remove unused ifa_ifwithaf() function.

ok mikeb@, miod@


# 1.138 07-Mar-2013 mpi

Remove the IFAFREE() macro, the ifafree() function it was calling already
check for the reference counter.

ok mikeb@, miod@, pelikan@, kettenis@, krw@


Revision tags: OPENBSD_5_3_BASE
# 1.137 23-Nov-2012 sthen

Add SIOCGIFHARDMTU to allow retrieving the driver's maximum supported MTU
looks fine reyk@ ok mikeb@


# 1.136 11-Nov-2012 deraadt

align ifaliasreq.ifra_addr similar to the way that ifreq is fixed --
a gruesome union, to block the compiler from placing the struct
incorrectly aligned on stack frames
ok guenther


# 1.135 05-Oct-2012 camield

Point an interface directly to its bridgeport configuration, instead
of to the bridge itself. This is ok, since an interface can only be part
of one bridge, and the parent bridge is easy to find from the bridgeport.

This way we can get rid of a lot of list walks, improving performance
and shortening the code.

ok henning stsp sthen reyk


# 1.134 19-Sep-2012 henning

defina an IFCAP_CSUM_MASK, covering IFCAP_CSUM_*, and use it in if_vlan.c
to replace the list of them.
this actually makes vlan inherit the IPv6 CSUM flags from it's parent, that
had been commented out since this code was committed back in 2001.
ok benno mpf


# 1.133 10-Sep-2012 guenther

Bring into compliance with POSIX, exposing just the specified bits.

Requested by jasper@, ok kettenis@


# 1.132 21-Aug-2012 bluhm

Reverse the name and meaning of the IFXF_INET6_PRIVACY interface
flag. It is now called IFXF_INET6_NOPRIVACY. So IPv6 privacy
addresses are on by default without resetting the flag during
ifconfig down/up.
OK stsp@, sperreault@ (who wrote the same diff)


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.131 02-Dec-2011 haesbaert

Kill unused IFCAP_IPSEC and IFCAP_IPCOMP.

ok claudio@ henning@ mikeb@


# 1.130 02-Nov-2011 haesbaert

Expose if_capabilities to userland so that ifconfig can display the
device hardware features.
Tune ifconfig to show them with 'hwfeatures' argument.
While here, kill some old unused capabilities and respect 80 columns
in brconfig.h.

ok mcbride@, henning@, mpf@.


# 1.129 07-Oct-2011 henning

rename some vars and functions
unfortunately altq is one giant namespace violation. rename just those that
conflict with new stuff for now only to be found on my laptop. reduce pain,
the diff is huge already. ok ryan


Revision tags: OPENBSD_5_0_BASE
# 1.128 08-Jul-2011 henning

new priority queueing implementation, extremely low overhead, thus fast.
unconditional, always on. 8 priority levels, as every better switch, the
vlan header etc etc. ok ryan mpf sthen, pea tested as well


# 1.127 07-Jul-2011 henning

provide IF_LEN and IFQ_LEN to access ifq_len on an ifqueue, ryan ok


# 1.126 05-Jul-2011 henning

now of course I only noticed if_qflush is completely unused after
adjusting it to the new world order in my tree... remove it, ok ryan claudio


# 1.125 03-Jul-2011 henning

IFQ_CLASSIFY is also just schrapnel


# 1.124 03-Jul-2011 henning

no traces of ALTQ_DECL to be found anywhere, thus kill the #defines


# 1.123 03-Jul-2011 claudio

LINK_STATE_IS_UP() should consider LINK_STATE_UNKNOWN as an up state.
This is now possible because carp no longer uses LINK_STATE_UNKNOWN
for a state that is considered down. This will simplify a lot of code.
OK mpf@ mcbride@ henning@


# 1.122 13-Mar-2011 stsp

Add a way to enable/disable Wake On LAN with ifconfig.
ok deraadt


Revision tags: OPENBSD_4_9_BASE
# 1.121 17-Nov-2010 henning

introduce ifa_update_broadaddr to update an ifaddr's broadcast address,
trivial for the moment, more needed soon
tested by many as part of a larger diff, ok sthen claudio dlg krw


# 1.120 24-Sep-2010 claudio

Implement if_freenameindex() as a real function as required by posix.
OK deraadt@, millert@


# 1.119 23-Sep-2010 dlg

tweak the mclgeti algorithm to behave better under load.

instead of letting hardware rings grow on every interrupt, restrict
it so it can only grow once per softclock tick. we can only punish
the rings on softclock ticks, so it make sense to only grow on
softclock tick boundaries too.

the rings are now punished after >1 lost softclock tick rather than
>2. mclgeti is now more aggressive at detecting livelock.

the rings get punished by an 8th, rather than by half.

we now allow the rings to be punished again even if the system is
already considered in livelock.

without this diff a livelocked system will have its rx ring sizes
scale up and down very rapidly, while holding the rings low for too
long. this affected throughput significantly.

discussed and tested heavily at j2k10. there are still some games
with softnet we can play, but this is a good first step.

"put it in" and ok deraadt@
ok claudio@ krw@ henning@ mcbride@

if we find out that it sucks we can pull it out again later. till then
we'll run with it and see how it goes.


# 1.118 27-Aug-2010 jsg

remove the unused if_init callback in struct ifnet
ok deraadt@ henning@ claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.117 26-Jun-2010 claudio

Implement a simple keepalive mechanism in gre(4) that is compatible with
the one used by Cisco. It sends a return gre packet inside a gre packet
to the other side and expects it to return.
OK deraadt, reyk additional testing by sthen


# 1.116 28-May-2010 claudio

Rework the way we handle MPLS in the kernel. Instead of fumbling MPLS into
ether_output() and later on other L2 output functions use a trick and over-
load the ifp->if_output() function pointer on MPLS enabled interfaces to
go through mpls_output() which will then call the link level output function.
By setting IFXF_MPLS on an interface the output pointers are switched.
This now allows to cleanup the MPLS input and output pathes and fix mpe(4)
so that the MPLS code now actually works for both P and PE systems.
Tested by myself and michele
(A custom kernel with MPLS and mpe enabled is still needed).


# 1.115 17-Apr-2010 deraadt

split SIOCSIFLLADDR code out into an ifnewlladr() function
ok stsp


# 1.114 06-Apr-2010 stsp

Simple implementation of RFC4941, "Privacy Extensions for Stateless
Address Autoconfiguration in IPv6". For those among us who are paranoid
about broadcasting their MAC address to the IPv6 internet.

Man page help from jmc, testing by weerd, arc4random API hints from djm.

ok deraadt, claudio


Revision tags: OPENBSD_4_7_BASE
# 1.113 13-Jan-2010 henning

maintain a global RB tree of all local addresses in the system. this
includes AF_LINK addresses (aka mac addresses in the ethernet case). for
inet this also includes the broadcast addresses.
depends on ifinit() called earlier so we have a chance to pool_init before
autoconf assigns the AF_LINK addresses, the v6 fix, and the ifa_add/del
abstraction i just committed.
this is a change in semantics, it is now illegal to change the actual
address in an ifaddr struct because then the RB tree becomes unbalanced.
nothing using this tree yet.
ok theo ryan dlg


# 1.112 13-Jan-2010 henning

instead of fiddling with the per-interface address lists directly in
many places create a proper API (ifa_add / ifa_del) and use it.
ok theo ryan dlg


# 1.111 12-Jan-2010 deraadt

Make the structures for ifa_msghdr and friends even more like
the route messages so that people and compilers will not get
confused.
ok claudio


# 1.110 17-Sep-2009 claudio

Remove the comaptibility structures for routing socket version 3.
The RTM_VERSION bump is 2 years ago and so there is no need for this.
Diff made by tedu@ some time ago but got never commited so I do it now.


# 1.109 14-Sep-2009 claudio

Add a way to convert the ifi_link_state to a string without the use of
if_media. This makes link state tracking a lot easier as there is no need
to convert if types to if_media types, etc. Additionally this allows us
to extend the link states to include states tracked on higher protocol layers.
gre(4) keepalives packets, bfd and udld can be implemented without ugly hacks.
OK henning, michele, sthen, deraadt


# 1.108 10-Aug-2009 deraadt

At sys_reboot time, bring all the interfaces down so that their xxstop
functions are called, which will turn off DMA. Receiving packets into
your memory after a system reboot is pretty nasty. This will also mean
that the shutdown hooks can go; this solution is smaller.
ok henning miod dlg kettenis


Revision tags: OPENBSD_4_6_BASE
# 1.107 06-Jun-2009 rainer

when xflags got changed, tell the userland by routing sockets

ok henning@


# 1.106 05-Jun-2009 claudio

Initial support for routing domains. This allows to bind interfaces to
alternate routing table and separate them from other interfaces in distinct
routing tables. The same network can now be used in any doamin at the same
time without causing conflicts.
This diff is mostly mechanical and adds the necessary rdomain checks accross
net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6.
input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@


# 1.105 04-Jun-2009 henning

allow IPvShit to be turned off completely per-interface.
ifconfig em0 -inet6
deletes all v6 addresses including link-local and prevents new ones from
being added.
ifconfig em0 inet6 <addr>
re-enables v6, brings the link local back and adds optional <addr>
ok theo reyk


# 1.104 03-Jun-2009 beck

make wireless interfaces priority 4 by default. other interfaces remain
priority 0. while we are in here make sure we add wi interfaces to group "wlan"
in the same way the net80211 stuff already is.

this makes dhcp multiple default routes useful on laptops.

ok claudio@


Revision tags: OPENBSD_4_5_BASE
# 1.103 27-Jan-2009 dlg

make drivers tell the mclgeti allocator what their maximum ring size is
to prevent the hwm growing beyond that. this allows the livelock mitigation
to do something where the hwm used to grow beyond twice the rx rings size.

ok kettenis@ claudio@


# 1.102 12-Dec-2008 claudio

Introduce a if_priority that will be added to RTP_STATIC when routes are
added without an expilict priority. This allows to specify less prefered
interfaces that will only take over if the primary interface loses link.
OK deraadt@


# 1.101 11-Dec-2008 deraadt

export per-interface mbuf cluster pool use statistics out to userland
inside if_data, so that netstat(1) and systat(1) can see them
ok dlg


# 1.100 30-Nov-2008 brad

- Remove unused if_reset "bus reset routine" field in the ifnet struct.
- Add if_stop "stop routine" field in the ifnet struct.

ok mglocker@


# 1.99 26-Nov-2008 dlg

provide m_clsetlwm, an interface for an interface to raise its low
watermark for mbuf cluster allocations.

this is necessary for things like bge which cannot cope with less than a
certain number of pkts on the ring.

ok deraadt@


# 1.98 25-Nov-2008 deraadt

Factor increases are not needed, +1 appears to work as well.
ok dlg


# 1.97 24-Nov-2008 deraadt

move MCLPOOLS to if.h and force uipc_mbuf.c to get if.h, there is no
other option
ok dlg


# 1.96 24-Nov-2008 dlg

add several backend pools to allocate mbufs clusters of various sizes out
of. currently limited to MCLBYTES (2048 bytes) and 4096 bytes until pools
can allocate objects of sizes greater than PAGESIZE.

this allows drivers to ask for "jumbo" packets to fill rx rings with.

the second half of this change is per interface mbuf cluster allocator
statistics. drivers can use the new interface (MCLGETI), which will use
these stats to selectively fail allocations based on demand for mbufs. if
the driver isnt rapidly consuming rx mbufs, we dont allow it to allocate
many to put on its rx ring.

drivers require modifications to take advantage of both the new allocation
semantic and large clusters.

this was written and developed with deraadt@ over the last two days
ok deraadt@ claudio@


# 1.95 07-Nov-2008 deraadt

give this some /* CONSTCOND */ love


Revision tags: OPENBSD_4_4_BASE
# 1.94 10-Apr-2008 dlg

introduce mitigation for the calling of an interfaces start routine.

decent drivers prefer to have a lot of packets on the send queue so they
can queue a lot of them up on the tx ring and then post them all in one
big chunk. unfortunately our stack queues one packet onto the send queue
and then calls the start handler immediately.

this mitigates against that queue, send, queue, send behaviour by trying to
call the start routine only once per softnet. now its queue, queue, queue,
send.

this is the result of a lot of discussion with claudio@
tested by many.


Revision tags: OPENBSD_4_3_BASE
# 1.93 18-Nov-2007 mpf

Sync struct ifaltq to match struct ifqueue.
I wonder why 64-bit archs have not been bitten by this.
OK mcbride@, henning@


# 1.92 03-Sep-2007 claudio

Bump RTM_VERSION to 4 and start a new aera of routing in OpenBSD :)
Changes include 64bit counters instead of u_long, routing table id in the header
of most messages, reserved routing priority field, added a hdrlen field to skip
over the header so that binary compatibility becomes easier.
A minimal backward support for old binaries is included to ease upgrades but
don't expect anything more than ifconfig, route and dhclient to correctly work.
OK henning@ mglocker@


Revision tags: OPENBSD_4_2_BASE
# 1.91 25-Jun-2007 henning

crank ifq_maxlen from 50 to 256, so it is not smaller than most interfaces
rx rings any more. forwarding boxes with many fast interfaces can still use
some more, but this is a saner default.
ok deraadt markus henric


# 1.90 14-Jun-2007 reyk

Add a new "rtlabel" option to ifconfig. It allows to specify a route label
which will be used for new interface routes. For example,
ifconfig em0 10.1.1.0 255.255.255.0 rtlabel RING_1
will set the new interface address and attach the route label RING_1 to
the corresponding route.

manpage bits from jmc@
ok claudio@ henning@


# 1.89 29-May-2007 uwe

Define IF_ENQUEUE() and friends as proper C statements using do ... while
ok henning


# 1.88 26-May-2007 jason

one extern seems to be better than 20 for ifqmaxlen; ok krw


# 1.87 27-Mar-2007 jmc

grammar from bret lambert, and one more from me;


Revision tags: OPENBSD_4_1_BASE
# 1.86 09-Feb-2007 jmc

grammar fix from bret lambert;


# 1.85 28-Nov-2006 reyk

add additional link states to report the half duplex / full duplex
state, if known by the driver. this is required to check the full
duplex state without depending on the ifmedia ioctl which can't be
called in the kernel without process context.

ok henning@, brad@


# 1.84 16-Nov-2006 henning

introduce if_creategroup() to create an empty interface group.
code factored out from if_addgroup(), previously a group always had to have
members. ok mpf mcbride


# 1.83 31-Oct-2006 jason

ether_input_mbuf() isn't necessary, turn it into a macro and deal with
it's "special" case in ether_input(). Based on similiar idea in FreeBSD.
ok brad


Revision tags: OPENBSD_4_0_BASE
# 1.82 02-Jun-2006 mpf

Introduce attributes to interface groups.
As a first user, move the global carp(4) demotion counter
into the interface group. Thus we have the possibility
to define which carp interfaces are demoted together.

Put the demotion counter into the reserved field of the carp header.
With this, we can have carp act smarter if multiple errors occur.
It now always takes over other carp peers, that are advertising
with a higher demote count. As a side effect, we can also have
group failovers without the need of running in preempt mode.
The protocol change does not break compability with older
implementations.

Collaborative work with mcbride@

OK mcbride@, henning@


# 1.81 27-May-2006 brad

remove IFCAP_JUMBO_MTU interface capabilities flag and set if_hardmtu in a few
more drivers.

ok reyk@


# 1.80 26-May-2006 deraadt

rename jumbo mtu to if_hardmtu; ok brad reyk


# 1.79 19-May-2006 reyk

add a if_jumbo_mtu field to the interface structure for drivers
supporting ethernet jumbo frames. there's no standard for the size of
jumbo MTUs, so either let the driver set it's own value or use 9000
byte jumbo frames by default.

ok brad@


# 1.78 04-Mar-2006 brad

With the exception of two other small uncommited diffs this moves
the remainder of the network stack from splimp to splnet.

ok miod@


Revision tags: OPENBSD_3_9_BASE
# 1.77 09-Feb-2006 reyk

add an interface detach hook and use it with the vlan(4) driver. this
fixes a possible crash if the parent interface has been destroyed
(like vlan on trunk) before destroying the vlan interface.

ok brad@


Revision tags: OPENBSD_3_8_BASE
# 1.76 14-Jun-2005 henning

rename function and define to reflect the external -> egress name change
so it is clear what it is all about


# 1.75 14-Jun-2005 henning

use "egress" instead of "external" for the interface group containing the
interfaces the default route(s) point to, proposed deraadt some days ago,
ok djm deraadt


# 1.74 12-Jun-2005 henning

add SIOCGIFGMEMB ioctl, returns a list of all interfaces who are member of
the given group, markus ok


# 1.73 07-Jun-2005 henning

introduce a default "external" interface group, containing the interface(s)
the the default route(s) point to.
handles IPv4 and IPv6 as well as multipath routes.
follows default route changes, of course.
eases writing pf rulesets especially on laptops etc. that use different
interfaces depending on the environment (wired, wireless, ...)
ok theo ryan


# 1.72 06-Jun-2005 henning

use a define instead of hardcoding "all" in 3 places


# 1.71 05-Jun-2005 henning

const'ify the char *groupname param to if_addgroup and if_delgroup


# 1.70 24-May-2005 markus

add net.inet.ip.ifq for monitoring and changing ifqueue; similar to netbsd
ok henning


# 1.69 24-May-2005 reyk

initial import of a trunking (link aggregation and link failover)
implementation. it currently supports round robin mode with link state
checking, additional modes will be added later.

ok brad@, deraadt@


# 1.68 24-May-2005 henning

keep a list of member interfaces in ifg_group


# 1.67 22-May-2005 henning

allow pf to match on interface groups
pass on mygroup ...
markus ok


# 1.66 21-May-2005 henning

clean up and rework the interface absraction code big time, rip out multiple
useless layers of indirection and make the code way cleaner overall.
this is just the start, more to come...
worked very hard on by Ryan and me in Montreal last week, on the airplane to
vancouver and yesterday here in calgary. it hurt.
ok ryan theo


# 1.65 20-Apr-2005 mpf

Introduce if_linkstatehooks.
This converts if_link_state_change() to a generic usable
callback with dohooks().

OK henning@, camield@
Tested by camield@ and Alexey E. Suslikov


Revision tags: OPENBSD_3_7_BASE
# 1.64 07-Feb-2005 mcbride

Add new function if_link_state_change() to take care of sending messages
on the routing socket and notifying carp() of link changes.

ok brad@ mpf@


# 1.63 14-Jan-2005 henning

remove old ifgroups ioctls
the old ifgroups haven't been in use ever really, and the new
implementation is 3 months old today. theo ok (3 months ago)


# 1.62 07-Dec-2004 mcbride

Convert carp(4) to behave more like a regular interface, much in the same
style as vlan(4). carp interfaces no longer require the physical interface
to be on the same subnet as the carp interface, or even that the physical
interface has an adress at all, so CARP can now be used on /30 networks.

ok deraadt@ henning@


# 1.61 07-Dec-2004 mcbride

KNF


# 1.60 03-Dec-2004 henning

do not use one struct timeout for the if congestion stuff, but embed
a struct timeout to struct ifqueue so that each one has its own - it
is a per-queue thing. from chris pascoe


# 1.59 10-Nov-2004 grange

Safer IF_INPUT_ENQUEUE macro.

ok millert@


# 1.58 14-Oct-2004 mickey

avoid stupid commons


# 1.57 11-Oct-2004 henning

ifgroups reqrite
there is now a TAILQ with all interface groups as members, and
in struct ofnet there is only a pointer to the group structure stored
and not its name.
mostly hacked at c2k4 and somewhere over the atlantic ocean
ok markus mcbride


Revision tags: OPENBSD_3_6_BASE
# 1.56 26-Jun-2004 markus

cleanup ioctl for ifgroups; ok pb@


# 1.55 25-Jun-2004 pb

introduce "interface groups"

by "ifconfig fxp0 group foobar" "ifconfig xl0 group foobar"
these two interfaces are in one group.
Every interface has its if-family as default group.

idea/design from henning@, based on some work/disucssion from Joris Vink.

henning@, mcbride@ ok.


# 1.54 20-Jun-2004 beck

undo mbuf cluster breakage that causes free'ed packets to show up on the
input queues when using dhcp and hostap wi, or xl, or fxp....
ok art@


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.53 29-May-2004 jcs

introduce SIOCSIFDESCR and SIOCGIFDESCR to maintain interface
descriptions, configurable with ifconfig

help from various, ok deraadt@


# 1.52 18-May-2004 brad

if_ether.h
add ETHER_MAX_LEN_JUMBO, ETHER_VLAN_ENCAP_LEN, ETHER_ALIGN, and
ETHERMTU_JUMBO constants.

if.h
add a few more interface capabilities flags.

Some from NetBSD, some from FreeBSD.

ok markus@


# 1.51 26-Apr-2004 mcbride

Before enqueueing the packet, copy the contents of incoming clusters
to the mbuf and free the cluster when it contains a small packet.

ok deraadt@


# 1.50 17-Apr-2004 henning

add a congestion indicator to if_queue. It is set when the input queue
is full, along with a timer that unsets it again after 10ms.
The input queue beeing full is a reliable indicator for CPU overload, and
this flag allows other subsystems to cope with the situation.
hacked with beck
ok kjc@ markus@ beck@


Revision tags: OPENBSD_3_5_BASE
# 1.49 15-Jan-2004 markus

add a RTM_IFANNOUNCE message; from netbsd; ok itojun, henning


# 1.48 16-Dec-2003 markus

return error in ifc_destroy; ok deraadt, itojun, cedric, hshoexer


# 1.47 10-Dec-2003 itojun

use if_indexlim (instead of if_index) and ifindex2ifnet[x] != NULL
to check if interface exists, as (1) if_index will have different meaning
(2) ifindex2ifnet could become NULL when interface gets destroyed,
when we introduce dynamically-created interfaces. markus ok


# 1.46 08-Dec-2003 markus

add IOCIFGCLONERS; ifconfig -C; from netbsd; ok henning, deraadt


# 1.45 03-Dec-2003 markus

support for network interface "cloning", e.g. gif(4) via ifconfig(8)


# 1.44 19-Oct-2003 david

more typos


# 1.43 17-Oct-2003 mcbride

Common Address Redundancy Protocol

Allows multiple hosts to share an IP address, providing high availability
and load balancing.

Based on code by mickey@, with additional help from markus@
and Marco_Pfatschbacher@genua.de

ok deraadt@


Revision tags: OPENBSD_3_4_BASE
# 1.42 25-Aug-2003 fgsch

if_init support, required by ieee80211.
deraadt@ ok.


# 1.41 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_A UBC_SYNC_B
# 1.40 03-Jul-2002 miod

Change all variables definitions (int foo) in sys/sys/*.h to variable
declarations (extern int foo), and compensate in the appropriate locations.


# 1.39 30-Jun-2002 itojun

allocate sockaddr_dl for ifnet in if_alloc_sadl(), as we don't always know
the size of sockaddr_dl on if_attach() - for instance, see ether_ifattach().
from netbsd. fgs ok


# 1.38 23-Jun-2002 itojun

g/c last remains of old ipv6 prefix management


# 1.37 27-May-2002 itojun

if_attach() gets called before domaininit(). scan all interfaces for if_afdata
initialization after domaininit().


# 1.36 27-May-2002 itojun

framework to add af-dependent data structure to struct ifnet.
as discussed at bsd-api-discuss. sync w/kame


# 1.35 24-Apr-2002 dhartmei

Add hooks to struct ifnet that allow to register callbacks that will be
notified of interface address changes. ok provos@, angelos@


Revision tags: OPENBSD_3_1_BASE
# 1.34 15-Mar-2002 millert

Cosmetic changes only, primarily making comments line up nicely after the
__P removal.


# 1.33 14-Mar-2002 millert

First round of __P removal in sys


# 1.32 23-Jan-2002 fgsch

compatability -> compatibility.


Revision tags: OPENBSD_3_0_BASE UBC_BASE
# 1.31 05-Jul-2001 angelos

branches: 1.31.4;
KNF


# 1.30 05-Jul-2001 jjbg

Include files for IPComp support. angelos@ ok.


# 1.29 27-Jun-2001 kjc

ALTQ base modifications to the kernel.
- ALTQ introduces a set of new queue macros that coexist with the
traditional IF_XXX macros.
- "struct ifaltq" replaces "struct ifqueue" in "struct ifnet".
- assign cdev major 74 for i386 and 54 for alpha as ALTQ control interface.


# 1.28 23-Jun-2001 fgsch

Add ether_input_mbuf to help us remove the ether_header from
ether_input; all drivers should start migrating to this.
Discussed with jason@, deraadt@ more or les ok'ed.


# 1.27 15-Jun-2001 itojun

change the meaning of ifnet.if_lastchange to meet RFC1573 ifLastChange.
follows BSD/OS practice and ucd-snmp code (FreeBSD does it for specific
interfaces only).

was: if_lastchange get updated on every packet transmission/receipt.
now: if_lastchange get updated when IFF_UP is changed.


# 1.26 09-Jun-2001 angelos

By popular demand, protect from multiple inclusion, and fix to use the
same naming style.


# 1.25 28-May-2001 angelos

IPSECv4 -> IPSEC


# 1.24 28-May-2001 angelos

No need for separate ESP/AH interface capabilities.


# 1.23 28-May-2001 angelos

Interface capabilities (based on NetBSD, but merge ethercom and ifnet
capabilities into one, in the ifp).


Revision tags: OPENBSD_2_9_BASE
# 1.22 06-Feb-2001 mickey

allow changing number of loopbacks in ukc.
change rest of the code to use lo0ifp pointing
to the corresponding struct ifnet.
itojun@ and niklas@ ok


# 1.21 19-Jan-2001 itojun

pull post-4.4BSD change to sys/net/route.c from BSD/OS 4.2 (UCB copyrighted).

have sys/net/route.c:rtrequest1(), which takes rt_addrinfo * as the argument.
pass rt_addrinfo all the way down to rtrequest, and ifa->ifa_rtrequest.
3rd arg of ifa->ifa_rtrequest is now rt_addrinfo * instead of sockaddr *
(almost noone is using it anyways).

benefit: the follwoing command now works. previously we need two route(8)
invocations, "add" then "change".
# route add -inet6 default ::1 -ifp gif0

remove unsafe typecast in rtrequest(), from rtentry * to sockaddr *. it was
introduced by 4.3BSD-reno and never corrected.

XXX is eon_rtrequest() change correct regarding to 3rd arg?
eon_rtrequest() and rtrequest() were incorrect since 4.3BSD-reno,
so i do not have correct answer in the source code.
someone with more clue about netiso-over-ip, please help.


Revision tags: OPENBSD_2_8_BASE
# 1.20 20-Sep-2000 art

Since ifa_refcnt was bumped to an int and rt_flags is an int too, bump
ifa_flags to int.


# 1.19 28-Aug-2000 deraadt

changing the size of if_data has heavy impact on userland compat. there
was even a u_char slot available for ifi_link_state, which clearly does not
need a full 32 bits.


# 1.18 26-Aug-2000 nate

sync mii code with netbsd
adds detach functionality for phys
some code cleanup

Nobody really had time to test all of this out, but theo said commit anyway


Revision tags: OPENBSD_2_7_BASE
# 1.17 22-Mar-2000 itojun

remove if_withname(), which was imported during KAME merge by mistake.


# 1.16 21-Mar-2000 mickey

add SIOCGIFMTU/SIOCSIFMTU; remediate redundant code of tun, ppp, sppp; chris@ ok


Revision tags: SMP_BASE
# 1.15 02-Feb-2000 itojun

branches: 1.15.2;
wrap IFAFREE() by "do {} while (0)". it wasn't safe enough.


Revision tags: kame_19991208
# 1.14 08-Dec-1999 itojun

bring in KAME IPv6 code, dated 19991208.
replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support.
see sys/netinet6/{TODO,IMPLEMENTATION} for more details.

GENERIC configuration should work fine as before. GENERIC.v6 works fine
as well, but you'll need KAME userland tools to play with IPv6 (will be
bringed into soon).


Revision tags: OPENBSD_2_6_BASE
# 1.13 08-Aug-1999 niklas

Support detaching of network interfaces. Still work to do in ipf, and
other families than inet.


# 1.12 23-Jun-1999 cmetz

Added some protocol independent interfaces (supposedly IPv6 support APIs, but
ones that are useful for all protocols, not just IPv6).


Revision tags: OPENBSD_2_5_BASE
# 1.11 13-Mar-1999 deraadt

make ifa_refcnt a u_int; andrewb@demon.net


# 1.10 26-Feb-1999 jason

Ethernet bridge/IP firewall driver.


# 1.9 07-Jan-1999 deraadt

fix IFAFREE() to be safe for if/else nesting


Revision tags: OPENBSD_2_4_BASE
# 1.8 03-Sep-1998 jason

o OpenBSD gets if_media support (from NetBSD)
o rework/simplify if_xl to use it


Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE OPENBSD_2_2_BASE OPENBSD_2_3_BASE
# 1.7 02-Jul-1996 niklas

-Wall & -Wstrict-prototype fixes


# 1.6 29-Jun-1996 deraadt

provide if_attachhead(), and make if_loop use it


# 1.5 10-May-1996 deraadt

if_name/if_unit -> if_xname/if_softc


# 1.4 06-May-1996 mickey

if.h was missed from the commit.
if_ethersubr.c: missed variables added.


# 1.3 05-Mar-1996 mickey

Changes for ifconfig to compile.


# 1.2 03-Mar-1996 niklas

From NetBSD: 960217 merge


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.192 19-Feb-2018 dlg

tunneldf needs ifr_df


# 1.191 10-Feb-2018 florian

Implement RFC 7217: "A Method for Generating Semantically Opaque
Interface Identifiers with IPv6 Stateless Address Autoconfiguration."

"An IPv6 address configured using this method is stable within each
subnet, but the corresponding Interface Identifier changes when the
host moves from one network to another. This method is meant to be an
alternative to generating Interface Identifiers based on hardware
addresses."

OK naddy, sthen


# 1.190 16-Jan-2018 mpi

Recycle IFF_NOTRAILERS into IFF_STATICARP and document ownerhsip
of IFF* flags.

inputs from jmc@, ok bluhm@, visa@


# 1.189 21-Dec-2017 dlg

prototype if_attach_iqueues so drivers can configure multiple iqs.


# 1.188 09-Nov-2017 tb

The cmd argument of ifconf() has been unused since COMPAT_LINUX was
purged. Remove it and move the prototype to if.c since ifconf() is
not used outside of this file.

ok mpi


# 1.187 31-Oct-2017 sashan

- add one more softnet taskq
NOTE: code still runs with single softnet task. change definition of
SOFTNET_TASKS in net/if.c, if you want to have more than one softnet task

OK mpi@, OK phessler@


Revision tags: OPENBSD_6_1_BASE OPENBSD_6_2_BASE
# 1.186 24-Jan-2017 krw

A space here, a space there. Soon we're talking real whitespace
rectification.


# 1.185 24-Jan-2017 dlg

add support for multiple transmit ifqueues per network interface.

an ifq to transmit a packet is picked by the current traffic
conditioner (ie, priq or hfsc) by providing an index into an array
of ifqs. by default interfaces get a single ifq but can ask for
more using if_attach_queues().

the vast majority of our drivers still think there's a 1:1 mapping
between interfaces and transmit queues, so their if_start routines
take an ifnet pointer instead of a pointer to the ifqueue struct.
instead of changing all the drivers in the tree, drivers can opt
into using an if_qstart routine and setting the IFXF_MPSAFE flag.
the stack provides a compatability wrapper from the new if_qstart
handler to the previous if_start handlers if IFXF_MPSAFE isnt set.

enabling hfsc on an interface configures it to transmit everything
through the first ifq. any other ifqs are left configured as priq,
but unused, when hfsc is enabled.

getting this in now so everyone can kick the tyres.

ok mpi@ visa@ (who provided some tweaks for cnmac).


# 1.184 23-Jan-2017 mpi

Flag pseudo-interfaces as such in order to call add_net_randomness()
only once per packet.

Fix a regression introduced when if_input() started to be called by
every pseudo-driver.

ok claudio@, dlg@


# 1.183 23-Jan-2017 dlg

i botched the copyout to ifr->ifr_data in SIOCGIFDATA.

this lets pflogd run again.

rename if_data() to if_getdata() while here to make grepping for
things less noisy.

reported by jsg@
worked through with deraadt@


# 1.182 23-Jan-2017 dlg

merge the ifnet and ifqueue stats together when userland wants them.

a new if_data() function takes a pointer to ifnet and merges its
if_data and ifq statistics. it takes the ifq mutex around the reads
of the ifq stats so they get a consistent copy.

the ifnet and ifq stats are merged because some parts of the stack
still update the ifnet counters.

ok visa@ (on an earlier diff) mpi@ claudio@


# 1.181 12-Dec-2016 mpi

Remove most of the splsoftnet() recursions related to cloned interfaces.

inputs and ok bluhm@


# 1.180 27-Oct-2016 dlg

add a new pool for 2k + 2 byte (mcl2k2) clusters.

a certain vendor likes to make chips that specify the rx buffer
sizes in kilobyte increments. unfortunately it places the ethernet
header on the start of the rx buffer, which means if you give it a
mcl2k cluster, the ethernet header will not be ETHER_ALIGNed cos
mcl2k clusters are always allocated on 2k boundarys (cos they pack
into pages well). that in turn means the ip header wont be aligned
correctly.

the current workaround on these chips has been to let non-strict
alignment archs just use the normal 2k cluster, but use whatever
cluster can fit 2k + 2 on strict archs. that turns out to be the
4k cluster, meaning we waste nearly 2k of space on every packet.

properly aligning the ethernet header and ip headers gives a
performance boost, even on non-strict archs.


# 1.179 04-Sep-2016 reyk

Move code to change the rdomain of an interface from the ioctl switch case
to a new function if_setrdomain().

OK mpi@ henning@


# 1.178 03-Sep-2016 reyk

Add support for a multipoint-to-multipoint mode in vxlan(4). In this
mode, vxlan(4) must be configured to accept any virtual network
identifier with "vnetid any" and added to a bridge(4) or switch(4).
This way the driver will dynamically learn the tunnel endpoints and
their vnetids for the responses and can be used to dynamically bridge
between VXLANs. It is also being used in combination with switch(4)
and the OpenFlow tunnel classifiers.

With input from yasuoka@ goda@
OK deraadt@ dlg@


Revision tags: OPENBSD_6_0_BASE
# 1.177 10-Jun-2016 vgross

Add the "llprio" field to struct ifnet, and the corresponding keyword
to ifconfig.

"llprio" allows one to set the priority of packets that do not go through
pf(4), as the case is for arp(4) or bpf(4).

ok sthen@ mikeb@


# 1.176 02-Mar-2016 dlg

provide generic ioctls for managing an interfaces parent

in the future this will subsume the individual vlandev, carpdev,
pppoedev, foodev options for things like vlan, carp, pppoe, etc.

inspired by vnetid

ok mpi@ jmatthew@


Revision tags: OPENBSD_5_9_BASE
# 1.175 05-Dec-2015 deraadt

avoid an ugly wrap in a comment


# 1.174 03-Dec-2015 dlg

rework if_start to allow nics to provide an mpsafe start routine.

existing start routines will still be called under the kernel lock
and at IPL_NET.

mpsafe start routines will be serialised so only one instance of
each interfaces function will be running in the kernel at any point
in time. this guarantees packets will be dequeued in order, and the
start routines dont have to lock against themselves because if_start
does it for them.

the code to do that is based on the scsi runqueue code.

this also provides an if_start_barrier() function that should wait
until any currently running instances of if_start have finished.

a driver can opt in to the mpsafe if_start call by doing the following:

1. setting ifp->if_xflags = IFXF_MPSAFE
2. only calling if_start() instead of its own start routine
3. clearing IFF_RUNNING before calling if_start_barrier() on its way down
4. only using IFQ_DEQUEUE (not ifq_deq_begin/commit/rollback)

to simplify the implementation the tx mitigation code has been removed.

tested by several
ok mpi@ jmatthew@


# 1.173 20-Nov-2015 mpi

Keep if_ref() private, if_get() is what you want to use before if_put().

The thread detaching an interface will sleep until all references to this
interface have been released. So we decided to only keep references for
a short period of time.

Keeping if_ref() private will hopefully help preserve this goal as long
as it makes sense.

Calling if_get()/if_put() in the same function also allows us to make
use of static analysis tools (thanks jsg@!) to catch our errors.

ok dlg@


# 1.172 24-Oct-2015 reyk

Add pair(4), a vether-based virtual Ethernet driver to interconnect
rdomains and bridges on the local system. This can be used to route
through local rdomains, to create L2 devices (like trunks) between
them, and many other things.

Discussed with many, with input from mpi@
OK sthen@ phessler@ yasuoka@ mikeb@


# 1.171 23-Oct-2015 claudio

Introduce a new sysctl NET_RT_IFNAMES that returns only ifnames to ifindex
mappings. This will be used by if_nameindex(3), if_nametoindex(3) and
if_indextoname(3) soon to fix the issues in pledge because of inet6 link
local addressing.
OK mpi@ benno@ deraadt@
The libc version will follow soon so better start updating your kernels


# 1.170 23-Oct-2015 dlg

tweak the vnetid so it can be optional and therefore cleared/deleted.

the abstract vnetid is promoted to a uin32_t, and adds a SIOCDVNETID
ioctl so it can be cleared.

this is all because i set an assignment on implementing a virtual
network interface and the students got confused when vnetid 0 didnt
show up in ifconfig output.

the vnetid in the vxlan(4) protocol is optional, but the current
code confuses 0 with no vnetid being set. this makes it clear.

ok reyk@ who also simplified my diff


# 1.169 05-Oct-2015 uebayasi

Add ifi_oqdrops and its alias to struct if_data.

Necessary bumps in Ports will be handled by sthen@.

OK mpi@ dlg@


# 1.168 27-Sep-2015 stsp

Add if_setlladdr(), factored out from ifioctl(). Will be used by iwm(4) soon.
With suggestions from tedu@ and guenther@
ok kettenis@


# 1.167 11-Sep-2015 stsp

Make room for media types of the future. Extend the ifmedia word to 64 bits.
This changes numbers of the SIOCSIFMEDIA and SIOCGIFMEDIA ioctls and
grows struct ifmediareq.

Old ifconfig and dhclient binaries can still assign addresses, however
the 'media' subcommand stops working. Recompiling ifconfig and dhclient
with new headers before a reboot should not be necessary unless in very
special circumstances where non-default media settings must be used to
get link and console access is not available.

There may be some MD fallout but that will be cleared up later.

ok deraadt miod
with help and suggestions from several sharks attending l2k15


# 1.166 09-Sep-2015 dlg

introduce reference counts for interfaces (ie, struct ifnet *ifp).

if_get can get a reference to an ifp, but it never releases that
reference. this provides an if_put function that can be used to
decrement the refcount.

we cannot come up with a scheme for letting the network stack run on
one (or many) cpus while ioctls are pulling interfaces down on another
cpu without refcounts for the interfaces.

if_put is going in now so we can go through the stack and put the
necessary calls to it in, and then we'll backfill this implementation
to actually check the refcounts when the interface detaches.

ok mpi@ mikeb@ claudio@


# 1.165 30-Aug-2015 mpi

Use a global table for domains instead of building a list at run time.

As a side effect there's no need to run if_attachdomain() after the
list of domains has been built.

ok claudio@, reyk@


Revision tags: OPENBSD_5_8_BASE
# 1.164 07-Jun-2015 jsg

Introduce unhandled_af() for cases where code conditionally does
something based on an address family and later assumes one of the paths
was taken. This was initially just calls to panic until guenther
suggested a function to reduce the amount of strings needed.

This reduces the amount of noise with static analysers and acts
as a sanity check.

ok guenther@ bluhm@


# 1.163 18-May-2015 reyk

Move the rdomain from struct ifnet into struct if_data. This way it
will be exported to userland with the existing sysctl, getifaddrs()
and routing socket (if_msghdr.ifm_data) interfaces that expose
if_data. All programs and daemons - Apps - that call the
SIOCGIFRDOMAIN ioctl in a getifaddrs() loop or after receiving an
interface message on the routing socket can now remove the pointless
additional ioctl. In base, that could be: dhclient, isakmpd, dhcpd,
dhcrelay, ntpd, ospfd, ripd, ifconfig.

No ABI breakage because it uses a previously unused pad field in if_data.

OK mpi@ deraadt@


# 1.162 10-Apr-2015 mpi

Run detach hook and similar before cleaning up any other resource when
an interface is destroyed/removed. This way we can ensure pseudo-driver
changes done after attaching an interface are undone before detaching it.

Note: it is safe to call if_deactivate() multiple times as the interface
should not have any attached pseudo-interface after the first call.

ok deraadt@, dlg@


# 1.161 18-Mar-2015 dlg

remove the congestion handling from struct ifqueue.

its only used for the ip and ip6 network stack input queues, so it
seems unfair that every instance of ifqueue has to carry a pointer
around for this specific use case.

this moves the congestion marker to a kernel global. if we detect
that we're congested, we assume the whole system is busy and punish
all input queues.

marking a system as congested is done by setting the global to the
current value of ticks. as the system moves away from that value,
it moves away from being congested until the comparison fails.

written at s2k15
ok henning@ beck@ bluhm@ claudio@


Revision tags: OPENBSD_5_7_BASE
# 1.160 08-Feb-2015 mpi

Introduce if_input() a function to pass packets dequeued from a
recieving ring to the stack.

if_input() is at the moment a drop-in replacement for ether_input_mbuf()
but will let us stack pseudo-driver in a nice way in order to no longer
call ether_input() recursively.

ok pelikan@, reyk@, blambert@, henning@


# 1.159 06-Jan-2015 stsp

Remove the NOINET6 interface flag, a left-over from the times when IPv6
was enabled by default. Add AFATTACH/AFDETACH ioctls which enable/disable
an address family for an interface (currently used for IPv6 only).

New kernel needs new ifconfig for IPv6 configuration (address assignment
still works with old ifconfig making this easy to cross over).

Committing on behalf of henning@ who is currently lebensmittelvergiftet.
ok stsp, benno, mpi


# 1.158 05-Dec-2014 mpi

Explicitly include <net/if_var.h> instead of pulling it in <net/if.h>.

ok mikeb@, krw@, bluhm@, tedu@


Revision tags: OPENBSD_5_6_BASE
# 1.157 14-Jul-2014 dlg

now that receive ring accounting has been pulled out of the mbuf layer,
we can pull the space the mbuf layer used to do per interface accounting
out of struct if_data.

saves a hundredish bytes on every interface.

ok deraadt@ claudio@


# 1.156 11-Jul-2014 henning

introduce the IFXF_AUTOCONF6 interface flag which controls wether we
accept rtadvs on that interface. the global net.inet6.ip6.accept_rtadv
sysctl just doesn't cut it, even tho the spec wants that - but in their
little absurd world, a host just has one interface by definition anyway...
the sysctlgoes away.
lots of head scratching, brain cell elemination etc from bluhm benno stsp
florian, excitement from simon and todd, ok bluhm stsp benno florian


# 1.155 08-Jul-2014 dlg

introduce the if_rxr api. it is intended to pull the rx ring accounting
out of the mbuf layer, and break the assumption that an interface will
only have a single ring per mbuf cluster size.

mpi@ is ok with moving this forward


# 1.154 13-Jun-2014 mpi

Instead of updating all the cluster allocation water marks of all the
interfaces when the kernel is livelocked, only do it for the current
pool and defer the other updates.

This allow us to get rid of an interface list iteration in a critical
path.

Ridding the libc crank since this change introduce an ABI break.

ok claudio@


Revision tags: OPENBSD_5_5_BASE
# 1.153 21-Nov-2013 mikeb

split kernel parts of the if.h into a separate header file if_var.h
which allows us to modify ifnet structure in a relatively safe way;
discussed with deraadt, ok mpi


# 1.152 09-Nov-2013 dlg

ticks is compared against mcl_grown to see if time has elapsed since
the rx ring was last allowed to grow and then assigned to it. it
is erroneous to do this because mcl_grown is a u_int and ticks is an
int.

this makes mcl_grown an int, and follows the idiom in kern_timeout.c
of going "thing - ticks < diff", which better copes with ticks
wrapping around and being used to calculate relative intervals.

ok pirofti@ guenther@


# 1.151 01-Nov-2013 pelikan

keep net/hfsc.h away from userspace, except in pfctl

tested by naddy, ok deraadt


# 1.150 21-Oct-2013 benno

nuke comment. How soon is now?
"do it" deraadt@


# 1.149 19-Oct-2013 reyk

Bring back the if_detachhook. We're going to have more users now.

ok mpi@ henning@ benno@


# 1.148 13-Oct-2013 reyk

Import vxlan(4), the virtual extensible local area network tunnel
interface. VXLAN is a UDP-based tunnelling protocol for overlaying
virtualized layer 2 networks over layer 3 networks. The implementation
is based on draft-mahalingam-dutt-dcops-vxlan-04 and has been tested
with other implementations in the wild.

put it in deraadt@


# 1.147 12-Oct-2013 henning

new bandwidth shaping subsystem, kernel side
uses hfsc behind the scenes; altq stays in parallel for a migration phase.
if.h even more messy for the transition, but eventuelly it should become
readable...
looked over & tested by many, ok phessler sthen


# 1.146 17-Sep-2013 mpi

Change vlan(4) detach procedure to not use a hook but a list of vlans
on the parent interface. This is similar to what bridge(4), trunk(4)
or carp(4) are doing and allows us to get rid of the detachhook.

ok reyk@, mikeb@


# 1.145 28-Aug-2013 mpi

Remove unused argument from *rtrequest()

ok krw@, mikeb@


Revision tags: OPENBSD_5_4_BASE
# 1.144 20-Jun-2013 mpi

Revert previous and unbreak asr, the new include should be protected.

Reported by naddy@


# 1.143 20-Jun-2013 mpi

Allocate the various hook head descriptors as part of the ifnet
structure rather than doing various M_WAITOK allocations during
the *attach() functions, we always rely on them anyway.

ok mikeb@, uebayasi@


# 1.142 02-Apr-2013 mpi

Instead of storing the link-level address of every interface in a global
array indexed by interface numbers, add a new field to the interface
descriptor pointing to it.

claudio@ and todd@ like it, ok mikeb@


# 1.141 26-Mar-2013 mpi

Remove various read-only *maxlen variables and use IFQ_MAXLEN directly.

ok beck@, mikeb@


# 1.140 20-Mar-2013 mpi

Introduce if_get() to retrieve an interface descriptor pointer given
an interface index and replace all the redondant checks and accesses
to a global array by a call to this function.

With imputs from and ok bluhm@, mikeb@


# 1.139 07-Mar-2013 mpi

Remove unused ifa_ifwithaf() function.

ok mikeb@, miod@


# 1.138 07-Mar-2013 mpi

Remove the IFAFREE() macro, the ifafree() function it was calling already
check for the reference counter.

ok mikeb@, miod@, pelikan@, kettenis@, krw@


Revision tags: OPENBSD_5_3_BASE
# 1.137 23-Nov-2012 sthen

Add SIOCGIFHARDMTU to allow retrieving the driver's maximum supported MTU
looks fine reyk@ ok mikeb@


# 1.136 11-Nov-2012 deraadt

align ifaliasreq.ifra_addr similar to the way that ifreq is fixed --
a gruesome union, to block the compiler from placing the struct
incorrectly aligned on stack frames
ok guenther


# 1.135 05-Oct-2012 camield

Point an interface directly to its bridgeport configuration, instead
of to the bridge itself. This is ok, since an interface can only be part
of one bridge, and the parent bridge is easy to find from the bridgeport.

This way we can get rid of a lot of list walks, improving performance
and shortening the code.

ok henning stsp sthen reyk


# 1.134 19-Sep-2012 henning

defina an IFCAP_CSUM_MASK, covering IFCAP_CSUM_*, and use it in if_vlan.c
to replace the list of them.
this actually makes vlan inherit the IPv6 CSUM flags from it's parent, that
had been commented out since this code was committed back in 2001.
ok benno mpf


# 1.133 10-Sep-2012 guenther

Bring into compliance with POSIX, exposing just the specified bits.

Requested by jasper@, ok kettenis@


# 1.132 21-Aug-2012 bluhm

Reverse the name and meaning of the IFXF_INET6_PRIVACY interface
flag. It is now called IFXF_INET6_NOPRIVACY. So IPv6 privacy
addresses are on by default without resetting the flag during
ifconfig down/up.
OK stsp@, sperreault@ (who wrote the same diff)


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.131 02-Dec-2011 haesbaert

Kill unused IFCAP_IPSEC and IFCAP_IPCOMP.

ok claudio@ henning@ mikeb@


# 1.130 02-Nov-2011 haesbaert

Expose if_capabilities to userland so that ifconfig can display the
device hardware features.
Tune ifconfig to show them with 'hwfeatures' argument.
While here, kill some old unused capabilities and respect 80 columns
in brconfig.h.

ok mcbride@, henning@, mpf@.


# 1.129 07-Oct-2011 henning

rename some vars and functions
unfortunately altq is one giant namespace violation. rename just those that
conflict with new stuff for now only to be found on my laptop. reduce pain,
the diff is huge already. ok ryan


Revision tags: OPENBSD_5_0_BASE
# 1.128 08-Jul-2011 henning

new priority queueing implementation, extremely low overhead, thus fast.
unconditional, always on. 8 priority levels, as every better switch, the
vlan header etc etc. ok ryan mpf sthen, pea tested as well


# 1.127 07-Jul-2011 henning

provide IF_LEN and IFQ_LEN to access ifq_len on an ifqueue, ryan ok


# 1.126 05-Jul-2011 henning

now of course I only noticed if_qflush is completely unused after
adjusting it to the new world order in my tree... remove it, ok ryan claudio


# 1.125 03-Jul-2011 henning

IFQ_CLASSIFY is also just schrapnel


# 1.124 03-Jul-2011 henning

no traces of ALTQ_DECL to be found anywhere, thus kill the #defines


# 1.123 03-Jul-2011 claudio

LINK_STATE_IS_UP() should consider LINK_STATE_UNKNOWN as an up state.
This is now possible because carp no longer uses LINK_STATE_UNKNOWN
for a state that is considered down. This will simplify a lot of code.
OK mpf@ mcbride@ henning@


# 1.122 13-Mar-2011 stsp

Add a way to enable/disable Wake On LAN with ifconfig.
ok deraadt


Revision tags: OPENBSD_4_9_BASE
# 1.121 17-Nov-2010 henning

introduce ifa_update_broadaddr to update an ifaddr's broadcast address,
trivial for the moment, more needed soon
tested by many as part of a larger diff, ok sthen claudio dlg krw


# 1.120 24-Sep-2010 claudio

Implement if_freenameindex() as a real function as required by posix.
OK deraadt@, millert@


# 1.119 23-Sep-2010 dlg

tweak the mclgeti algorithm to behave better under load.

instead of letting hardware rings grow on every interrupt, restrict
it so it can only grow once per softclock tick. we can only punish
the rings on softclock ticks, so it make sense to only grow on
softclock tick boundaries too.

the rings are now punished after >1 lost softclock tick rather than
>2. mclgeti is now more aggressive at detecting livelock.

the rings get punished by an 8th, rather than by half.

we now allow the rings to be punished again even if the system is
already considered in livelock.

without this diff a livelocked system will have its rx ring sizes
scale up and down very rapidly, while holding the rings low for too
long. this affected throughput significantly.

discussed and tested heavily at j2k10. there are still some games
with softnet we can play, but this is a good first step.

"put it in" and ok deraadt@
ok claudio@ krw@ henning@ mcbride@

if we find out that it sucks we can pull it out again later. till then
we'll run with it and see how it goes.


# 1.118 27-Aug-2010 jsg

remove the unused if_init callback in struct ifnet
ok deraadt@ henning@ claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.117 26-Jun-2010 claudio

Implement a simple keepalive mechanism in gre(4) that is compatible with
the one used by Cisco. It sends a return gre packet inside a gre packet
to the other side and expects it to return.
OK deraadt, reyk additional testing by sthen


# 1.116 28-May-2010 claudio

Rework the way we handle MPLS in the kernel. Instead of fumbling MPLS into
ether_output() and later on other L2 output functions use a trick and over-
load the ifp->if_output() function pointer on MPLS enabled interfaces to
go through mpls_output() which will then call the link level output function.
By setting IFXF_MPLS on an interface the output pointers are switched.
This now allows to cleanup the MPLS input and output pathes and fix mpe(4)
so that the MPLS code now actually works for both P and PE systems.
Tested by myself and michele
(A custom kernel with MPLS and mpe enabled is still needed).


# 1.115 17-Apr-2010 deraadt

split SIOCSIFLLADDR code out into an ifnewlladr() function
ok stsp


# 1.114 06-Apr-2010 stsp

Simple implementation of RFC4941, "Privacy Extensions for Stateless
Address Autoconfiguration in IPv6". For those among us who are paranoid
about broadcasting their MAC address to the IPv6 internet.

Man page help from jmc, testing by weerd, arc4random API hints from djm.

ok deraadt, claudio


Revision tags: OPENBSD_4_7_BASE
# 1.113 13-Jan-2010 henning

maintain a global RB tree of all local addresses in the system. this
includes AF_LINK addresses (aka mac addresses in the ethernet case). for
inet this also includes the broadcast addresses.
depends on ifinit() called earlier so we have a chance to pool_init before
autoconf assigns the AF_LINK addresses, the v6 fix, and the ifa_add/del
abstraction i just committed.
this is a change in semantics, it is now illegal to change the actual
address in an ifaddr struct because then the RB tree becomes unbalanced.
nothing using this tree yet.
ok theo ryan dlg


# 1.112 13-Jan-2010 henning

instead of fiddling with the per-interface address lists directly in
many places create a proper API (ifa_add / ifa_del) and use it.
ok theo ryan dlg


# 1.111 12-Jan-2010 deraadt

Make the structures for ifa_msghdr and friends even more like
the route messages so that people and compilers will not get
confused.
ok claudio


# 1.110 17-Sep-2009 claudio

Remove the comaptibility structures for routing socket version 3.
The RTM_VERSION bump is 2 years ago and so there is no need for this.
Diff made by tedu@ some time ago but got never commited so I do it now.


# 1.109 14-Sep-2009 claudio

Add a way to convert the ifi_link_state to a string without the use of
if_media. This makes link state tracking a lot easier as there is no need
to convert if types to if_media types, etc. Additionally this allows us
to extend the link states to include states tracked on higher protocol layers.
gre(4) keepalives packets, bfd and udld can be implemented without ugly hacks.
OK henning, michele, sthen, deraadt


# 1.108 10-Aug-2009 deraadt

At sys_reboot time, bring all the interfaces down so that their xxstop
functions are called, which will turn off DMA. Receiving packets into
your memory after a system reboot is pretty nasty. This will also mean
that the shutdown hooks can go; this solution is smaller.
ok henning miod dlg kettenis


Revision tags: OPENBSD_4_6_BASE
# 1.107 06-Jun-2009 rainer

when xflags got changed, tell the userland by routing sockets

ok henning@


# 1.106 05-Jun-2009 claudio

Initial support for routing domains. This allows to bind interfaces to
alternate routing table and separate them from other interfaces in distinct
routing tables. The same network can now be used in any doamin at the same
time without causing conflicts.
This diff is mostly mechanical and adds the necessary rdomain checks accross
net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6.
input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@


# 1.105 04-Jun-2009 henning

allow IPvShit to be turned off completely per-interface.
ifconfig em0 -inet6
deletes all v6 addresses including link-local and prevents new ones from
being added.
ifconfig em0 inet6 <addr>
re-enables v6, brings the link local back and adds optional <addr>
ok theo reyk


# 1.104 03-Jun-2009 beck

make wireless interfaces priority 4 by default. other interfaces remain
priority 0. while we are in here make sure we add wi interfaces to group "wlan"
in the same way the net80211 stuff already is.

this makes dhcp multiple default routes useful on laptops.

ok claudio@


Revision tags: OPENBSD_4_5_BASE
# 1.103 27-Jan-2009 dlg

make drivers tell the mclgeti allocator what their maximum ring size is
to prevent the hwm growing beyond that. this allows the livelock mitigation
to do something where the hwm used to grow beyond twice the rx rings size.

ok kettenis@ claudio@


# 1.102 12-Dec-2008 claudio

Introduce a if_priority that will be added to RTP_STATIC when routes are
added without an expilict priority. This allows to specify less prefered
interfaces that will only take over if the primary interface loses link.
OK deraadt@


# 1.101 11-Dec-2008 deraadt

export per-interface mbuf cluster pool use statistics out to userland
inside if_data, so that netstat(1) and systat(1) can see them
ok dlg


# 1.100 30-Nov-2008 brad

- Remove unused if_reset "bus reset routine" field in the ifnet struct.
- Add if_stop "stop routine" field in the ifnet struct.

ok mglocker@


# 1.99 26-Nov-2008 dlg

provide m_clsetlwm, an interface for an interface to raise its low
watermark for mbuf cluster allocations.

this is necessary for things like bge which cannot cope with less than a
certain number of pkts on the ring.

ok deraadt@


# 1.98 25-Nov-2008 deraadt

Factor increases are not needed, +1 appears to work as well.
ok dlg


# 1.97 24-Nov-2008 deraadt

move MCLPOOLS to if.h and force uipc_mbuf.c to get if.h, there is no
other option
ok dlg


# 1.96 24-Nov-2008 dlg

add several backend pools to allocate mbufs clusters of various sizes out
of. currently limited to MCLBYTES (2048 bytes) and 4096 bytes until pools
can allocate objects of sizes greater than PAGESIZE.

this allows drivers to ask for "jumbo" packets to fill rx rings with.

the second half of this change is per interface mbuf cluster allocator
statistics. drivers can use the new interface (MCLGETI), which will use
these stats to selectively fail allocations based on demand for mbufs. if
the driver isnt rapidly consuming rx mbufs, we dont allow it to allocate
many to put on its rx ring.

drivers require modifications to take advantage of both the new allocation
semantic and large clusters.

this was written and developed with deraadt@ over the last two days
ok deraadt@ claudio@


# 1.95 07-Nov-2008 deraadt

give this some /* CONSTCOND */ love


Revision tags: OPENBSD_4_4_BASE
# 1.94 10-Apr-2008 dlg

introduce mitigation for the calling of an interfaces start routine.

decent drivers prefer to have a lot of packets on the send queue so they
can queue a lot of them up on the tx ring and then post them all in one
big chunk. unfortunately our stack queues one packet onto the send queue
and then calls the start handler immediately.

this mitigates against that queue, send, queue, send behaviour by trying to
call the start routine only once per softnet. now its queue, queue, queue,
send.

this is the result of a lot of discussion with claudio@
tested by many.


Revision tags: OPENBSD_4_3_BASE
# 1.93 18-Nov-2007 mpf

Sync struct ifaltq to match struct ifqueue.
I wonder why 64-bit archs have not been bitten by this.
OK mcbride@, henning@


# 1.92 03-Sep-2007 claudio

Bump RTM_VERSION to 4 and start a new aera of routing in OpenBSD :)
Changes include 64bit counters instead of u_long, routing table id in the header
of most messages, reserved routing priority field, added a hdrlen field to skip
over the header so that binary compatibility becomes easier.
A minimal backward support for old binaries is included to ease upgrades but
don't expect anything more than ifconfig, route and dhclient to correctly work.
OK henning@ mglocker@


Revision tags: OPENBSD_4_2_BASE
# 1.91 25-Jun-2007 henning

crank ifq_maxlen from 50 to 256, so it is not smaller than most interfaces
rx rings any more. forwarding boxes with many fast interfaces can still use
some more, but this is a saner default.
ok deraadt markus henric


# 1.90 14-Jun-2007 reyk

Add a new "rtlabel" option to ifconfig. It allows to specify a route label
which will be used for new interface routes. For example,
ifconfig em0 10.1.1.0 255.255.255.0 rtlabel RING_1
will set the new interface address and attach the route label RING_1 to
the corresponding route.

manpage bits from jmc@
ok claudio@ henning@


# 1.89 29-May-2007 uwe

Define IF_ENQUEUE() and friends as proper C statements using do ... while
ok henning


# 1.88 26-May-2007 jason

one extern seems to be better than 20 for ifqmaxlen; ok krw


# 1.87 27-Mar-2007 jmc

grammar from bret lambert, and one more from me;


Revision tags: OPENBSD_4_1_BASE
# 1.86 09-Feb-2007 jmc

grammar fix from bret lambert;


# 1.85 28-Nov-2006 reyk

add additional link states to report the half duplex / full duplex
state, if known by the driver. this is required to check the full
duplex state without depending on the ifmedia ioctl which can't be
called in the kernel without process context.

ok henning@, brad@


# 1.84 16-Nov-2006 henning

introduce if_creategroup() to create an empty interface group.
code factored out from if_addgroup(), previously a group always had to have
members. ok mpf mcbride


# 1.83 31-Oct-2006 jason

ether_input_mbuf() isn't necessary, turn it into a macro and deal with
it's "special" case in ether_input(). Based on similiar idea in FreeBSD.
ok brad


Revision tags: OPENBSD_4_0_BASE
# 1.82 02-Jun-2006 mpf

Introduce attributes to interface groups.
As a first user, move the global carp(4) demotion counter
into the interface group. Thus we have the possibility
to define which carp interfaces are demoted together.

Put the demotion counter into the reserved field of the carp header.
With this, we can have carp act smarter if multiple errors occur.
It now always takes over other carp peers, that are advertising
with a higher demote count. As a side effect, we can also have
group failovers without the need of running in preempt mode.
The protocol change does not break compability with older
implementations.

Collaborative work with mcbride@

OK mcbride@, henning@


# 1.81 27-May-2006 brad

remove IFCAP_JUMBO_MTU interface capabilities flag and set if_hardmtu in a few
more drivers.

ok reyk@


# 1.80 26-May-2006 deraadt

rename jumbo mtu to if_hardmtu; ok brad reyk


# 1.79 19-May-2006 reyk

add a if_jumbo_mtu field to the interface structure for drivers
supporting ethernet jumbo frames. there's no standard for the size of
jumbo MTUs, so either let the driver set it's own value or use 9000
byte jumbo frames by default.

ok brad@


# 1.78 04-Mar-2006 brad

With the exception of two other small uncommited diffs this moves
the remainder of the network stack from splimp to splnet.

ok miod@


Revision tags: OPENBSD_3_9_BASE
# 1.77 09-Feb-2006 reyk

add an interface detach hook and use it with the vlan(4) driver. this
fixes a possible crash if the parent interface has been destroyed
(like vlan on trunk) before destroying the vlan interface.

ok brad@


Revision tags: OPENBSD_3_8_BASE
# 1.76 14-Jun-2005 henning

rename function and define to reflect the external -> egress name change
so it is clear what it is all about


# 1.75 14-Jun-2005 henning

use "egress" instead of "external" for the interface group containing the
interfaces the default route(s) point to, proposed deraadt some days ago,
ok djm deraadt


# 1.74 12-Jun-2005 henning

add SIOCGIFGMEMB ioctl, returns a list of all interfaces who are member of
the given group, markus ok


# 1.73 07-Jun-2005 henning

introduce a default "external" interface group, containing the interface(s)
the the default route(s) point to.
handles IPv4 and IPv6 as well as multipath routes.
follows default route changes, of course.
eases writing pf rulesets especially on laptops etc. that use different
interfaces depending on the environment (wired, wireless, ...)
ok theo ryan


# 1.72 06-Jun-2005 henning

use a define instead of hardcoding "all" in 3 places


# 1.71 05-Jun-2005 henning

const'ify the char *groupname param to if_addgroup and if_delgroup


# 1.70 24-May-2005 markus

add net.inet.ip.ifq for monitoring and changing ifqueue; similar to netbsd
ok henning


# 1.69 24-May-2005 reyk

initial import of a trunking (link aggregation and link failover)
implementation. it currently supports round robin mode with link state
checking, additional modes will be added later.

ok brad@, deraadt@


# 1.68 24-May-2005 henning

keep a list of member interfaces in ifg_group


# 1.67 22-May-2005 henning

allow pf to match on interface groups
pass on mygroup ...
markus ok


# 1.66 21-May-2005 henning

clean up and rework the interface absraction code big time, rip out multiple
useless layers of indirection and make the code way cleaner overall.
this is just the start, more to come...
worked very hard on by Ryan and me in Montreal last week, on the airplane to
vancouver and yesterday here in calgary. it hurt.
ok ryan theo


# 1.65 20-Apr-2005 mpf

Introduce if_linkstatehooks.
This converts if_link_state_change() to a generic usable
callback with dohooks().

OK henning@, camield@
Tested by camield@ and Alexey E. Suslikov


Revision tags: OPENBSD_3_7_BASE
# 1.64 07-Feb-2005 mcbride

Add new function if_link_state_change() to take care of sending messages
on the routing socket and notifying carp() of link changes.

ok brad@ mpf@


# 1.63 14-Jan-2005 henning

remove old ifgroups ioctls
the old ifgroups haven't been in use ever really, and the new
implementation is 3 months old today. theo ok (3 months ago)


# 1.62 07-Dec-2004 mcbride

Convert carp(4) to behave more like a regular interface, much in the same
style as vlan(4). carp interfaces no longer require the physical interface
to be on the same subnet as the carp interface, or even that the physical
interface has an adress at all, so CARP can now be used on /30 networks.

ok deraadt@ henning@


# 1.61 07-Dec-2004 mcbride

KNF


# 1.60 03-Dec-2004 henning

do not use one struct timeout for the if congestion stuff, but embed
a struct timeout to struct ifqueue so that each one has its own - it
is a per-queue thing. from chris pascoe


# 1.59 10-Nov-2004 grange

Safer IF_INPUT_ENQUEUE macro.

ok millert@


# 1.58 14-Oct-2004 mickey

avoid stupid commons


# 1.57 11-Oct-2004 henning

ifgroups reqrite
there is now a TAILQ with all interface groups as members, and
in struct ofnet there is only a pointer to the group structure stored
and not its name.
mostly hacked at c2k4 and somewhere over the atlantic ocean
ok markus mcbride


Revision tags: OPENBSD_3_6_BASE
# 1.56 26-Jun-2004 markus

cleanup ioctl for ifgroups; ok pb@


# 1.55 25-Jun-2004 pb

introduce "interface groups"

by "ifconfig fxp0 group foobar" "ifconfig xl0 group foobar"
these two interfaces are in one group.
Every interface has its if-family as default group.

idea/design from henning@, based on some work/disucssion from Joris Vink.

henning@, mcbride@ ok.


# 1.54 20-Jun-2004 beck

undo mbuf cluster breakage that causes free'ed packets to show up on the
input queues when using dhcp and hostap wi, or xl, or fxp....
ok art@


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.53 29-May-2004 jcs

introduce SIOCSIFDESCR and SIOCGIFDESCR to maintain interface
descriptions, configurable with ifconfig

help from various, ok deraadt@


# 1.52 18-May-2004 brad

if_ether.h
add ETHER_MAX_LEN_JUMBO, ETHER_VLAN_ENCAP_LEN, ETHER_ALIGN, and
ETHERMTU_JUMBO constants.

if.h
add a few more interface capabilities flags.

Some from NetBSD, some from FreeBSD.

ok markus@


# 1.51 26-Apr-2004 mcbride

Before enqueueing the packet, copy the contents of incoming clusters
to the mbuf and free the cluster when it contains a small packet.

ok deraadt@


# 1.50 17-Apr-2004 henning

add a congestion indicator to if_queue. It is set when the input queue
is full, along with a timer that unsets it again after 10ms.
The input queue beeing full is a reliable indicator for CPU overload, and
this flag allows other subsystems to cope with the situation.
hacked with beck
ok kjc@ markus@ beck@


Revision tags: OPENBSD_3_5_BASE
# 1.49 15-Jan-2004 markus

add a RTM_IFANNOUNCE message; from netbsd; ok itojun, henning


# 1.48 16-Dec-2003 markus

return error in ifc_destroy; ok deraadt, itojun, cedric, hshoexer


# 1.47 10-Dec-2003 itojun

use if_indexlim (instead of if_index) and ifindex2ifnet[x] != NULL
to check if interface exists, as (1) if_index will have different meaning
(2) ifindex2ifnet could become NULL when interface gets destroyed,
when we introduce dynamically-created interfaces. markus ok


# 1.46 08-Dec-2003 markus

add IOCIFGCLONERS; ifconfig -C; from netbsd; ok henning, deraadt


# 1.45 03-Dec-2003 markus

support for network interface "cloning", e.g. gif(4) via ifconfig(8)


# 1.44 19-Oct-2003 david

more typos


# 1.43 17-Oct-2003 mcbride

Common Address Redundancy Protocol

Allows multiple hosts to share an IP address, providing high availability
and load balancing.

Based on code by mickey@, with additional help from markus@
and Marco_Pfatschbacher@genua.de

ok deraadt@


Revision tags: OPENBSD_3_4_BASE
# 1.42 25-Aug-2003 fgsch

if_init support, required by ieee80211.
deraadt@ ok.


# 1.41 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_A UBC_SYNC_B
# 1.40 03-Jul-2002 miod

Change all variables definitions (int foo) in sys/sys/*.h to variable
declarations (extern int foo), and compensate in the appropriate locations.


# 1.39 30-Jun-2002 itojun

allocate sockaddr_dl for ifnet in if_alloc_sadl(), as we don't always know
the size of sockaddr_dl on if_attach() - for instance, see ether_ifattach().
from netbsd. fgs ok


# 1.38 23-Jun-2002 itojun

g/c last remains of old ipv6 prefix management


# 1.37 27-May-2002 itojun

if_attach() gets called before domaininit(). scan all interfaces for if_afdata
initialization after domaininit().


# 1.36 27-May-2002 itojun

framework to add af-dependent data structure to struct ifnet.
as discussed at bsd-api-discuss. sync w/kame


# 1.35 24-Apr-2002 dhartmei

Add hooks to struct ifnet that allow to register callbacks that will be
notified of interface address changes. ok provos@, angelos@


Revision tags: OPENBSD_3_1_BASE
# 1.34 15-Mar-2002 millert

Cosmetic changes only, primarily making comments line up nicely after the
__P removal.


# 1.33 14-Mar-2002 millert

First round of __P removal in sys


# 1.32 23-Jan-2002 fgsch

compatability -> compatibility.


Revision tags: OPENBSD_3_0_BASE UBC_BASE
# 1.31 05-Jul-2001 angelos

branches: 1.31.4;
KNF


# 1.30 05-Jul-2001 jjbg

Include files for IPComp support. angelos@ ok.


# 1.29 27-Jun-2001 kjc

ALTQ base modifications to the kernel.
- ALTQ introduces a set of new queue macros that coexist with the
traditional IF_XXX macros.
- "struct ifaltq" replaces "struct ifqueue" in "struct ifnet".
- assign cdev major 74 for i386 and 54 for alpha as ALTQ control interface.


# 1.28 23-Jun-2001 fgsch

Add ether_input_mbuf to help us remove the ether_header from
ether_input; all drivers should start migrating to this.
Discussed with jason@, deraadt@ more or les ok'ed.


# 1.27 15-Jun-2001 itojun

change the meaning of ifnet.if_lastchange to meet RFC1573 ifLastChange.
follows BSD/OS practice and ucd-snmp code (FreeBSD does it for specific
interfaces only).

was: if_lastchange get updated on every packet transmission/receipt.
now: if_lastchange get updated when IFF_UP is changed.


# 1.26 09-Jun-2001 angelos

By popular demand, protect from multiple inclusion, and fix to use the
same naming style.


# 1.25 28-May-2001 angelos

IPSECv4 -> IPSEC


# 1.24 28-May-2001 angelos

No need for separate ESP/AH interface capabilities.


# 1.23 28-May-2001 angelos

Interface capabilities (based on NetBSD, but merge ethercom and ifnet
capabilities into one, in the ifp).


Revision tags: OPENBSD_2_9_BASE
# 1.22 06-Feb-2001 mickey

allow changing number of loopbacks in ukc.
change rest of the code to use lo0ifp pointing
to the corresponding struct ifnet.
itojun@ and niklas@ ok


# 1.21 19-Jan-2001 itojun

pull post-4.4BSD change to sys/net/route.c from BSD/OS 4.2 (UCB copyrighted).

have sys/net/route.c:rtrequest1(), which takes rt_addrinfo * as the argument.
pass rt_addrinfo all the way down to rtrequest, and ifa->ifa_rtrequest.
3rd arg of ifa->ifa_rtrequest is now rt_addrinfo * instead of sockaddr *
(almost noone is using it anyways).

benefit: the follwoing command now works. previously we need two route(8)
invocations, "add" then "change".
# route add -inet6 default ::1 -ifp gif0

remove unsafe typecast in rtrequest(), from rtentry * to sockaddr *. it was
introduced by 4.3BSD-reno and never corrected.

XXX is eon_rtrequest() change correct regarding to 3rd arg?
eon_rtrequest() and rtrequest() were incorrect since 4.3BSD-reno,
so i do not have correct answer in the source code.
someone with more clue about netiso-over-ip, please help.


Revision tags: OPENBSD_2_8_BASE
# 1.20 20-Sep-2000 art

Since ifa_refcnt was bumped to an int and rt_flags is an int too, bump
ifa_flags to int.


# 1.19 28-Aug-2000 deraadt

changing the size of if_data has heavy impact on userland compat. there
was even a u_char slot available for ifi_link_state, which clearly does not
need a full 32 bits.


# 1.18 26-Aug-2000 nate

sync mii code with netbsd
adds detach functionality for phys
some code cleanup

Nobody really had time to test all of this out, but theo said commit anyway


Revision tags: OPENBSD_2_7_BASE
# 1.17 22-Mar-2000 itojun

remove if_withname(), which was imported during KAME merge by mistake.


# 1.16 21-Mar-2000 mickey

add SIOCGIFMTU/SIOCSIFMTU; remediate redundant code of tun, ppp, sppp; chris@ ok


Revision tags: SMP_BASE
# 1.15 02-Feb-2000 itojun

branches: 1.15.2;
wrap IFAFREE() by "do {} while (0)". it wasn't safe enough.


Revision tags: kame_19991208
# 1.14 08-Dec-1999 itojun

bring in KAME IPv6 code, dated 19991208.
replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support.
see sys/netinet6/{TODO,IMPLEMENTATION} for more details.

GENERIC configuration should work fine as before. GENERIC.v6 works fine
as well, but you'll need KAME userland tools to play with IPv6 (will be
bringed into soon).


Revision tags: OPENBSD_2_6_BASE
# 1.13 08-Aug-1999 niklas

Support detaching of network interfaces. Still work to do in ipf, and
other families than inet.


# 1.12 23-Jun-1999 cmetz

Added some protocol independent interfaces (supposedly IPv6 support APIs, but
ones that are useful for all protocols, not just IPv6).


Revision tags: OPENBSD_2_5_BASE
# 1.11 13-Mar-1999 deraadt

make ifa_refcnt a u_int; andrewb@demon.net


# 1.10 26-Feb-1999 jason

Ethernet bridge/IP firewall driver.


# 1.9 07-Jan-1999 deraadt

fix IFAFREE() to be safe for if/else nesting


Revision tags: OPENBSD_2_4_BASE
# 1.8 03-Sep-1998 jason

o OpenBSD gets if_media support (from NetBSD)
o rework/simplify if_xl to use it


Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE OPENBSD_2_2_BASE OPENBSD_2_3_BASE
# 1.7 02-Jul-1996 niklas

-Wall & -Wstrict-prototype fixes


# 1.6 29-Jun-1996 deraadt

provide if_attachhead(), and make if_loop use it


# 1.5 10-May-1996 deraadt

if_name/if_unit -> if_xname/if_softc


# 1.4 06-May-1996 mickey

if.h was missed from the commit.
if_ethersubr.c: missed variables added.


# 1.3 05-Mar-1996 mickey

Changes for ifconfig to compile.


# 1.2 03-Mar-1996 niklas

From NetBSD: 960217 merge


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.191 10-Feb-2018 florian

Implement RFC 7217: "A Method for Generating Semantically Opaque
Interface Identifiers with IPv6 Stateless Address Autoconfiguration."

"An IPv6 address configured using this method is stable within each
subnet, but the corresponding Interface Identifier changes when the
host moves from one network to another. This method is meant to be an
alternative to generating Interface Identifiers based on hardware
addresses."

OK naddy, sthen


# 1.190 16-Jan-2018 mpi

Recycle IFF_NOTRAILERS into IFF_STATICARP and document ownerhsip
of IFF* flags.

inputs from jmc@, ok bluhm@, visa@


# 1.189 21-Dec-2017 dlg

prototype if_attach_iqueues so drivers can configure multiple iqs.


# 1.188 09-Nov-2017 tb

The cmd argument of ifconf() has been unused since COMPAT_LINUX was
purged. Remove it and move the prototype to if.c since ifconf() is
not used outside of this file.

ok mpi


# 1.187 31-Oct-2017 sashan

- add one more softnet taskq
NOTE: code still runs with single softnet task. change definition of
SOFTNET_TASKS in net/if.c, if you want to have more than one softnet task

OK mpi@, OK phessler@


Revision tags: OPENBSD_6_1_BASE OPENBSD_6_2_BASE
# 1.186 24-Jan-2017 krw

A space here, a space there. Soon we're talking real whitespace
rectification.


# 1.185 24-Jan-2017 dlg

add support for multiple transmit ifqueues per network interface.

an ifq to transmit a packet is picked by the current traffic
conditioner (ie, priq or hfsc) by providing an index into an array
of ifqs. by default interfaces get a single ifq but can ask for
more using if_attach_queues().

the vast majority of our drivers still think there's a 1:1 mapping
between interfaces and transmit queues, so their if_start routines
take an ifnet pointer instead of a pointer to the ifqueue struct.
instead of changing all the drivers in the tree, drivers can opt
into using an if_qstart routine and setting the IFXF_MPSAFE flag.
the stack provides a compatability wrapper from the new if_qstart
handler to the previous if_start handlers if IFXF_MPSAFE isnt set.

enabling hfsc on an interface configures it to transmit everything
through the first ifq. any other ifqs are left configured as priq,
but unused, when hfsc is enabled.

getting this in now so everyone can kick the tyres.

ok mpi@ visa@ (who provided some tweaks for cnmac).


# 1.184 23-Jan-2017 mpi

Flag pseudo-interfaces as such in order to call add_net_randomness()
only once per packet.

Fix a regression introduced when if_input() started to be called by
every pseudo-driver.

ok claudio@, dlg@


# 1.183 23-Jan-2017 dlg

i botched the copyout to ifr->ifr_data in SIOCGIFDATA.

this lets pflogd run again.

rename if_data() to if_getdata() while here to make grepping for
things less noisy.

reported by jsg@
worked through with deraadt@


# 1.182 23-Jan-2017 dlg

merge the ifnet and ifqueue stats together when userland wants them.

a new if_data() function takes a pointer to ifnet and merges its
if_data and ifq statistics. it takes the ifq mutex around the reads
of the ifq stats so they get a consistent copy.

the ifnet and ifq stats are merged because some parts of the stack
still update the ifnet counters.

ok visa@ (on an earlier diff) mpi@ claudio@


# 1.181 12-Dec-2016 mpi

Remove most of the splsoftnet() recursions related to cloned interfaces.

inputs and ok bluhm@


# 1.180 27-Oct-2016 dlg

add a new pool for 2k + 2 byte (mcl2k2) clusters.

a certain vendor likes to make chips that specify the rx buffer
sizes in kilobyte increments. unfortunately it places the ethernet
header on the start of the rx buffer, which means if you give it a
mcl2k cluster, the ethernet header will not be ETHER_ALIGNed cos
mcl2k clusters are always allocated on 2k boundarys (cos they pack
into pages well). that in turn means the ip header wont be aligned
correctly.

the current workaround on these chips has been to let non-strict
alignment archs just use the normal 2k cluster, but use whatever
cluster can fit 2k + 2 on strict archs. that turns out to be the
4k cluster, meaning we waste nearly 2k of space on every packet.

properly aligning the ethernet header and ip headers gives a
performance boost, even on non-strict archs.


# 1.179 04-Sep-2016 reyk

Move code to change the rdomain of an interface from the ioctl switch case
to a new function if_setrdomain().

OK mpi@ henning@


# 1.178 03-Sep-2016 reyk

Add support for a multipoint-to-multipoint mode in vxlan(4). In this
mode, vxlan(4) must be configured to accept any virtual network
identifier with "vnetid any" and added to a bridge(4) or switch(4).
This way the driver will dynamically learn the tunnel endpoints and
their vnetids for the responses and can be used to dynamically bridge
between VXLANs. It is also being used in combination with switch(4)
and the OpenFlow tunnel classifiers.

With input from yasuoka@ goda@
OK deraadt@ dlg@


Revision tags: OPENBSD_6_0_BASE
# 1.177 10-Jun-2016 vgross

Add the "llprio" field to struct ifnet, and the corresponding keyword
to ifconfig.

"llprio" allows one to set the priority of packets that do not go through
pf(4), as the case is for arp(4) or bpf(4).

ok sthen@ mikeb@


# 1.176 02-Mar-2016 dlg

provide generic ioctls for managing an interfaces parent

in the future this will subsume the individual vlandev, carpdev,
pppoedev, foodev options for things like vlan, carp, pppoe, etc.

inspired by vnetid

ok mpi@ jmatthew@


Revision tags: OPENBSD_5_9_BASE
# 1.175 05-Dec-2015 deraadt

avoid an ugly wrap in a comment


# 1.174 03-Dec-2015 dlg

rework if_start to allow nics to provide an mpsafe start routine.

existing start routines will still be called under the kernel lock
and at IPL_NET.

mpsafe start routines will be serialised so only one instance of
each interfaces function will be running in the kernel at any point
in time. this guarantees packets will be dequeued in order, and the
start routines dont have to lock against themselves because if_start
does it for them.

the code to do that is based on the scsi runqueue code.

this also provides an if_start_barrier() function that should wait
until any currently running instances of if_start have finished.

a driver can opt in to the mpsafe if_start call by doing the following:

1. setting ifp->if_xflags = IFXF_MPSAFE
2. only calling if_start() instead of its own start routine
3. clearing IFF_RUNNING before calling if_start_barrier() on its way down
4. only using IFQ_DEQUEUE (not ifq_deq_begin/commit/rollback)

to simplify the implementation the tx mitigation code has been removed.

tested by several
ok mpi@ jmatthew@


# 1.173 20-Nov-2015 mpi

Keep if_ref() private, if_get() is what you want to use before if_put().

The thread detaching an interface will sleep until all references to this
interface have been released. So we decided to only keep references for
a short period of time.

Keeping if_ref() private will hopefully help preserve this goal as long
as it makes sense.

Calling if_get()/if_put() in the same function also allows us to make
use of static analysis tools (thanks jsg@!) to catch our errors.

ok dlg@


# 1.172 24-Oct-2015 reyk

Add pair(4), a vether-based virtual Ethernet driver to interconnect
rdomains and bridges on the local system. This can be used to route
through local rdomains, to create L2 devices (like trunks) between
them, and many other things.

Discussed with many, with input from mpi@
OK sthen@ phessler@ yasuoka@ mikeb@


# 1.171 23-Oct-2015 claudio

Introduce a new sysctl NET_RT_IFNAMES that returns only ifnames to ifindex
mappings. This will be used by if_nameindex(3), if_nametoindex(3) and
if_indextoname(3) soon to fix the issues in pledge because of inet6 link
local addressing.
OK mpi@ benno@ deraadt@
The libc version will follow soon so better start updating your kernels


# 1.170 23-Oct-2015 dlg

tweak the vnetid so it can be optional and therefore cleared/deleted.

the abstract vnetid is promoted to a uin32_t, and adds a SIOCDVNETID
ioctl so it can be cleared.

this is all because i set an assignment on implementing a virtual
network interface and the students got confused when vnetid 0 didnt
show up in ifconfig output.

the vnetid in the vxlan(4) protocol is optional, but the current
code confuses 0 with no vnetid being set. this makes it clear.

ok reyk@ who also simplified my diff


# 1.169 05-Oct-2015 uebayasi

Add ifi_oqdrops and its alias to struct if_data.

Necessary bumps in Ports will be handled by sthen@.

OK mpi@ dlg@


# 1.168 27-Sep-2015 stsp

Add if_setlladdr(), factored out from ifioctl(). Will be used by iwm(4) soon.
With suggestions from tedu@ and guenther@
ok kettenis@


# 1.167 11-Sep-2015 stsp

Make room for media types of the future. Extend the ifmedia word to 64 bits.
This changes numbers of the SIOCSIFMEDIA and SIOCGIFMEDIA ioctls and
grows struct ifmediareq.

Old ifconfig and dhclient binaries can still assign addresses, however
the 'media' subcommand stops working. Recompiling ifconfig and dhclient
with new headers before a reboot should not be necessary unless in very
special circumstances where non-default media settings must be used to
get link and console access is not available.

There may be some MD fallout but that will be cleared up later.

ok deraadt miod
with help and suggestions from several sharks attending l2k15


# 1.166 09-Sep-2015 dlg

introduce reference counts for interfaces (ie, struct ifnet *ifp).

if_get can get a reference to an ifp, but it never releases that
reference. this provides an if_put function that can be used to
decrement the refcount.

we cannot come up with a scheme for letting the network stack run on
one (or many) cpus while ioctls are pulling interfaces down on another
cpu without refcounts for the interfaces.

if_put is going in now so we can go through the stack and put the
necessary calls to it in, and then we'll backfill this implementation
to actually check the refcounts when the interface detaches.

ok mpi@ mikeb@ claudio@


# 1.165 30-Aug-2015 mpi

Use a global table for domains instead of building a list at run time.

As a side effect there's no need to run if_attachdomain() after the
list of domains has been built.

ok claudio@, reyk@


Revision tags: OPENBSD_5_8_BASE
# 1.164 07-Jun-2015 jsg

Introduce unhandled_af() for cases where code conditionally does
something based on an address family and later assumes one of the paths
was taken. This was initially just calls to panic until guenther
suggested a function to reduce the amount of strings needed.

This reduces the amount of noise with static analysers and acts
as a sanity check.

ok guenther@ bluhm@


# 1.163 18-May-2015 reyk

Move the rdomain from struct ifnet into struct if_data. This way it
will be exported to userland with the existing sysctl, getifaddrs()
and routing socket (if_msghdr.ifm_data) interfaces that expose
if_data. All programs and daemons - Apps - that call the
SIOCGIFRDOMAIN ioctl in a getifaddrs() loop or after receiving an
interface message on the routing socket can now remove the pointless
additional ioctl. In base, that could be: dhclient, isakmpd, dhcpd,
dhcrelay, ntpd, ospfd, ripd, ifconfig.

No ABI breakage because it uses a previously unused pad field in if_data.

OK mpi@ deraadt@


# 1.162 10-Apr-2015 mpi

Run detach hook and similar before cleaning up any other resource when
an interface is destroyed/removed. This way we can ensure pseudo-driver
changes done after attaching an interface are undone before detaching it.

Note: it is safe to call if_deactivate() multiple times as the interface
should not have any attached pseudo-interface after the first call.

ok deraadt@, dlg@


# 1.161 18-Mar-2015 dlg

remove the congestion handling from struct ifqueue.

its only used for the ip and ip6 network stack input queues, so it
seems unfair that every instance of ifqueue has to carry a pointer
around for this specific use case.

this moves the congestion marker to a kernel global. if we detect
that we're congested, we assume the whole system is busy and punish
all input queues.

marking a system as congested is done by setting the global to the
current value of ticks. as the system moves away from that value,
it moves away from being congested until the comparison fails.

written at s2k15
ok henning@ beck@ bluhm@ claudio@


Revision tags: OPENBSD_5_7_BASE
# 1.160 08-Feb-2015 mpi

Introduce if_input() a function to pass packets dequeued from a
recieving ring to the stack.

if_input() is at the moment a drop-in replacement for ether_input_mbuf()
but will let us stack pseudo-driver in a nice way in order to no longer
call ether_input() recursively.

ok pelikan@, reyk@, blambert@, henning@


# 1.159 06-Jan-2015 stsp

Remove the NOINET6 interface flag, a left-over from the times when IPv6
was enabled by default. Add AFATTACH/AFDETACH ioctls which enable/disable
an address family for an interface (currently used for IPv6 only).

New kernel needs new ifconfig for IPv6 configuration (address assignment
still works with old ifconfig making this easy to cross over).

Committing on behalf of henning@ who is currently lebensmittelvergiftet.
ok stsp, benno, mpi


# 1.158 05-Dec-2014 mpi

Explicitly include <net/if_var.h> instead of pulling it in <net/if.h>.

ok mikeb@, krw@, bluhm@, tedu@


Revision tags: OPENBSD_5_6_BASE
# 1.157 14-Jul-2014 dlg

now that receive ring accounting has been pulled out of the mbuf layer,
we can pull the space the mbuf layer used to do per interface accounting
out of struct if_data.

saves a hundredish bytes on every interface.

ok deraadt@ claudio@


# 1.156 11-Jul-2014 henning

introduce the IFXF_AUTOCONF6 interface flag which controls wether we
accept rtadvs on that interface. the global net.inet6.ip6.accept_rtadv
sysctl just doesn't cut it, even tho the spec wants that - but in their
little absurd world, a host just has one interface by definition anyway...
the sysctlgoes away.
lots of head scratching, brain cell elemination etc from bluhm benno stsp
florian, excitement from simon and todd, ok bluhm stsp benno florian


# 1.155 08-Jul-2014 dlg

introduce the if_rxr api. it is intended to pull the rx ring accounting
out of the mbuf layer, and break the assumption that an interface will
only have a single ring per mbuf cluster size.

mpi@ is ok with moving this forward


# 1.154 13-Jun-2014 mpi

Instead of updating all the cluster allocation water marks of all the
interfaces when the kernel is livelocked, only do it for the current
pool and defer the other updates.

This allow us to get rid of an interface list iteration in a critical
path.

Ridding the libc crank since this change introduce an ABI break.

ok claudio@


Revision tags: OPENBSD_5_5_BASE
# 1.153 21-Nov-2013 mikeb

split kernel parts of the if.h into a separate header file if_var.h
which allows us to modify ifnet structure in a relatively safe way;
discussed with deraadt, ok mpi


# 1.152 09-Nov-2013 dlg

ticks is compared against mcl_grown to see if time has elapsed since
the rx ring was last allowed to grow and then assigned to it. it
is erroneous to do this because mcl_grown is a u_int and ticks is an
int.

this makes mcl_grown an int, and follows the idiom in kern_timeout.c
of going "thing - ticks < diff", which better copes with ticks
wrapping around and being used to calculate relative intervals.

ok pirofti@ guenther@


# 1.151 01-Nov-2013 pelikan

keep net/hfsc.h away from userspace, except in pfctl

tested by naddy, ok deraadt


# 1.150 21-Oct-2013 benno

nuke comment. How soon is now?
"do it" deraadt@


# 1.149 19-Oct-2013 reyk

Bring back the if_detachhook. We're going to have more users now.

ok mpi@ henning@ benno@


# 1.148 13-Oct-2013 reyk

Import vxlan(4), the virtual extensible local area network tunnel
interface. VXLAN is a UDP-based tunnelling protocol for overlaying
virtualized layer 2 networks over layer 3 networks. The implementation
is based on draft-mahalingam-dutt-dcops-vxlan-04 and has been tested
with other implementations in the wild.

put it in deraadt@


# 1.147 12-Oct-2013 henning

new bandwidth shaping subsystem, kernel side
uses hfsc behind the scenes; altq stays in parallel for a migration phase.
if.h even more messy for the transition, but eventuelly it should become
readable...
looked over & tested by many, ok phessler sthen


# 1.146 17-Sep-2013 mpi

Change vlan(4) detach procedure to not use a hook but a list of vlans
on the parent interface. This is similar to what bridge(4), trunk(4)
or carp(4) are doing and allows us to get rid of the detachhook.

ok reyk@, mikeb@


# 1.145 28-Aug-2013 mpi

Remove unused argument from *rtrequest()

ok krw@, mikeb@


Revision tags: OPENBSD_5_4_BASE
# 1.144 20-Jun-2013 mpi

Revert previous and unbreak asr, the new include should be protected.

Reported by naddy@


# 1.143 20-Jun-2013 mpi

Allocate the various hook head descriptors as part of the ifnet
structure rather than doing various M_WAITOK allocations during
the *attach() functions, we always rely on them anyway.

ok mikeb@, uebayasi@


# 1.142 02-Apr-2013 mpi

Instead of storing the link-level address of every interface in a global
array indexed by interface numbers, add a new field to the interface
descriptor pointing to it.

claudio@ and todd@ like it, ok mikeb@


# 1.141 26-Mar-2013 mpi

Remove various read-only *maxlen variables and use IFQ_MAXLEN directly.

ok beck@, mikeb@


# 1.140 20-Mar-2013 mpi

Introduce if_get() to retrieve an interface descriptor pointer given
an interface index and replace all the redondant checks and accesses
to a global array by a call to this function.

With imputs from and ok bluhm@, mikeb@


# 1.139 07-Mar-2013 mpi

Remove unused ifa_ifwithaf() function.

ok mikeb@, miod@


# 1.138 07-Mar-2013 mpi

Remove the IFAFREE() macro, the ifafree() function it was calling already
check for the reference counter.

ok mikeb@, miod@, pelikan@, kettenis@, krw@


Revision tags: OPENBSD_5_3_BASE
# 1.137 23-Nov-2012 sthen

Add SIOCGIFHARDMTU to allow retrieving the driver's maximum supported MTU
looks fine reyk@ ok mikeb@


# 1.136 11-Nov-2012 deraadt

align ifaliasreq.ifra_addr similar to the way that ifreq is fixed --
a gruesome union, to block the compiler from placing the struct
incorrectly aligned on stack frames
ok guenther


# 1.135 05-Oct-2012 camield

Point an interface directly to its bridgeport configuration, instead
of to the bridge itself. This is ok, since an interface can only be part
of one bridge, and the parent bridge is easy to find from the bridgeport.

This way we can get rid of a lot of list walks, improving performance
and shortening the code.

ok henning stsp sthen reyk


# 1.134 19-Sep-2012 henning

defina an IFCAP_CSUM_MASK, covering IFCAP_CSUM_*, and use it in if_vlan.c
to replace the list of them.
this actually makes vlan inherit the IPv6 CSUM flags from it's parent, that
had been commented out since this code was committed back in 2001.
ok benno mpf


# 1.133 10-Sep-2012 guenther

Bring into compliance with POSIX, exposing just the specified bits.

Requested by jasper@, ok kettenis@


# 1.132 21-Aug-2012 bluhm

Reverse the name and meaning of the IFXF_INET6_PRIVACY interface
flag. It is now called IFXF_INET6_NOPRIVACY. So IPv6 privacy
addresses are on by default without resetting the flag during
ifconfig down/up.
OK stsp@, sperreault@ (who wrote the same diff)


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.131 02-Dec-2011 haesbaert

Kill unused IFCAP_IPSEC and IFCAP_IPCOMP.

ok claudio@ henning@ mikeb@


# 1.130 02-Nov-2011 haesbaert

Expose if_capabilities to userland so that ifconfig can display the
device hardware features.
Tune ifconfig to show them with 'hwfeatures' argument.
While here, kill some old unused capabilities and respect 80 columns
in brconfig.h.

ok mcbride@, henning@, mpf@.


# 1.129 07-Oct-2011 henning

rename some vars and functions
unfortunately altq is one giant namespace violation. rename just those that
conflict with new stuff for now only to be found on my laptop. reduce pain,
the diff is huge already. ok ryan


Revision tags: OPENBSD_5_0_BASE
# 1.128 08-Jul-2011 henning

new priority queueing implementation, extremely low overhead, thus fast.
unconditional, always on. 8 priority levels, as every better switch, the
vlan header etc etc. ok ryan mpf sthen, pea tested as well


# 1.127 07-Jul-2011 henning

provide IF_LEN and IFQ_LEN to access ifq_len on an ifqueue, ryan ok


# 1.126 05-Jul-2011 henning

now of course I only noticed if_qflush is completely unused after
adjusting it to the new world order in my tree... remove it, ok ryan claudio


# 1.125 03-Jul-2011 henning

IFQ_CLASSIFY is also just schrapnel


# 1.124 03-Jul-2011 henning

no traces of ALTQ_DECL to be found anywhere, thus kill the #defines


# 1.123 03-Jul-2011 claudio

LINK_STATE_IS_UP() should consider LINK_STATE_UNKNOWN as an up state.
This is now possible because carp no longer uses LINK_STATE_UNKNOWN
for a state that is considered down. This will simplify a lot of code.
OK mpf@ mcbride@ henning@


# 1.122 13-Mar-2011 stsp

Add a way to enable/disable Wake On LAN with ifconfig.
ok deraadt


Revision tags: OPENBSD_4_9_BASE
# 1.121 17-Nov-2010 henning

introduce ifa_update_broadaddr to update an ifaddr's broadcast address,
trivial for the moment, more needed soon
tested by many as part of a larger diff, ok sthen claudio dlg krw


# 1.120 24-Sep-2010 claudio

Implement if_freenameindex() as a real function as required by posix.
OK deraadt@, millert@


# 1.119 23-Sep-2010 dlg

tweak the mclgeti algorithm to behave better under load.

instead of letting hardware rings grow on every interrupt, restrict
it so it can only grow once per softclock tick. we can only punish
the rings on softclock ticks, so it make sense to only grow on
softclock tick boundaries too.

the rings are now punished after >1 lost softclock tick rather than
>2. mclgeti is now more aggressive at detecting livelock.

the rings get punished by an 8th, rather than by half.

we now allow the rings to be punished again even if the system is
already considered in livelock.

without this diff a livelocked system will have its rx ring sizes
scale up and down very rapidly, while holding the rings low for too
long. this affected throughput significantly.

discussed and tested heavily at j2k10. there are still some games
with softnet we can play, but this is a good first step.

"put it in" and ok deraadt@
ok claudio@ krw@ henning@ mcbride@

if we find out that it sucks we can pull it out again later. till then
we'll run with it and see how it goes.


# 1.118 27-Aug-2010 jsg

remove the unused if_init callback in struct ifnet
ok deraadt@ henning@ claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.117 26-Jun-2010 claudio

Implement a simple keepalive mechanism in gre(4) that is compatible with
the one used by Cisco. It sends a return gre packet inside a gre packet
to the other side and expects it to return.
OK deraadt, reyk additional testing by sthen


# 1.116 28-May-2010 claudio

Rework the way we handle MPLS in the kernel. Instead of fumbling MPLS into
ether_output() and later on other L2 output functions use a trick and over-
load the ifp->if_output() function pointer on MPLS enabled interfaces to
go through mpls_output() which will then call the link level output function.
By setting IFXF_MPLS on an interface the output pointers are switched.
This now allows to cleanup the MPLS input and output pathes and fix mpe(4)
so that the MPLS code now actually works for both P and PE systems.
Tested by myself and michele
(A custom kernel with MPLS and mpe enabled is still needed).


# 1.115 17-Apr-2010 deraadt

split SIOCSIFLLADDR code out into an ifnewlladr() function
ok stsp


# 1.114 06-Apr-2010 stsp

Simple implementation of RFC4941, "Privacy Extensions for Stateless
Address Autoconfiguration in IPv6". For those among us who are paranoid
about broadcasting their MAC address to the IPv6 internet.

Man page help from jmc, testing by weerd, arc4random API hints from djm.

ok deraadt, claudio


Revision tags: OPENBSD_4_7_BASE
# 1.113 13-Jan-2010 henning

maintain a global RB tree of all local addresses in the system. this
includes AF_LINK addresses (aka mac addresses in the ethernet case). for
inet this also includes the broadcast addresses.
depends on ifinit() called earlier so we have a chance to pool_init before
autoconf assigns the AF_LINK addresses, the v6 fix, and the ifa_add/del
abstraction i just committed.
this is a change in semantics, it is now illegal to change the actual
address in an ifaddr struct because then the RB tree becomes unbalanced.
nothing using this tree yet.
ok theo ryan dlg


# 1.112 13-Jan-2010 henning

instead of fiddling with the per-interface address lists directly in
many places create a proper API (ifa_add / ifa_del) and use it.
ok theo ryan dlg


# 1.111 12-Jan-2010 deraadt

Make the structures for ifa_msghdr and friends even more like
the route messages so that people and compilers will not get
confused.
ok claudio


# 1.110 17-Sep-2009 claudio

Remove the comaptibility structures for routing socket version 3.
The RTM_VERSION bump is 2 years ago and so there is no need for this.
Diff made by tedu@ some time ago but got never commited so I do it now.


# 1.109 14-Sep-2009 claudio

Add a way to convert the ifi_link_state to a string without the use of
if_media. This makes link state tracking a lot easier as there is no need
to convert if types to if_media types, etc. Additionally this allows us
to extend the link states to include states tracked on higher protocol layers.
gre(4) keepalives packets, bfd and udld can be implemented without ugly hacks.
OK henning, michele, sthen, deraadt


# 1.108 10-Aug-2009 deraadt

At sys_reboot time, bring all the interfaces down so that their xxstop
functions are called, which will turn off DMA. Receiving packets into
your memory after a system reboot is pretty nasty. This will also mean
that the shutdown hooks can go; this solution is smaller.
ok henning miod dlg kettenis


Revision tags: OPENBSD_4_6_BASE
# 1.107 06-Jun-2009 rainer

when xflags got changed, tell the userland by routing sockets

ok henning@


# 1.106 05-Jun-2009 claudio

Initial support for routing domains. This allows to bind interfaces to
alternate routing table and separate them from other interfaces in distinct
routing tables. The same network can now be used in any doamin at the same
time without causing conflicts.
This diff is mostly mechanical and adds the necessary rdomain checks accross
net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6.
input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@


# 1.105 04-Jun-2009 henning

allow IPvShit to be turned off completely per-interface.
ifconfig em0 -inet6
deletes all v6 addresses including link-local and prevents new ones from
being added.
ifconfig em0 inet6 <addr>
re-enables v6, brings the link local back and adds optional <addr>
ok theo reyk


# 1.104 03-Jun-2009 beck

make wireless interfaces priority 4 by default. other interfaces remain
priority 0. while we are in here make sure we add wi interfaces to group "wlan"
in the same way the net80211 stuff already is.

this makes dhcp multiple default routes useful on laptops.

ok claudio@


Revision tags: OPENBSD_4_5_BASE
# 1.103 27-Jan-2009 dlg

make drivers tell the mclgeti allocator what their maximum ring size is
to prevent the hwm growing beyond that. this allows the livelock mitigation
to do something where the hwm used to grow beyond twice the rx rings size.

ok kettenis@ claudio@


# 1.102 12-Dec-2008 claudio

Introduce a if_priority that will be added to RTP_STATIC when routes are
added without an expilict priority. This allows to specify less prefered
interfaces that will only take over if the primary interface loses link.
OK deraadt@


# 1.101 11-Dec-2008 deraadt

export per-interface mbuf cluster pool use statistics out to userland
inside if_data, so that netstat(1) and systat(1) can see them
ok dlg


# 1.100 30-Nov-2008 brad

- Remove unused if_reset "bus reset routine" field in the ifnet struct.
- Add if_stop "stop routine" field in the ifnet struct.

ok mglocker@


# 1.99 26-Nov-2008 dlg

provide m_clsetlwm, an interface for an interface to raise its low
watermark for mbuf cluster allocations.

this is necessary for things like bge which cannot cope with less than a
certain number of pkts on the ring.

ok deraadt@


# 1.98 25-Nov-2008 deraadt

Factor increases are not needed, +1 appears to work as well.
ok dlg


# 1.97 24-Nov-2008 deraadt

move MCLPOOLS to if.h and force uipc_mbuf.c to get if.h, there is no
other option
ok dlg


# 1.96 24-Nov-2008 dlg

add several backend pools to allocate mbufs clusters of various sizes out
of. currently limited to MCLBYTES (2048 bytes) and 4096 bytes until pools
can allocate objects of sizes greater than PAGESIZE.

this allows drivers to ask for "jumbo" packets to fill rx rings with.

the second half of this change is per interface mbuf cluster allocator
statistics. drivers can use the new interface (MCLGETI), which will use
these stats to selectively fail allocations based on demand for mbufs. if
the driver isnt rapidly consuming rx mbufs, we dont allow it to allocate
many to put on its rx ring.

drivers require modifications to take advantage of both the new allocation
semantic and large clusters.

this was written and developed with deraadt@ over the last two days
ok deraadt@ claudio@


# 1.95 07-Nov-2008 deraadt

give this some /* CONSTCOND */ love


Revision tags: OPENBSD_4_4_BASE
# 1.94 10-Apr-2008 dlg

introduce mitigation for the calling of an interfaces start routine.

decent drivers prefer to have a lot of packets on the send queue so they
can queue a lot of them up on the tx ring and then post them all in one
big chunk. unfortunately our stack queues one packet onto the send queue
and then calls the start handler immediately.

this mitigates against that queue, send, queue, send behaviour by trying to
call the start routine only once per softnet. now its queue, queue, queue,
send.

this is the result of a lot of discussion with claudio@
tested by many.


Revision tags: OPENBSD_4_3_BASE
# 1.93 18-Nov-2007 mpf

Sync struct ifaltq to match struct ifqueue.
I wonder why 64-bit archs have not been bitten by this.
OK mcbride@, henning@


# 1.92 03-Sep-2007 claudio

Bump RTM_VERSION to 4 and start a new aera of routing in OpenBSD :)
Changes include 64bit counters instead of u_long, routing table id in the header
of most messages, reserved routing priority field, added a hdrlen field to skip
over the header so that binary compatibility becomes easier.
A minimal backward support for old binaries is included to ease upgrades but
don't expect anything more than ifconfig, route and dhclient to correctly work.
OK henning@ mglocker@


Revision tags: OPENBSD_4_2_BASE
# 1.91 25-Jun-2007 henning

crank ifq_maxlen from 50 to 256, so it is not smaller than most interfaces
rx rings any more. forwarding boxes with many fast interfaces can still use
some more, but this is a saner default.
ok deraadt markus henric


# 1.90 14-Jun-2007 reyk

Add a new "rtlabel" option to ifconfig. It allows to specify a route label
which will be used for new interface routes. For example,
ifconfig em0 10.1.1.0 255.255.255.0 rtlabel RING_1
will set the new interface address and attach the route label RING_1 to
the corresponding route.

manpage bits from jmc@
ok claudio@ henning@


# 1.89 29-May-2007 uwe

Define IF_ENQUEUE() and friends as proper C statements using do ... while
ok henning


# 1.88 26-May-2007 jason

one extern seems to be better than 20 for ifqmaxlen; ok krw


# 1.87 27-Mar-2007 jmc

grammar from bret lambert, and one more from me;


Revision tags: OPENBSD_4_1_BASE
# 1.86 09-Feb-2007 jmc

grammar fix from bret lambert;


# 1.85 28-Nov-2006 reyk

add additional link states to report the half duplex / full duplex
state, if known by the driver. this is required to check the full
duplex state without depending on the ifmedia ioctl which can't be
called in the kernel without process context.

ok henning@, brad@


# 1.84 16-Nov-2006 henning

introduce if_creategroup() to create an empty interface group.
code factored out from if_addgroup(), previously a group always had to have
members. ok mpf mcbride


# 1.83 31-Oct-2006 jason

ether_input_mbuf() isn't necessary, turn it into a macro and deal with
it's "special" case in ether_input(). Based on similiar idea in FreeBSD.
ok brad


Revision tags: OPENBSD_4_0_BASE
# 1.82 02-Jun-2006 mpf

Introduce attributes to interface groups.
As a first user, move the global carp(4) demotion counter
into the interface group. Thus we have the possibility
to define which carp interfaces are demoted together.

Put the demotion counter into the reserved field of the carp header.
With this, we can have carp act smarter if multiple errors occur.
It now always takes over other carp peers, that are advertising
with a higher demote count. As a side effect, we can also have
group failovers without the need of running in preempt mode.
The protocol change does not break compability with older
implementations.

Collaborative work with mcbride@

OK mcbride@, henning@


# 1.81 27-May-2006 brad

remove IFCAP_JUMBO_MTU interface capabilities flag and set if_hardmtu in a few
more drivers.

ok reyk@


# 1.80 26-May-2006 deraadt

rename jumbo mtu to if_hardmtu; ok brad reyk


# 1.79 19-May-2006 reyk

add a if_jumbo_mtu field to the interface structure for drivers
supporting ethernet jumbo frames. there's no standard for the size of
jumbo MTUs, so either let the driver set it's own value or use 9000
byte jumbo frames by default.

ok brad@


# 1.78 04-Mar-2006 brad

With the exception of two other small uncommited diffs this moves
the remainder of the network stack from splimp to splnet.

ok miod@


Revision tags: OPENBSD_3_9_BASE
# 1.77 09-Feb-2006 reyk

add an interface detach hook and use it with the vlan(4) driver. this
fixes a possible crash if the parent interface has been destroyed
(like vlan on trunk) before destroying the vlan interface.

ok brad@


Revision tags: OPENBSD_3_8_BASE
# 1.76 14-Jun-2005 henning

rename function and define to reflect the external -> egress name change
so it is clear what it is all about


# 1.75 14-Jun-2005 henning

use "egress" instead of "external" for the interface group containing the
interfaces the default route(s) point to, proposed deraadt some days ago,
ok djm deraadt


# 1.74 12-Jun-2005 henning

add SIOCGIFGMEMB ioctl, returns a list of all interfaces who are member of
the given group, markus ok


# 1.73 07-Jun-2005 henning

introduce a default "external" interface group, containing the interface(s)
the the default route(s) point to.
handles IPv4 and IPv6 as well as multipath routes.
follows default route changes, of course.
eases writing pf rulesets especially on laptops etc. that use different
interfaces depending on the environment (wired, wireless, ...)
ok theo ryan


# 1.72 06-Jun-2005 henning

use a define instead of hardcoding "all" in 3 places


# 1.71 05-Jun-2005 henning

const'ify the char *groupname param to if_addgroup and if_delgroup


# 1.70 24-May-2005 markus

add net.inet.ip.ifq for monitoring and changing ifqueue; similar to netbsd
ok henning


# 1.69 24-May-2005 reyk

initial import of a trunking (link aggregation and link failover)
implementation. it currently supports round robin mode with link state
checking, additional modes will be added later.

ok brad@, deraadt@


# 1.68 24-May-2005 henning

keep a list of member interfaces in ifg_group


# 1.67 22-May-2005 henning

allow pf to match on interface groups
pass on mygroup ...
markus ok


# 1.66 21-May-2005 henning

clean up and rework the interface absraction code big time, rip out multiple
useless layers of indirection and make the code way cleaner overall.
this is just the start, more to come...
worked very hard on by Ryan and me in Montreal last week, on the airplane to
vancouver and yesterday here in calgary. it hurt.
ok ryan theo


# 1.65 20-Apr-2005 mpf

Introduce if_linkstatehooks.
This converts if_link_state_change() to a generic usable
callback with dohooks().

OK henning@, camield@
Tested by camield@ and Alexey E. Suslikov


Revision tags: OPENBSD_3_7_BASE
# 1.64 07-Feb-2005 mcbride

Add new function if_link_state_change() to take care of sending messages
on the routing socket and notifying carp() of link changes.

ok brad@ mpf@


# 1.63 14-Jan-2005 henning

remove old ifgroups ioctls
the old ifgroups haven't been in use ever really, and the new
implementation is 3 months old today. theo ok (3 months ago)


# 1.62 07-Dec-2004 mcbride

Convert carp(4) to behave more like a regular interface, much in the same
style as vlan(4). carp interfaces no longer require the physical interface
to be on the same subnet as the carp interface, or even that the physical
interface has an adress at all, so CARP can now be used on /30 networks.

ok deraadt@ henning@


# 1.61 07-Dec-2004 mcbride

KNF


# 1.60 03-Dec-2004 henning

do not use one struct timeout for the if congestion stuff, but embed
a struct timeout to struct ifqueue so that each one has its own - it
is a per-queue thing. from chris pascoe


# 1.59 10-Nov-2004 grange

Safer IF_INPUT_ENQUEUE macro.

ok millert@


# 1.58 14-Oct-2004 mickey

avoid stupid commons


# 1.57 11-Oct-2004 henning

ifgroups reqrite
there is now a TAILQ with all interface groups as members, and
in struct ofnet there is only a pointer to the group structure stored
and not its name.
mostly hacked at c2k4 and somewhere over the atlantic ocean
ok markus mcbride


Revision tags: OPENBSD_3_6_BASE
# 1.56 26-Jun-2004 markus

cleanup ioctl for ifgroups; ok pb@


# 1.55 25-Jun-2004 pb

introduce "interface groups"

by "ifconfig fxp0 group foobar" "ifconfig xl0 group foobar"
these two interfaces are in one group.
Every interface has its if-family as default group.

idea/design from henning@, based on some work/disucssion from Joris Vink.

henning@, mcbride@ ok.


# 1.54 20-Jun-2004 beck

undo mbuf cluster breakage that causes free'ed packets to show up on the
input queues when using dhcp and hostap wi, or xl, or fxp....
ok art@


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.53 29-May-2004 jcs

introduce SIOCSIFDESCR and SIOCGIFDESCR to maintain interface
descriptions, configurable with ifconfig

help from various, ok deraadt@


# 1.52 18-May-2004 brad

if_ether.h
add ETHER_MAX_LEN_JUMBO, ETHER_VLAN_ENCAP_LEN, ETHER_ALIGN, and
ETHERMTU_JUMBO constants.

if.h
add a few more interface capabilities flags.

Some from NetBSD, some from FreeBSD.

ok markus@


# 1.51 26-Apr-2004 mcbride

Before enqueueing the packet, copy the contents of incoming clusters
to the mbuf and free the cluster when it contains a small packet.

ok deraadt@


# 1.50 17-Apr-2004 henning

add a congestion indicator to if_queue. It is set when the input queue
is full, along with a timer that unsets it again after 10ms.
The input queue beeing full is a reliable indicator for CPU overload, and
this flag allows other subsystems to cope with the situation.
hacked with beck
ok kjc@ markus@ beck@


Revision tags: OPENBSD_3_5_BASE
# 1.49 15-Jan-2004 markus

add a RTM_IFANNOUNCE message; from netbsd; ok itojun, henning


# 1.48 16-Dec-2003 markus

return error in ifc_destroy; ok deraadt, itojun, cedric, hshoexer


# 1.47 10-Dec-2003 itojun

use if_indexlim (instead of if_index) and ifindex2ifnet[x] != NULL
to check if interface exists, as (1) if_index will have different meaning
(2) ifindex2ifnet could become NULL when interface gets destroyed,
when we introduce dynamically-created interfaces. markus ok


# 1.46 08-Dec-2003 markus

add IOCIFGCLONERS; ifconfig -C; from netbsd; ok henning, deraadt


# 1.45 03-Dec-2003 markus

support for network interface "cloning", e.g. gif(4) via ifconfig(8)


# 1.44 19-Oct-2003 david

more typos


# 1.43 17-Oct-2003 mcbride

Common Address Redundancy Protocol

Allows multiple hosts to share an IP address, providing high availability
and load balancing.

Based on code by mickey@, with additional help from markus@
and Marco_Pfatschbacher@genua.de

ok deraadt@


Revision tags: OPENBSD_3_4_BASE
# 1.42 25-Aug-2003 fgsch

if_init support, required by ieee80211.
deraadt@ ok.


# 1.41 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_A UBC_SYNC_B
# 1.40 03-Jul-2002 miod

Change all variables definitions (int foo) in sys/sys/*.h to variable
declarations (extern int foo), and compensate in the appropriate locations.


# 1.39 30-Jun-2002 itojun

allocate sockaddr_dl for ifnet in if_alloc_sadl(), as we don't always know
the size of sockaddr_dl on if_attach() - for instance, see ether_ifattach().
from netbsd. fgs ok


# 1.38 23-Jun-2002 itojun

g/c last remains of old ipv6 prefix management


# 1.37 27-May-2002 itojun

if_attach() gets called before domaininit(). scan all interfaces for if_afdata
initialization after domaininit().


# 1.36 27-May-2002 itojun

framework to add af-dependent data structure to struct ifnet.
as discussed at bsd-api-discuss. sync w/kame


# 1.35 24-Apr-2002 dhartmei

Add hooks to struct ifnet that allow to register callbacks that will be
notified of interface address changes. ok provos@, angelos@


Revision tags: OPENBSD_3_1_BASE
# 1.34 15-Mar-2002 millert

Cosmetic changes only, primarily making comments line up nicely after the
__P removal.


# 1.33 14-Mar-2002 millert

First round of __P removal in sys


# 1.32 23-Jan-2002 fgsch

compatability -> compatibility.


Revision tags: OPENBSD_3_0_BASE UBC_BASE
# 1.31 05-Jul-2001 angelos

branches: 1.31.4;
KNF


# 1.30 05-Jul-2001 jjbg

Include files for IPComp support. angelos@ ok.


# 1.29 27-Jun-2001 kjc

ALTQ base modifications to the kernel.
- ALTQ introduces a set of new queue macros that coexist with the
traditional IF_XXX macros.
- "struct ifaltq" replaces "struct ifqueue" in "struct ifnet".
- assign cdev major 74 for i386 and 54 for alpha as ALTQ control interface.


# 1.28 23-Jun-2001 fgsch

Add ether_input_mbuf to help us remove the ether_header from
ether_input; all drivers should start migrating to this.
Discussed with jason@, deraadt@ more or les ok'ed.


# 1.27 15-Jun-2001 itojun

change the meaning of ifnet.if_lastchange to meet RFC1573 ifLastChange.
follows BSD/OS practice and ucd-snmp code (FreeBSD does it for specific
interfaces only).

was: if_lastchange get updated on every packet transmission/receipt.
now: if_lastchange get updated when IFF_UP is changed.


# 1.26 09-Jun-2001 angelos

By popular demand, protect from multiple inclusion, and fix to use the
same naming style.


# 1.25 28-May-2001 angelos

IPSECv4 -> IPSEC


# 1.24 28-May-2001 angelos

No need for separate ESP/AH interface capabilities.


# 1.23 28-May-2001 angelos

Interface capabilities (based on NetBSD, but merge ethercom and ifnet
capabilities into one, in the ifp).


Revision tags: OPENBSD_2_9_BASE
# 1.22 06-Feb-2001 mickey

allow changing number of loopbacks in ukc.
change rest of the code to use lo0ifp pointing
to the corresponding struct ifnet.
itojun@ and niklas@ ok


# 1.21 19-Jan-2001 itojun

pull post-4.4BSD change to sys/net/route.c from BSD/OS 4.2 (UCB copyrighted).

have sys/net/route.c:rtrequest1(), which takes rt_addrinfo * as the argument.
pass rt_addrinfo all the way down to rtrequest, and ifa->ifa_rtrequest.
3rd arg of ifa->ifa_rtrequest is now rt_addrinfo * instead of sockaddr *
(almost noone is using it anyways).

benefit: the follwoing command now works. previously we need two route(8)
invocations, "add" then "change".
# route add -inet6 default ::1 -ifp gif0

remove unsafe typecast in rtrequest(), from rtentry * to sockaddr *. it was
introduced by 4.3BSD-reno and never corrected.

XXX is eon_rtrequest() change correct regarding to 3rd arg?
eon_rtrequest() and rtrequest() were incorrect since 4.3BSD-reno,
so i do not have correct answer in the source code.
someone with more clue about netiso-over-ip, please help.


Revision tags: OPENBSD_2_8_BASE
# 1.20 20-Sep-2000 art

Since ifa_refcnt was bumped to an int and rt_flags is an int too, bump
ifa_flags to int.


# 1.19 28-Aug-2000 deraadt

changing the size of if_data has heavy impact on userland compat. there
was even a u_char slot available for ifi_link_state, which clearly does not
need a full 32 bits.


# 1.18 26-Aug-2000 nate

sync mii code with netbsd
adds detach functionality for phys
some code cleanup

Nobody really had time to test all of this out, but theo said commit anyway


Revision tags: OPENBSD_2_7_BASE
# 1.17 22-Mar-2000 itojun

remove if_withname(), which was imported during KAME merge by mistake.


# 1.16 21-Mar-2000 mickey

add SIOCGIFMTU/SIOCSIFMTU; remediate redundant code of tun, ppp, sppp; chris@ ok


Revision tags: SMP_BASE
# 1.15 02-Feb-2000 itojun

branches: 1.15.2;
wrap IFAFREE() by "do {} while (0)". it wasn't safe enough.


Revision tags: kame_19991208
# 1.14 08-Dec-1999 itojun

bring in KAME IPv6 code, dated 19991208.
replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support.
see sys/netinet6/{TODO,IMPLEMENTATION} for more details.

GENERIC configuration should work fine as before. GENERIC.v6 works fine
as well, but you'll need KAME userland tools to play with IPv6 (will be
bringed into soon).


Revision tags: OPENBSD_2_6_BASE
# 1.13 08-Aug-1999 niklas

Support detaching of network interfaces. Still work to do in ipf, and
other families than inet.


# 1.12 23-Jun-1999 cmetz

Added some protocol independent interfaces (supposedly IPv6 support APIs, but
ones that are useful for all protocols, not just IPv6).


Revision tags: OPENBSD_2_5_BASE
# 1.11 13-Mar-1999 deraadt

make ifa_refcnt a u_int; andrewb@demon.net


# 1.10 26-Feb-1999 jason

Ethernet bridge/IP firewall driver.


# 1.9 07-Jan-1999 deraadt

fix IFAFREE() to be safe for if/else nesting


Revision tags: OPENBSD_2_4_BASE
# 1.8 03-Sep-1998 jason

o OpenBSD gets if_media support (from NetBSD)
o rework/simplify if_xl to use it


Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE OPENBSD_2_2_BASE OPENBSD_2_3_BASE
# 1.7 02-Jul-1996 niklas

-Wall & -Wstrict-prototype fixes


# 1.6 29-Jun-1996 deraadt

provide if_attachhead(), and make if_loop use it


# 1.5 10-May-1996 deraadt

if_name/if_unit -> if_xname/if_softc


# 1.4 06-May-1996 mickey

if.h was missed from the commit.
if_ethersubr.c: missed variables added.


# 1.3 05-Mar-1996 mickey

Changes for ifconfig to compile.


# 1.2 03-Mar-1996 niklas

From NetBSD: 960217 merge


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.190 16-Jan-2018 mpi

Recycle IFF_NOTRAILERS into IFF_STATICARP and document ownerhsip
of IFF* flags.

inputs from jmc@, ok bluhm@, visa@


# 1.189 21-Dec-2017 dlg

prototype if_attach_iqueues so drivers can configure multiple iqs.


# 1.188 09-Nov-2017 tb

The cmd argument of ifconf() has been unused since COMPAT_LINUX was
purged. Remove it and move the prototype to if.c since ifconf() is
not used outside of this file.

ok mpi


# 1.187 31-Oct-2017 sashan

- add one more softnet taskq
NOTE: code still runs with single softnet task. change definition of
SOFTNET_TASKS in net/if.c, if you want to have more than one softnet task

OK mpi@, OK phessler@


Revision tags: OPENBSD_6_1_BASE OPENBSD_6_2_BASE
# 1.186 24-Jan-2017 krw

A space here, a space there. Soon we're talking real whitespace
rectification.


# 1.185 24-Jan-2017 dlg

add support for multiple transmit ifqueues per network interface.

an ifq to transmit a packet is picked by the current traffic
conditioner (ie, priq or hfsc) by providing an index into an array
of ifqs. by default interfaces get a single ifq but can ask for
more using if_attach_queues().

the vast majority of our drivers still think there's a 1:1 mapping
between interfaces and transmit queues, so their if_start routines
take an ifnet pointer instead of a pointer to the ifqueue struct.
instead of changing all the drivers in the tree, drivers can opt
into using an if_qstart routine and setting the IFXF_MPSAFE flag.
the stack provides a compatability wrapper from the new if_qstart
handler to the previous if_start handlers if IFXF_MPSAFE isnt set.

enabling hfsc on an interface configures it to transmit everything
through the first ifq. any other ifqs are left configured as priq,
but unused, when hfsc is enabled.

getting this in now so everyone can kick the tyres.

ok mpi@ visa@ (who provided some tweaks for cnmac).


# 1.184 23-Jan-2017 mpi

Flag pseudo-interfaces as such in order to call add_net_randomness()
only once per packet.

Fix a regression introduced when if_input() started to be called by
every pseudo-driver.

ok claudio@, dlg@


# 1.183 23-Jan-2017 dlg

i botched the copyout to ifr->ifr_data in SIOCGIFDATA.

this lets pflogd run again.

rename if_data() to if_getdata() while here to make grepping for
things less noisy.

reported by jsg@
worked through with deraadt@


# 1.182 23-Jan-2017 dlg

merge the ifnet and ifqueue stats together when userland wants them.

a new if_data() function takes a pointer to ifnet and merges its
if_data and ifq statistics. it takes the ifq mutex around the reads
of the ifq stats so they get a consistent copy.

the ifnet and ifq stats are merged because some parts of the stack
still update the ifnet counters.

ok visa@ (on an earlier diff) mpi@ claudio@


# 1.181 12-Dec-2016 mpi

Remove most of the splsoftnet() recursions related to cloned interfaces.

inputs and ok bluhm@


# 1.180 27-Oct-2016 dlg

add a new pool for 2k + 2 byte (mcl2k2) clusters.

a certain vendor likes to make chips that specify the rx buffer
sizes in kilobyte increments. unfortunately it places the ethernet
header on the start of the rx buffer, which means if you give it a
mcl2k cluster, the ethernet header will not be ETHER_ALIGNed cos
mcl2k clusters are always allocated on 2k boundarys (cos they pack
into pages well). that in turn means the ip header wont be aligned
correctly.

the current workaround on these chips has been to let non-strict
alignment archs just use the normal 2k cluster, but use whatever
cluster can fit 2k + 2 on strict archs. that turns out to be the
4k cluster, meaning we waste nearly 2k of space on every packet.

properly aligning the ethernet header and ip headers gives a
performance boost, even on non-strict archs.


# 1.179 04-Sep-2016 reyk

Move code to change the rdomain of an interface from the ioctl switch case
to a new function if_setrdomain().

OK mpi@ henning@


# 1.178 03-Sep-2016 reyk

Add support for a multipoint-to-multipoint mode in vxlan(4). In this
mode, vxlan(4) must be configured to accept any virtual network
identifier with "vnetid any" and added to a bridge(4) or switch(4).
This way the driver will dynamically learn the tunnel endpoints and
their vnetids for the responses and can be used to dynamically bridge
between VXLANs. It is also being used in combination with switch(4)
and the OpenFlow tunnel classifiers.

With input from yasuoka@ goda@
OK deraadt@ dlg@


Revision tags: OPENBSD_6_0_BASE
# 1.177 10-Jun-2016 vgross

Add the "llprio" field to struct ifnet, and the corresponding keyword
to ifconfig.

"llprio" allows one to set the priority of packets that do not go through
pf(4), as the case is for arp(4) or bpf(4).

ok sthen@ mikeb@


# 1.176 02-Mar-2016 dlg

provide generic ioctls for managing an interfaces parent

in the future this will subsume the individual vlandev, carpdev,
pppoedev, foodev options for things like vlan, carp, pppoe, etc.

inspired by vnetid

ok mpi@ jmatthew@


Revision tags: OPENBSD_5_9_BASE
# 1.175 05-Dec-2015 deraadt

avoid an ugly wrap in a comment


# 1.174 03-Dec-2015 dlg

rework if_start to allow nics to provide an mpsafe start routine.

existing start routines will still be called under the kernel lock
and at IPL_NET.

mpsafe start routines will be serialised so only one instance of
each interfaces function will be running in the kernel at any point
in time. this guarantees packets will be dequeued in order, and the
start routines dont have to lock against themselves because if_start
does it for them.

the code to do that is based on the scsi runqueue code.

this also provides an if_start_barrier() function that should wait
until any currently running instances of if_start have finished.

a driver can opt in to the mpsafe if_start call by doing the following:

1. setting ifp->if_xflags = IFXF_MPSAFE
2. only calling if_start() instead of its own start routine
3. clearing IFF_RUNNING before calling if_start_barrier() on its way down
4. only using IFQ_DEQUEUE (not ifq_deq_begin/commit/rollback)

to simplify the implementation the tx mitigation code has been removed.

tested by several
ok mpi@ jmatthew@


# 1.173 20-Nov-2015 mpi

Keep if_ref() private, if_get() is what you want to use before if_put().

The thread detaching an interface will sleep until all references to this
interface have been released. So we decided to only keep references for
a short period of time.

Keeping if_ref() private will hopefully help preserve this goal as long
as it makes sense.

Calling if_get()/if_put() in the same function also allows us to make
use of static analysis tools (thanks jsg@!) to catch our errors.

ok dlg@


# 1.172 24-Oct-2015 reyk

Add pair(4), a vether-based virtual Ethernet driver to interconnect
rdomains and bridges on the local system. This can be used to route
through local rdomains, to create L2 devices (like trunks) between
them, and many other things.

Discussed with many, with input from mpi@
OK sthen@ phessler@ yasuoka@ mikeb@


# 1.171 23-Oct-2015 claudio

Introduce a new sysctl NET_RT_IFNAMES that returns only ifnames to ifindex
mappings. This will be used by if_nameindex(3), if_nametoindex(3) and
if_indextoname(3) soon to fix the issues in pledge because of inet6 link
local addressing.
OK mpi@ benno@ deraadt@
The libc version will follow soon so better start updating your kernels


# 1.170 23-Oct-2015 dlg

tweak the vnetid so it can be optional and therefore cleared/deleted.

the abstract vnetid is promoted to a uin32_t, and adds a SIOCDVNETID
ioctl so it can be cleared.

this is all because i set an assignment on implementing a virtual
network interface and the students got confused when vnetid 0 didnt
show up in ifconfig output.

the vnetid in the vxlan(4) protocol is optional, but the current
code confuses 0 with no vnetid being set. this makes it clear.

ok reyk@ who also simplified my diff


# 1.169 05-Oct-2015 uebayasi

Add ifi_oqdrops and its alias to struct if_data.

Necessary bumps in Ports will be handled by sthen@.

OK mpi@ dlg@


# 1.168 27-Sep-2015 stsp

Add if_setlladdr(), factored out from ifioctl(). Will be used by iwm(4) soon.
With suggestions from tedu@ and guenther@
ok kettenis@


# 1.167 11-Sep-2015 stsp

Make room for media types of the future. Extend the ifmedia word to 64 bits.
This changes numbers of the SIOCSIFMEDIA and SIOCGIFMEDIA ioctls and
grows struct ifmediareq.

Old ifconfig and dhclient binaries can still assign addresses, however
the 'media' subcommand stops working. Recompiling ifconfig and dhclient
with new headers before a reboot should not be necessary unless in very
special circumstances where non-default media settings must be used to
get link and console access is not available.

There may be some MD fallout but that will be cleared up later.

ok deraadt miod
with help and suggestions from several sharks attending l2k15


# 1.166 09-Sep-2015 dlg

introduce reference counts for interfaces (ie, struct ifnet *ifp).

if_get can get a reference to an ifp, but it never releases that
reference. this provides an if_put function that can be used to
decrement the refcount.

we cannot come up with a scheme for letting the network stack run on
one (or many) cpus while ioctls are pulling interfaces down on another
cpu without refcounts for the interfaces.

if_put is going in now so we can go through the stack and put the
necessary calls to it in, and then we'll backfill this implementation
to actually check the refcounts when the interface detaches.

ok mpi@ mikeb@ claudio@


# 1.165 30-Aug-2015 mpi

Use a global table for domains instead of building a list at run time.

As a side effect there's no need to run if_attachdomain() after the
list of domains has been built.

ok claudio@, reyk@


Revision tags: OPENBSD_5_8_BASE
# 1.164 07-Jun-2015 jsg

Introduce unhandled_af() for cases where code conditionally does
something based on an address family and later assumes one of the paths
was taken. This was initially just calls to panic until guenther
suggested a function to reduce the amount of strings needed.

This reduces the amount of noise with static analysers and acts
as a sanity check.

ok guenther@ bluhm@


# 1.163 18-May-2015 reyk

Move the rdomain from struct ifnet into struct if_data. This way it
will be exported to userland with the existing sysctl, getifaddrs()
and routing socket (if_msghdr.ifm_data) interfaces that expose
if_data. All programs and daemons - Apps - that call the
SIOCGIFRDOMAIN ioctl in a getifaddrs() loop or after receiving an
interface message on the routing socket can now remove the pointless
additional ioctl. In base, that could be: dhclient, isakmpd, dhcpd,
dhcrelay, ntpd, ospfd, ripd, ifconfig.

No ABI breakage because it uses a previously unused pad field in if_data.

OK mpi@ deraadt@


# 1.162 10-Apr-2015 mpi

Run detach hook and similar before cleaning up any other resource when
an interface is destroyed/removed. This way we can ensure pseudo-driver
changes done after attaching an interface are undone before detaching it.

Note: it is safe to call if_deactivate() multiple times as the interface
should not have any attached pseudo-interface after the first call.

ok deraadt@, dlg@


# 1.161 18-Mar-2015 dlg

remove the congestion handling from struct ifqueue.

its only used for the ip and ip6 network stack input queues, so it
seems unfair that every instance of ifqueue has to carry a pointer
around for this specific use case.

this moves the congestion marker to a kernel global. if we detect
that we're congested, we assume the whole system is busy and punish
all input queues.

marking a system as congested is done by setting the global to the
current value of ticks. as the system moves away from that value,
it moves away from being congested until the comparison fails.

written at s2k15
ok henning@ beck@ bluhm@ claudio@


Revision tags: OPENBSD_5_7_BASE
# 1.160 08-Feb-2015 mpi

Introduce if_input() a function to pass packets dequeued from a
recieving ring to the stack.

if_input() is at the moment a drop-in replacement for ether_input_mbuf()
but will let us stack pseudo-driver in a nice way in order to no longer
call ether_input() recursively.

ok pelikan@, reyk@, blambert@, henning@


# 1.159 06-Jan-2015 stsp

Remove the NOINET6 interface flag, a left-over from the times when IPv6
was enabled by default. Add AFATTACH/AFDETACH ioctls which enable/disable
an address family for an interface (currently used for IPv6 only).

New kernel needs new ifconfig for IPv6 configuration (address assignment
still works with old ifconfig making this easy to cross over).

Committing on behalf of henning@ who is currently lebensmittelvergiftet.
ok stsp, benno, mpi


# 1.158 05-Dec-2014 mpi

Explicitly include <net/if_var.h> instead of pulling it in <net/if.h>.

ok mikeb@, krw@, bluhm@, tedu@


Revision tags: OPENBSD_5_6_BASE
# 1.157 14-Jul-2014 dlg

now that receive ring accounting has been pulled out of the mbuf layer,
we can pull the space the mbuf layer used to do per interface accounting
out of struct if_data.

saves a hundredish bytes on every interface.

ok deraadt@ claudio@


# 1.156 11-Jul-2014 henning

introduce the IFXF_AUTOCONF6 interface flag which controls wether we
accept rtadvs on that interface. the global net.inet6.ip6.accept_rtadv
sysctl just doesn't cut it, even tho the spec wants that - but in their
little absurd world, a host just has one interface by definition anyway...
the sysctlgoes away.
lots of head scratching, brain cell elemination etc from bluhm benno stsp
florian, excitement from simon and todd, ok bluhm stsp benno florian


# 1.155 08-Jul-2014 dlg

introduce the if_rxr api. it is intended to pull the rx ring accounting
out of the mbuf layer, and break the assumption that an interface will
only have a single ring per mbuf cluster size.

mpi@ is ok with moving this forward


# 1.154 13-Jun-2014 mpi

Instead of updating all the cluster allocation water marks of all the
interfaces when the kernel is livelocked, only do it for the current
pool and defer the other updates.

This allow us to get rid of an interface list iteration in a critical
path.

Ridding the libc crank since this change introduce an ABI break.

ok claudio@


Revision tags: OPENBSD_5_5_BASE
# 1.153 21-Nov-2013 mikeb

split kernel parts of the if.h into a separate header file if_var.h
which allows us to modify ifnet structure in a relatively safe way;
discussed with deraadt, ok mpi


# 1.152 09-Nov-2013 dlg

ticks is compared against mcl_grown to see if time has elapsed since
the rx ring was last allowed to grow and then assigned to it. it
is erroneous to do this because mcl_grown is a u_int and ticks is an
int.

this makes mcl_grown an int, and follows the idiom in kern_timeout.c
of going "thing - ticks < diff", which better copes with ticks
wrapping around and being used to calculate relative intervals.

ok pirofti@ guenther@


# 1.151 01-Nov-2013 pelikan

keep net/hfsc.h away from userspace, except in pfctl

tested by naddy, ok deraadt


# 1.150 21-Oct-2013 benno

nuke comment. How soon is now?
"do it" deraadt@


# 1.149 19-Oct-2013 reyk

Bring back the if_detachhook. We're going to have more users now.

ok mpi@ henning@ benno@


# 1.148 13-Oct-2013 reyk

Import vxlan(4), the virtual extensible local area network tunnel
interface. VXLAN is a UDP-based tunnelling protocol for overlaying
virtualized layer 2 networks over layer 3 networks. The implementation
is based on draft-mahalingam-dutt-dcops-vxlan-04 and has been tested
with other implementations in the wild.

put it in deraadt@


# 1.147 12-Oct-2013 henning

new bandwidth shaping subsystem, kernel side
uses hfsc behind the scenes; altq stays in parallel for a migration phase.
if.h even more messy for the transition, but eventuelly it should become
readable...
looked over & tested by many, ok phessler sthen


# 1.146 17-Sep-2013 mpi

Change vlan(4) detach procedure to not use a hook but a list of vlans
on the parent interface. This is similar to what bridge(4), trunk(4)
or carp(4) are doing and allows us to get rid of the detachhook.

ok reyk@, mikeb@


# 1.145 28-Aug-2013 mpi

Remove unused argument from *rtrequest()

ok krw@, mikeb@


Revision tags: OPENBSD_5_4_BASE
# 1.144 20-Jun-2013 mpi

Revert previous and unbreak asr, the new include should be protected.

Reported by naddy@


# 1.143 20-Jun-2013 mpi

Allocate the various hook head descriptors as part of the ifnet
structure rather than doing various M_WAITOK allocations during
the *attach() functions, we always rely on them anyway.

ok mikeb@, uebayasi@


# 1.142 02-Apr-2013 mpi

Instead of storing the link-level address of every interface in a global
array indexed by interface numbers, add a new field to the interface
descriptor pointing to it.

claudio@ and todd@ like it, ok mikeb@


# 1.141 26-Mar-2013 mpi

Remove various read-only *maxlen variables and use IFQ_MAXLEN directly.

ok beck@, mikeb@


# 1.140 20-Mar-2013 mpi

Introduce if_get() to retrieve an interface descriptor pointer given
an interface index and replace all the redondant checks and accesses
to a global array by a call to this function.

With imputs from and ok bluhm@, mikeb@


# 1.139 07-Mar-2013 mpi

Remove unused ifa_ifwithaf() function.

ok mikeb@, miod@


# 1.138 07-Mar-2013 mpi

Remove the IFAFREE() macro, the ifafree() function it was calling already
check for the reference counter.

ok mikeb@, miod@, pelikan@, kettenis@, krw@


Revision tags: OPENBSD_5_3_BASE
# 1.137 23-Nov-2012 sthen

Add SIOCGIFHARDMTU to allow retrieving the driver's maximum supported MTU
looks fine reyk@ ok mikeb@


# 1.136 11-Nov-2012 deraadt

align ifaliasreq.ifra_addr similar to the way that ifreq is fixed --
a gruesome union, to block the compiler from placing the struct
incorrectly aligned on stack frames
ok guenther


# 1.135 05-Oct-2012 camield

Point an interface directly to its bridgeport configuration, instead
of to the bridge itself. This is ok, since an interface can only be part
of one bridge, and the parent bridge is easy to find from the bridgeport.

This way we can get rid of a lot of list walks, improving performance
and shortening the code.

ok henning stsp sthen reyk


# 1.134 19-Sep-2012 henning

defina an IFCAP_CSUM_MASK, covering IFCAP_CSUM_*, and use it in if_vlan.c
to replace the list of them.
this actually makes vlan inherit the IPv6 CSUM flags from it's parent, that
had been commented out since this code was committed back in 2001.
ok benno mpf


# 1.133 10-Sep-2012 guenther

Bring into compliance with POSIX, exposing just the specified bits.

Requested by jasper@, ok kettenis@


# 1.132 21-Aug-2012 bluhm

Reverse the name and meaning of the IFXF_INET6_PRIVACY interface
flag. It is now called IFXF_INET6_NOPRIVACY. So IPv6 privacy
addresses are on by default without resetting the flag during
ifconfig down/up.
OK stsp@, sperreault@ (who wrote the same diff)


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.131 02-Dec-2011 haesbaert

Kill unused IFCAP_IPSEC and IFCAP_IPCOMP.

ok claudio@ henning@ mikeb@


# 1.130 02-Nov-2011 haesbaert

Expose if_capabilities to userland so that ifconfig can display the
device hardware features.
Tune ifconfig to show them with 'hwfeatures' argument.
While here, kill some old unused capabilities and respect 80 columns
in brconfig.h.

ok mcbride@, henning@, mpf@.


# 1.129 07-Oct-2011 henning

rename some vars and functions
unfortunately altq is one giant namespace violation. rename just those that
conflict with new stuff for now only to be found on my laptop. reduce pain,
the diff is huge already. ok ryan


Revision tags: OPENBSD_5_0_BASE
# 1.128 08-Jul-2011 henning

new priority queueing implementation, extremely low overhead, thus fast.
unconditional, always on. 8 priority levels, as every better switch, the
vlan header etc etc. ok ryan mpf sthen, pea tested as well


# 1.127 07-Jul-2011 henning

provide IF_LEN and IFQ_LEN to access ifq_len on an ifqueue, ryan ok


# 1.126 05-Jul-2011 henning

now of course I only noticed if_qflush is completely unused after
adjusting it to the new world order in my tree... remove it, ok ryan claudio


# 1.125 03-Jul-2011 henning

IFQ_CLASSIFY is also just schrapnel


# 1.124 03-Jul-2011 henning

no traces of ALTQ_DECL to be found anywhere, thus kill the #defines


# 1.123 03-Jul-2011 claudio

LINK_STATE_IS_UP() should consider LINK_STATE_UNKNOWN as an up state.
This is now possible because carp no longer uses LINK_STATE_UNKNOWN
for a state that is considered down. This will simplify a lot of code.
OK mpf@ mcbride@ henning@


# 1.122 13-Mar-2011 stsp

Add a way to enable/disable Wake On LAN with ifconfig.
ok deraadt


Revision tags: OPENBSD_4_9_BASE
# 1.121 17-Nov-2010 henning

introduce ifa_update_broadaddr to update an ifaddr's broadcast address,
trivial for the moment, more needed soon
tested by many as part of a larger diff, ok sthen claudio dlg krw


# 1.120 24-Sep-2010 claudio

Implement if_freenameindex() as a real function as required by posix.
OK deraadt@, millert@


# 1.119 23-Sep-2010 dlg

tweak the mclgeti algorithm to behave better under load.

instead of letting hardware rings grow on every interrupt, restrict
it so it can only grow once per softclock tick. we can only punish
the rings on softclock ticks, so it make sense to only grow on
softclock tick boundaries too.

the rings are now punished after >1 lost softclock tick rather than
>2. mclgeti is now more aggressive at detecting livelock.

the rings get punished by an 8th, rather than by half.

we now allow the rings to be punished again even if the system is
already considered in livelock.

without this diff a livelocked system will have its rx ring sizes
scale up and down very rapidly, while holding the rings low for too
long. this affected throughput significantly.

discussed and tested heavily at j2k10. there are still some games
with softnet we can play, but this is a good first step.

"put it in" and ok deraadt@
ok claudio@ krw@ henning@ mcbride@

if we find out that it sucks we can pull it out again later. till then
we'll run with it and see how it goes.


# 1.118 27-Aug-2010 jsg

remove the unused if_init callback in struct ifnet
ok deraadt@ henning@ claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.117 26-Jun-2010 claudio

Implement a simple keepalive mechanism in gre(4) that is compatible with
the one used by Cisco. It sends a return gre packet inside a gre packet
to the other side and expects it to return.
OK deraadt, reyk additional testing by sthen


# 1.116 28-May-2010 claudio

Rework the way we handle MPLS in the kernel. Instead of fumbling MPLS into
ether_output() and later on other L2 output functions use a trick and over-
load the ifp->if_output() function pointer on MPLS enabled interfaces to
go through mpls_output() which will then call the link level output function.
By setting IFXF_MPLS on an interface the output pointers are switched.
This now allows to cleanup the MPLS input and output pathes and fix mpe(4)
so that the MPLS code now actually works for both P and PE systems.
Tested by myself and michele
(A custom kernel with MPLS and mpe enabled is still needed).


# 1.115 17-Apr-2010 deraadt

split SIOCSIFLLADDR code out into an ifnewlladr() function
ok stsp


# 1.114 06-Apr-2010 stsp

Simple implementation of RFC4941, "Privacy Extensions for Stateless
Address Autoconfiguration in IPv6". For those among us who are paranoid
about broadcasting their MAC address to the IPv6 internet.

Man page help from jmc, testing by weerd, arc4random API hints from djm.

ok deraadt, claudio


Revision tags: OPENBSD_4_7_BASE
# 1.113 13-Jan-2010 henning

maintain a global RB tree of all local addresses in the system. this
includes AF_LINK addresses (aka mac addresses in the ethernet case). for
inet this also includes the broadcast addresses.
depends on ifinit() called earlier so we have a chance to pool_init before
autoconf assigns the AF_LINK addresses, the v6 fix, and the ifa_add/del
abstraction i just committed.
this is a change in semantics, it is now illegal to change the actual
address in an ifaddr struct because then the RB tree becomes unbalanced.
nothing using this tree yet.
ok theo ryan dlg


# 1.112 13-Jan-2010 henning

instead of fiddling with the per-interface address lists directly in
many places create a proper API (ifa_add / ifa_del) and use it.
ok theo ryan dlg


# 1.111 12-Jan-2010 deraadt

Make the structures for ifa_msghdr and friends even more like
the route messages so that people and compilers will not get
confused.
ok claudio


# 1.110 17-Sep-2009 claudio

Remove the comaptibility structures for routing socket version 3.
The RTM_VERSION bump is 2 years ago and so there is no need for this.
Diff made by tedu@ some time ago but got never commited so I do it now.


# 1.109 14-Sep-2009 claudio

Add a way to convert the ifi_link_state to a string without the use of
if_media. This makes link state tracking a lot easier as there is no need
to convert if types to if_media types, etc. Additionally this allows us
to extend the link states to include states tracked on higher protocol layers.
gre(4) keepalives packets, bfd and udld can be implemented without ugly hacks.
OK henning, michele, sthen, deraadt


# 1.108 10-Aug-2009 deraadt

At sys_reboot time, bring all the interfaces down so that their xxstop
functions are called, which will turn off DMA. Receiving packets into
your memory after a system reboot is pretty nasty. This will also mean
that the shutdown hooks can go; this solution is smaller.
ok henning miod dlg kettenis


Revision tags: OPENBSD_4_6_BASE
# 1.107 06-Jun-2009 rainer

when xflags got changed, tell the userland by routing sockets

ok henning@


# 1.106 05-Jun-2009 claudio

Initial support for routing domains. This allows to bind interfaces to
alternate routing table and separate them from other interfaces in distinct
routing tables. The same network can now be used in any doamin at the same
time without causing conflicts.
This diff is mostly mechanical and adds the necessary rdomain checks accross
net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6.
input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@


# 1.105 04-Jun-2009 henning

allow IPvShit to be turned off completely per-interface.
ifconfig em0 -inet6
deletes all v6 addresses including link-local and prevents new ones from
being added.
ifconfig em0 inet6 <addr>
re-enables v6, brings the link local back and adds optional <addr>
ok theo reyk


# 1.104 03-Jun-2009 beck

make wireless interfaces priority 4 by default. other interfaces remain
priority 0. while we are in here make sure we add wi interfaces to group "wlan"
in the same way the net80211 stuff already is.

this makes dhcp multiple default routes useful on laptops.

ok claudio@


Revision tags: OPENBSD_4_5_BASE
# 1.103 27-Jan-2009 dlg

make drivers tell the mclgeti allocator what their maximum ring size is
to prevent the hwm growing beyond that. this allows the livelock mitigation
to do something where the hwm used to grow beyond twice the rx rings size.

ok kettenis@ claudio@


# 1.102 12-Dec-2008 claudio

Introduce a if_priority that will be added to RTP_STATIC when routes are
added without an expilict priority. This allows to specify less prefered
interfaces that will only take over if the primary interface loses link.
OK deraadt@


# 1.101 11-Dec-2008 deraadt

export per-interface mbuf cluster pool use statistics out to userland
inside if_data, so that netstat(1) and systat(1) can see them
ok dlg


# 1.100 30-Nov-2008 brad

- Remove unused if_reset "bus reset routine" field in the ifnet struct.
- Add if_stop "stop routine" field in the ifnet struct.

ok mglocker@


# 1.99 26-Nov-2008 dlg

provide m_clsetlwm, an interface for an interface to raise its low
watermark for mbuf cluster allocations.

this is necessary for things like bge which cannot cope with less than a
certain number of pkts on the ring.

ok deraadt@


# 1.98 25-Nov-2008 deraadt

Factor increases are not needed, +1 appears to work as well.
ok dlg


# 1.97 24-Nov-2008 deraadt

move MCLPOOLS to if.h and force uipc_mbuf.c to get if.h, there is no
other option
ok dlg


# 1.96 24-Nov-2008 dlg

add several backend pools to allocate mbufs clusters of various sizes out
of. currently limited to MCLBYTES (2048 bytes) and 4096 bytes until pools
can allocate objects of sizes greater than PAGESIZE.

this allows drivers to ask for "jumbo" packets to fill rx rings with.

the second half of this change is per interface mbuf cluster allocator
statistics. drivers can use the new interface (MCLGETI), which will use
these stats to selectively fail allocations based on demand for mbufs. if
the driver isnt rapidly consuming rx mbufs, we dont allow it to allocate
many to put on its rx ring.

drivers require modifications to take advantage of both the new allocation
semantic and large clusters.

this was written and developed with deraadt@ over the last two days
ok deraadt@ claudio@


# 1.95 07-Nov-2008 deraadt

give this some /* CONSTCOND */ love


Revision tags: OPENBSD_4_4_BASE
# 1.94 10-Apr-2008 dlg

introduce mitigation for the calling of an interfaces start routine.

decent drivers prefer to have a lot of packets on the send queue so they
can queue a lot of them up on the tx ring and then post them all in one
big chunk. unfortunately our stack queues one packet onto the send queue
and then calls the start handler immediately.

this mitigates against that queue, send, queue, send behaviour by trying to
call the start routine only once per softnet. now its queue, queue, queue,
send.

this is the result of a lot of discussion with claudio@
tested by many.


Revision tags: OPENBSD_4_3_BASE
# 1.93 18-Nov-2007 mpf

Sync struct ifaltq to match struct ifqueue.
I wonder why 64-bit archs have not been bitten by this.
OK mcbride@, henning@


# 1.92 03-Sep-2007 claudio

Bump RTM_VERSION to 4 and start a new aera of routing in OpenBSD :)
Changes include 64bit counters instead of u_long, routing table id in the header
of most messages, reserved routing priority field, added a hdrlen field to skip
over the header so that binary compatibility becomes easier.
A minimal backward support for old binaries is included to ease upgrades but
don't expect anything more than ifconfig, route and dhclient to correctly work.
OK henning@ mglocker@


Revision tags: OPENBSD_4_2_BASE
# 1.91 25-Jun-2007 henning

crank ifq_maxlen from 50 to 256, so it is not smaller than most interfaces
rx rings any more. forwarding boxes with many fast interfaces can still use
some more, but this is a saner default.
ok deraadt markus henric


# 1.90 14-Jun-2007 reyk

Add a new "rtlabel" option to ifconfig. It allows to specify a route label
which will be used for new interface routes. For example,
ifconfig em0 10.1.1.0 255.255.255.0 rtlabel RING_1
will set the new interface address and attach the route label RING_1 to
the corresponding route.

manpage bits from jmc@
ok claudio@ henning@


# 1.89 29-May-2007 uwe

Define IF_ENQUEUE() and friends as proper C statements using do ... while
ok henning


# 1.88 26-May-2007 jason

one extern seems to be better than 20 for ifqmaxlen; ok krw


# 1.87 27-Mar-2007 jmc

grammar from bret lambert, and one more from me;


Revision tags: OPENBSD_4_1_BASE
# 1.86 09-Feb-2007 jmc

grammar fix from bret lambert;


# 1.85 28-Nov-2006 reyk

add additional link states to report the half duplex / full duplex
state, if known by the driver. this is required to check the full
duplex state without depending on the ifmedia ioctl which can't be
called in the kernel without process context.

ok henning@, brad@


# 1.84 16-Nov-2006 henning

introduce if_creategroup() to create an empty interface group.
code factored out from if_addgroup(), previously a group always had to have
members. ok mpf mcbride


# 1.83 31-Oct-2006 jason

ether_input_mbuf() isn't necessary, turn it into a macro and deal with
it's "special" case in ether_input(). Based on similiar idea in FreeBSD.
ok brad


Revision tags: OPENBSD_4_0_BASE
# 1.82 02-Jun-2006 mpf

Introduce attributes to interface groups.
As a first user, move the global carp(4) demotion counter
into the interface group. Thus we have the possibility
to define which carp interfaces are demoted together.

Put the demotion counter into the reserved field of the carp header.
With this, we can have carp act smarter if multiple errors occur.
It now always takes over other carp peers, that are advertising
with a higher demote count. As a side effect, we can also have
group failovers without the need of running in preempt mode.
The protocol change does not break compability with older
implementations.

Collaborative work with mcbride@

OK mcbride@, henning@


# 1.81 27-May-2006 brad

remove IFCAP_JUMBO_MTU interface capabilities flag and set if_hardmtu in a few
more drivers.

ok reyk@


# 1.80 26-May-2006 deraadt

rename jumbo mtu to if_hardmtu; ok brad reyk


# 1.79 19-May-2006 reyk

add a if_jumbo_mtu field to the interface structure for drivers
supporting ethernet jumbo frames. there's no standard for the size of
jumbo MTUs, so either let the driver set it's own value or use 9000
byte jumbo frames by default.

ok brad@


# 1.78 04-Mar-2006 brad

With the exception of two other small uncommited diffs this moves
the remainder of the network stack from splimp to splnet.

ok miod@


Revision tags: OPENBSD_3_9_BASE
# 1.77 09-Feb-2006 reyk

add an interface detach hook and use it with the vlan(4) driver. this
fixes a possible crash if the parent interface has been destroyed
(like vlan on trunk) before destroying the vlan interface.

ok brad@


Revision tags: OPENBSD_3_8_BASE
# 1.76 14-Jun-2005 henning

rename function and define to reflect the external -> egress name change
so it is clear what it is all about


# 1.75 14-Jun-2005 henning

use "egress" instead of "external" for the interface group containing the
interfaces the default route(s) point to, proposed deraadt some days ago,
ok djm deraadt


# 1.74 12-Jun-2005 henning

add SIOCGIFGMEMB ioctl, returns a list of all interfaces who are member of
the given group, markus ok


# 1.73 07-Jun-2005 henning

introduce a default "external" interface group, containing the interface(s)
the the default route(s) point to.
handles IPv4 and IPv6 as well as multipath routes.
follows default route changes, of course.
eases writing pf rulesets especially on laptops etc. that use different
interfaces depending on the environment (wired, wireless, ...)
ok theo ryan


# 1.72 06-Jun-2005 henning

use a define instead of hardcoding "all" in 3 places


# 1.71 05-Jun-2005 henning

const'ify the char *groupname param to if_addgroup and if_delgroup


# 1.70 24-May-2005 markus

add net.inet.ip.ifq for monitoring and changing ifqueue; similar to netbsd
ok henning


# 1.69 24-May-2005 reyk

initial import of a trunking (link aggregation and link failover)
implementation. it currently supports round robin mode with link state
checking, additional modes will be added later.

ok brad@, deraadt@


# 1.68 24-May-2005 henning

keep a list of member interfaces in ifg_group


# 1.67 22-May-2005 henning

allow pf to match on interface groups
pass on mygroup ...
markus ok


# 1.66 21-May-2005 henning

clean up and rework the interface absraction code big time, rip out multiple
useless layers of indirection and make the code way cleaner overall.
this is just the start, more to come...
worked very hard on by Ryan and me in Montreal last week, on the airplane to
vancouver and yesterday here in calgary. it hurt.
ok ryan theo


# 1.65 20-Apr-2005 mpf

Introduce if_linkstatehooks.
This converts if_link_state_change() to a generic usable
callback with dohooks().

OK henning@, camield@
Tested by camield@ and Alexey E. Suslikov


Revision tags: OPENBSD_3_7_BASE
# 1.64 07-Feb-2005 mcbride

Add new function if_link_state_change() to take care of sending messages
on the routing socket and notifying carp() of link changes.

ok brad@ mpf@


# 1.63 14-Jan-2005 henning

remove old ifgroups ioctls
the old ifgroups haven't been in use ever really, and the new
implementation is 3 months old today. theo ok (3 months ago)


# 1.62 07-Dec-2004 mcbride

Convert carp(4) to behave more like a regular interface, much in the same
style as vlan(4). carp interfaces no longer require the physical interface
to be on the same subnet as the carp interface, or even that the physical
interface has an adress at all, so CARP can now be used on /30 networks.

ok deraadt@ henning@


# 1.61 07-Dec-2004 mcbride

KNF


# 1.60 03-Dec-2004 henning

do not use one struct timeout for the if congestion stuff, but embed
a struct timeout to struct ifqueue so that each one has its own - it
is a per-queue thing. from chris pascoe


# 1.59 10-Nov-2004 grange

Safer IF_INPUT_ENQUEUE macro.

ok millert@


# 1.58 14-Oct-2004 mickey

avoid stupid commons


# 1.57 11-Oct-2004 henning

ifgroups reqrite
there is now a TAILQ with all interface groups as members, and
in struct ofnet there is only a pointer to the group structure stored
and not its name.
mostly hacked at c2k4 and somewhere over the atlantic ocean
ok markus mcbride


Revision tags: OPENBSD_3_6_BASE
# 1.56 26-Jun-2004 markus

cleanup ioctl for ifgroups; ok pb@


# 1.55 25-Jun-2004 pb

introduce "interface groups"

by "ifconfig fxp0 group foobar" "ifconfig xl0 group foobar"
these two interfaces are in one group.
Every interface has its if-family as default group.

idea/design from henning@, based on some work/disucssion from Joris Vink.

henning@, mcbride@ ok.


# 1.54 20-Jun-2004 beck

undo mbuf cluster breakage that causes free'ed packets to show up on the
input queues when using dhcp and hostap wi, or xl, or fxp....
ok art@


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.53 29-May-2004 jcs

introduce SIOCSIFDESCR and SIOCGIFDESCR to maintain interface
descriptions, configurable with ifconfig

help from various, ok deraadt@


# 1.52 18-May-2004 brad

if_ether.h
add ETHER_MAX_LEN_JUMBO, ETHER_VLAN_ENCAP_LEN, ETHER_ALIGN, and
ETHERMTU_JUMBO constants.

if.h
add a few more interface capabilities flags.

Some from NetBSD, some from FreeBSD.

ok markus@


# 1.51 26-Apr-2004 mcbride

Before enqueueing the packet, copy the contents of incoming clusters
to the mbuf and free the cluster when it contains a small packet.

ok deraadt@


# 1.50 17-Apr-2004 henning

add a congestion indicator to if_queue. It is set when the input queue
is full, along with a timer that unsets it again after 10ms.
The input queue beeing full is a reliable indicator for CPU overload, and
this flag allows other subsystems to cope with the situation.
hacked with beck
ok kjc@ markus@ beck@


Revision tags: OPENBSD_3_5_BASE
# 1.49 15-Jan-2004 markus

add a RTM_IFANNOUNCE message; from netbsd; ok itojun, henning


# 1.48 16-Dec-2003 markus

return error in ifc_destroy; ok deraadt, itojun, cedric, hshoexer


# 1.47 10-Dec-2003 itojun

use if_indexlim (instead of if_index) and ifindex2ifnet[x] != NULL
to check if interface exists, as (1) if_index will have different meaning
(2) ifindex2ifnet could become NULL when interface gets destroyed,
when we introduce dynamically-created interfaces. markus ok


# 1.46 08-Dec-2003 markus

add IOCIFGCLONERS; ifconfig -C; from netbsd; ok henning, deraadt


# 1.45 03-Dec-2003 markus

support for network interface "cloning", e.g. gif(4) via ifconfig(8)


# 1.44 19-Oct-2003 david

more typos


# 1.43 17-Oct-2003 mcbride

Common Address Redundancy Protocol

Allows multiple hosts to share an IP address, providing high availability
and load balancing.

Based on code by mickey@, with additional help from markus@
and Marco_Pfatschbacher@genua.de

ok deraadt@


Revision tags: OPENBSD_3_4_BASE
# 1.42 25-Aug-2003 fgsch

if_init support, required by ieee80211.
deraadt@ ok.


# 1.41 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


Revision tags: OPENBSD_3_2_BASE OPENBSD_3_3_BASE UBC_SYNC_A UBC_SYNC_B
# 1.40 03-Jul-2002 miod

Change all variables definitions (int foo) in sys/sys/*.h to variable
declarations (extern int foo), and compensate in the appropriate locations.


# 1.39 30-Jun-2002 itojun

allocate sockaddr_dl for ifnet in if_alloc_sadl(), as we don't always know
the size of sockaddr_dl on if_attach() - for instance, see ether_ifattach().
from netbsd. fgs ok


# 1.38 23-Jun-2002 itojun

g/c last remains of old ipv6 prefix management


# 1.37 27-May-2002 itojun

if_attach() gets called before domaininit(). scan all interfaces for if_afdata
initialization after domaininit().


# 1.36 27-May-2002 itojun

framework to add af-dependent data structure to struct ifnet.
as discussed at bsd-api-discuss. sync w/kame


# 1.35 24-Apr-2002 dhartmei

Add hooks to struct ifnet that allow to register callbacks that will be
notified of interface address changes. ok provos@, angelos@


Revision tags: OPENBSD_3_1_BASE
# 1.34 15-Mar-2002 millert

Cosmetic changes only, primarily making comments line up nicely after the
__P removal.


# 1.33 14-Mar-2002 millert

First round of __P removal in sys


# 1.32 23-Jan-2002 fgsch

compatability -> compatibility.


Revision tags: OPENBSD_3_0_BASE UBC_BASE
# 1.31 05-Jul-2001 angelos

branches: 1.31.4;
KNF


# 1.30 05-Jul-2001 jjbg

Include files for IPComp support. angelos@ ok.


# 1.29 27-Jun-2001 kjc

ALTQ base modifications to the kernel.
- ALTQ introduces a set of new queue macros that coexist with the
traditional IF_XXX macros.
- "struct ifaltq" replaces "struct ifqueue" in "struct ifnet".
- assign cdev major 74 for i386 and 54 for alpha as ALTQ control interface.


# 1.28 23-Jun-2001 fgsch

Add ether_input_mbuf to help us remove the ether_header from
ether_input; all drivers should start migrating to this.
Discussed with jason@, deraadt@ more or les ok'ed.


# 1.27 15-Jun-2001 itojun

change the meaning of ifnet.if_lastchange to meet RFC1573 ifLastChange.
follows BSD/OS practice and ucd-snmp code (FreeBSD does it for specific
interfaces only).

was: if_lastchange get updated on every packet transmission/receipt.
now: if_lastchange get updated when IFF_UP is changed.


# 1.26 09-Jun-2001 angelos

By popular demand, protect from multiple inclusion, and fix to use the
same naming style.


# 1.25 28-May-2001 angelos

IPSECv4 -> IPSEC


# 1.24 28-May-2001 angelos

No need for separate ESP/AH interface capabilities.


# 1.23 28-May-2001 angelos

Interface capabilities (based on NetBSD, but merge ethercom and ifnet
capabilities into one, in the ifp).


Revision tags: OPENBSD_2_9_BASE
# 1.22 06-Feb-2001 mickey

allow changing number of loopbacks in ukc.
change rest of the code to use lo0ifp pointing
to the corresponding struct ifnet.
itojun@ and niklas@ ok


# 1.21 19-Jan-2001 itojun

pull post-4.4BSD change to sys/net/route.c from BSD/OS 4.2 (UCB copyrighted).

have sys/net/route.c:rtrequest1(), which takes rt_addrinfo * as the argument.
pass rt_addrinfo all the way down to rtrequest, and ifa->ifa_rtrequest.
3rd arg of ifa->ifa_rtrequest is now rt_addrinfo * instead of sockaddr *
(almost noone is using it anyways).

benefit: the follwoing command now works. previously we need two route(8)
invocations, "add" then "change".
# route add -inet6 default ::1 -ifp gif0

remove unsafe typecast in rtrequest(), from rtentry * to sockaddr *. it was
introduced by 4.3BSD-reno and never corrected.

XXX is eon_rtrequest() change correct regarding to 3rd arg?
eon_rtrequest() and rtrequest() were incorrect since 4.3BSD-reno,
so i do not have correct answer in the source code.
someone with more clue about netiso-over-ip, please help.


Revision tags: OPENBSD_2_8_BASE
# 1.20 20-Sep-2000 art

Since ifa_refcnt was bumped to an int and rt_flags is an int too, bump
ifa_flags to int.


# 1.19 28-Aug-2000 deraadt

changing the size of if_data has heavy impact on userland compat. there
was even a u_char slot available for ifi_link_state, which clearly does not
need a full 32 bits.


# 1.18 26-Aug-2000 nate

sync mii code with netbsd
adds detach functionality for phys
some code cleanup

Nobody really had time to test all of this out, but theo said commit anyway


Revision tags: OPENBSD_2_7_BASE
# 1.17 22-Mar-2000 itojun

remove if_withname(), which was imported during KAME merge by mistake.


# 1.16 21-Mar-2000 mickey

add SIOCGIFMTU/SIOCSIFMTU; remediate redundant code of tun, ppp, sppp; chris@ ok


Revision tags: SMP_BASE
# 1.15 02-Feb-2000 itojun

branches: 1.15.2;
wrap IFAFREE() by "do {} while (0)". it wasn't safe enough.


Revision tags: kame_19991208
# 1.14 08-Dec-1999 itojun

bring in KAME IPv6 code, dated 19991208.
replaces NRL IPv6 layer. reuses NRL pcb layer. no IPsec-on-v6 support.
see sys/netinet6/{TODO,IMPLEMENTATION} for more details.

GENERIC configuration should work fine as before. GENERIC.v6 works fine
as well, but you'll need KAME userland tools to play with IPv6 (will be
bringed into soon).


Revision tags: OPENBSD_2_6_BASE
# 1.13 08-Aug-1999 niklas

Support detaching of network interfaces. Still work to do in ipf, and
other families than inet.


# 1.12 23-Jun-1999 cmetz

Added some protocol independent interfaces (supposedly IPv6 support APIs, but
ones that are useful for all protocols, not just IPv6).


Revision tags: OPENBSD_2_5_BASE
# 1.11 13-Mar-1999 deraadt

make ifa_refcnt a u_int; andrewb@demon.net


# 1.10 26-Feb-1999 jason

Ethernet bridge/IP firewall driver.


# 1.9 07-Jan-1999 deraadt

fix IFAFREE() to be safe for if/else nesting


Revision tags: OPENBSD_2_4_BASE
# 1.8 03-Sep-1998 jason

o OpenBSD gets if_media support (from NetBSD)
o rework/simplify if_xl to use it


Revision tags: OPENBSD_2_0_BASE OPENBSD_2_1_BASE OPENBSD_2_2_BASE OPENBSD_2_3_BASE
# 1.7 02-Jul-1996 niklas

-Wall & -Wstrict-prototype fixes


# 1.6 29-Jun-1996 deraadt

provide if_attachhead(), and make if_loop use it


# 1.5 10-May-1996 deraadt

if_name/if_unit -> if_xname/if_softc


# 1.4 06-May-1996 mickey

if.h was missed from the commit.
if_ethersubr.c: missed variables added.


# 1.3 05-Mar-1996 mickey

Changes for ifconfig to compile.


# 1.2 03-Mar-1996 niklas

From NetBSD: 960217 merge


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision