History log of /netbsd-current/sys/net/if_bridge.c
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 1.189 29-Jul-2022 skrll

Sprinkle const


# 1.188 29-Jul-2022 skrll

Trailing whitespace


# 1.187 20-Jun-2022 yamaguchi

bridge(4): support VLAN frames stripped by hardware tagging


# 1.186 31-Dec-2021 riastradh

sys: Use if_init wrapper function.

Exception: Not in kern_pmf.c, for the kind of silly reason that it
avoids having kern_pmf.c refer to symbols defined only in net; this
avoids a pain in the rump.


# 1.185 31-Dec-2021 riastradh

sys: Use if_ioctl wrapper function.


# 1.184 31-Dec-2021 riastradh

sys: Use if_stop wrapper function.

Exception: Not in kern_pmf.c, for the kind of silly reason that it
avoids having kern_pmf.c refer to symbols defined only in net; this
avoids a pain in the rump.


# 1.183 30-Sep-2021 yamaguchi

bridge: Register bridge_ifdetach to ether_ifdetach hook


# 1.182 30-Sep-2021 yamaguchi

bridge: Register bridge_calc_link_state to link-state change hook


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base thorpej-i2c-spi-conf-base
# 1.181 02-Jul-2021 yamaguchi

Use if_ioctl() for changing MTU, not ether_ioctl to prevent panic

Fix PR kern/56292


# 1.180 16-Jun-2021 riastradh

if_attach and if_initialize cannot fail, don't test return value

These were originally made failable back in 2017 when if_initialize
allocated a softint in every interface for link state changes, so
that it could fail gracefully instead of panicking:

https://mail-index.NetBSD.org/source-changes/2017/10/23/msg089053.html

However, this spawned many seldom- or never-tested error branches,
which are risky to have around. And that softint in every interface
has since been replaced by a single global workqueue, because link
state changes require thread context but not low latency or high
throughput:

https://mail-index.NetBSD.org/source-changes/2020/02/06/msg113759.html

So there is no longer any reason for if_initialize to fail. (The
subroutine if_stats_init can't fail because percpu_alloc can't fail
either.)

There is a snag: the softint_establish in if_percpuq_create could
fail, potentially leading to bad consequences later on trying to use
the softint. This change doesn't introduce any new bugs because of
the snag -- if_percpuq_attach was already broken. However, the snag
can be better addressed without spawning error branches, either by
using a single softint or making softints less scarce.

(Separate commit will change the signatures of if_attach and
if_initialize to return void, scheduled to ride whatever is the next
convenient kernel bump.)

Patch and testing on amd64 and evbmips64-eb by maya@; commit message
soliloquy, and compile-testing on evbppc/i386/earmv7hf, by me.


Revision tags: cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-cfargs-base thorpej-futex-base
# 1.179 19-Feb-2021 christos

branches: 1.179.4;
- Make ALIGNED_POINTER use __alignof(t) instead of sizeof(t). This is more
correct because it works with non-primitive types and provides the ABI
alignment for the type the compiler will use.
- Remove all the *_HDR_ALIGNMENT macros and asserts
- Replace POINTER_ALIGNED_P with ACCESSIBLE_POINTER which is identical to
ALIGNED_POINTER, but returns that the pointer is always aligned if the
CPU supports unaligned accesses.
[ as proposed in tech-kern ]


# 1.178 14-Feb-2021 christos

- centralize header align and pullup into a single inline function
- use a single macro to align pointers and expose the alignment, instead
of hard-coding 3 in 1/2 the macros.
- fix an issue in the ipv6 lt2p where it was aligning for ipv4 and pulling
for ipv6.


# 1.177 02-Nov-2020 roy

bridge: revert prior

It's of little use.
If we need to do this in the future, consider a sysctl to do it for all
interfaces in the bridge and not just the one being added.


# 1.176 27-Sep-2020 roy

branches: 1.176.2;
bridge: When an interface joins then mark addresses on it as tentative

The exact flow is detatch addresses, join bridge and then mark detached
addresses as tentative.
This ensures that Duplicate Address Detection for the joining interface
are performed across all members of the bridge.


# 1.175 27-Sep-2020 roy

bridge: Calculate link state as the best link state of any member

If any member is LINK_STATE_UP then it's LINK_STATE_UP.
Otherwise if any member is LINK_STATE_UNKNOWN then it's LINK_STATE_UNKNOWN.
Otherwise it's LINK_STATE_DOWN.


# 1.174 01-Aug-2020 maxv

Remove #ifdef BRIDGE_IPF, compile in the code by default. Sent to
tech-net@.


# 1.173 01-May-2020 jdolecek

report no enabled capabilities when no interface is part of bridge


# 1.172 30-Apr-2020 jdolecek

for bridge(4), report the common enabled capabilities of the members
via SIOCGIFCAP for visibility


# 1.171 27-Apr-2020 jdolecek

if MTU of the added interface doesn't match the bridge, modify the MTU
of the interface to that of the bridge instead of just refusing the
addition with EINVAL

this is a convenience feature to simplify bridge setup with non-standard
MTU, the useful behaviour observed with Linux xenbr


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406
# 1.170 27-Mar-2020 jdolecek

replace the conditional m_pullup() on start of bridge_output() with
a KASSERT(), to make it clear no mbuf manipulation is ever done here

the condition should never trigger, this always runs after ether_output()
M_PREPEND()s ether_header


# 1.169 24-Mar-2020 jdolecek

reset the csum_flags in bridge_brodcast() also for bmcast path

for destination interfaces with real hardware offloading this fixes
multicast packet corruption; for xvif(4) this fix stops treating them
as having no csum

may fix PR kern/42386


Revision tags: ad-namecache-base3
# 1.168 24-Feb-2020 rin

Remove debug printf I put into bridge_calc_csum_flags().
Sorry for noise.


# 1.167 23-Feb-2020 jdolecek

disable the DEBUG bridge_calc_csum_flags() printf


# 1.166 29-Jan-2020 thorpej

Adopt <net/if_stats.h>.


Revision tags: ad-namecache-base2 ad-namecache-base1 ad-namecache-base phil-wifi-20191119
# 1.165 05-Aug-2019 msaitoh

branches: 1.165.2;
Cast uint32_t to avoid undefined behavior in bridge_rthash(). Found by kUBSan.


Revision tags: netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609 isaki-audio2-base pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226
# 1.164 22-Dec-2018 rin

branches: 1.164.4;
Take the interface out of promiscuous mode in bridge_delete_member()
instead of bridge_ioctl_del(). Otherwise, the member interfaces are
left in promiscuous mode when the bridge is destroyed.


# 1.163 15-Dec-2018 rin

Improve wording in comments: replace "chain" with "queue" for
sequence of mbuf's connected by m_nextpkt, in order to avoid
confusion with those connected by m_next.

No binary changes.


# 1.162 14-Dec-2018 martin

Need <netinet6/ip6_var.h> for ip6_statinc() prototype.


# 1.161 12-Dec-2018 rin

PR kern/53562

Handle TX offload in software when a packet is sent via
bridge_output(). We can send it as is in the following
exceptional cases:

For unicast:

(1) When the destination interface is the same as source.

(2) When the destination supports all TX offload options
specified in a packet.

For multicast/broadcast:

(3) When all the members of the bridge support the specified
TX offload options.

For (3), add sc_csum_flags_tx flag to bridge softc, which is
logical AND b/w capabilities of TX offload options in member
interface (ifp->if_csum_flags_tx). The flag is updated when a
member is (i) added to or (ii) removed from a bridge, or (iii)
if_csum_flags_tx flag of a member interface is manipulated via
ifconfig(8).

Turn on M_CSUM_TSOv[46] bit in ifp->if_csum_flags_tx flag when
TSO[46] is enabled for that interface.

OK msaitoh thorpej


Revision tags: pgoyette-compat-1126
# 1.160 09-Nov-2018 ozaki-r

Fix that brconfig <bridge> (addr) can't show a large number of MAC addresses

The command shows only 256 addresses at maximum even if a bridge caches more
addresses. It occurs because the kernel doesn't return an error if the command
passes a short buffer that can't store all cached addresses; the kernel fills
cached addresses as much as possible and returns it without telling that the
result is truncated.

Fix the issue by telling a required size of a buffer if a buffer passed from the
command is not enough, which lets the command retry with an enough buffer.

Reported by k-goda@IIJ


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930
# 1.159 19-Sep-2018 msaitoh

Micro optimization. m_copym(M_COPYALL) -> m_copypacket().


# 1.158 18-Sep-2018 msaitoh

- Fix bridge_enqueue() which was broken by last commit. Use correct mbuf
pointer.
- Modify comment.


# 1.157 14-Sep-2018 msaitoh

Fix a bug that bridge_enqueue() incorrectly cleared outgoing packet's offload
flags. bridge_enqueue() is called from bridge_output() when a packet is
spontaneous. Clear csum_flags before calling brige_enqueue() in
bridge_forward() or bridge_broadcast() instead of in the beginning of
bridge_enqueue().

Note that this change doesn't fix a problem on the following configuration:

A bridge has two or more interfaces.

An address is assigned to an bridge member interface and
some offload flags are set.

Another interface has no address and has no any offload flag.

XXX pullup-[78]


Revision tags: pgoyette-compat-0906 pgoyette-compat-0728 phil-wifi-base pgoyette-compat-0625
# 1.156 25-May-2018 ozaki-r

branches: 1.156.2;
Ensure to call if_register after interface initializations finish


Revision tags: pgoyette-compat-0521
# 1.155 14-May-2018 ozaki-r

Protect packet input routines with KERNEL_LOCK and splsoftnet

if_input, i.e, ether_input and friends, now runs in softint without any
protections. It's ok for ether_input itself because it's already MP-safe,
however, subsequent routines called from it such as carp_input and agr_input
aren't safe because they're not MP-safe. Protect if_input with KERNEL_LOCK.

if_input can be called from a normal LWP context. In that case we need to
prevent interrupts (softint) from running by splsoftnet to protect non-MP-safe
codes (e.g., carp_input and agr_input).

Pointed out by mlelstv@


Revision tags: pgoyette-compat-0502 pgoyette-compat-0422
# 1.154 18-Apr-2018 ozaki-r

Add missing PSLIST_ENTRY_INIT and PSLIST_ENTRY_DESTROY


# 1.153 18-Apr-2018 ozaki-r

Get rid of a unnecessary semicolon

Pointed out by kamil@


# 1.152 18-Apr-2018 ozaki-r

bridge: use pslist(9) for rtlist and rthash

The change fixes race conditions on list operations. One example is that a
reader may see invalid pointers on a looking item in a list due to lack of
membar_producer.


# 1.151 18-Apr-2018 ozaki-r

Simplify bridge_rtnode_insert (NFC)


# 1.150 18-Apr-2018 ozaki-r

Remove obsolete NULL checks


Revision tags: pgoyette-compat-0415
# 1.149 10-Apr-2018 ozaki-r

Fix bridge_rtdelete

It removes a rtable entry that belongs to a specified interface, however, its
original behavior was to delete all belonging entries. Restore the original
behavior.


Revision tags: pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.148 15-Jan-2018 maxv

branches: 1.148.2;
If the bridge is not running, don't call bridge_stop. Otherwise the
following commands will crash the kernel:

ifconfig bridge0 create
ifconfig bridge0 destroy


# 1.147 28-Dec-2017 ozaki-r

Ensure the timer isn't running by using workqueue_wait


# 1.146 19-Dec-2017 ozaki-r

Don't set IFEF_MPSAFE unless NET_MPSAFE at this point

Because recent investigations show that interfaces with IFEF_MPSAFE need to
follow additional restrictions to work with the flag safely. We should enable it
on an interface by default only if the interface surely satisfies the
restrictions, which are described in if.h.

Note that enabling IFEF_MPSAFE solely gains a few benefit on performance because
the network stack is still serialized by the big kernel locks by default.


# 1.145 11-Dec-2017 ozaki-r

Wrap if_ioctl_lock with IFNET_* macros (NFC)

Also if_ioctl_lock perhaps needs to be renamed to something because it's now
not just for ioctl...


# 1.144 08-Dec-2017 ozaki-r

Fix build of kernels without ether

By throwing out if_enable_vlan_mtu and if_disable_vlan_mtu that
created a unnecessary dependency from if.c to if_ethersubr.c.

PR kern/52790


# 1.143 06-Dec-2017 ozaki-r

Ensure to not turn on IFF_RUNNING of an interface until its initialization completes

And ensure to turn off it before destruction as per IFF_RUNNING's description
"resource allocated". (The description is a bit doubtful though, I believe the
change is still proper.)


# 1.142 06-Dec-2017 ozaki-r

Ensure to hold if_ioctl_lock when calling if_flags_set


Revision tags: tls-maxphys-base-20171202
# 1.141 17-Nov-2017 ozaki-r

Add missing IFEF_NO_LINK_STATE_CHANGE to bridge


# 1.140 16-Nov-2017 ozaki-r

Unify IFEF_*_MPSAFE into IFEF_MPSAFE

There are already two flags for if_output and if_start, however, it seems such
MPSAFE flags are eventually needed for all if_XXX operations. Having discrete
flags for each operation is wasteful of if_extflags bits. So let's unify
the flags into one: IFEF_MPSAFE.

Fortunately IFEF_*_MPSAFE flags have never been included in any releases, so
we can change them without breaking backward compatibility of the releases
(though the kernel version of -current should be bumped).

Note that if an interface have both MP-safe and non-MP-safe operations at a
time, we have to set the IFEF_MPSAFE flag and let callees of non-MP-safe
opeartions take the kernel lock.

Proposed on tech-kern@ and tech-net@


# 1.139 15-Nov-2017 ozaki-r

Mark callouts of bridge CALLOUT_MPSAFE


# 1.138 25-Oct-2017 ozaki-r

Remove unnecessary splsoftnet


# 1.137 25-Oct-2017 ozaki-r

Don't free sc_rthash twice


# 1.136 23-Oct-2017 msaitoh

- If if_initialize() failed in the attach function, free resources and return.
- Add some missing frees in bridge_clone_destroy().
- KNF


# 1.135 02-Oct-2017 ozaki-r

Add curlwp_bind to bridge_input for psref

It can be called in a thread context via tap (tap_dev_write).

Fix PR kern/52587


Revision tags: nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320
# 1.134 07-Mar-2017 ozaki-r

branches: 1.134.6;
Remove unnecessary splnet for bridge_enqueue

bridge_enqueue now uses if_transmit_lock that does splnet for device
drivers, so splnet for bridge_enqueue isn't needed anymore.


# 1.133 16-Feb-2017 knakahara

add l2tp(4) L2TPv3 interface.

originally implemented by IIJ SEIL team.


Revision tags: nick-nhusb-base-20170204
# 1.132 23-Jan-2017 ozaki-r

Replace some splnet with splsoftnet


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107 nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004
# 1.131 15-Sep-2016 christos

branches: 1.131.2;
Always do the mbuf checks. The packet filters (npf) expect the mbuf to be
pulled-up. (Krists Krilovs)


Revision tags: localcount-20160914
# 1.130 29-Aug-2016 ozaki-r

KNF; replace white spaces with hard tabs

No functional change.


Revision tags: pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.129 22-Jun-2016 knakahara

branches: 1.129.2;
fix: locking about IFQ_ENQUEUE and ALTQ

- If NET_MPSAFE is not defined, IFQ_LOCK is nop. Currently, that means
IFQ_ENQUEUE() of some paths such as bridge_enqueue() is called parallel
wrongly.
- If ALTQ is enabled, Tx processing should call if_transmit() (= IFQ_ENQUEUE
+ ifp->if_start()) instead of ifp->if_transmit() to call ALTQ_ENQUEUE()
and ALTQ_DEQUEUE().
Furthermore, ALTQ processing is always required KERNEL_LOCK currently.


# 1.128 20-Jun-2016 knakahara

fix: should not assert IFEF_OUTPUT_MPSAFE in bridge_output()


# 1.127 20-Jun-2016 knakahara

tentative fix for ATF(net/if_bridge/t_bridge)


# 1.126 20-Jun-2016 knakahara

make bridge_output MP-safe, so that bridge(4) can enable IFEF_OUTPUT_MPSAFE.

making MP-scalable is future work.


# 1.125 10-Jun-2016 ozaki-r

Avoid storing a pointer of an interface in a mbuf

Having a pointer of an interface in a mbuf isn't safe if we remove big
kernel locks; an interface object (ifnet) can be destroyed anytime in any
packet processing and accessing such object via a pointer is racy. Instead
we have to get an object from the interface collection (ifindex2ifnet) via
an interface index (if_index) that is stored to a mbuf instead of an
pointer.

The change provides two APIs: m_{get,put}_rcvif_psref that use psref(9)
for sleep-able critical sections and m_{get,put}_rcvif that use
pserialize(9) for other critical sections. The change also adds another
API called m_get_rcvif_NOMPSAFE, that is NOT MP-safe and for transition
moratorium, i.e., it is intended to be used for places where are not
planned to be MP-ified soon.

The change adds some overhead due to psref to performance sensitive paths,
however the overhead is not serious, 2% down at worst.

Proposed on tech-kern and tech-net.


# 1.124 10-Jun-2016 ozaki-r

Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.123 16-May-2016 ozaki-r

Apply if_get and if_put to bridge(4)


# 1.122 04-May-2016 roy

Allow multicast/broadcast packets from a bridge member to other members.
Note this should just call bridge_broadcast when more locking issues are
resolved.


# 1.121 28-Apr-2016 knakahara

introduce new ifnet MP-scalable sending interface "if_transmit".


# 1.120 28-Apr-2016 ozaki-r

Constify rtentry of if_output

We no longer need to change rtentry below if_output.

The change makes it clear where rtentries are changed (or not)
and helps forthcoming locking (os psrefing) rtentries.


# 1.119 24-Apr-2016 christos

CID 1358673: dead code


Revision tags: nick-nhusb-base-20160422
# 1.118 22-Apr-2016 roy

Change used from int to bool.
If used, abort the loop because we think we're already at the end.


# 1.117 20-Apr-2016 knakahara

IFQ_ENQUEUE refactor (3/3) : eliminate pktattr argument from IFQ_ENQUEUE caller


# 1.116 20-Apr-2016 knakahara

IFQ_ENQUEUE refactor (2/3) : eliminate pktattr argument from altq implemantation


# 1.115 19-Apr-2016 ozaki-r

Apply psref(9) to bridge(4)

Note that there is an issue that ioctls for an interface and a destruction
of the interface can run in parallel and it causes race conditions on
bridge as well (it rarely happens). The issue will be addressed in the
interface common code (if.c).


# 1.114 19-Apr-2016 ozaki-r

Remove BRIDGE_MPSAFE switch and enable MP-safe code by default

We need to enable it by default because bridge_input now runs
in softint, but bridge_input w/o BRIDGE_MPSAFE was designed as
it runs in hardware interrupt.

Note that there remains a racy code in bridge_output; it will be
solved in the upcoming change (applying psref(9)).


# 1.113 11-Apr-2016 ozaki-r

Fix usage of pslist(9)

Pointed out by riastradh@.


# 1.112 11-Apr-2016 ozaki-r

Use pslist(9) in bridge(4)

This adds missing memory barriers to list operations for pserialize.


# 1.111 28-Mar-2016 ozaki-r

Remove unused global bridge list

Pointed out by riastradh@


# 1.110 23-Mar-2016 ozaki-r

Fix LIST_FOREACH argument


# 1.109 23-Mar-2016 ozaki-r

Use LIST_FOREACH instead of LIST_FOREACH_SAFE

No need to use *_SAFE because we don't remove any items in the loop.


Revision tags: nick-nhusb-base-20160319
# 1.108 15-Feb-2016 ozaki-r

Simplify bridge(4)

Thanks to introducing softint-based if_input, the entire bridge code now
never run in hardware interrupt context. So we can simplify the code.

- Remove spin mutexes
- They were needed because some code of bridge could run in
hardware interrupt context
- We now need only an adaptive mutex for each shared object
(a member list and a forwarding table)
- Remove pktqueue
- bridge_input is already in softint, using another softint
(for bridge_forward) is useless
- Packet distribution should be down at device drivers


# 1.107 10-Feb-2016 ozaki-r

Don't share struct work, instead have one per softc

Pointed out by riastradh@


# 1.106 09-Feb-2016 ozaki-r

Introduce softint-based if_input

This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.

This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.

To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.

Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html

Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!


Revision tags: nick-nhusb-base-20151226
# 1.105 19-Nov-2015 christos

Add handling of VLAN packets in if_bridge where the parent interface supports
them (Jean-Jacques.Puig@espci.fr). Factor out the vlan_mtu enabling and
disabling code.


# 1.104 20-Oct-2015 maxv

Harmless alloc inconsistency; make sure the exact same argument is given to
kmem_alloc/kmem_free. Found by Brainy.


# 1.103 07-Oct-2015 ozaki-r

Enqueue frames to a curcpu's pktqueue

Currently RX can run on a CPU other than CPU#0, so always enqueuing
to a pktqueue of CPU#0 makes no sense. Let's use a curcpu's pktqueue,
although bridge_foward softint doesn't run in parallel without
NET_MPSAFE.

This is a temporal solution. We need a fundamental solution.


Revision tags: nick-nhusb-base-20150921
# 1.102 28-Aug-2015 rjs

Don't set M_PROTO1 in mbuf flags.

This was left over from the old usage of gif(4) with bridges.


# 1.101 20-Aug-2015 christos

include "ioconf.h" to get the 'void <driver>attach(int count);' prototype.


# 1.100 23-Jul-2015 ozaki-r

Fix PR 48104

So far bridge cannot receive frames via a member interface when the frames
come from another member interface. So when we assign an IP address to
a member interface, hosts connected to another member interface cannot
ping to the IP address. That behavior isn't expected. See PR 48104 for
more realistic examples of this issue.

The change does:
- drop M_PROMISC before ether_input, which allows a bridge member interface
to receive a frame coming from another bridge member interface
- receive broadcast/multicast frames via all bridge member interfaces,
which is required to receive IPv6 multicast packets destined to a
multicast group belonging to a bridge member interface that is different
from a packet arrival interface

roy@ helped testing of the fix, thanks!


Revision tags: nick-nhusb-base-20150606
# 1.99 01-Jun-2015 matt

Modify the BRDGGIFS and BRDGRTS cmds to be more COMPAT_NETBSD32 friendly.
(XXX whitespace)


# 1.98 16-Apr-2015 ozaki-r

Fix racy bridge_delete_member

It can be called from bridge_ioctl_del and bridge_clone_destroy with
a same bridge member (bif) at the same time. We have to prevent
that happens.

Pointed out by riastradh@


Revision tags: nick-nhusb-base-20150406
# 1.97 08-Jan-2015 ozaki-r

Use pserialize for rtlist in bridge

This change enables lockless accesses to bridge rtable lists.
See locking notes in a comment to know how pserialize and
mutexes are used. Some functions are rearranged to use
pserialize. A workqueue is introduced to use pserialize in
bridge_rtage via bridge_timer callout.

As usual, pserialize and mutexes are used only when NET_MPSAFE
on. On the other hand, the newly added workqueue is used
regardless of NET_MPSAFE on or off.


# 1.96 01-Jan-2015 ozaki-r

Reset the expire time of a cache on receiving a frame for the cache

The expire time of a cache in a bridge MAC address table was never reset
once it is initialized regardless of traffic for the cache. The behavior
isn't supposed and active caches are unnecessarily expired and removed.

PR kern/49507


# 1.95 31-Dec-2014 ozaki-r

Use pserialize in bridge

This change enables lockless accesses to bridge member lists.
See locking notes in a comment to know how pserialize and
mutexes are used.

This change also provides support for softint-based interrupt
handling; pserialize readers can run in both HW interrupt and
softint contexts.

As usual, pserialize is used only when NET_MPSAFE on.


# 1.94 25-Dec-2014 ozaki-r

Use LIST_FOREACH_SAFE in bridge_rt* functions


# 1.93 24-Dec-2014 ozaki-r

Replace malloc/free with kmem_* in if_bridge

Additionally M_NOWAIT is replaced with KM_SLEEP.


# 1.92 22-Dec-2014 ozaki-r

Call ether_input/m_freem without holding a lock or referencing unnecessary objects

When NET_MPSAFE on, a bridge tries to pass up a packet to Layer 3
(or call m_freem) with holding a lock or referencing unnecessary
objects. That causes random lock ups. The change fixes the issue.


Revision tags: nick-nhusb-base
# 1.91 15-Aug-2014 ozaki-r

branches: 1.91.2;
bridge: reject non-IFF_SIMPLEX interfaces

bridge does not work with !IFF_SIMPLEX interfaces (PR/18035);
the bug is not yet fixed. Until it gets fixed, we should
reject non-IFF_SIMPLEX interfaces.

Discussed with pooka@


Revision tags: netbsd-7-1-2-RELEASE netbsd-7-1-1-RELEASE netbsd-7-1-RELEASE netbsd-7-1-RC2 netbsd-7-nhusb-base-20170116 netbsd-7-1-RC1 netbsd-7-0-2-RELEASE netbsd-7-nhusb-base netbsd-7-0-1-RELEASE netbsd-7-0-RELEASE netbsd-7-0-RC3 netbsd-7-0-RC2 netbsd-7-0-RC1 netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.90 23-Jul-2014 ozaki-r

branches: 1.90.2;
Avoid calling copyout with holding mutex(IPL_NET)

Because copyout may lead a page fault that may sleep, we have to pull it
out from the critical section of mutex(IPL_NET) in bridge_ioctl_gifs.


# 1.89 23-Jul-2014 ozaki-r

Add missing unlock


# 1.88 20-Jul-2014 ozaki-r

Don't return ENETRESET when ioctl SIOCSIFMTU

Otherwise, just changing MTU with ifconfig shows
a confusable error message.

RP kern/48996


# 1.87 14-Jul-2014 ozaki-r

Make bridge MPSAFE

- Introduce BRIDGE_MPSAFE
- It's enabled only when NET_MPSAFE is defined
in if.h or the kernel config
- Add iflist and rtlist mutex locks
- Locking iflist is performance sensitive,
so it's not used when !BRIDGE_MPSAFE
- Add bif object reference counting
- It enables fine-grain locking for bridge member lists
by allowing to not hold a lock during touching a bif
- bridge_release_member is added to decrement the
reference count
- A condition variable is added to do bridge_delete_member
gracefully
- Add if_bridgeif to ifnet
- It's a shortcut to a bif object of a bridge member
- It reduces a bif lookup cost and so lock contention on iflist
- Make bridgestp MPSAFE too


# 1.86 02-Jul-2014 ozaki-r

Protect bridge_list with a mutex


# 1.85 02-Jul-2014 ozaki-r

Remove obsolete codes for if_snd


# 1.84 23-Jun-2014 ozaki-r

Get rid of unnecessary xc_broadcast after pktq_barrier

Pointed out by rmind@


# 1.83 18-Jun-2014 ozaki-r

Restructure bridge_input and bridge_broadcast

There are two changes:
- Assemble the places calling pktq_enqueue (bridge_forward)
for unicast and {b,m}cast frames into one
- Receive {b,m}cast frames in bridge_broadcast, not in
bridge_input

The changes make the code clear and readable. bridge_input
now doesn't need to take care of {b,m}cast frames;
bridge_forward and bridge_broadcast have the responsibility.

The changes are based on a patch of Lloyd Parkes submitted
in PR 48104, but don't fix its issue yet.


# 1.82 18-Jun-2014 ozaki-r

Tidy up bridge_input

No functional change.


# 1.81 17-Jun-2014 ozaki-r

Restructure ether_input and bridge_input

The network stack of NetBSD is well organized and
layered. A packet reception is processed from a
lower layer to an upper layer one by one. However,
ether_input and bridge_input are not structured so.
bridge_input is called inside ether_input.

The new structure replaces ifnet#if_input of a bridge
member with bridge_input when the member is attached.
So a packet goes straight on a packet reception via
a bridge, bridge_input => ether_input => ip_input.

The change is part of a patch of Lloyd Parkes submitted
in PR 48104. Unlike the patch, the change doesn't
intend to change the behavior of the packet processing.
Another patch will fix PR 48104.


# 1.80 16-Jun-2014 ozaki-r

Add net.interfaces.bridgeN.fwdq.{maxlen,len,drops} sysctl


# 1.79 16-Jun-2014 ozaki-r

Use pktqueue for bridge forwarding queue and softint


# 1.78 15-Jun-2014 ozaki-r

Get rid of unnecessary splnet for pool_{get,put}

A mutex prevents interrupts in the functions now.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base
# 1.77 29-Jun-2013 rmind

branches: 1.77.4;
- Rewrite parts of pfil(9): use array to store hooks and thus be more cache
friendly (there are only few hooks in the system). Make the structures
opaque and the interface more strict.
- Remove PFIL_HOOKS option by making pfil(9) mandatory.


Revision tags: agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8
# 1.76 22-Mar-2012 wiz

branches: 1.76.2; 1.76.4;
Fix typo in kauth name. From PR 46234 by Matthew Mondor.
Tested by Geoff Adams and Ryo ONODERA.


# 1.75 13-Mar-2012 elad

Replace the remaining KAUTH_GENERIC_ISSUSER authorization calls with
something meaningful. All relevant documentation has been updated or
written.

Most of these changes were brought up in the following messages:

http://mail-index.netbsd.org/tech-kern/2012/01/18/msg012490.html
http://mail-index.netbsd.org/tech-kern/2012/01/19/msg012502.html
http://mail-index.netbsd.org/tech-kern/2012/02/17/msg012728.html

Thanks to christos, manu, njoly, and jmmv for input.

Huge thanks to pgoyette for spinning these changes through some build
cycles and ATF.


Revision tags: netbsd-6-0-6-RELEASE netbsd-6-1-5-RELEASE netbsd-6-1-4-RELEASE netbsd-6-0-5-RELEASE netbsd-6-1-3-RELEASE netbsd-6-0-4-RELEASE netbsd-6-1-2-RELEASE netbsd-6-0-3-RELEASE netbsd-6-1-1-RELEASE netbsd-6-0-2-RELEASE netbsd-6-1-RELEASE netbsd-6-1-RC4 netbsd-6-1-RC3 netbsd-6-1-RC2 netbsd-6-1-RC1 netbsd-6-0-1-RELEASE matt-nb6-plus-nbase netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base
# 1.74 19-Nov-2011 tls

branches: 1.74.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.73 23-May-2011 joerg

branches: 1.73.4;
simplify


Revision tags: bouyer-quota2-nbase bouyer-quota2-base jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.72 07-Dec-2010 pooka

branches: 1.72.2;
_KERNEL_TOP


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1 yamt-nfs-mp-base9 uebayasi-xip-base
# 1.71 19-Jan-2010 pooka

branches: 1.71.4;
Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


Revision tags: matt-premerge-20091211 yamt-nfs-mp-base8 yamt-nfs-mp-base7 jymxensuspend-base yamt-nfs-mp-base6 yamt-nfs-mp-base5 jym-xensuspend-nbase
# 1.70 17-May-2009 cegger

fix crash in bridge_ioctl():


BRDGGFLT and BRDGSFILT bridge controls are only available with BRIDGE_IPF and PFIL_HOOKS defined.
In amd64 GENERIC and XEN kernel configs PFIL_HOOKS is defined but BRIDGE_IPF is not.

When a BRDGGFLT or BRDGSFILT command comes in, then ifd->ifd_cmd is not in range
of bridge_control_table_size. Then bc is not set and is dereferenced
later => BOOM.


Revision tags: yamt-nfs-mp-base4 jym-xensuspend-base
# 1.69 12-May-2009 elad

Move kauth(9) call before going into splnet().

Mailing list reference:

http://mail-index.netbsd.org/tech-net/2009/05/08/msg001286.html


Revision tags: yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 nick-hppapmap-base
# 1.68 04-Apr-2009 bouyer

Fix another typo


# 1.67 04-Apr-2009 bouyer

Fix a comment, and make it build.


# 1.66 04-Apr-2009 bouyer

Fixes from Masao Uebayashi


# 1.65 04-Apr-2009 bouyer

Fix for if_start() and pfil_hook() being called from hardware interrupt
context (reported on various mailing-lists, and part of PR kern/41114,
causing panic in pf(4) and possibly ipf(4) when BRIDGE_IPF is used).
Defer bridge_forward() to a software interrupt; bridge_input() enqueues
mbufs to ifp->if_snd which is handled in bridge_forward().


Revision tags: nick-hppapmap-base2
# 1.64 18-Jan-2009 mrg

branches: 1.64.2;
Fix multiple problems:

* A sign extension error creating the bridge ID corrupted the
priority (always making it the maximum).
* Do not catch STP packets on an interface for which STP is not
enabled -- it's a violation of the spec, and causes STP to fail on
neighboring bridges.
* An optimization to bstp_input() -- some information is already
known when we call it.

contributed anonymously.


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.63 07-Nov-2008 dyoung

*** Summary ***

When a link-layer address changes (e.g., ifconfig ex0 link
02:de:ad:be:ef:02 active), send a gratuitous ARP and/or a Neighbor
Advertisement to update the network-/link-layer address bindings
on our LAN peers.

Refuse a change of ethernet address to the address 00:00:00:00:00:00
or to any multicast/broadcast address. (Thanks matt@.)

Reorder ifnet ioctl operations so that driver ioctls may inherit
the functions of their "class"---ether_ioctl(), fddi_ioctl(), et
cetera---and the class ioctls may inherit from the generic ioctl,
ifioctl_common(), but both driver- and class-ioctls may override
the generic behavior. Make network drivers share more code.

Distinguish a "factory" link-layer address from others for the
purposes of both protecting that address from deletion and computing
EUI64.

Return consistent, appropriate error codes from network drivers.

Improve readability. KNF.

*** Details ***

In if_attach(), always initialize the interface ioctl routine,
ifnet->if_ioctl, if the driver has not already initialized it.
Delete if_ioctl == NULL tests everywhere else, because it cannot
happen.

In the ioctl routines of network interfaces, inherit common ioctl
behaviors by calling either ifioctl_common() or whichever ioctl
routine is appropriate for the class of interface---e.g., ether_ioctl()
for ethernets.

Stop (ab)using SIOCSIFADDR and start to use SIOCINITIFADDR. In
the user->kernel interface, SIOCSIFADDR's argument was an ifreq,
but on the protocol->ifnet interface, SIOCSIFADDR's argument was
an ifaddr. That was confusing, and it would work against me as I
make it possible for a network interface to overload most ioctls.
On the protocol->ifnet interface, replace SIOCSIFADDR with
SIOCINITIFADDR. In ifioctl(), return EPERM if userland tries to
invoke SIOCINITIFADDR.

In ifioctl(), give the interface the first shot at handling most
interface ioctls, and give the protocol the second shot, instead
of the other way around. Finally, let compatibility code (COMPAT_OSOCK)
take a shot.

Pull device initialization out of switch statements under
SIOCINITIFADDR. For example, pull ..._init() out of any switch
statement that looks like this:

switch (...->sa_family) {
case ...:
..._init();
...
break;
...
default:
..._init();
...
break;
}

Rewrite many if-else clauses that handle all permutations of IFF_UP
and IFF_RUNNING to use a switch statement,

switch (x & (IFF_UP|IFF_RUNNING)) {
case 0:
...
break;
case IFF_RUNNING:
...
break;
case IFF_UP:
...
break;
case IFF_UP|IFF_RUNNING:
...
break;
}

unifdef lots of code containing #ifdef FreeBSD, #ifdef NetBSD, and
#ifdef SIOCSIFMTU, especially in fwip(4) and in ndis(4).

In ipw(4), remove an if_set_sadl() call that is out of place.

In nfe(4), reuse the jumbo MTU logic in ether_ioctl().

Let ethernets register a callback for setting h/w state such as
promiscuous mode and the multicast filter in accord with a change
in the if_flags: ether_set_ifflags_cb() registers a callback that
returns ENETRESET if the caller should reset the ethernet by calling
if_init(), 0 on success, != 0 on failure. Pull common code from
ex(4), gem(4), nfe(4), sip(4), tlp(4), vge(4) into ether_ioctl(),
and register if_flags callbacks for those drivers.

Return ENOTTY instead of EINVAL for inappropriate ioctls. In
zyd(4), use ENXIO instead of ENOTTY to indicate that the device is
not any longer attached.

Add to if_set_sadl() a boolean 'factory' argument that indicates
whether a link-layer address was assigned by the factory or some
other source. In a comment, recommend using the factory address
for generating an EUI64, and update in6_get_hw_ifid() to prefer a
factory address to any other link-layer address.

Add a routing message, RTM_LLINFO_UPD, that tells protocols to
update the binding of network-layer addresses to link-layer addresses.
Implement this message in IPv4 and IPv6 by sending a gratuitous
ARP or a neighbor advertisement, respectively. Generate RTM_LLINFO_UPD
messages on a change of an interface's link-layer address.

In ether_ioctl(), do not let SIOCALIFADDR set a link-layer address
that is broadcast/multicast or equal to 00:00:00:00:00:00.

Make ether_ioctl() call ifioctl_common() to handle ioctls that it
does not understand.

In gif(4), initialize if_softc and use it, instead of assuming that
the gif_softc and ifp overlap.

Let ifioctl_common() handle SIOCGIFADDR.

Sprinkle rtcache_invariants(), which checks on DIAGNOSTIC kernels
that certain invariants on a struct route are satisfied.

In agr(4), rewrite agr_ioctl_filter() to be a bit more explicit
about the ioctls that we do not allow on an agr(4) member interface.

bzero -> memset. Delete unnecessary casts to void *. Use
sockaddr_in_init() and sockaddr_in6_init(). Compare pointers with
NULL instead of "testing truth". Replace some instances of (type
*)0 with NULL. Change some K&R prototypes to ANSI C, and join
lines.


Revision tags: netbsd-5-0-RC3 netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 wrstuden-revivesa-base-1 simonb-wapbl-nbase yamt-pf42-base4 simonb-wapbl-base wrstuden-revivesa-base
# 1.62 15-Jun-2008 christos

branches: 1.62.2; 1.62.4; 1.62.6;
- add if_alloc (ours just mallocs), and if_initname and use them (from FreeBSD)
- kill memsets where M_ZERO can be used.


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base
# 1.61 15-Apr-2008 thorpej

branches: 1.61.2; 1.61.4; 1.61.6; 1.61.8;
Make ip6 and icmp6 stats per-cpu.


# 1.60 12-Apr-2008 cegger

make this build with BRIDGE_IPF and PFIL_HOOKS options


# 1.59 12-Apr-2008 thorpej

Make IP, TCP, UDP, and ICMP statistics per-CPU. The stats are collated
when the user requests them via sysctl.


# 1.58 08-Apr-2008 thorpej

Change IPv6 stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ip6stat structure; old netstat
binaries will continue to work properly.


# 1.57 07-Apr-2008 thorpej

Change IP stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ipstat structure; old netstat
binaries will continue to work properly.


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase nick-net80211-sync-base keiichi-mipv6-base matt-armv6-nbase hpcarm-cleanup-base
# 1.56 20-Feb-2008 matt

branches: 1.56.6;
s/u_\(int[0-9]*_t\)/u\1/g
(change u_int*_t to uint*_t)


Revision tags: bouyer-xeni386-nbase bouyer-xeni386-base mjf-devfs-base
# 1.55 19-Jan-2008 dyoung

Use C99 array initializers for bridge_control_table[].


Revision tags: nick-csl-alignment-base5 bouyer-xeni386-merge1 matt-armv6-prevmlocking vmlocking2-base3 yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 jmcneill-base bouyer-xenamd64-base2 vmlocking-nbase yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base matt-armv6-base jmcneill-pm-base reinoud-bufcleanup-base vmlocking-base
# 1.54 27-Aug-2007 dyoung

branches: 1.54.2; 1.54.8; 1.54.14;
LLADDR -> CLLADDR.


# 1.53 26-Aug-2007 dyoung

Constify: LLADDR -> CLLADDR. I'm aiming here to make it easier to
identify sockaddr_dl abuse that remains in the kernel, especially
the potential for overwriting memory past the end of a sockaddr_dl
with, e.g., memcpy(LLADDR(), ...).

Use sockaddr_dl_setaddr() in a few places.


Revision tags: matt-mips64-base nick-csl-alignment-base mjf-ufs-trans-base
# 1.52 09-Jul-2007 ad

branches: 1.52.2; 1.52.6;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.51 12-Mar-2007 ad

branches: 1.51.2;
Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.


# 1.50 04-Mar-2007 christos

branches: 1.50.2;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.49 21-Feb-2007 dyoung

Use __arraycount().


# 1.48 17-Feb-2007 dyoung

KNF: de-__P, bzero -> memset, bcmp -> memcmp. Remove extraneous
parentheses in return statements.

Cosmetic: don't open-code TAILQ_FOREACH().

Cosmetic: change types of variables to avoid oodles of casts: in
in6_src.c, avoid casts by changing several route_in6 pointers
to struct route pointers. Remove unnecessary casts to caddr_t
elsewhere.

Pave the way for eliminating address family-specific route caches:
soon, struct route will not embed a sockaddr, but it will hold
a reference to an external sockaddr, instead. We will set the
destination sockaddr using rtcache_setdst(). (I created a stub
for it, but it isn't used anywhere, yet.) rtcache_free() will
free the sockaddr. I have extracted from rtcache_free() a helper
subroutine, rtcache_clear(). rtcache_clear() will "forget" a
cached route, but it will not forget the destination by releasing
the sockaddr. I use rtcache_clear() instead of rtcache_free()
in rtcache_update(), because rtcache_update() is not supposed
to forget the destination.

Constify:

1 Introduce const accessor for route->ro_dst, rtcache_getdst().

2 Constify the 'dst' argument to ifnet->if_output(). This
led me to constify a lot of code called by output routines.

3 Constify the sockaddr argument to protosw->pr_ctlinput. This
led me to constify a lot of code called by ctlinput routines.

4 Introduce const macros for converting from a generic sockaddr
to family-specific sockaddrs, e.g., sockaddr_in: satocsin6,
satocsin, et cetera.


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.47 04-Jan-2007 elad

branches: 1.47.2;
Consistent usage of KAUTH_GENERIC_ISSUSER.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3 netbsd-4-base
# 1.46 23-Nov-2006 rpaulo

New EtherIP driver based on tap(4) and gif(4) by Hans Rosenfeld.
Notable changes:
* Fixes PR 34268.
* Separates the code from gif(4) (which is more cleaner).
* Allows the usage of STP (Spanning Tree Protocol).
* Removed EtherIP implementation from gif(4)/tap(4).

Some input from Christos.


# 1.45 16-Nov-2006 christos

__unused removal on arguments; approved by core.


Revision tags: yamt-splraiseipl-base2
# 1.44 17-Oct-2006 dogcow

now that we have -Wno-unused-parameter, back out all the tremendously ugly
code to gratuitously access said parameters.


# 1.43 13-Oct-2006 dogcow

More -Wunused fallout. sprinkle __unused when possible; otherwise, use the
do { if (&x) {} } while (/* CONSTCOND */ 0);
construct as suggested by uwe in <20061012224845.GA9449@snark.ptc.spbu.ru>.


# 1.42 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


# 1.41 05-Oct-2006 tls

Protect calls to pool_put/pool_get that may occur in interrupt context
with spl used to protect other allocations and frees, or datastructure
element insertion and removal, in adjacent code.

It is almost unquestionably the case that some of the spl()/splx() calls
added here are superfluous, but it really seems wrong to see:

s=splfoo();
/* frob data structure */
splx(s);
pool_put(x);

and if we think we need to protect the first operation, then it is hard
to see why we should not think we need to protect the next. "Better
safe than sorry".

It is also almost unquestionably the case that I missed some pool
gets/puts from interrupt context with my strategy for finding these
calls; use of PR_NOWAIT is a strong hint that a pool may be used from
interrupt context but many callers in the kernel pass a "can wait/can't
wait" flag down such that my searches might not have found them. One
notable area that needs to be looked at is pf.

See also:

http://mail-index.netbsd.org/tech-kern/2006/07/19/0003.html
http://mail-index.netbsd.org/tech-kern/2006/07/19/0009.html


Revision tags: abandoned-netbsd-4-base yamt-splraiseipl-base yamt-pdpolicy-base9 yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.40 23-Jul-2006 ad

branches: 1.40.4; 1.40.6;
Use the LWP cached credentials where sane.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.39 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.38 18-May-2006 liamjfoy

branches: 1.38.2;
Integrate Common Address Redundancy Procotol (CARP) from OpenBSD

'pseudo-device carp'

Thanks to: joerg@ christos@ riz@ and others who tested
Ok: core@


# 1.37 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base yamt-pdpolicy-base2 elad-kernelauth-base yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.36 17-Jan-2006 christos

branches: 1.36.2; 1.36.4; 1.36.6; 1.36.8; 1.36.10;
Make sure that breq is also cleared (from Xin LI)


# 1.35 09-Jan-2006 christos

Make sure we initialize all structs to 0; from Xin LI


# 1.34 24-Dec-2005 perry

branches: 1.34.2;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.33 11-Dec-2005 thorpej

ANSI function decls and application of static.


# 1.32 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.31 01-Jun-2005 jdc

branches: 1.31.2;
Fix this properly by renaming the conflicting variables.


# 1.30 01-Jun-2005 jdc

Remove extraneous definition of struct llc (found by shadow warning).


Revision tags: netbsd-3-0-RELEASE netbsd-3-0-RC6 netbsd-3-0-RC5 netbsd-3-0-RC4 netbsd-3-0-RC3 netbsd-3-0-RC2 netbsd-3-0-RC1 yamt-km-base4 yamt-km-base3 netbsd-3-base kent-audio2-base
# 1.29 26-Feb-2005 perry

branches: 1.29.2; 1.29.4;
nuke trailing whitespace


Revision tags: yamt-km-base2
# 1.28 31-Jan-2005 kim

Add RFC 3378 EtherIP support, ported from OpenBSD to NetBSD by
Hans Rosenfeld (rosenfeld at grumpf.hope-2000.org)

This change makes it possible to add gif interfaces to bridges, which
will then send and receive IP protocol 97 packets. Packets are Ethernet
frames with an EtherIP header prepended.


Revision tags: yamt-km-base kent-audio1-beforemerge kent-audio1-base
# 1.27 04-Dec-2004 peter

branches: 1.27.4; 1.27.6;
Change ifc_destroy to return an int instead of void, so that it
can pass back errors to ifconfig.


# 1.26 06-Oct-2004 bad

Interfaces that do checksum offloading indicate the checksum status of
received packets in csum_flags in the packet header. Packets that are
forwarded over the bridge need to have csum_flags cleared before being
put on the output queue. Do so in bridge_enqueue().

Discussed with Jason Thorpe.

Fixes PR kern/27007 and the first part of PR kern/21831.


# 1.25 05-Oct-2004 christos

Only enable BRIDGE_IPF code if PFIL_HOOKS is enabled.


# 1.24 21-Apr-2004 itojun

kill a sprintf


# 1.23 21-Apr-2004 itojun

kill sprintf, use snprintf


Revision tags: netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.22 31-Jan-2004 jdc

branches: 1.22.2;
Use m_copydata(), m_adj() and M_PREPEND() to manipulate mbuf's in
bridge_ipf(). Fixes kernel memory corruption that occured when using
m_split() and m_cat().
Idea from OpenBSD.


# 1.21 09-Dec-2003 augustss

Fix spelling mistake in a comment.


# 1.20 28-Oct-2003 mycroft

Mark this initializer in the canonical way so it can be found later.


# 1.19 25-Oct-2003 christos

Fix uninitialized variable warnings


# 1.18 16-Sep-2003 jdc

Add a flag parameter to bridge_enqueue() to tell it whether to run the
filter or not. We only need to run the filter for bridge_forward() and
bridge_broadcast(). If we also run it for bridge_output(), we will run
the filter twice outbound per packet, so don't.

In bridge_ipf(), make sure we don't run m_cat() on a single mbuf chain
by checking to see (and remembering) if we need to m_split() the mbuf.
This fixes bridge + ipfilter on sparc.

Fixes PR kern/22063.


# 1.17 11-Aug-2003 itojun

rm extra blank line


# 1.16 13-Jul-2003 jdc

Include opt_inet.h to get INET6 definition.
Now, bridged ipv6 packets are passed through ipfilter.
However, some v6 packets still do not get transmitted when ipf is enabled.
Partial fix for PR kern/22063.


# 1.15 23-Jun-2003 martin

branches: 1.15.2;
Make sure to include opt_foo.h if a defflag option FOO is used.


# 1.14 24-May-2003 kristerw

Make sure splx() is called for all bridge_ioctl() error cases.


# 1.13 16-May-2003 itojun

use strlcpy


# 1.12 14-May-2003 itojun

use arc4random


# 1.11 19-Mar-2003 bouyer

Fix 2 bugs:
- initialise stp when the bridge is turned up, without this stp will keep
all interfaces disabled in a sequence like:
brconfig bridge0 add if0 add if1 stp if0 stp if1 up
- s/BRDGSPRI/BRDGSIFPRIO in brconfig.c:cmd_ifpriority()

add a command (ifpathcost) to change the stp path cost of the STP path cost of
an interface. Display the interface path cost with the others STP parameters.


# 1.10 27-Feb-2003 perseant

Make BRIDGE_IPF an option, and document it. Add it (commented) to GENERIC.
Let brconfig tell whether the bridge is using the ipfilter hook, or not.


# 1.9 15-Feb-2003 perseant

Add ipf packet-filtering option to if_bridge. The option is controlled at
compile-time by BRIDGE_IPF, and at runtime by brconfig with the {ipf,-ipf}
option on a per-bridge basis.

As a side-effect, add PFIL_HOOKS processing to if_bridge.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base kqueue-aftermerge kqueue-beforemerge gehenna-devsw-base kqueue-base
# 1.8 24-Aug-2002 martin

Add a function to lookup bridge members by struct ifnet * and use
it at all call sites that have such a pointer readily available.
This avoids unnecessary strcmp()s in critical paths, and removes
some XXX comments.


# 1.7 08-Jun-2002 itojun

reject "add" request if if_mtu is different.


# 1.6 23-May-2002 itojun

use IFT_BRIDGE


Revision tags: netbsd-1-6-base
# 1.5 24-Mar-2002 jdolecek

branches: 1.5.2; 1.5.4;
Fix a memory leak in bridge_ioctl_add() when the called for non-ethernet
interface.
Problem noted and fix provided by in kern/16019 by Love.


Revision tags: eeh-devprop-base newlock-base
# 1.4 08-Mar-2002 thorpej

Pool deals fairly well with physical memory shortage, but it doesn't
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map). Try to deal with this:

* Group all information about the backend allocator for a pool in a
separate structure. The pool references this structure, rather than
the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
to become available, but will still fail if it cannot callocate KVA
space for the pages. If this happens, carefully drain all pools using
the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
some pages, and use that information to make draining easier and more
efficient.
* Get rid of PR_URGENT. There was only one use of it, and it could be
dealt with by the caller.

From art@openbsd.org.


Revision tags: ifpoll-base
# 1.3 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.2 17-Aug-2001 thorpej

branches: 1.2.2; 1.2.4;
Only report expire time for DYNAMIC forwarding table entries.


# 1.1 17-Aug-2001 thorpej

Add support for building Ethernet bridges, based on Jason Wright's
bridge driver from OpenBSD, although the bridge code has been *heavily*
modified by me (the 802.1D code remains mostly unchanged from the
original).


# 1.187 20-Jun-2022 yamaguchi

bridge(4): support VLAN frames stripped by hardware tagging


# 1.186 31-Dec-2021 riastradh

sys: Use if_init wrapper function.

Exception: Not in kern_pmf.c, for the kind of silly reason that it
avoids having kern_pmf.c refer to symbols defined only in net; this
avoids a pain in the rump.


# 1.185 31-Dec-2021 riastradh

sys: Use if_ioctl wrapper function.


# 1.184 31-Dec-2021 riastradh

sys: Use if_stop wrapper function.

Exception: Not in kern_pmf.c, for the kind of silly reason that it
avoids having kern_pmf.c refer to symbols defined only in net; this
avoids a pain in the rump.


# 1.183 30-Sep-2021 yamaguchi

bridge: Register bridge_ifdetach to ether_ifdetach hook


# 1.182 30-Sep-2021 yamaguchi

bridge: Register bridge_calc_link_state to link-state change hook


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base thorpej-i2c-spi-conf-base
# 1.181 02-Jul-2021 yamaguchi

Use if_ioctl() for changing MTU, not ether_ioctl to prevent panic

Fix PR kern/56292


# 1.180 16-Jun-2021 riastradh

if_attach and if_initialize cannot fail, don't test return value

These were originally made failable back in 2017 when if_initialize
allocated a softint in every interface for link state changes, so
that it could fail gracefully instead of panicking:

https://mail-index.NetBSD.org/source-changes/2017/10/23/msg089053.html

However, this spawned many seldom- or never-tested error branches,
which are risky to have around. And that softint in every interface
has since been replaced by a single global workqueue, because link
state changes require thread context but not low latency or high
throughput:

https://mail-index.NetBSD.org/source-changes/2020/02/06/msg113759.html

So there is no longer any reason for if_initialize to fail. (The
subroutine if_stats_init can't fail because percpu_alloc can't fail
either.)

There is a snag: the softint_establish in if_percpuq_create could
fail, potentially leading to bad consequences later on trying to use
the softint. This change doesn't introduce any new bugs because of
the snag -- if_percpuq_attach was already broken. However, the snag
can be better addressed without spawning error branches, either by
using a single softint or making softints less scarce.

(Separate commit will change the signatures of if_attach and
if_initialize to return void, scheduled to ride whatever is the next
convenient kernel bump.)

Patch and testing on amd64 and evbmips64-eb by maya@; commit message
soliloquy, and compile-testing on evbppc/i386/earmv7hf, by me.


Revision tags: cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-cfargs-base thorpej-futex-base
# 1.179 19-Feb-2021 christos

branches: 1.179.4;
- Make ALIGNED_POINTER use __alignof(t) instead of sizeof(t). This is more
correct because it works with non-primitive types and provides the ABI
alignment for the type the compiler will use.
- Remove all the *_HDR_ALIGNMENT macros and asserts
- Replace POINTER_ALIGNED_P with ACCESSIBLE_POINTER which is identical to
ALIGNED_POINTER, but returns that the pointer is always aligned if the
CPU supports unaligned accesses.
[ as proposed in tech-kern ]


# 1.178 14-Feb-2021 christos

- centralize header align and pullup into a single inline function
- use a single macro to align pointers and expose the alignment, instead
of hard-coding 3 in 1/2 the macros.
- fix an issue in the ipv6 lt2p where it was aligning for ipv4 and pulling
for ipv6.


# 1.177 02-Nov-2020 roy

bridge: revert prior

It's of little use.
If we need to do this in the future, consider a sysctl to do it for all
interfaces in the bridge and not just the one being added.


# 1.176 27-Sep-2020 roy

branches: 1.176.2;
bridge: When an interface joins then mark addresses on it as tentative

The exact flow is detatch addresses, join bridge and then mark detached
addresses as tentative.
This ensures that Duplicate Address Detection for the joining interface
are performed across all members of the bridge.


# 1.175 27-Sep-2020 roy

bridge: Calculate link state as the best link state of any member

If any member is LINK_STATE_UP then it's LINK_STATE_UP.
Otherwise if any member is LINK_STATE_UNKNOWN then it's LINK_STATE_UNKNOWN.
Otherwise it's LINK_STATE_DOWN.


# 1.174 01-Aug-2020 maxv

Remove #ifdef BRIDGE_IPF, compile in the code by default. Sent to
tech-net@.


# 1.173 01-May-2020 jdolecek

report no enabled capabilities when no interface is part of bridge


# 1.172 30-Apr-2020 jdolecek

for bridge(4), report the common enabled capabilities of the members
via SIOCGIFCAP for visibility


# 1.171 27-Apr-2020 jdolecek

if MTU of the added interface doesn't match the bridge, modify the MTU
of the interface to that of the bridge instead of just refusing the
addition with EINVAL

this is a convenience feature to simplify bridge setup with non-standard
MTU, the useful behaviour observed with Linux xenbr


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406
# 1.170 27-Mar-2020 jdolecek

replace the conditional m_pullup() on start of bridge_output() with
a KASSERT(), to make it clear no mbuf manipulation is ever done here

the condition should never trigger, this always runs after ether_output()
M_PREPEND()s ether_header


# 1.169 24-Mar-2020 jdolecek

reset the csum_flags in bridge_brodcast() also for bmcast path

for destination interfaces with real hardware offloading this fixes
multicast packet corruption; for xvif(4) this fix stops treating them
as having no csum

may fix PR kern/42386


Revision tags: ad-namecache-base3
# 1.168 24-Feb-2020 rin

Remove debug printf I put into bridge_calc_csum_flags().
Sorry for noise.


# 1.167 23-Feb-2020 jdolecek

disable the DEBUG bridge_calc_csum_flags() printf


# 1.166 29-Jan-2020 thorpej

Adopt <net/if_stats.h>.


Revision tags: ad-namecache-base2 ad-namecache-base1 ad-namecache-base phil-wifi-20191119
# 1.165 05-Aug-2019 msaitoh

branches: 1.165.2;
Cast uint32_t to avoid undefined behavior in bridge_rthash(). Found by kUBSan.


Revision tags: netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609 isaki-audio2-base pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226
# 1.164 22-Dec-2018 rin

branches: 1.164.4;
Take the interface out of promiscuous mode in bridge_delete_member()
instead of bridge_ioctl_del(). Otherwise, the member interfaces are
left in promiscuous mode when the bridge is destroyed.


# 1.163 15-Dec-2018 rin

Improve wording in comments: replace "chain" with "queue" for
sequence of mbuf's connected by m_nextpkt, in order to avoid
confusion with those connected by m_next.

No binary changes.


# 1.162 14-Dec-2018 martin

Need <netinet6/ip6_var.h> for ip6_statinc() prototype.


# 1.161 12-Dec-2018 rin

PR kern/53562

Handle TX offload in software when a packet is sent via
bridge_output(). We can send it as is in the following
exceptional cases:

For unicast:

(1) When the destination interface is the same as source.

(2) When the destination supports all TX offload options
specified in a packet.

For multicast/broadcast:

(3) When all the members of the bridge support the specified
TX offload options.

For (3), add sc_csum_flags_tx flag to bridge softc, which is
logical AND b/w capabilities of TX offload options in member
interface (ifp->if_csum_flags_tx). The flag is updated when a
member is (i) added to or (ii) removed from a bridge, or (iii)
if_csum_flags_tx flag of a member interface is manipulated via
ifconfig(8).

Turn on M_CSUM_TSOv[46] bit in ifp->if_csum_flags_tx flag when
TSO[46] is enabled for that interface.

OK msaitoh thorpej


Revision tags: pgoyette-compat-1126
# 1.160 09-Nov-2018 ozaki-r

Fix that brconfig <bridge> (addr) can't show a large number of MAC addresses

The command shows only 256 addresses at maximum even if a bridge caches more
addresses. It occurs because the kernel doesn't return an error if the command
passes a short buffer that can't store all cached addresses; the kernel fills
cached addresses as much as possible and returns it without telling that the
result is truncated.

Fix the issue by telling a required size of a buffer if a buffer passed from the
command is not enough, which lets the command retry with an enough buffer.

Reported by k-goda@IIJ


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930
# 1.159 19-Sep-2018 msaitoh

Micro optimization. m_copym(M_COPYALL) -> m_copypacket().


# 1.158 18-Sep-2018 msaitoh

- Fix bridge_enqueue() which was broken by last commit. Use correct mbuf
pointer.
- Modify comment.


# 1.157 14-Sep-2018 msaitoh

Fix a bug that bridge_enqueue() incorrectly cleared outgoing packet's offload
flags. bridge_enqueue() is called from bridge_output() when a packet is
spontaneous. Clear csum_flags before calling brige_enqueue() in
bridge_forward() or bridge_broadcast() instead of in the beginning of
bridge_enqueue().

Note that this change doesn't fix a problem on the following configuration:

A bridge has two or more interfaces.

An address is assigned to an bridge member interface and
some offload flags are set.

Another interface has no address and has no any offload flag.

XXX pullup-[78]


Revision tags: pgoyette-compat-0906 pgoyette-compat-0728 phil-wifi-base pgoyette-compat-0625
# 1.156 25-May-2018 ozaki-r

branches: 1.156.2;
Ensure to call if_register after interface initializations finish


Revision tags: pgoyette-compat-0521
# 1.155 14-May-2018 ozaki-r

Protect packet input routines with KERNEL_LOCK and splsoftnet

if_input, i.e, ether_input and friends, now runs in softint without any
protections. It's ok for ether_input itself because it's already MP-safe,
however, subsequent routines called from it such as carp_input and agr_input
aren't safe because they're not MP-safe. Protect if_input with KERNEL_LOCK.

if_input can be called from a normal LWP context. In that case we need to
prevent interrupts (softint) from running by splsoftnet to protect non-MP-safe
codes (e.g., carp_input and agr_input).

Pointed out by mlelstv@


Revision tags: pgoyette-compat-0502 pgoyette-compat-0422
# 1.154 18-Apr-2018 ozaki-r

Add missing PSLIST_ENTRY_INIT and PSLIST_ENTRY_DESTROY


# 1.153 18-Apr-2018 ozaki-r

Get rid of a unnecessary semicolon

Pointed out by kamil@


# 1.152 18-Apr-2018 ozaki-r

bridge: use pslist(9) for rtlist and rthash

The change fixes race conditions on list operations. One example is that a
reader may see invalid pointers on a looking item in a list due to lack of
membar_producer.


# 1.151 18-Apr-2018 ozaki-r

Simplify bridge_rtnode_insert (NFC)


# 1.150 18-Apr-2018 ozaki-r

Remove obsolete NULL checks


Revision tags: pgoyette-compat-0415
# 1.149 10-Apr-2018 ozaki-r

Fix bridge_rtdelete

It removes a rtable entry that belongs to a specified interface, however, its
original behavior was to delete all belonging entries. Restore the original
behavior.


Revision tags: pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.148 15-Jan-2018 maxv

branches: 1.148.2;
If the bridge is not running, don't call bridge_stop. Otherwise the
following commands will crash the kernel:

ifconfig bridge0 create
ifconfig bridge0 destroy


# 1.147 28-Dec-2017 ozaki-r

Ensure the timer isn't running by using workqueue_wait


# 1.146 19-Dec-2017 ozaki-r

Don't set IFEF_MPSAFE unless NET_MPSAFE at this point

Because recent investigations show that interfaces with IFEF_MPSAFE need to
follow additional restrictions to work with the flag safely. We should enable it
on an interface by default only if the interface surely satisfies the
restrictions, which are described in if.h.

Note that enabling IFEF_MPSAFE solely gains a few benefit on performance because
the network stack is still serialized by the big kernel locks by default.


# 1.145 11-Dec-2017 ozaki-r

Wrap if_ioctl_lock with IFNET_* macros (NFC)

Also if_ioctl_lock perhaps needs to be renamed to something because it's now
not just for ioctl...


# 1.144 08-Dec-2017 ozaki-r

Fix build of kernels without ether

By throwing out if_enable_vlan_mtu and if_disable_vlan_mtu that
created a unnecessary dependency from if.c to if_ethersubr.c.

PR kern/52790


# 1.143 06-Dec-2017 ozaki-r

Ensure to not turn on IFF_RUNNING of an interface until its initialization completes

And ensure to turn off it before destruction as per IFF_RUNNING's description
"resource allocated". (The description is a bit doubtful though, I believe the
change is still proper.)


# 1.142 06-Dec-2017 ozaki-r

Ensure to hold if_ioctl_lock when calling if_flags_set


Revision tags: tls-maxphys-base-20171202
# 1.141 17-Nov-2017 ozaki-r

Add missing IFEF_NO_LINK_STATE_CHANGE to bridge


# 1.140 16-Nov-2017 ozaki-r

Unify IFEF_*_MPSAFE into IFEF_MPSAFE

There are already two flags for if_output and if_start, however, it seems such
MPSAFE flags are eventually needed for all if_XXX operations. Having discrete
flags for each operation is wasteful of if_extflags bits. So let's unify
the flags into one: IFEF_MPSAFE.

Fortunately IFEF_*_MPSAFE flags have never been included in any releases, so
we can change them without breaking backward compatibility of the releases
(though the kernel version of -current should be bumped).

Note that if an interface have both MP-safe and non-MP-safe operations at a
time, we have to set the IFEF_MPSAFE flag and let callees of non-MP-safe
opeartions take the kernel lock.

Proposed on tech-kern@ and tech-net@


# 1.139 15-Nov-2017 ozaki-r

Mark callouts of bridge CALLOUT_MPSAFE


# 1.138 25-Oct-2017 ozaki-r

Remove unnecessary splsoftnet


# 1.137 25-Oct-2017 ozaki-r

Don't free sc_rthash twice


# 1.136 23-Oct-2017 msaitoh

- If if_initialize() failed in the attach function, free resources and return.
- Add some missing frees in bridge_clone_destroy().
- KNF


# 1.135 02-Oct-2017 ozaki-r

Add curlwp_bind to bridge_input for psref

It can be called in a thread context via tap (tap_dev_write).

Fix PR kern/52587


Revision tags: nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320
# 1.134 07-Mar-2017 ozaki-r

branches: 1.134.6;
Remove unnecessary splnet for bridge_enqueue

bridge_enqueue now uses if_transmit_lock that does splnet for device
drivers, so splnet for bridge_enqueue isn't needed anymore.


# 1.133 16-Feb-2017 knakahara

add l2tp(4) L2TPv3 interface.

originally implemented by IIJ SEIL team.


Revision tags: nick-nhusb-base-20170204
# 1.132 23-Jan-2017 ozaki-r

Replace some splnet with splsoftnet


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107 nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004
# 1.131 15-Sep-2016 christos

branches: 1.131.2;
Always do the mbuf checks. The packet filters (npf) expect the mbuf to be
pulled-up. (Krists Krilovs)


Revision tags: localcount-20160914
# 1.130 29-Aug-2016 ozaki-r

KNF; replace white spaces with hard tabs

No functional change.


Revision tags: pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.129 22-Jun-2016 knakahara

branches: 1.129.2;
fix: locking about IFQ_ENQUEUE and ALTQ

- If NET_MPSAFE is not defined, IFQ_LOCK is nop. Currently, that means
IFQ_ENQUEUE() of some paths such as bridge_enqueue() is called parallel
wrongly.
- If ALTQ is enabled, Tx processing should call if_transmit() (= IFQ_ENQUEUE
+ ifp->if_start()) instead of ifp->if_transmit() to call ALTQ_ENQUEUE()
and ALTQ_DEQUEUE().
Furthermore, ALTQ processing is always required KERNEL_LOCK currently.


# 1.128 20-Jun-2016 knakahara

fix: should not assert IFEF_OUTPUT_MPSAFE in bridge_output()


# 1.127 20-Jun-2016 knakahara

tentative fix for ATF(net/if_bridge/t_bridge)


# 1.126 20-Jun-2016 knakahara

make bridge_output MP-safe, so that bridge(4) can enable IFEF_OUTPUT_MPSAFE.

making MP-scalable is future work.


# 1.125 10-Jun-2016 ozaki-r

Avoid storing a pointer of an interface in a mbuf

Having a pointer of an interface in a mbuf isn't safe if we remove big
kernel locks; an interface object (ifnet) can be destroyed anytime in any
packet processing and accessing such object via a pointer is racy. Instead
we have to get an object from the interface collection (ifindex2ifnet) via
an interface index (if_index) that is stored to a mbuf instead of an
pointer.

The change provides two APIs: m_{get,put}_rcvif_psref that use psref(9)
for sleep-able critical sections and m_{get,put}_rcvif that use
pserialize(9) for other critical sections. The change also adds another
API called m_get_rcvif_NOMPSAFE, that is NOT MP-safe and for transition
moratorium, i.e., it is intended to be used for places where are not
planned to be MP-ified soon.

The change adds some overhead due to psref to performance sensitive paths,
however the overhead is not serious, 2% down at worst.

Proposed on tech-kern and tech-net.


# 1.124 10-Jun-2016 ozaki-r

Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.123 16-May-2016 ozaki-r

Apply if_get and if_put to bridge(4)


# 1.122 04-May-2016 roy

Allow multicast/broadcast packets from a bridge member to other members.
Note this should just call bridge_broadcast when more locking issues are
resolved.


# 1.121 28-Apr-2016 knakahara

introduce new ifnet MP-scalable sending interface "if_transmit".


# 1.120 28-Apr-2016 ozaki-r

Constify rtentry of if_output

We no longer need to change rtentry below if_output.

The change makes it clear where rtentries are changed (or not)
and helps forthcoming locking (os psrefing) rtentries.


# 1.119 24-Apr-2016 christos

CID 1358673: dead code


Revision tags: nick-nhusb-base-20160422
# 1.118 22-Apr-2016 roy

Change used from int to bool.
If used, abort the loop because we think we're already at the end.


# 1.117 20-Apr-2016 knakahara

IFQ_ENQUEUE refactor (3/3) : eliminate pktattr argument from IFQ_ENQUEUE caller


# 1.116 20-Apr-2016 knakahara

IFQ_ENQUEUE refactor (2/3) : eliminate pktattr argument from altq implemantation


# 1.115 19-Apr-2016 ozaki-r

Apply psref(9) to bridge(4)

Note that there is an issue that ioctls for an interface and a destruction
of the interface can run in parallel and it causes race conditions on
bridge as well (it rarely happens). The issue will be addressed in the
interface common code (if.c).


# 1.114 19-Apr-2016 ozaki-r

Remove BRIDGE_MPSAFE switch and enable MP-safe code by default

We need to enable it by default because bridge_input now runs
in softint, but bridge_input w/o BRIDGE_MPSAFE was designed as
it runs in hardware interrupt.

Note that there remains a racy code in bridge_output; it will be
solved in the upcoming change (applying psref(9)).


# 1.113 11-Apr-2016 ozaki-r

Fix usage of pslist(9)

Pointed out by riastradh@.


# 1.112 11-Apr-2016 ozaki-r

Use pslist(9) in bridge(4)

This adds missing memory barriers to list operations for pserialize.


# 1.111 28-Mar-2016 ozaki-r

Remove unused global bridge list

Pointed out by riastradh@


# 1.110 23-Mar-2016 ozaki-r

Fix LIST_FOREACH argument


# 1.109 23-Mar-2016 ozaki-r

Use LIST_FOREACH instead of LIST_FOREACH_SAFE

No need to use *_SAFE because we don't remove any items in the loop.


Revision tags: nick-nhusb-base-20160319
# 1.108 15-Feb-2016 ozaki-r

Simplify bridge(4)

Thanks to introducing softint-based if_input, the entire bridge code now
never run in hardware interrupt context. So we can simplify the code.

- Remove spin mutexes
- They were needed because some code of bridge could run in
hardware interrupt context
- We now need only an adaptive mutex for each shared object
(a member list and a forwarding table)
- Remove pktqueue
- bridge_input is already in softint, using another softint
(for bridge_forward) is useless
- Packet distribution should be down at device drivers


# 1.107 10-Feb-2016 ozaki-r

Don't share struct work, instead have one per softc

Pointed out by riastradh@


# 1.106 09-Feb-2016 ozaki-r

Introduce softint-based if_input

This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.

This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.

To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.

Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html

Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!


Revision tags: nick-nhusb-base-20151226
# 1.105 19-Nov-2015 christos

Add handling of VLAN packets in if_bridge where the parent interface supports
them (Jean-Jacques.Puig@espci.fr). Factor out the vlan_mtu enabling and
disabling code.


# 1.104 20-Oct-2015 maxv

Harmless alloc inconsistency; make sure the exact same argument is given to
kmem_alloc/kmem_free. Found by Brainy.


# 1.103 07-Oct-2015 ozaki-r

Enqueue frames to a curcpu's pktqueue

Currently RX can run on a CPU other than CPU#0, so always enqueuing
to a pktqueue of CPU#0 makes no sense. Let's use a curcpu's pktqueue,
although bridge_foward softint doesn't run in parallel without
NET_MPSAFE.

This is a temporal solution. We need a fundamental solution.


Revision tags: nick-nhusb-base-20150921
# 1.102 28-Aug-2015 rjs

Don't set M_PROTO1 in mbuf flags.

This was left over from the old usage of gif(4) with bridges.


# 1.101 20-Aug-2015 christos

include "ioconf.h" to get the 'void <driver>attach(int count);' prototype.


# 1.100 23-Jul-2015 ozaki-r

Fix PR 48104

So far bridge cannot receive frames via a member interface when the frames
come from another member interface. So when we assign an IP address to
a member interface, hosts connected to another member interface cannot
ping to the IP address. That behavior isn't expected. See PR 48104 for
more realistic examples of this issue.

The change does:
- drop M_PROMISC before ether_input, which allows a bridge member interface
to receive a frame coming from another bridge member interface
- receive broadcast/multicast frames via all bridge member interfaces,
which is required to receive IPv6 multicast packets destined to a
multicast group belonging to a bridge member interface that is different
from a packet arrival interface

roy@ helped testing of the fix, thanks!


Revision tags: nick-nhusb-base-20150606
# 1.99 01-Jun-2015 matt

Modify the BRDGGIFS and BRDGRTS cmds to be more COMPAT_NETBSD32 friendly.
(XXX whitespace)


# 1.98 16-Apr-2015 ozaki-r

Fix racy bridge_delete_member

It can be called from bridge_ioctl_del and bridge_clone_destroy with
a same bridge member (bif) at the same time. We have to prevent
that happens.

Pointed out by riastradh@


Revision tags: nick-nhusb-base-20150406
# 1.97 08-Jan-2015 ozaki-r

Use pserialize for rtlist in bridge

This change enables lockless accesses to bridge rtable lists.
See locking notes in a comment to know how pserialize and
mutexes are used. Some functions are rearranged to use
pserialize. A workqueue is introduced to use pserialize in
bridge_rtage via bridge_timer callout.

As usual, pserialize and mutexes are used only when NET_MPSAFE
on. On the other hand, the newly added workqueue is used
regardless of NET_MPSAFE on or off.


# 1.96 01-Jan-2015 ozaki-r

Reset the expire time of a cache on receiving a frame for the cache

The expire time of a cache in a bridge MAC address table was never reset
once it is initialized regardless of traffic for the cache. The behavior
isn't supposed and active caches are unnecessarily expired and removed.

PR kern/49507


# 1.95 31-Dec-2014 ozaki-r

Use pserialize in bridge

This change enables lockless accesses to bridge member lists.
See locking notes in a comment to know how pserialize and
mutexes are used.

This change also provides support for softint-based interrupt
handling; pserialize readers can run in both HW interrupt and
softint contexts.

As usual, pserialize is used only when NET_MPSAFE on.


# 1.94 25-Dec-2014 ozaki-r

Use LIST_FOREACH_SAFE in bridge_rt* functions


# 1.93 24-Dec-2014 ozaki-r

Replace malloc/free with kmem_* in if_bridge

Additionally M_NOWAIT is replaced with KM_SLEEP.


# 1.92 22-Dec-2014 ozaki-r

Call ether_input/m_freem without holding a lock or referencing unnecessary objects

When NET_MPSAFE on, a bridge tries to pass up a packet to Layer 3
(or call m_freem) with holding a lock or referencing unnecessary
objects. That causes random lock ups. The change fixes the issue.


Revision tags: nick-nhusb-base
# 1.91 15-Aug-2014 ozaki-r

branches: 1.91.2;
bridge: reject non-IFF_SIMPLEX interfaces

bridge does not work with !IFF_SIMPLEX interfaces (PR/18035);
the bug is not yet fixed. Until it gets fixed, we should
reject non-IFF_SIMPLEX interfaces.

Discussed with pooka@


Revision tags: netbsd-7-1-2-RELEASE netbsd-7-1-1-RELEASE netbsd-7-1-RELEASE netbsd-7-1-RC2 netbsd-7-nhusb-base-20170116 netbsd-7-1-RC1 netbsd-7-0-2-RELEASE netbsd-7-nhusb-base netbsd-7-0-1-RELEASE netbsd-7-0-RELEASE netbsd-7-0-RC3 netbsd-7-0-RC2 netbsd-7-0-RC1 netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.90 23-Jul-2014 ozaki-r

branches: 1.90.2;
Avoid calling copyout with holding mutex(IPL_NET)

Because copyout may lead a page fault that may sleep, we have to pull it
out from the critical section of mutex(IPL_NET) in bridge_ioctl_gifs.


# 1.89 23-Jul-2014 ozaki-r

Add missing unlock


# 1.88 20-Jul-2014 ozaki-r

Don't return ENETRESET when ioctl SIOCSIFMTU

Otherwise, just changing MTU with ifconfig shows
a confusable error message.

RP kern/48996


# 1.87 14-Jul-2014 ozaki-r

Make bridge MPSAFE

- Introduce BRIDGE_MPSAFE
- It's enabled only when NET_MPSAFE is defined
in if.h or the kernel config
- Add iflist and rtlist mutex locks
- Locking iflist is performance sensitive,
so it's not used when !BRIDGE_MPSAFE
- Add bif object reference counting
- It enables fine-grain locking for bridge member lists
by allowing to not hold a lock during touching a bif
- bridge_release_member is added to decrement the
reference count
- A condition variable is added to do bridge_delete_member
gracefully
- Add if_bridgeif to ifnet
- It's a shortcut to a bif object of a bridge member
- It reduces a bif lookup cost and so lock contention on iflist
- Make bridgestp MPSAFE too


# 1.86 02-Jul-2014 ozaki-r

Protect bridge_list with a mutex


# 1.85 02-Jul-2014 ozaki-r

Remove obsolete codes for if_snd


# 1.84 23-Jun-2014 ozaki-r

Get rid of unnecessary xc_broadcast after pktq_barrier

Pointed out by rmind@


# 1.83 18-Jun-2014 ozaki-r

Restructure bridge_input and bridge_broadcast

There are two changes:
- Assemble the places calling pktq_enqueue (bridge_forward)
for unicast and {b,m}cast frames into one
- Receive {b,m}cast frames in bridge_broadcast, not in
bridge_input

The changes make the code clear and readable. bridge_input
now doesn't need to take care of {b,m}cast frames;
bridge_forward and bridge_broadcast have the responsibility.

The changes are based on a patch of Lloyd Parkes submitted
in PR 48104, but don't fix its issue yet.


# 1.82 18-Jun-2014 ozaki-r

Tidy up bridge_input

No functional change.


# 1.81 17-Jun-2014 ozaki-r

Restructure ether_input and bridge_input

The network stack of NetBSD is well organized and
layered. A packet reception is processed from a
lower layer to an upper layer one by one. However,
ether_input and bridge_input are not structured so.
bridge_input is called inside ether_input.

The new structure replaces ifnet#if_input of a bridge
member with bridge_input when the member is attached.
So a packet goes straight on a packet reception via
a bridge, bridge_input => ether_input => ip_input.

The change is part of a patch of Lloyd Parkes submitted
in PR 48104. Unlike the patch, the change doesn't
intend to change the behavior of the packet processing.
Another patch will fix PR 48104.


# 1.80 16-Jun-2014 ozaki-r

Add net.interfaces.bridgeN.fwdq.{maxlen,len,drops} sysctl


# 1.79 16-Jun-2014 ozaki-r

Use pktqueue for bridge forwarding queue and softint


# 1.78 15-Jun-2014 ozaki-r

Get rid of unnecessary splnet for pool_{get,put}

A mutex prevents interrupts in the functions now.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base
# 1.77 29-Jun-2013 rmind

branches: 1.77.4;
- Rewrite parts of pfil(9): use array to store hooks and thus be more cache
friendly (there are only few hooks in the system). Make the structures
opaque and the interface more strict.
- Remove PFIL_HOOKS option by making pfil(9) mandatory.


Revision tags: agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8
# 1.76 22-Mar-2012 wiz

branches: 1.76.2; 1.76.4;
Fix typo in kauth name. From PR 46234 by Matthew Mondor.
Tested by Geoff Adams and Ryo ONODERA.


# 1.75 13-Mar-2012 elad

Replace the remaining KAUTH_GENERIC_ISSUSER authorization calls with
something meaningful. All relevant documentation has been updated or
written.

Most of these changes were brought up in the following messages:

http://mail-index.netbsd.org/tech-kern/2012/01/18/msg012490.html
http://mail-index.netbsd.org/tech-kern/2012/01/19/msg012502.html
http://mail-index.netbsd.org/tech-kern/2012/02/17/msg012728.html

Thanks to christos, manu, njoly, and jmmv for input.

Huge thanks to pgoyette for spinning these changes through some build
cycles and ATF.


Revision tags: netbsd-6-0-6-RELEASE netbsd-6-1-5-RELEASE netbsd-6-1-4-RELEASE netbsd-6-0-5-RELEASE netbsd-6-1-3-RELEASE netbsd-6-0-4-RELEASE netbsd-6-1-2-RELEASE netbsd-6-0-3-RELEASE netbsd-6-1-1-RELEASE netbsd-6-0-2-RELEASE netbsd-6-1-RELEASE netbsd-6-1-RC4 netbsd-6-1-RC3 netbsd-6-1-RC2 netbsd-6-1-RC1 netbsd-6-0-1-RELEASE matt-nb6-plus-nbase netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base
# 1.74 19-Nov-2011 tls

branches: 1.74.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.73 23-May-2011 joerg

branches: 1.73.4;
simplify


Revision tags: bouyer-quota2-nbase bouyer-quota2-base jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.72 07-Dec-2010 pooka

branches: 1.72.2;
_KERNEL_TOP


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1 yamt-nfs-mp-base9 uebayasi-xip-base
# 1.71 19-Jan-2010 pooka

branches: 1.71.4;
Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


Revision tags: matt-premerge-20091211 yamt-nfs-mp-base8 yamt-nfs-mp-base7 jymxensuspend-base yamt-nfs-mp-base6 yamt-nfs-mp-base5 jym-xensuspend-nbase
# 1.70 17-May-2009 cegger

fix crash in bridge_ioctl():


BRDGGFLT and BRDGSFILT bridge controls are only available with BRIDGE_IPF and PFIL_HOOKS defined.
In amd64 GENERIC and XEN kernel configs PFIL_HOOKS is defined but BRIDGE_IPF is not.

When a BRDGGFLT or BRDGSFILT command comes in, then ifd->ifd_cmd is not in range
of bridge_control_table_size. Then bc is not set and is dereferenced
later => BOOM.


Revision tags: yamt-nfs-mp-base4 jym-xensuspend-base
# 1.69 12-May-2009 elad

Move kauth(9) call before going into splnet().

Mailing list reference:

http://mail-index.netbsd.org/tech-net/2009/05/08/msg001286.html


Revision tags: yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 nick-hppapmap-base
# 1.68 04-Apr-2009 bouyer

Fix another typo


# 1.67 04-Apr-2009 bouyer

Fix a comment, and make it build.


# 1.66 04-Apr-2009 bouyer

Fixes from Masao Uebayashi


# 1.65 04-Apr-2009 bouyer

Fix for if_start() and pfil_hook() being called from hardware interrupt
context (reported on various mailing-lists, and part of PR kern/41114,
causing panic in pf(4) and possibly ipf(4) when BRIDGE_IPF is used).
Defer bridge_forward() to a software interrupt; bridge_input() enqueues
mbufs to ifp->if_snd which is handled in bridge_forward().


Revision tags: nick-hppapmap-base2
# 1.64 18-Jan-2009 mrg

branches: 1.64.2;
Fix multiple problems:

* A sign extension error creating the bridge ID corrupted the
priority (always making it the maximum).
* Do not catch STP packets on an interface for which STP is not
enabled -- it's a violation of the spec, and causes STP to fail on
neighboring bridges.
* An optimization to bstp_input() -- some information is already
known when we call it.

contributed anonymously.


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.63 07-Nov-2008 dyoung

*** Summary ***

When a link-layer address changes (e.g., ifconfig ex0 link
02:de:ad:be:ef:02 active), send a gratuitous ARP and/or a Neighbor
Advertisement to update the network-/link-layer address bindings
on our LAN peers.

Refuse a change of ethernet address to the address 00:00:00:00:00:00
or to any multicast/broadcast address. (Thanks matt@.)

Reorder ifnet ioctl operations so that driver ioctls may inherit
the functions of their "class"---ether_ioctl(), fddi_ioctl(), et
cetera---and the class ioctls may inherit from the generic ioctl,
ifioctl_common(), but both driver- and class-ioctls may override
the generic behavior. Make network drivers share more code.

Distinguish a "factory" link-layer address from others for the
purposes of both protecting that address from deletion and computing
EUI64.

Return consistent, appropriate error codes from network drivers.

Improve readability. KNF.

*** Details ***

In if_attach(), always initialize the interface ioctl routine,
ifnet->if_ioctl, if the driver has not already initialized it.
Delete if_ioctl == NULL tests everywhere else, because it cannot
happen.

In the ioctl routines of network interfaces, inherit common ioctl
behaviors by calling either ifioctl_common() or whichever ioctl
routine is appropriate for the class of interface---e.g., ether_ioctl()
for ethernets.

Stop (ab)using SIOCSIFADDR and start to use SIOCINITIFADDR. In
the user->kernel interface, SIOCSIFADDR's argument was an ifreq,
but on the protocol->ifnet interface, SIOCSIFADDR's argument was
an ifaddr. That was confusing, and it would work against me as I
make it possible for a network interface to overload most ioctls.
On the protocol->ifnet interface, replace SIOCSIFADDR with
SIOCINITIFADDR. In ifioctl(), return EPERM if userland tries to
invoke SIOCINITIFADDR.

In ifioctl(), give the interface the first shot at handling most
interface ioctls, and give the protocol the second shot, instead
of the other way around. Finally, let compatibility code (COMPAT_OSOCK)
take a shot.

Pull device initialization out of switch statements under
SIOCINITIFADDR. For example, pull ..._init() out of any switch
statement that looks like this:

switch (...->sa_family) {
case ...:
..._init();
...
break;
...
default:
..._init();
...
break;
}

Rewrite many if-else clauses that handle all permutations of IFF_UP
and IFF_RUNNING to use a switch statement,

switch (x & (IFF_UP|IFF_RUNNING)) {
case 0:
...
break;
case IFF_RUNNING:
...
break;
case IFF_UP:
...
break;
case IFF_UP|IFF_RUNNING:
...
break;
}

unifdef lots of code containing #ifdef FreeBSD, #ifdef NetBSD, and
#ifdef SIOCSIFMTU, especially in fwip(4) and in ndis(4).

In ipw(4), remove an if_set_sadl() call that is out of place.

In nfe(4), reuse the jumbo MTU logic in ether_ioctl().

Let ethernets register a callback for setting h/w state such as
promiscuous mode and the multicast filter in accord with a change
in the if_flags: ether_set_ifflags_cb() registers a callback that
returns ENETRESET if the caller should reset the ethernet by calling
if_init(), 0 on success, != 0 on failure. Pull common code from
ex(4), gem(4), nfe(4), sip(4), tlp(4), vge(4) into ether_ioctl(),
and register if_flags callbacks for those drivers.

Return ENOTTY instead of EINVAL for inappropriate ioctls. In
zyd(4), use ENXIO instead of ENOTTY to indicate that the device is
not any longer attached.

Add to if_set_sadl() a boolean 'factory' argument that indicates
whether a link-layer address was assigned by the factory or some
other source. In a comment, recommend using the factory address
for generating an EUI64, and update in6_get_hw_ifid() to prefer a
factory address to any other link-layer address.

Add a routing message, RTM_LLINFO_UPD, that tells protocols to
update the binding of network-layer addresses to link-layer addresses.
Implement this message in IPv4 and IPv6 by sending a gratuitous
ARP or a neighbor advertisement, respectively. Generate RTM_LLINFO_UPD
messages on a change of an interface's link-layer address.

In ether_ioctl(), do not let SIOCALIFADDR set a link-layer address
that is broadcast/multicast or equal to 00:00:00:00:00:00.

Make ether_ioctl() call ifioctl_common() to handle ioctls that it
does not understand.

In gif(4), initialize if_softc and use it, instead of assuming that
the gif_softc and ifp overlap.

Let ifioctl_common() handle SIOCGIFADDR.

Sprinkle rtcache_invariants(), which checks on DIAGNOSTIC kernels
that certain invariants on a struct route are satisfied.

In agr(4), rewrite agr_ioctl_filter() to be a bit more explicit
about the ioctls that we do not allow on an agr(4) member interface.

bzero -> memset. Delete unnecessary casts to void *. Use
sockaddr_in_init() and sockaddr_in6_init(). Compare pointers with
NULL instead of "testing truth". Replace some instances of (type
*)0 with NULL. Change some K&R prototypes to ANSI C, and join
lines.


Revision tags: netbsd-5-0-RC3 netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 wrstuden-revivesa-base-1 simonb-wapbl-nbase yamt-pf42-base4 simonb-wapbl-base wrstuden-revivesa-base
# 1.62 15-Jun-2008 christos

branches: 1.62.2; 1.62.4; 1.62.6;
- add if_alloc (ours just mallocs), and if_initname and use them (from FreeBSD)
- kill memsets where M_ZERO can be used.


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base
# 1.61 15-Apr-2008 thorpej

branches: 1.61.2; 1.61.4; 1.61.6; 1.61.8;
Make ip6 and icmp6 stats per-cpu.


# 1.60 12-Apr-2008 cegger

make this build with BRIDGE_IPF and PFIL_HOOKS options


# 1.59 12-Apr-2008 thorpej

Make IP, TCP, UDP, and ICMP statistics per-CPU. The stats are collated
when the user requests them via sysctl.


# 1.58 08-Apr-2008 thorpej

Change IPv6 stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ip6stat structure; old netstat
binaries will continue to work properly.


# 1.57 07-Apr-2008 thorpej

Change IP stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ipstat structure; old netstat
binaries will continue to work properly.


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase nick-net80211-sync-base keiichi-mipv6-base matt-armv6-nbase hpcarm-cleanup-base
# 1.56 20-Feb-2008 matt

branches: 1.56.6;
s/u_\(int[0-9]*_t\)/u\1/g
(change u_int*_t to uint*_t)


Revision tags: bouyer-xeni386-nbase bouyer-xeni386-base mjf-devfs-base
# 1.55 19-Jan-2008 dyoung

Use C99 array initializers for bridge_control_table[].


Revision tags: nick-csl-alignment-base5 bouyer-xeni386-merge1 matt-armv6-prevmlocking vmlocking2-base3 yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 jmcneill-base bouyer-xenamd64-base2 vmlocking-nbase yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base matt-armv6-base jmcneill-pm-base reinoud-bufcleanup-base vmlocking-base
# 1.54 27-Aug-2007 dyoung

branches: 1.54.2; 1.54.8; 1.54.14;
LLADDR -> CLLADDR.


# 1.53 26-Aug-2007 dyoung

Constify: LLADDR -> CLLADDR. I'm aiming here to make it easier to
identify sockaddr_dl abuse that remains in the kernel, especially
the potential for overwriting memory past the end of a sockaddr_dl
with, e.g., memcpy(LLADDR(), ...).

Use sockaddr_dl_setaddr() in a few places.


Revision tags: matt-mips64-base nick-csl-alignment-base mjf-ufs-trans-base
# 1.52 09-Jul-2007 ad

branches: 1.52.2; 1.52.6;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.51 12-Mar-2007 ad

branches: 1.51.2;
Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.


# 1.50 04-Mar-2007 christos

branches: 1.50.2;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.49 21-Feb-2007 dyoung

Use __arraycount().


# 1.48 17-Feb-2007 dyoung

KNF: de-__P, bzero -> memset, bcmp -> memcmp. Remove extraneous
parentheses in return statements.

Cosmetic: don't open-code TAILQ_FOREACH().

Cosmetic: change types of variables to avoid oodles of casts: in
in6_src.c, avoid casts by changing several route_in6 pointers
to struct route pointers. Remove unnecessary casts to caddr_t
elsewhere.

Pave the way for eliminating address family-specific route caches:
soon, struct route will not embed a sockaddr, but it will hold
a reference to an external sockaddr, instead. We will set the
destination sockaddr using rtcache_setdst(). (I created a stub
for it, but it isn't used anywhere, yet.) rtcache_free() will
free the sockaddr. I have extracted from rtcache_free() a helper
subroutine, rtcache_clear(). rtcache_clear() will "forget" a
cached route, but it will not forget the destination by releasing
the sockaddr. I use rtcache_clear() instead of rtcache_free()
in rtcache_update(), because rtcache_update() is not supposed
to forget the destination.

Constify:

1 Introduce const accessor for route->ro_dst, rtcache_getdst().

2 Constify the 'dst' argument to ifnet->if_output(). This
led me to constify a lot of code called by output routines.

3 Constify the sockaddr argument to protosw->pr_ctlinput. This
led me to constify a lot of code called by ctlinput routines.

4 Introduce const macros for converting from a generic sockaddr
to family-specific sockaddrs, e.g., sockaddr_in: satocsin6,
satocsin, et cetera.


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.47 04-Jan-2007 elad

branches: 1.47.2;
Consistent usage of KAUTH_GENERIC_ISSUSER.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3 netbsd-4-base
# 1.46 23-Nov-2006 rpaulo

New EtherIP driver based on tap(4) and gif(4) by Hans Rosenfeld.
Notable changes:
* Fixes PR 34268.
* Separates the code from gif(4) (which is more cleaner).
* Allows the usage of STP (Spanning Tree Protocol).
* Removed EtherIP implementation from gif(4)/tap(4).

Some input from Christos.


# 1.45 16-Nov-2006 christos

__unused removal on arguments; approved by core.


Revision tags: yamt-splraiseipl-base2
# 1.44 17-Oct-2006 dogcow

now that we have -Wno-unused-parameter, back out all the tremendously ugly
code to gratuitously access said parameters.


# 1.43 13-Oct-2006 dogcow

More -Wunused fallout. sprinkle __unused when possible; otherwise, use the
do { if (&x) {} } while (/* CONSTCOND */ 0);
construct as suggested by uwe in <20061012224845.GA9449@snark.ptc.spbu.ru>.


# 1.42 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


# 1.41 05-Oct-2006 tls

Protect calls to pool_put/pool_get that may occur in interrupt context
with spl used to protect other allocations and frees, or datastructure
element insertion and removal, in adjacent code.

It is almost unquestionably the case that some of the spl()/splx() calls
added here are superfluous, but it really seems wrong to see:

s=splfoo();
/* frob data structure */
splx(s);
pool_put(x);

and if we think we need to protect the first operation, then it is hard
to see why we should not think we need to protect the next. "Better
safe than sorry".

It is also almost unquestionably the case that I missed some pool
gets/puts from interrupt context with my strategy for finding these
calls; use of PR_NOWAIT is a strong hint that a pool may be used from
interrupt context but many callers in the kernel pass a "can wait/can't
wait" flag down such that my searches might not have found them. One
notable area that needs to be looked at is pf.

See also:

http://mail-index.netbsd.org/tech-kern/2006/07/19/0003.html
http://mail-index.netbsd.org/tech-kern/2006/07/19/0009.html


Revision tags: abandoned-netbsd-4-base yamt-splraiseipl-base yamt-pdpolicy-base9 yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.40 23-Jul-2006 ad

branches: 1.40.4; 1.40.6;
Use the LWP cached credentials where sane.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.39 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.38 18-May-2006 liamjfoy

branches: 1.38.2;
Integrate Common Address Redundancy Procotol (CARP) from OpenBSD

'pseudo-device carp'

Thanks to: joerg@ christos@ riz@ and others who tested
Ok: core@


# 1.37 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base yamt-pdpolicy-base2 elad-kernelauth-base yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.36 17-Jan-2006 christos

branches: 1.36.2; 1.36.4; 1.36.6; 1.36.8; 1.36.10;
Make sure that breq is also cleared (from Xin LI)


# 1.35 09-Jan-2006 christos

Make sure we initialize all structs to 0; from Xin LI


# 1.34 24-Dec-2005 perry

branches: 1.34.2;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.33 11-Dec-2005 thorpej

ANSI function decls and application of static.


# 1.32 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.31 01-Jun-2005 jdc

branches: 1.31.2;
Fix this properly by renaming the conflicting variables.


# 1.30 01-Jun-2005 jdc

Remove extraneous definition of struct llc (found by shadow warning).


Revision tags: netbsd-3-0-RELEASE netbsd-3-0-RC6 netbsd-3-0-RC5 netbsd-3-0-RC4 netbsd-3-0-RC3 netbsd-3-0-RC2 netbsd-3-0-RC1 yamt-km-base4 yamt-km-base3 netbsd-3-base kent-audio2-base
# 1.29 26-Feb-2005 perry

branches: 1.29.2; 1.29.4;
nuke trailing whitespace


Revision tags: yamt-km-base2
# 1.28 31-Jan-2005 kim

Add RFC 3378 EtherIP support, ported from OpenBSD to NetBSD by
Hans Rosenfeld (rosenfeld at grumpf.hope-2000.org)

This change makes it possible to add gif interfaces to bridges, which
will then send and receive IP protocol 97 packets. Packets are Ethernet
frames with an EtherIP header prepended.


Revision tags: yamt-km-base kent-audio1-beforemerge kent-audio1-base
# 1.27 04-Dec-2004 peter

branches: 1.27.4; 1.27.6;
Change ifc_destroy to return an int instead of void, so that it
can pass back errors to ifconfig.


# 1.26 06-Oct-2004 bad

Interfaces that do checksum offloading indicate the checksum status of
received packets in csum_flags in the packet header. Packets that are
forwarded over the bridge need to have csum_flags cleared before being
put on the output queue. Do so in bridge_enqueue().

Discussed with Jason Thorpe.

Fixes PR kern/27007 and the first part of PR kern/21831.


# 1.25 05-Oct-2004 christos

Only enable BRIDGE_IPF code if PFIL_HOOKS is enabled.


# 1.24 21-Apr-2004 itojun

kill a sprintf


# 1.23 21-Apr-2004 itojun

kill sprintf, use snprintf


Revision tags: netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.22 31-Jan-2004 jdc

branches: 1.22.2;
Use m_copydata(), m_adj() and M_PREPEND() to manipulate mbuf's in
bridge_ipf(). Fixes kernel memory corruption that occured when using
m_split() and m_cat().
Idea from OpenBSD.


# 1.21 09-Dec-2003 augustss

Fix spelling mistake in a comment.


# 1.20 28-Oct-2003 mycroft

Mark this initializer in the canonical way so it can be found later.


# 1.19 25-Oct-2003 christos

Fix uninitialized variable warnings


# 1.18 16-Sep-2003 jdc

Add a flag parameter to bridge_enqueue() to tell it whether to run the
filter or not. We only need to run the filter for bridge_forward() and
bridge_broadcast(). If we also run it for bridge_output(), we will run
the filter twice outbound per packet, so don't.

In bridge_ipf(), make sure we don't run m_cat() on a single mbuf chain
by checking to see (and remembering) if we need to m_split() the mbuf.
This fixes bridge + ipfilter on sparc.

Fixes PR kern/22063.


# 1.17 11-Aug-2003 itojun

rm extra blank line


# 1.16 13-Jul-2003 jdc

Include opt_inet.h to get INET6 definition.
Now, bridged ipv6 packets are passed through ipfilter.
However, some v6 packets still do not get transmitted when ipf is enabled.
Partial fix for PR kern/22063.


# 1.15 23-Jun-2003 martin

branches: 1.15.2;
Make sure to include opt_foo.h if a defflag option FOO is used.


# 1.14 24-May-2003 kristerw

Make sure splx() is called for all bridge_ioctl() error cases.


# 1.13 16-May-2003 itojun

use strlcpy


# 1.12 14-May-2003 itojun

use arc4random


# 1.11 19-Mar-2003 bouyer

Fix 2 bugs:
- initialise stp when the bridge is turned up, without this stp will keep
all interfaces disabled in a sequence like:
brconfig bridge0 add if0 add if1 stp if0 stp if1 up
- s/BRDGSPRI/BRDGSIFPRIO in brconfig.c:cmd_ifpriority()

add a command (ifpathcost) to change the stp path cost of the STP path cost of
an interface. Display the interface path cost with the others STP parameters.


# 1.10 27-Feb-2003 perseant

Make BRIDGE_IPF an option, and document it. Add it (commented) to GENERIC.
Let brconfig tell whether the bridge is using the ipfilter hook, or not.


# 1.9 15-Feb-2003 perseant

Add ipf packet-filtering option to if_bridge. The option is controlled at
compile-time by BRIDGE_IPF, and at runtime by brconfig with the {ipf,-ipf}
option on a per-bridge basis.

As a side-effect, add PFIL_HOOKS processing to if_bridge.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base kqueue-aftermerge kqueue-beforemerge gehenna-devsw-base kqueue-base
# 1.8 24-Aug-2002 martin

Add a function to lookup bridge members by struct ifnet * and use
it at all call sites that have such a pointer readily available.
This avoids unnecessary strcmp()s in critical paths, and removes
some XXX comments.


# 1.7 08-Jun-2002 itojun

reject "add" request if if_mtu is different.


# 1.6 23-May-2002 itojun

use IFT_BRIDGE


Revision tags: netbsd-1-6-base
# 1.5 24-Mar-2002 jdolecek

branches: 1.5.2; 1.5.4;
Fix a memory leak in bridge_ioctl_add() when the called for non-ethernet
interface.
Problem noted and fix provided by in kern/16019 by Love.


Revision tags: eeh-devprop-base newlock-base
# 1.4 08-Mar-2002 thorpej

Pool deals fairly well with physical memory shortage, but it doesn't
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map). Try to deal with this:

* Group all information about the backend allocator for a pool in a
separate structure. The pool references this structure, rather than
the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
to become available, but will still fail if it cannot callocate KVA
space for the pages. If this happens, carefully drain all pools using
the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
some pages, and use that information to make draining easier and more
efficient.
* Get rid of PR_URGENT. There was only one use of it, and it could be
dealt with by the caller.

From art@openbsd.org.


Revision tags: ifpoll-base
# 1.3 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.2 17-Aug-2001 thorpej

branches: 1.2.2; 1.2.4;
Only report expire time for DYNAMIC forwarding table entries.


# 1.1 17-Aug-2001 thorpej

Add support for building Ethernet bridges, based on Jason Wright's
bridge driver from OpenBSD, although the bridge code has been *heavily*
modified by me (the 802.1D code remains mostly unchanged from the
original).


# 1.186 31-Dec-2021 riastradh

sys: Use if_init wrapper function.

Exception: Not in kern_pmf.c, for the kind of silly reason that it
avoids having kern_pmf.c refer to symbols defined only in net; this
avoids a pain in the rump.


# 1.185 31-Dec-2021 riastradh

sys: Use if_ioctl wrapper function.


# 1.184 31-Dec-2021 riastradh

sys: Use if_stop wrapper function.

Exception: Not in kern_pmf.c, for the kind of silly reason that it
avoids having kern_pmf.c refer to symbols defined only in net; this
avoids a pain in the rump.


# 1.183 30-Sep-2021 yamaguchi

bridge: Register bridge_ifdetach to ether_ifdetach hook


# 1.182 30-Sep-2021 yamaguchi

bridge: Register bridge_calc_link_state to link-state change hook


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base thorpej-i2c-spi-conf-base
# 1.181 02-Jul-2021 yamaguchi

Use if_ioctl() for changing MTU, not ether_ioctl to prevent panic

Fix PR kern/56292


# 1.180 16-Jun-2021 riastradh

if_attach and if_initialize cannot fail, don't test return value

These were originally made failable back in 2017 when if_initialize
allocated a softint in every interface for link state changes, so
that it could fail gracefully instead of panicking:

https://mail-index.NetBSD.org/source-changes/2017/10/23/msg089053.html

However, this spawned many seldom- or never-tested error branches,
which are risky to have around. And that softint in every interface
has since been replaced by a single global workqueue, because link
state changes require thread context but not low latency or high
throughput:

https://mail-index.NetBSD.org/source-changes/2020/02/06/msg113759.html

So there is no longer any reason for if_initialize to fail. (The
subroutine if_stats_init can't fail because percpu_alloc can't fail
either.)

There is a snag: the softint_establish in if_percpuq_create could
fail, potentially leading to bad consequences later on trying to use
the softint. This change doesn't introduce any new bugs because of
the snag -- if_percpuq_attach was already broken. However, the snag
can be better addressed without spawning error branches, either by
using a single softint or making softints less scarce.

(Separate commit will change the signatures of if_attach and
if_initialize to return void, scheduled to ride whatever is the next
convenient kernel bump.)

Patch and testing on amd64 and evbmips64-eb by maya@; commit message
soliloquy, and compile-testing on evbppc/i386/earmv7hf, by me.


Revision tags: cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-cfargs-base thorpej-futex-base
# 1.179 19-Feb-2021 christos

branches: 1.179.4;
- Make ALIGNED_POINTER use __alignof(t) instead of sizeof(t). This is more
correct because it works with non-primitive types and provides the ABI
alignment for the type the compiler will use.
- Remove all the *_HDR_ALIGNMENT macros and asserts
- Replace POINTER_ALIGNED_P with ACCESSIBLE_POINTER which is identical to
ALIGNED_POINTER, but returns that the pointer is always aligned if the
CPU supports unaligned accesses.
[ as proposed in tech-kern ]


# 1.178 14-Feb-2021 christos

- centralize header align and pullup into a single inline function
- use a single macro to align pointers and expose the alignment, instead
of hard-coding 3 in 1/2 the macros.
- fix an issue in the ipv6 lt2p where it was aligning for ipv4 and pulling
for ipv6.


# 1.177 02-Nov-2020 roy

bridge: revert prior

It's of little use.
If we need to do this in the future, consider a sysctl to do it for all
interfaces in the bridge and not just the one being added.


# 1.176 27-Sep-2020 roy

branches: 1.176.2;
bridge: When an interface joins then mark addresses on it as tentative

The exact flow is detatch addresses, join bridge and then mark detached
addresses as tentative.
This ensures that Duplicate Address Detection for the joining interface
are performed across all members of the bridge.


# 1.175 27-Sep-2020 roy

bridge: Calculate link state as the best link state of any member

If any member is LINK_STATE_UP then it's LINK_STATE_UP.
Otherwise if any member is LINK_STATE_UNKNOWN then it's LINK_STATE_UNKNOWN.
Otherwise it's LINK_STATE_DOWN.


# 1.174 01-Aug-2020 maxv

Remove #ifdef BRIDGE_IPF, compile in the code by default. Sent to
tech-net@.


# 1.173 01-May-2020 jdolecek

report no enabled capabilities when no interface is part of bridge


# 1.172 30-Apr-2020 jdolecek

for bridge(4), report the common enabled capabilities of the members
via SIOCGIFCAP for visibility


# 1.171 27-Apr-2020 jdolecek

if MTU of the added interface doesn't match the bridge, modify the MTU
of the interface to that of the bridge instead of just refusing the
addition with EINVAL

this is a convenience feature to simplify bridge setup with non-standard
MTU, the useful behaviour observed with Linux xenbr


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406
# 1.170 27-Mar-2020 jdolecek

replace the conditional m_pullup() on start of bridge_output() with
a KASSERT(), to make it clear no mbuf manipulation is ever done here

the condition should never trigger, this always runs after ether_output()
M_PREPEND()s ether_header


# 1.169 24-Mar-2020 jdolecek

reset the csum_flags in bridge_brodcast() also for bmcast path

for destination interfaces with real hardware offloading this fixes
multicast packet corruption; for xvif(4) this fix stops treating them
as having no csum

may fix PR kern/42386


Revision tags: ad-namecache-base3
# 1.168 24-Feb-2020 rin

Remove debug printf I put into bridge_calc_csum_flags().
Sorry for noise.


# 1.167 23-Feb-2020 jdolecek

disable the DEBUG bridge_calc_csum_flags() printf


# 1.166 29-Jan-2020 thorpej

Adopt <net/if_stats.h>.


Revision tags: ad-namecache-base2 ad-namecache-base1 ad-namecache-base phil-wifi-20191119
# 1.165 05-Aug-2019 msaitoh

branches: 1.165.2;
Cast uint32_t to avoid undefined behavior in bridge_rthash(). Found by kUBSan.


Revision tags: netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609 isaki-audio2-base pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226
# 1.164 22-Dec-2018 rin

branches: 1.164.4;
Take the interface out of promiscuous mode in bridge_delete_member()
instead of bridge_ioctl_del(). Otherwise, the member interfaces are
left in promiscuous mode when the bridge is destroyed.


# 1.163 15-Dec-2018 rin

Improve wording in comments: replace "chain" with "queue" for
sequence of mbuf's connected by m_nextpkt, in order to avoid
confusion with those connected by m_next.

No binary changes.


# 1.162 14-Dec-2018 martin

Need <netinet6/ip6_var.h> for ip6_statinc() prototype.


# 1.161 12-Dec-2018 rin

PR kern/53562

Handle TX offload in software when a packet is sent via
bridge_output(). We can send it as is in the following
exceptional cases:

For unicast:

(1) When the destination interface is the same as source.

(2) When the destination supports all TX offload options
specified in a packet.

For multicast/broadcast:

(3) When all the members of the bridge support the specified
TX offload options.

For (3), add sc_csum_flags_tx flag to bridge softc, which is
logical AND b/w capabilities of TX offload options in member
interface (ifp->if_csum_flags_tx). The flag is updated when a
member is (i) added to or (ii) removed from a bridge, or (iii)
if_csum_flags_tx flag of a member interface is manipulated via
ifconfig(8).

Turn on M_CSUM_TSOv[46] bit in ifp->if_csum_flags_tx flag when
TSO[46] is enabled for that interface.

OK msaitoh thorpej


Revision tags: pgoyette-compat-1126
# 1.160 09-Nov-2018 ozaki-r

Fix that brconfig <bridge> (addr) can't show a large number of MAC addresses

The command shows only 256 addresses at maximum even if a bridge caches more
addresses. It occurs because the kernel doesn't return an error if the command
passes a short buffer that can't store all cached addresses; the kernel fills
cached addresses as much as possible and returns it without telling that the
result is truncated.

Fix the issue by telling a required size of a buffer if a buffer passed from the
command is not enough, which lets the command retry with an enough buffer.

Reported by k-goda@IIJ


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930
# 1.159 19-Sep-2018 msaitoh

Micro optimization. m_copym(M_COPYALL) -> m_copypacket().


# 1.158 18-Sep-2018 msaitoh

- Fix bridge_enqueue() which was broken by last commit. Use correct mbuf
pointer.
- Modify comment.


# 1.157 14-Sep-2018 msaitoh

Fix a bug that bridge_enqueue() incorrectly cleared outgoing packet's offload
flags. bridge_enqueue() is called from bridge_output() when a packet is
spontaneous. Clear csum_flags before calling brige_enqueue() in
bridge_forward() or bridge_broadcast() instead of in the beginning of
bridge_enqueue().

Note that this change doesn't fix a problem on the following configuration:

A bridge has two or more interfaces.

An address is assigned to an bridge member interface and
some offload flags are set.

Another interface has no address and has no any offload flag.

XXX pullup-[78]


Revision tags: pgoyette-compat-0906 pgoyette-compat-0728 phil-wifi-base pgoyette-compat-0625
# 1.156 25-May-2018 ozaki-r

branches: 1.156.2;
Ensure to call if_register after interface initializations finish


Revision tags: pgoyette-compat-0521
# 1.155 14-May-2018 ozaki-r

Protect packet input routines with KERNEL_LOCK and splsoftnet

if_input, i.e, ether_input and friends, now runs in softint without any
protections. It's ok for ether_input itself because it's already MP-safe,
however, subsequent routines called from it such as carp_input and agr_input
aren't safe because they're not MP-safe. Protect if_input with KERNEL_LOCK.

if_input can be called from a normal LWP context. In that case we need to
prevent interrupts (softint) from running by splsoftnet to protect non-MP-safe
codes (e.g., carp_input and agr_input).

Pointed out by mlelstv@


Revision tags: pgoyette-compat-0502 pgoyette-compat-0422
# 1.154 18-Apr-2018 ozaki-r

Add missing PSLIST_ENTRY_INIT and PSLIST_ENTRY_DESTROY


# 1.153 18-Apr-2018 ozaki-r

Get rid of a unnecessary semicolon

Pointed out by kamil@


# 1.152 18-Apr-2018 ozaki-r

bridge: use pslist(9) for rtlist and rthash

The change fixes race conditions on list operations. One example is that a
reader may see invalid pointers on a looking item in a list due to lack of
membar_producer.


# 1.151 18-Apr-2018 ozaki-r

Simplify bridge_rtnode_insert (NFC)


# 1.150 18-Apr-2018 ozaki-r

Remove obsolete NULL checks


Revision tags: pgoyette-compat-0415
# 1.149 10-Apr-2018 ozaki-r

Fix bridge_rtdelete

It removes a rtable entry that belongs to a specified interface, however, its
original behavior was to delete all belonging entries. Restore the original
behavior.


Revision tags: pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.148 15-Jan-2018 maxv

branches: 1.148.2;
If the bridge is not running, don't call bridge_stop. Otherwise the
following commands will crash the kernel:

ifconfig bridge0 create
ifconfig bridge0 destroy


# 1.147 28-Dec-2017 ozaki-r

Ensure the timer isn't running by using workqueue_wait


# 1.146 19-Dec-2017 ozaki-r

Don't set IFEF_MPSAFE unless NET_MPSAFE at this point

Because recent investigations show that interfaces with IFEF_MPSAFE need to
follow additional restrictions to work with the flag safely. We should enable it
on an interface by default only if the interface surely satisfies the
restrictions, which are described in if.h.

Note that enabling IFEF_MPSAFE solely gains a few benefit on performance because
the network stack is still serialized by the big kernel locks by default.


# 1.145 11-Dec-2017 ozaki-r

Wrap if_ioctl_lock with IFNET_* macros (NFC)

Also if_ioctl_lock perhaps needs to be renamed to something because it's now
not just for ioctl...


# 1.144 08-Dec-2017 ozaki-r

Fix build of kernels without ether

By throwing out if_enable_vlan_mtu and if_disable_vlan_mtu that
created a unnecessary dependency from if.c to if_ethersubr.c.

PR kern/52790


# 1.143 06-Dec-2017 ozaki-r

Ensure to not turn on IFF_RUNNING of an interface until its initialization completes

And ensure to turn off it before destruction as per IFF_RUNNING's description
"resource allocated". (The description is a bit doubtful though, I believe the
change is still proper.)


# 1.142 06-Dec-2017 ozaki-r

Ensure to hold if_ioctl_lock when calling if_flags_set


Revision tags: tls-maxphys-base-20171202
# 1.141 17-Nov-2017 ozaki-r

Add missing IFEF_NO_LINK_STATE_CHANGE to bridge


# 1.140 16-Nov-2017 ozaki-r

Unify IFEF_*_MPSAFE into IFEF_MPSAFE

There are already two flags for if_output and if_start, however, it seems such
MPSAFE flags are eventually needed for all if_XXX operations. Having discrete
flags for each operation is wasteful of if_extflags bits. So let's unify
the flags into one: IFEF_MPSAFE.

Fortunately IFEF_*_MPSAFE flags have never been included in any releases, so
we can change them without breaking backward compatibility of the releases
(though the kernel version of -current should be bumped).

Note that if an interface have both MP-safe and non-MP-safe operations at a
time, we have to set the IFEF_MPSAFE flag and let callees of non-MP-safe
opeartions take the kernel lock.

Proposed on tech-kern@ and tech-net@


# 1.139 15-Nov-2017 ozaki-r

Mark callouts of bridge CALLOUT_MPSAFE


# 1.138 25-Oct-2017 ozaki-r

Remove unnecessary splsoftnet


# 1.137 25-Oct-2017 ozaki-r

Don't free sc_rthash twice


# 1.136 23-Oct-2017 msaitoh

- If if_initialize() failed in the attach function, free resources and return.
- Add some missing frees in bridge_clone_destroy().
- KNF


# 1.135 02-Oct-2017 ozaki-r

Add curlwp_bind to bridge_input for psref

It can be called in a thread context via tap (tap_dev_write).

Fix PR kern/52587


Revision tags: nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320
# 1.134 07-Mar-2017 ozaki-r

branches: 1.134.6;
Remove unnecessary splnet for bridge_enqueue

bridge_enqueue now uses if_transmit_lock that does splnet for device
drivers, so splnet for bridge_enqueue isn't needed anymore.


# 1.133 16-Feb-2017 knakahara

add l2tp(4) L2TPv3 interface.

originally implemented by IIJ SEIL team.


Revision tags: nick-nhusb-base-20170204
# 1.132 23-Jan-2017 ozaki-r

Replace some splnet with splsoftnet


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107 nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004
# 1.131 15-Sep-2016 christos

branches: 1.131.2;
Always do the mbuf checks. The packet filters (npf) expect the mbuf to be
pulled-up. (Krists Krilovs)


Revision tags: localcount-20160914
# 1.130 29-Aug-2016 ozaki-r

KNF; replace white spaces with hard tabs

No functional change.


Revision tags: pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.129 22-Jun-2016 knakahara

branches: 1.129.2;
fix: locking about IFQ_ENQUEUE and ALTQ

- If NET_MPSAFE is not defined, IFQ_LOCK is nop. Currently, that means
IFQ_ENQUEUE() of some paths such as bridge_enqueue() is called parallel
wrongly.
- If ALTQ is enabled, Tx processing should call if_transmit() (= IFQ_ENQUEUE
+ ifp->if_start()) instead of ifp->if_transmit() to call ALTQ_ENQUEUE()
and ALTQ_DEQUEUE().
Furthermore, ALTQ processing is always required KERNEL_LOCK currently.


# 1.128 20-Jun-2016 knakahara

fix: should not assert IFEF_OUTPUT_MPSAFE in bridge_output()


# 1.127 20-Jun-2016 knakahara

tentative fix for ATF(net/if_bridge/t_bridge)


# 1.126 20-Jun-2016 knakahara

make bridge_output MP-safe, so that bridge(4) can enable IFEF_OUTPUT_MPSAFE.

making MP-scalable is future work.


# 1.125 10-Jun-2016 ozaki-r

Avoid storing a pointer of an interface in a mbuf

Having a pointer of an interface in a mbuf isn't safe if we remove big
kernel locks; an interface object (ifnet) can be destroyed anytime in any
packet processing and accessing such object via a pointer is racy. Instead
we have to get an object from the interface collection (ifindex2ifnet) via
an interface index (if_index) that is stored to a mbuf instead of an
pointer.

The change provides two APIs: m_{get,put}_rcvif_psref that use psref(9)
for sleep-able critical sections and m_{get,put}_rcvif that use
pserialize(9) for other critical sections. The change also adds another
API called m_get_rcvif_NOMPSAFE, that is NOT MP-safe and for transition
moratorium, i.e., it is intended to be used for places where are not
planned to be MP-ified soon.

The change adds some overhead due to psref to performance sensitive paths,
however the overhead is not serious, 2% down at worst.

Proposed on tech-kern and tech-net.


# 1.124 10-Jun-2016 ozaki-r

Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.123 16-May-2016 ozaki-r

Apply if_get and if_put to bridge(4)


# 1.122 04-May-2016 roy

Allow multicast/broadcast packets from a bridge member to other members.
Note this should just call bridge_broadcast when more locking issues are
resolved.


# 1.121 28-Apr-2016 knakahara

introduce new ifnet MP-scalable sending interface "if_transmit".


# 1.120 28-Apr-2016 ozaki-r

Constify rtentry of if_output

We no longer need to change rtentry below if_output.

The change makes it clear where rtentries are changed (or not)
and helps forthcoming locking (os psrefing) rtentries.


# 1.119 24-Apr-2016 christos

CID 1358673: dead code


Revision tags: nick-nhusb-base-20160422
# 1.118 22-Apr-2016 roy

Change used from int to bool.
If used, abort the loop because we think we're already at the end.


# 1.117 20-Apr-2016 knakahara

IFQ_ENQUEUE refactor (3/3) : eliminate pktattr argument from IFQ_ENQUEUE caller


# 1.116 20-Apr-2016 knakahara

IFQ_ENQUEUE refactor (2/3) : eliminate pktattr argument from altq implemantation


# 1.115 19-Apr-2016 ozaki-r

Apply psref(9) to bridge(4)

Note that there is an issue that ioctls for an interface and a destruction
of the interface can run in parallel and it causes race conditions on
bridge as well (it rarely happens). The issue will be addressed in the
interface common code (if.c).


# 1.114 19-Apr-2016 ozaki-r

Remove BRIDGE_MPSAFE switch and enable MP-safe code by default

We need to enable it by default because bridge_input now runs
in softint, but bridge_input w/o BRIDGE_MPSAFE was designed as
it runs in hardware interrupt.

Note that there remains a racy code in bridge_output; it will be
solved in the upcoming change (applying psref(9)).


# 1.113 11-Apr-2016 ozaki-r

Fix usage of pslist(9)

Pointed out by riastradh@.


# 1.112 11-Apr-2016 ozaki-r

Use pslist(9) in bridge(4)

This adds missing memory barriers to list operations for pserialize.


# 1.111 28-Mar-2016 ozaki-r

Remove unused global bridge list

Pointed out by riastradh@


# 1.110 23-Mar-2016 ozaki-r

Fix LIST_FOREACH argument


# 1.109 23-Mar-2016 ozaki-r

Use LIST_FOREACH instead of LIST_FOREACH_SAFE

No need to use *_SAFE because we don't remove any items in the loop.


Revision tags: nick-nhusb-base-20160319
# 1.108 15-Feb-2016 ozaki-r

Simplify bridge(4)

Thanks to introducing softint-based if_input, the entire bridge code now
never run in hardware interrupt context. So we can simplify the code.

- Remove spin mutexes
- They were needed because some code of bridge could run in
hardware interrupt context
- We now need only an adaptive mutex for each shared object
(a member list and a forwarding table)
- Remove pktqueue
- bridge_input is already in softint, using another softint
(for bridge_forward) is useless
- Packet distribution should be down at device drivers


# 1.107 10-Feb-2016 ozaki-r

Don't share struct work, instead have one per softc

Pointed out by riastradh@


# 1.106 09-Feb-2016 ozaki-r

Introduce softint-based if_input

This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.

This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.

To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.

Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html

Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!


Revision tags: nick-nhusb-base-20151226
# 1.105 19-Nov-2015 christos

Add handling of VLAN packets in if_bridge where the parent interface supports
them (Jean-Jacques.Puig@espci.fr). Factor out the vlan_mtu enabling and
disabling code.


# 1.104 20-Oct-2015 maxv

Harmless alloc inconsistency; make sure the exact same argument is given to
kmem_alloc/kmem_free. Found by Brainy.


# 1.103 07-Oct-2015 ozaki-r

Enqueue frames to a curcpu's pktqueue

Currently RX can run on a CPU other than CPU#0, so always enqueuing
to a pktqueue of CPU#0 makes no sense. Let's use a curcpu's pktqueue,
although bridge_foward softint doesn't run in parallel without
NET_MPSAFE.

This is a temporal solution. We need a fundamental solution.


Revision tags: nick-nhusb-base-20150921
# 1.102 28-Aug-2015 rjs

Don't set M_PROTO1 in mbuf flags.

This was left over from the old usage of gif(4) with bridges.


# 1.101 20-Aug-2015 christos

include "ioconf.h" to get the 'void <driver>attach(int count);' prototype.


# 1.100 23-Jul-2015 ozaki-r

Fix PR 48104

So far bridge cannot receive frames via a member interface when the frames
come from another member interface. So when we assign an IP address to
a member interface, hosts connected to another member interface cannot
ping to the IP address. That behavior isn't expected. See PR 48104 for
more realistic examples of this issue.

The change does:
- drop M_PROMISC before ether_input, which allows a bridge member interface
to receive a frame coming from another bridge member interface
- receive broadcast/multicast frames via all bridge member interfaces,
which is required to receive IPv6 multicast packets destined to a
multicast group belonging to a bridge member interface that is different
from a packet arrival interface

roy@ helped testing of the fix, thanks!


Revision tags: nick-nhusb-base-20150606
# 1.99 01-Jun-2015 matt

Modify the BRDGGIFS and BRDGRTS cmds to be more COMPAT_NETBSD32 friendly.
(XXX whitespace)


# 1.98 16-Apr-2015 ozaki-r

Fix racy bridge_delete_member

It can be called from bridge_ioctl_del and bridge_clone_destroy with
a same bridge member (bif) at the same time. We have to prevent
that happens.

Pointed out by riastradh@


Revision tags: nick-nhusb-base-20150406
# 1.97 08-Jan-2015 ozaki-r

Use pserialize for rtlist in bridge

This change enables lockless accesses to bridge rtable lists.
See locking notes in a comment to know how pserialize and
mutexes are used. Some functions are rearranged to use
pserialize. A workqueue is introduced to use pserialize in
bridge_rtage via bridge_timer callout.

As usual, pserialize and mutexes are used only when NET_MPSAFE
on. On the other hand, the newly added workqueue is used
regardless of NET_MPSAFE on or off.


# 1.96 01-Jan-2015 ozaki-r

Reset the expire time of a cache on receiving a frame for the cache

The expire time of a cache in a bridge MAC address table was never reset
once it is initialized regardless of traffic for the cache. The behavior
isn't supposed and active caches are unnecessarily expired and removed.

PR kern/49507


# 1.95 31-Dec-2014 ozaki-r

Use pserialize in bridge

This change enables lockless accesses to bridge member lists.
See locking notes in a comment to know how pserialize and
mutexes are used.

This change also provides support for softint-based interrupt
handling; pserialize readers can run in both HW interrupt and
softint contexts.

As usual, pserialize is used only when NET_MPSAFE on.


# 1.94 25-Dec-2014 ozaki-r

Use LIST_FOREACH_SAFE in bridge_rt* functions


# 1.93 24-Dec-2014 ozaki-r

Replace malloc/free with kmem_* in if_bridge

Additionally M_NOWAIT is replaced with KM_SLEEP.


# 1.92 22-Dec-2014 ozaki-r

Call ether_input/m_freem without holding a lock or referencing unnecessary objects

When NET_MPSAFE on, a bridge tries to pass up a packet to Layer 3
(or call m_freem) with holding a lock or referencing unnecessary
objects. That causes random lock ups. The change fixes the issue.


Revision tags: nick-nhusb-base
# 1.91 15-Aug-2014 ozaki-r

branches: 1.91.2;
bridge: reject non-IFF_SIMPLEX interfaces

bridge does not work with !IFF_SIMPLEX interfaces (PR/18035);
the bug is not yet fixed. Until it gets fixed, we should
reject non-IFF_SIMPLEX interfaces.

Discussed with pooka@


Revision tags: netbsd-7-1-2-RELEASE netbsd-7-1-1-RELEASE netbsd-7-1-RELEASE netbsd-7-1-RC2 netbsd-7-nhusb-base-20170116 netbsd-7-1-RC1 netbsd-7-0-2-RELEASE netbsd-7-nhusb-base netbsd-7-0-1-RELEASE netbsd-7-0-RELEASE netbsd-7-0-RC3 netbsd-7-0-RC2 netbsd-7-0-RC1 netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.90 23-Jul-2014 ozaki-r

branches: 1.90.2;
Avoid calling copyout with holding mutex(IPL_NET)

Because copyout may lead a page fault that may sleep, we have to pull it
out from the critical section of mutex(IPL_NET) in bridge_ioctl_gifs.


# 1.89 23-Jul-2014 ozaki-r

Add missing unlock


# 1.88 20-Jul-2014 ozaki-r

Don't return ENETRESET when ioctl SIOCSIFMTU

Otherwise, just changing MTU with ifconfig shows
a confusable error message.

RP kern/48996


# 1.87 14-Jul-2014 ozaki-r

Make bridge MPSAFE

- Introduce BRIDGE_MPSAFE
- It's enabled only when NET_MPSAFE is defined
in if.h or the kernel config
- Add iflist and rtlist mutex locks
- Locking iflist is performance sensitive,
so it's not used when !BRIDGE_MPSAFE
- Add bif object reference counting
- It enables fine-grain locking for bridge member lists
by allowing to not hold a lock during touching a bif
- bridge_release_member is added to decrement the
reference count
- A condition variable is added to do bridge_delete_member
gracefully
- Add if_bridgeif to ifnet
- It's a shortcut to a bif object of a bridge member
- It reduces a bif lookup cost and so lock contention on iflist
- Make bridgestp MPSAFE too


# 1.86 02-Jul-2014 ozaki-r

Protect bridge_list with a mutex


# 1.85 02-Jul-2014 ozaki-r

Remove obsolete codes for if_snd


# 1.84 23-Jun-2014 ozaki-r

Get rid of unnecessary xc_broadcast after pktq_barrier

Pointed out by rmind@


# 1.83 18-Jun-2014 ozaki-r

Restructure bridge_input and bridge_broadcast

There are two changes:
- Assemble the places calling pktq_enqueue (bridge_forward)
for unicast and {b,m}cast frames into one
- Receive {b,m}cast frames in bridge_broadcast, not in
bridge_input

The changes make the code clear and readable. bridge_input
now doesn't need to take care of {b,m}cast frames;
bridge_forward and bridge_broadcast have the responsibility.

The changes are based on a patch of Lloyd Parkes submitted
in PR 48104, but don't fix its issue yet.


# 1.82 18-Jun-2014 ozaki-r

Tidy up bridge_input

No functional change.


# 1.81 17-Jun-2014 ozaki-r

Restructure ether_input and bridge_input

The network stack of NetBSD is well organized and
layered. A packet reception is processed from a
lower layer to an upper layer one by one. However,
ether_input and bridge_input are not structured so.
bridge_input is called inside ether_input.

The new structure replaces ifnet#if_input of a bridge
member with bridge_input when the member is attached.
So a packet goes straight on a packet reception via
a bridge, bridge_input => ether_input => ip_input.

The change is part of a patch of Lloyd Parkes submitted
in PR 48104. Unlike the patch, the change doesn't
intend to change the behavior of the packet processing.
Another patch will fix PR 48104.


# 1.80 16-Jun-2014 ozaki-r

Add net.interfaces.bridgeN.fwdq.{maxlen,len,drops} sysctl


# 1.79 16-Jun-2014 ozaki-r

Use pktqueue for bridge forwarding queue and softint


# 1.78 15-Jun-2014 ozaki-r

Get rid of unnecessary splnet for pool_{get,put}

A mutex prevents interrupts in the functions now.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base
# 1.77 29-Jun-2013 rmind

branches: 1.77.4;
- Rewrite parts of pfil(9): use array to store hooks and thus be more cache
friendly (there are only few hooks in the system). Make the structures
opaque and the interface more strict.
- Remove PFIL_HOOKS option by making pfil(9) mandatory.


Revision tags: agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8
# 1.76 22-Mar-2012 wiz

branches: 1.76.2; 1.76.4;
Fix typo in kauth name. From PR 46234 by Matthew Mondor.
Tested by Geoff Adams and Ryo ONODERA.


# 1.75 13-Mar-2012 elad

Replace the remaining KAUTH_GENERIC_ISSUSER authorization calls with
something meaningful. All relevant documentation has been updated or
written.

Most of these changes were brought up in the following messages:

http://mail-index.netbsd.org/tech-kern/2012/01/18/msg012490.html
http://mail-index.netbsd.org/tech-kern/2012/01/19/msg012502.html
http://mail-index.netbsd.org/tech-kern/2012/02/17/msg012728.html

Thanks to christos, manu, njoly, and jmmv for input.

Huge thanks to pgoyette for spinning these changes through some build
cycles and ATF.


Revision tags: netbsd-6-0-6-RELEASE netbsd-6-1-5-RELEASE netbsd-6-1-4-RELEASE netbsd-6-0-5-RELEASE netbsd-6-1-3-RELEASE netbsd-6-0-4-RELEASE netbsd-6-1-2-RELEASE netbsd-6-0-3-RELEASE netbsd-6-1-1-RELEASE netbsd-6-0-2-RELEASE netbsd-6-1-RELEASE netbsd-6-1-RC4 netbsd-6-1-RC3 netbsd-6-1-RC2 netbsd-6-1-RC1 netbsd-6-0-1-RELEASE matt-nb6-plus-nbase netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base
# 1.74 19-Nov-2011 tls

branches: 1.74.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.73 23-May-2011 joerg

branches: 1.73.4;
simplify


Revision tags: bouyer-quota2-nbase bouyer-quota2-base jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.72 07-Dec-2010 pooka

branches: 1.72.2;
_KERNEL_TOP


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1 yamt-nfs-mp-base9 uebayasi-xip-base
# 1.71 19-Jan-2010 pooka

branches: 1.71.4;
Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


Revision tags: matt-premerge-20091211 yamt-nfs-mp-base8 yamt-nfs-mp-base7 jymxensuspend-base yamt-nfs-mp-base6 yamt-nfs-mp-base5 jym-xensuspend-nbase
# 1.70 17-May-2009 cegger

fix crash in bridge_ioctl():


BRDGGFLT and BRDGSFILT bridge controls are only available with BRIDGE_IPF and PFIL_HOOKS defined.
In amd64 GENERIC and XEN kernel configs PFIL_HOOKS is defined but BRIDGE_IPF is not.

When a BRDGGFLT or BRDGSFILT command comes in, then ifd->ifd_cmd is not in range
of bridge_control_table_size. Then bc is not set and is dereferenced
later => BOOM.


Revision tags: yamt-nfs-mp-base4 jym-xensuspend-base
# 1.69 12-May-2009 elad

Move kauth(9) call before going into splnet().

Mailing list reference:

http://mail-index.netbsd.org/tech-net/2009/05/08/msg001286.html


Revision tags: yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 nick-hppapmap-base
# 1.68 04-Apr-2009 bouyer

Fix another typo


# 1.67 04-Apr-2009 bouyer

Fix a comment, and make it build.


# 1.66 04-Apr-2009 bouyer

Fixes from Masao Uebayashi


# 1.65 04-Apr-2009 bouyer

Fix for if_start() and pfil_hook() being called from hardware interrupt
context (reported on various mailing-lists, and part of PR kern/41114,
causing panic in pf(4) and possibly ipf(4) when BRIDGE_IPF is used).
Defer bridge_forward() to a software interrupt; bridge_input() enqueues
mbufs to ifp->if_snd which is handled in bridge_forward().


Revision tags: nick-hppapmap-base2
# 1.64 18-Jan-2009 mrg

branches: 1.64.2;
Fix multiple problems:

* A sign extension error creating the bridge ID corrupted the
priority (always making it the maximum).
* Do not catch STP packets on an interface for which STP is not
enabled -- it's a violation of the spec, and causes STP to fail on
neighboring bridges.
* An optimization to bstp_input() -- some information is already
known when we call it.

contributed anonymously.


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.63 07-Nov-2008 dyoung

*** Summary ***

When a link-layer address changes (e.g., ifconfig ex0 link
02:de:ad:be:ef:02 active), send a gratuitous ARP and/or a Neighbor
Advertisement to update the network-/link-layer address bindings
on our LAN peers.

Refuse a change of ethernet address to the address 00:00:00:00:00:00
or to any multicast/broadcast address. (Thanks matt@.)

Reorder ifnet ioctl operations so that driver ioctls may inherit
the functions of their "class"---ether_ioctl(), fddi_ioctl(), et
cetera---and the class ioctls may inherit from the generic ioctl,
ifioctl_common(), but both driver- and class-ioctls may override
the generic behavior. Make network drivers share more code.

Distinguish a "factory" link-layer address from others for the
purposes of both protecting that address from deletion and computing
EUI64.

Return consistent, appropriate error codes from network drivers.

Improve readability. KNF.

*** Details ***

In if_attach(), always initialize the interface ioctl routine,
ifnet->if_ioctl, if the driver has not already initialized it.
Delete if_ioctl == NULL tests everywhere else, because it cannot
happen.

In the ioctl routines of network interfaces, inherit common ioctl
behaviors by calling either ifioctl_common() or whichever ioctl
routine is appropriate for the class of interface---e.g., ether_ioctl()
for ethernets.

Stop (ab)using SIOCSIFADDR and start to use SIOCINITIFADDR. In
the user->kernel interface, SIOCSIFADDR's argument was an ifreq,
but on the protocol->ifnet interface, SIOCSIFADDR's argument was
an ifaddr. That was confusing, and it would work against me as I
make it possible for a network interface to overload most ioctls.
On the protocol->ifnet interface, replace SIOCSIFADDR with
SIOCINITIFADDR. In ifioctl(), return EPERM if userland tries to
invoke SIOCINITIFADDR.

In ifioctl(), give the interface the first shot at handling most
interface ioctls, and give the protocol the second shot, instead
of the other way around. Finally, let compatibility code (COMPAT_OSOCK)
take a shot.

Pull device initialization out of switch statements under
SIOCINITIFADDR. For example, pull ..._init() out of any switch
statement that looks like this:

switch (...->sa_family) {
case ...:
..._init();
...
break;
...
default:
..._init();
...
break;
}

Rewrite many if-else clauses that handle all permutations of IFF_UP
and IFF_RUNNING to use a switch statement,

switch (x & (IFF_UP|IFF_RUNNING)) {
case 0:
...
break;
case IFF_RUNNING:
...
break;
case IFF_UP:
...
break;
case IFF_UP|IFF_RUNNING:
...
break;
}

unifdef lots of code containing #ifdef FreeBSD, #ifdef NetBSD, and
#ifdef SIOCSIFMTU, especially in fwip(4) and in ndis(4).

In ipw(4), remove an if_set_sadl() call that is out of place.

In nfe(4), reuse the jumbo MTU logic in ether_ioctl().

Let ethernets register a callback for setting h/w state such as
promiscuous mode and the multicast filter in accord with a change
in the if_flags: ether_set_ifflags_cb() registers a callback that
returns ENETRESET if the caller should reset the ethernet by calling
if_init(), 0 on success, != 0 on failure. Pull common code from
ex(4), gem(4), nfe(4), sip(4), tlp(4), vge(4) into ether_ioctl(),
and register if_flags callbacks for those drivers.

Return ENOTTY instead of EINVAL for inappropriate ioctls. In
zyd(4), use ENXIO instead of ENOTTY to indicate that the device is
not any longer attached.

Add to if_set_sadl() a boolean 'factory' argument that indicates
whether a link-layer address was assigned by the factory or some
other source. In a comment, recommend using the factory address
for generating an EUI64, and update in6_get_hw_ifid() to prefer a
factory address to any other link-layer address.

Add a routing message, RTM_LLINFO_UPD, that tells protocols to
update the binding of network-layer addresses to link-layer addresses.
Implement this message in IPv4 and IPv6 by sending a gratuitous
ARP or a neighbor advertisement, respectively. Generate RTM_LLINFO_UPD
messages on a change of an interface's link-layer address.

In ether_ioctl(), do not let SIOCALIFADDR set a link-layer address
that is broadcast/multicast or equal to 00:00:00:00:00:00.

Make ether_ioctl() call ifioctl_common() to handle ioctls that it
does not understand.

In gif(4), initialize if_softc and use it, instead of assuming that
the gif_softc and ifp overlap.

Let ifioctl_common() handle SIOCGIFADDR.

Sprinkle rtcache_invariants(), which checks on DIAGNOSTIC kernels
that certain invariants on a struct route are satisfied.

In agr(4), rewrite agr_ioctl_filter() to be a bit more explicit
about the ioctls that we do not allow on an agr(4) member interface.

bzero -> memset. Delete unnecessary casts to void *. Use
sockaddr_in_init() and sockaddr_in6_init(). Compare pointers with
NULL instead of "testing truth". Replace some instances of (type
*)0 with NULL. Change some K&R prototypes to ANSI C, and join
lines.


Revision tags: netbsd-5-0-RC3 netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 wrstuden-revivesa-base-1 simonb-wapbl-nbase yamt-pf42-base4 simonb-wapbl-base wrstuden-revivesa-base
# 1.62 15-Jun-2008 christos

branches: 1.62.2; 1.62.4; 1.62.6;
- add if_alloc (ours just mallocs), and if_initname and use them (from FreeBSD)
- kill memsets where M_ZERO can be used.


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base
# 1.61 15-Apr-2008 thorpej

branches: 1.61.2; 1.61.4; 1.61.6; 1.61.8;
Make ip6 and icmp6 stats per-cpu.


# 1.60 12-Apr-2008 cegger

make this build with BRIDGE_IPF and PFIL_HOOKS options


# 1.59 12-Apr-2008 thorpej

Make IP, TCP, UDP, and ICMP statistics per-CPU. The stats are collated
when the user requests them via sysctl.


# 1.58 08-Apr-2008 thorpej

Change IPv6 stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ip6stat structure; old netstat
binaries will continue to work properly.


# 1.57 07-Apr-2008 thorpej

Change IP stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ipstat structure; old netstat
binaries will continue to work properly.


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase nick-net80211-sync-base keiichi-mipv6-base matt-armv6-nbase hpcarm-cleanup-base
# 1.56 20-Feb-2008 matt

branches: 1.56.6;
s/u_\(int[0-9]*_t\)/u\1/g
(change u_int*_t to uint*_t)


Revision tags: bouyer-xeni386-nbase bouyer-xeni386-base mjf-devfs-base
# 1.55 19-Jan-2008 dyoung

Use C99 array initializers for bridge_control_table[].


Revision tags: nick-csl-alignment-base5 bouyer-xeni386-merge1 matt-armv6-prevmlocking vmlocking2-base3 yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 jmcneill-base bouyer-xenamd64-base2 vmlocking-nbase yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base matt-armv6-base jmcneill-pm-base reinoud-bufcleanup-base vmlocking-base
# 1.54 27-Aug-2007 dyoung

branches: 1.54.2; 1.54.8; 1.54.14;
LLADDR -> CLLADDR.


# 1.53 26-Aug-2007 dyoung

Constify: LLADDR -> CLLADDR. I'm aiming here to make it easier to
identify sockaddr_dl abuse that remains in the kernel, especially
the potential for overwriting memory past the end of a sockaddr_dl
with, e.g., memcpy(LLADDR(), ...).

Use sockaddr_dl_setaddr() in a few places.


Revision tags: matt-mips64-base nick-csl-alignment-base mjf-ufs-trans-base
# 1.52 09-Jul-2007 ad

branches: 1.52.2; 1.52.6;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.51 12-Mar-2007 ad

branches: 1.51.2;
Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.


# 1.50 04-Mar-2007 christos

branches: 1.50.2;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.49 21-Feb-2007 dyoung

Use __arraycount().


# 1.48 17-Feb-2007 dyoung

KNF: de-__P, bzero -> memset, bcmp -> memcmp. Remove extraneous
parentheses in return statements.

Cosmetic: don't open-code TAILQ_FOREACH().

Cosmetic: change types of variables to avoid oodles of casts: in
in6_src.c, avoid casts by changing several route_in6 pointers
to struct route pointers. Remove unnecessary casts to caddr_t
elsewhere.

Pave the way for eliminating address family-specific route caches:
soon, struct route will not embed a sockaddr, but it will hold
a reference to an external sockaddr, instead. We will set the
destination sockaddr using rtcache_setdst(). (I created a stub
for it, but it isn't used anywhere, yet.) rtcache_free() will
free the sockaddr. I have extracted from rtcache_free() a helper
subroutine, rtcache_clear(). rtcache_clear() will "forget" a
cached route, but it will not forget the destination by releasing
the sockaddr. I use rtcache_clear() instead of rtcache_free()
in rtcache_update(), because rtcache_update() is not supposed
to forget the destination.

Constify:

1 Introduce const accessor for route->ro_dst, rtcache_getdst().

2 Constify the 'dst' argument to ifnet->if_output(). This
led me to constify a lot of code called by output routines.

3 Constify the sockaddr argument to protosw->pr_ctlinput. This
led me to constify a lot of code called by ctlinput routines.

4 Introduce const macros for converting from a generic sockaddr
to family-specific sockaddrs, e.g., sockaddr_in: satocsin6,
satocsin, et cetera.


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.47 04-Jan-2007 elad

branches: 1.47.2;
Consistent usage of KAUTH_GENERIC_ISSUSER.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3 netbsd-4-base
# 1.46 23-Nov-2006 rpaulo

New EtherIP driver based on tap(4) and gif(4) by Hans Rosenfeld.
Notable changes:
* Fixes PR 34268.
* Separates the code from gif(4) (which is more cleaner).
* Allows the usage of STP (Spanning Tree Protocol).
* Removed EtherIP implementation from gif(4)/tap(4).

Some input from Christos.


# 1.45 16-Nov-2006 christos

__unused removal on arguments; approved by core.


Revision tags: yamt-splraiseipl-base2
# 1.44 17-Oct-2006 dogcow

now that we have -Wno-unused-parameter, back out all the tremendously ugly
code to gratuitously access said parameters.


# 1.43 13-Oct-2006 dogcow

More -Wunused fallout. sprinkle __unused when possible; otherwise, use the
do { if (&x) {} } while (/* CONSTCOND */ 0);
construct as suggested by uwe in <20061012224845.GA9449@snark.ptc.spbu.ru>.


# 1.42 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


# 1.41 05-Oct-2006 tls

Protect calls to pool_put/pool_get that may occur in interrupt context
with spl used to protect other allocations and frees, or datastructure
element insertion and removal, in adjacent code.

It is almost unquestionably the case that some of the spl()/splx() calls
added here are superfluous, but it really seems wrong to see:

s=splfoo();
/* frob data structure */
splx(s);
pool_put(x);

and if we think we need to protect the first operation, then it is hard
to see why we should not think we need to protect the next. "Better
safe than sorry".

It is also almost unquestionably the case that I missed some pool
gets/puts from interrupt context with my strategy for finding these
calls; use of PR_NOWAIT is a strong hint that a pool may be used from
interrupt context but many callers in the kernel pass a "can wait/can't
wait" flag down such that my searches might not have found them. One
notable area that needs to be looked at is pf.

See also:

http://mail-index.netbsd.org/tech-kern/2006/07/19/0003.html
http://mail-index.netbsd.org/tech-kern/2006/07/19/0009.html


Revision tags: abandoned-netbsd-4-base yamt-splraiseipl-base yamt-pdpolicy-base9 yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.40 23-Jul-2006 ad

branches: 1.40.4; 1.40.6;
Use the LWP cached credentials where sane.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.39 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.38 18-May-2006 liamjfoy

branches: 1.38.2;
Integrate Common Address Redundancy Procotol (CARP) from OpenBSD

'pseudo-device carp'

Thanks to: joerg@ christos@ riz@ and others who tested
Ok: core@


# 1.37 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base yamt-pdpolicy-base2 elad-kernelauth-base yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.36 17-Jan-2006 christos

branches: 1.36.2; 1.36.4; 1.36.6; 1.36.8; 1.36.10;
Make sure that breq is also cleared (from Xin LI)


# 1.35 09-Jan-2006 christos

Make sure we initialize all structs to 0; from Xin LI


# 1.34 24-Dec-2005 perry

branches: 1.34.2;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.33 11-Dec-2005 thorpej

ANSI function decls and application of static.


# 1.32 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.31 01-Jun-2005 jdc

branches: 1.31.2;
Fix this properly by renaming the conflicting variables.


# 1.30 01-Jun-2005 jdc

Remove extraneous definition of struct llc (found by shadow warning).


Revision tags: netbsd-3-0-RELEASE netbsd-3-0-RC6 netbsd-3-0-RC5 netbsd-3-0-RC4 netbsd-3-0-RC3 netbsd-3-0-RC2 netbsd-3-0-RC1 yamt-km-base4 yamt-km-base3 netbsd-3-base kent-audio2-base
# 1.29 26-Feb-2005 perry

branches: 1.29.2; 1.29.4;
nuke trailing whitespace


Revision tags: yamt-km-base2
# 1.28 31-Jan-2005 kim

Add RFC 3378 EtherIP support, ported from OpenBSD to NetBSD by
Hans Rosenfeld (rosenfeld at grumpf.hope-2000.org)

This change makes it possible to add gif interfaces to bridges, which
will then send and receive IP protocol 97 packets. Packets are Ethernet
frames with an EtherIP header prepended.


Revision tags: yamt-km-base kent-audio1-beforemerge kent-audio1-base
# 1.27 04-Dec-2004 peter

branches: 1.27.4; 1.27.6;
Change ifc_destroy to return an int instead of void, so that it
can pass back errors to ifconfig.


# 1.26 06-Oct-2004 bad

Interfaces that do checksum offloading indicate the checksum status of
received packets in csum_flags in the packet header. Packets that are
forwarded over the bridge need to have csum_flags cleared before being
put on the output queue. Do so in bridge_enqueue().

Discussed with Jason Thorpe.

Fixes PR kern/27007 and the first part of PR kern/21831.


# 1.25 05-Oct-2004 christos

Only enable BRIDGE_IPF code if PFIL_HOOKS is enabled.


# 1.24 21-Apr-2004 itojun

kill a sprintf


# 1.23 21-Apr-2004 itojun

kill sprintf, use snprintf


Revision tags: netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.22 31-Jan-2004 jdc

branches: 1.22.2;
Use m_copydata(), m_adj() and M_PREPEND() to manipulate mbuf's in
bridge_ipf(). Fixes kernel memory corruption that occured when using
m_split() and m_cat().
Idea from OpenBSD.


# 1.21 09-Dec-2003 augustss

Fix spelling mistake in a comment.


# 1.20 28-Oct-2003 mycroft

Mark this initializer in the canonical way so it can be found later.


# 1.19 25-Oct-2003 christos

Fix uninitialized variable warnings


# 1.18 16-Sep-2003 jdc

Add a flag parameter to bridge_enqueue() to tell it whether to run the
filter or not. We only need to run the filter for bridge_forward() and
bridge_broadcast(). If we also run it for bridge_output(), we will run
the filter twice outbound per packet, so don't.

In bridge_ipf(), make sure we don't run m_cat() on a single mbuf chain
by checking to see (and remembering) if we need to m_split() the mbuf.
This fixes bridge + ipfilter on sparc.

Fixes PR kern/22063.


# 1.17 11-Aug-2003 itojun

rm extra blank line


# 1.16 13-Jul-2003 jdc

Include opt_inet.h to get INET6 definition.
Now, bridged ipv6 packets are passed through ipfilter.
However, some v6 packets still do not get transmitted when ipf is enabled.
Partial fix for PR kern/22063.


# 1.15 23-Jun-2003 martin

branches: 1.15.2;
Make sure to include opt_foo.h if a defflag option FOO is used.


# 1.14 24-May-2003 kristerw

Make sure splx() is called for all bridge_ioctl() error cases.


# 1.13 16-May-2003 itojun

use strlcpy


# 1.12 14-May-2003 itojun

use arc4random


# 1.11 19-Mar-2003 bouyer

Fix 2 bugs:
- initialise stp when the bridge is turned up, without this stp will keep
all interfaces disabled in a sequence like:
brconfig bridge0 add if0 add if1 stp if0 stp if1 up
- s/BRDGSPRI/BRDGSIFPRIO in brconfig.c:cmd_ifpriority()

add a command (ifpathcost) to change the stp path cost of the STP path cost of
an interface. Display the interface path cost with the others STP parameters.


# 1.10 27-Feb-2003 perseant

Make BRIDGE_IPF an option, and document it. Add it (commented) to GENERIC.
Let brconfig tell whether the bridge is using the ipfilter hook, or not.


# 1.9 15-Feb-2003 perseant

Add ipf packet-filtering option to if_bridge. The option is controlled at
compile-time by BRIDGE_IPF, and at runtime by brconfig with the {ipf,-ipf}
option on a per-bridge basis.

As a side-effect, add PFIL_HOOKS processing to if_bridge.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base kqueue-aftermerge kqueue-beforemerge gehenna-devsw-base kqueue-base
# 1.8 24-Aug-2002 martin

Add a function to lookup bridge members by struct ifnet * and use
it at all call sites that have such a pointer readily available.
This avoids unnecessary strcmp()s in critical paths, and removes
some XXX comments.


# 1.7 08-Jun-2002 itojun

reject "add" request if if_mtu is different.


# 1.6 23-May-2002 itojun

use IFT_BRIDGE


Revision tags: netbsd-1-6-base
# 1.5 24-Mar-2002 jdolecek

branches: 1.5.2; 1.5.4;
Fix a memory leak in bridge_ioctl_add() when the called for non-ethernet
interface.
Problem noted and fix provided by in kern/16019 by Love.


Revision tags: eeh-devprop-base newlock-base
# 1.4 08-Mar-2002 thorpej

Pool deals fairly well with physical memory shortage, but it doesn't
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map). Try to deal with this:

* Group all information about the backend allocator for a pool in a
separate structure. The pool references this structure, rather than
the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
to become available, but will still fail if it cannot callocate KVA
space for the pages. If this happens, carefully drain all pools using
the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
some pages, and use that information to make draining easier and more
efficient.
* Get rid of PR_URGENT. There was only one use of it, and it could be
dealt with by the caller.

From art@openbsd.org.


Revision tags: ifpoll-base
# 1.3 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.2 17-Aug-2001 thorpej

branches: 1.2.2; 1.2.4;
Only report expire time for DYNAMIC forwarding table entries.


# 1.1 17-Aug-2001 thorpej

Add support for building Ethernet bridges, based on Jason Wright's
bridge driver from OpenBSD, although the bridge code has been *heavily*
modified by me (the 802.1D code remains mostly unchanged from the
original).


# 1.183 30-Sep-2021 yamaguchi

bridge: Register bridge_ifdetach to ether_ifdetach hook


# 1.182 30-Sep-2021 yamaguchi

bridge: Register bridge_calc_link_state to link-state change hook


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base thorpej-i2c-spi-conf-base
# 1.181 02-Jul-2021 yamaguchi

Use if_ioctl() for changing MTU, not ether_ioctl to prevent panic

Fix PR kern/56292


# 1.180 16-Jun-2021 riastradh

if_attach and if_initialize cannot fail, don't test return value

These were originally made failable back in 2017 when if_initialize
allocated a softint in every interface for link state changes, so
that it could fail gracefully instead of panicking:

https://mail-index.NetBSD.org/source-changes/2017/10/23/msg089053.html

However, this spawned many seldom- or never-tested error branches,
which are risky to have around. And that softint in every interface
has since been replaced by a single global workqueue, because link
state changes require thread context but not low latency or high
throughput:

https://mail-index.NetBSD.org/source-changes/2020/02/06/msg113759.html

So there is no longer any reason for if_initialize to fail. (The
subroutine if_stats_init can't fail because percpu_alloc can't fail
either.)

There is a snag: the softint_establish in if_percpuq_create could
fail, potentially leading to bad consequences later on trying to use
the softint. This change doesn't introduce any new bugs because of
the snag -- if_percpuq_attach was already broken. However, the snag
can be better addressed without spawning error branches, either by
using a single softint or making softints less scarce.

(Separate commit will change the signatures of if_attach and
if_initialize to return void, scheduled to ride whatever is the next
convenient kernel bump.)

Patch and testing on amd64 and evbmips64-eb by maya@; commit message
soliloquy, and compile-testing on evbppc/i386/earmv7hf, by me.


Revision tags: cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-cfargs-base thorpej-futex-base
# 1.179 19-Feb-2021 christos

branches: 1.179.4;
- Make ALIGNED_POINTER use __alignof(t) instead of sizeof(t). This is more
correct because it works with non-primitive types and provides the ABI
alignment for the type the compiler will use.
- Remove all the *_HDR_ALIGNMENT macros and asserts
- Replace POINTER_ALIGNED_P with ACCESSIBLE_POINTER which is identical to
ALIGNED_POINTER, but returns that the pointer is always aligned if the
CPU supports unaligned accesses.
[ as proposed in tech-kern ]


# 1.178 14-Feb-2021 christos

- centralize header align and pullup into a single inline function
- use a single macro to align pointers and expose the alignment, instead
of hard-coding 3 in 1/2 the macros.
- fix an issue in the ipv6 lt2p where it was aligning for ipv4 and pulling
for ipv6.


# 1.177 02-Nov-2020 roy

bridge: revert prior

It's of little use.
If we need to do this in the future, consider a sysctl to do it for all
interfaces in the bridge and not just the one being added.


# 1.176 27-Sep-2020 roy

branches: 1.176.2;
bridge: When an interface joins then mark addresses on it as tentative

The exact flow is detatch addresses, join bridge and then mark detached
addresses as tentative.
This ensures that Duplicate Address Detection for the joining interface
are performed across all members of the bridge.


# 1.175 27-Sep-2020 roy

bridge: Calculate link state as the best link state of any member

If any member is LINK_STATE_UP then it's LINK_STATE_UP.
Otherwise if any member is LINK_STATE_UNKNOWN then it's LINK_STATE_UNKNOWN.
Otherwise it's LINK_STATE_DOWN.


# 1.174 01-Aug-2020 maxv

Remove #ifdef BRIDGE_IPF, compile in the code by default. Sent to
tech-net@.


# 1.173 01-May-2020 jdolecek

report no enabled capabilities when no interface is part of bridge


# 1.172 30-Apr-2020 jdolecek

for bridge(4), report the common enabled capabilities of the members
via SIOCGIFCAP for visibility


# 1.171 27-Apr-2020 jdolecek

if MTU of the added interface doesn't match the bridge, modify the MTU
of the interface to that of the bridge instead of just refusing the
addition with EINVAL

this is a convenience feature to simplify bridge setup with non-standard
MTU, the useful behaviour observed with Linux xenbr


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406
# 1.170 27-Mar-2020 jdolecek

replace the conditional m_pullup() on start of bridge_output() with
a KASSERT(), to make it clear no mbuf manipulation is ever done here

the condition should never trigger, this always runs after ether_output()
M_PREPEND()s ether_header


# 1.169 24-Mar-2020 jdolecek

reset the csum_flags in bridge_brodcast() also for bmcast path

for destination interfaces with real hardware offloading this fixes
multicast packet corruption; for xvif(4) this fix stops treating them
as having no csum

may fix PR kern/42386


Revision tags: ad-namecache-base3
# 1.168 24-Feb-2020 rin

Remove debug printf I put into bridge_calc_csum_flags().
Sorry for noise.


# 1.167 23-Feb-2020 jdolecek

disable the DEBUG bridge_calc_csum_flags() printf


# 1.166 29-Jan-2020 thorpej

Adopt <net/if_stats.h>.


Revision tags: ad-namecache-base2 ad-namecache-base1 ad-namecache-base phil-wifi-20191119
# 1.165 05-Aug-2019 msaitoh

branches: 1.165.2;
Cast uint32_t to avoid undefined behavior in bridge_rthash(). Found by kUBSan.


Revision tags: netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609 isaki-audio2-base pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226
# 1.164 22-Dec-2018 rin

branches: 1.164.4;
Take the interface out of promiscuous mode in bridge_delete_member()
instead of bridge_ioctl_del(). Otherwise, the member interfaces are
left in promiscuous mode when the bridge is destroyed.


# 1.163 15-Dec-2018 rin

Improve wording in comments: replace "chain" with "queue" for
sequence of mbuf's connected by m_nextpkt, in order to avoid
confusion with those connected by m_next.

No binary changes.


# 1.162 14-Dec-2018 martin

Need <netinet6/ip6_var.h> for ip6_statinc() prototype.


# 1.161 12-Dec-2018 rin

PR kern/53562

Handle TX offload in software when a packet is sent via
bridge_output(). We can send it as is in the following
exceptional cases:

For unicast:

(1) When the destination interface is the same as source.

(2) When the destination supports all TX offload options
specified in a packet.

For multicast/broadcast:

(3) When all the members of the bridge support the specified
TX offload options.

For (3), add sc_csum_flags_tx flag to bridge softc, which is
logical AND b/w capabilities of TX offload options in member
interface (ifp->if_csum_flags_tx). The flag is updated when a
member is (i) added to or (ii) removed from a bridge, or (iii)
if_csum_flags_tx flag of a member interface is manipulated via
ifconfig(8).

Turn on M_CSUM_TSOv[46] bit in ifp->if_csum_flags_tx flag when
TSO[46] is enabled for that interface.

OK msaitoh thorpej


Revision tags: pgoyette-compat-1126
# 1.160 09-Nov-2018 ozaki-r

Fix that brconfig <bridge> (addr) can't show a large number of MAC addresses

The command shows only 256 addresses at maximum even if a bridge caches more
addresses. It occurs because the kernel doesn't return an error if the command
passes a short buffer that can't store all cached addresses; the kernel fills
cached addresses as much as possible and returns it without telling that the
result is truncated.

Fix the issue by telling a required size of a buffer if a buffer passed from the
command is not enough, which lets the command retry with an enough buffer.

Reported by k-goda@IIJ


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930
# 1.159 19-Sep-2018 msaitoh

Micro optimization. m_copym(M_COPYALL) -> m_copypacket().


# 1.158 18-Sep-2018 msaitoh

- Fix bridge_enqueue() which was broken by last commit. Use correct mbuf
pointer.
- Modify comment.


# 1.157 14-Sep-2018 msaitoh

Fix a bug that bridge_enqueue() incorrectly cleared outgoing packet's offload
flags. bridge_enqueue() is called from bridge_output() when a packet is
spontaneous. Clear csum_flags before calling brige_enqueue() in
bridge_forward() or bridge_broadcast() instead of in the beginning of
bridge_enqueue().

Note that this change doesn't fix a problem on the following configuration:

A bridge has two or more interfaces.

An address is assigned to an bridge member interface and
some offload flags are set.

Another interface has no address and has no any offload flag.

XXX pullup-[78]


Revision tags: pgoyette-compat-0906 pgoyette-compat-0728 phil-wifi-base pgoyette-compat-0625
# 1.156 25-May-2018 ozaki-r

branches: 1.156.2;
Ensure to call if_register after interface initializations finish


Revision tags: pgoyette-compat-0521
# 1.155 14-May-2018 ozaki-r

Protect packet input routines with KERNEL_LOCK and splsoftnet

if_input, i.e, ether_input and friends, now runs in softint without any
protections. It's ok for ether_input itself because it's already MP-safe,
however, subsequent routines called from it such as carp_input and agr_input
aren't safe because they're not MP-safe. Protect if_input with KERNEL_LOCK.

if_input can be called from a normal LWP context. In that case we need to
prevent interrupts (softint) from running by splsoftnet to protect non-MP-safe
codes (e.g., carp_input and agr_input).

Pointed out by mlelstv@


Revision tags: pgoyette-compat-0502 pgoyette-compat-0422
# 1.154 18-Apr-2018 ozaki-r

Add missing PSLIST_ENTRY_INIT and PSLIST_ENTRY_DESTROY


# 1.153 18-Apr-2018 ozaki-r

Get rid of a unnecessary semicolon

Pointed out by kamil@


# 1.152 18-Apr-2018 ozaki-r

bridge: use pslist(9) for rtlist and rthash

The change fixes race conditions on list operations. One example is that a
reader may see invalid pointers on a looking item in a list due to lack of
membar_producer.


# 1.151 18-Apr-2018 ozaki-r

Simplify bridge_rtnode_insert (NFC)


# 1.150 18-Apr-2018 ozaki-r

Remove obsolete NULL checks


Revision tags: pgoyette-compat-0415
# 1.149 10-Apr-2018 ozaki-r

Fix bridge_rtdelete

It removes a rtable entry that belongs to a specified interface, however, its
original behavior was to delete all belonging entries. Restore the original
behavior.


Revision tags: pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.148 15-Jan-2018 maxv

branches: 1.148.2;
If the bridge is not running, don't call bridge_stop. Otherwise the
following commands will crash the kernel:

ifconfig bridge0 create
ifconfig bridge0 destroy


# 1.147 28-Dec-2017 ozaki-r

Ensure the timer isn't running by using workqueue_wait


# 1.146 19-Dec-2017 ozaki-r

Don't set IFEF_MPSAFE unless NET_MPSAFE at this point

Because recent investigations show that interfaces with IFEF_MPSAFE need to
follow additional restrictions to work with the flag safely. We should enable it
on an interface by default only if the interface surely satisfies the
restrictions, which are described in if.h.

Note that enabling IFEF_MPSAFE solely gains a few benefit on performance because
the network stack is still serialized by the big kernel locks by default.


# 1.145 11-Dec-2017 ozaki-r

Wrap if_ioctl_lock with IFNET_* macros (NFC)

Also if_ioctl_lock perhaps needs to be renamed to something because it's now
not just for ioctl...


# 1.144 08-Dec-2017 ozaki-r

Fix build of kernels without ether

By throwing out if_enable_vlan_mtu and if_disable_vlan_mtu that
created a unnecessary dependency from if.c to if_ethersubr.c.

PR kern/52790


# 1.143 06-Dec-2017 ozaki-r

Ensure to not turn on IFF_RUNNING of an interface until its initialization completes

And ensure to turn off it before destruction as per IFF_RUNNING's description
"resource allocated". (The description is a bit doubtful though, I believe the
change is still proper.)


# 1.142 06-Dec-2017 ozaki-r

Ensure to hold if_ioctl_lock when calling if_flags_set


Revision tags: tls-maxphys-base-20171202
# 1.141 17-Nov-2017 ozaki-r

Add missing IFEF_NO_LINK_STATE_CHANGE to bridge


# 1.140 16-Nov-2017 ozaki-r

Unify IFEF_*_MPSAFE into IFEF_MPSAFE

There are already two flags for if_output and if_start, however, it seems such
MPSAFE flags are eventually needed for all if_XXX operations. Having discrete
flags for each operation is wasteful of if_extflags bits. So let's unify
the flags into one: IFEF_MPSAFE.

Fortunately IFEF_*_MPSAFE flags have never been included in any releases, so
we can change them without breaking backward compatibility of the releases
(though the kernel version of -current should be bumped).

Note that if an interface have both MP-safe and non-MP-safe operations at a
time, we have to set the IFEF_MPSAFE flag and let callees of non-MP-safe
opeartions take the kernel lock.

Proposed on tech-kern@ and tech-net@


# 1.139 15-Nov-2017 ozaki-r

Mark callouts of bridge CALLOUT_MPSAFE


# 1.138 25-Oct-2017 ozaki-r

Remove unnecessary splsoftnet


# 1.137 25-Oct-2017 ozaki-r

Don't free sc_rthash twice


# 1.136 23-Oct-2017 msaitoh

- If if_initialize() failed in the attach function, free resources and return.
- Add some missing frees in bridge_clone_destroy().
- KNF


# 1.135 02-Oct-2017 ozaki-r

Add curlwp_bind to bridge_input for psref

It can be called in a thread context via tap (tap_dev_write).

Fix PR kern/52587


Revision tags: nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320
# 1.134 07-Mar-2017 ozaki-r

branches: 1.134.6;
Remove unnecessary splnet for bridge_enqueue

bridge_enqueue now uses if_transmit_lock that does splnet for device
drivers, so splnet for bridge_enqueue isn't needed anymore.


# 1.133 16-Feb-2017 knakahara

add l2tp(4) L2TPv3 interface.

originally implemented by IIJ SEIL team.


Revision tags: nick-nhusb-base-20170204
# 1.132 23-Jan-2017 ozaki-r

Replace some splnet with splsoftnet


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107 nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004
# 1.131 15-Sep-2016 christos

branches: 1.131.2;
Always do the mbuf checks. The packet filters (npf) expect the mbuf to be
pulled-up. (Krists Krilovs)


Revision tags: localcount-20160914
# 1.130 29-Aug-2016 ozaki-r

KNF; replace white spaces with hard tabs

No functional change.


Revision tags: pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.129 22-Jun-2016 knakahara

branches: 1.129.2;
fix: locking about IFQ_ENQUEUE and ALTQ

- If NET_MPSAFE is not defined, IFQ_LOCK is nop. Currently, that means
IFQ_ENQUEUE() of some paths such as bridge_enqueue() is called parallel
wrongly.
- If ALTQ is enabled, Tx processing should call if_transmit() (= IFQ_ENQUEUE
+ ifp->if_start()) instead of ifp->if_transmit() to call ALTQ_ENQUEUE()
and ALTQ_DEQUEUE().
Furthermore, ALTQ processing is always required KERNEL_LOCK currently.


# 1.128 20-Jun-2016 knakahara

fix: should not assert IFEF_OUTPUT_MPSAFE in bridge_output()


# 1.127 20-Jun-2016 knakahara

tentative fix for ATF(net/if_bridge/t_bridge)


# 1.126 20-Jun-2016 knakahara

make bridge_output MP-safe, so that bridge(4) can enable IFEF_OUTPUT_MPSAFE.

making MP-scalable is future work.


# 1.125 10-Jun-2016 ozaki-r

Avoid storing a pointer of an interface in a mbuf

Having a pointer of an interface in a mbuf isn't safe if we remove big
kernel locks; an interface object (ifnet) can be destroyed anytime in any
packet processing and accessing such object via a pointer is racy. Instead
we have to get an object from the interface collection (ifindex2ifnet) via
an interface index (if_index) that is stored to a mbuf instead of an
pointer.

The change provides two APIs: m_{get,put}_rcvif_psref that use psref(9)
for sleep-able critical sections and m_{get,put}_rcvif that use
pserialize(9) for other critical sections. The change also adds another
API called m_get_rcvif_NOMPSAFE, that is NOT MP-safe and for transition
moratorium, i.e., it is intended to be used for places where are not
planned to be MP-ified soon.

The change adds some overhead due to psref to performance sensitive paths,
however the overhead is not serious, 2% down at worst.

Proposed on tech-kern and tech-net.


# 1.124 10-Jun-2016 ozaki-r

Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.123 16-May-2016 ozaki-r

Apply if_get and if_put to bridge(4)


# 1.122 04-May-2016 roy

Allow multicast/broadcast packets from a bridge member to other members.
Note this should just call bridge_broadcast when more locking issues are
resolved.


# 1.121 28-Apr-2016 knakahara

introduce new ifnet MP-scalable sending interface "if_transmit".


# 1.120 28-Apr-2016 ozaki-r

Constify rtentry of if_output

We no longer need to change rtentry below if_output.

The change makes it clear where rtentries are changed (or not)
and helps forthcoming locking (os psrefing) rtentries.


# 1.119 24-Apr-2016 christos

CID 1358673: dead code


Revision tags: nick-nhusb-base-20160422
# 1.118 22-Apr-2016 roy

Change used from int to bool.
If used, abort the loop because we think we're already at the end.


# 1.117 20-Apr-2016 knakahara

IFQ_ENQUEUE refactor (3/3) : eliminate pktattr argument from IFQ_ENQUEUE caller


# 1.116 20-Apr-2016 knakahara

IFQ_ENQUEUE refactor (2/3) : eliminate pktattr argument from altq implemantation


# 1.115 19-Apr-2016 ozaki-r

Apply psref(9) to bridge(4)

Note that there is an issue that ioctls for an interface and a destruction
of the interface can run in parallel and it causes race conditions on
bridge as well (it rarely happens). The issue will be addressed in the
interface common code (if.c).


# 1.114 19-Apr-2016 ozaki-r

Remove BRIDGE_MPSAFE switch and enable MP-safe code by default

We need to enable it by default because bridge_input now runs
in softint, but bridge_input w/o BRIDGE_MPSAFE was designed as
it runs in hardware interrupt.

Note that there remains a racy code in bridge_output; it will be
solved in the upcoming change (applying psref(9)).


# 1.113 11-Apr-2016 ozaki-r

Fix usage of pslist(9)

Pointed out by riastradh@.


# 1.112 11-Apr-2016 ozaki-r

Use pslist(9) in bridge(4)

This adds missing memory barriers to list operations for pserialize.


# 1.111 28-Mar-2016 ozaki-r

Remove unused global bridge list

Pointed out by riastradh@


# 1.110 23-Mar-2016 ozaki-r

Fix LIST_FOREACH argument


# 1.109 23-Mar-2016 ozaki-r

Use LIST_FOREACH instead of LIST_FOREACH_SAFE

No need to use *_SAFE because we don't remove any items in the loop.


Revision tags: nick-nhusb-base-20160319
# 1.108 15-Feb-2016 ozaki-r

Simplify bridge(4)

Thanks to introducing softint-based if_input, the entire bridge code now
never run in hardware interrupt context. So we can simplify the code.

- Remove spin mutexes
- They were needed because some code of bridge could run in
hardware interrupt context
- We now need only an adaptive mutex for each shared object
(a member list and a forwarding table)
- Remove pktqueue
- bridge_input is already in softint, using another softint
(for bridge_forward) is useless
- Packet distribution should be down at device drivers


# 1.107 10-Feb-2016 ozaki-r

Don't share struct work, instead have one per softc

Pointed out by riastradh@


# 1.106 09-Feb-2016 ozaki-r

Introduce softint-based if_input

This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.

This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.

To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.

Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html

Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!


Revision tags: nick-nhusb-base-20151226
# 1.105 19-Nov-2015 christos

Add handling of VLAN packets in if_bridge where the parent interface supports
them (Jean-Jacques.Puig@espci.fr). Factor out the vlan_mtu enabling and
disabling code.


# 1.104 20-Oct-2015 maxv

Harmless alloc inconsistency; make sure the exact same argument is given to
kmem_alloc/kmem_free. Found by Brainy.


# 1.103 07-Oct-2015 ozaki-r

Enqueue frames to a curcpu's pktqueue

Currently RX can run on a CPU other than CPU#0, so always enqueuing
to a pktqueue of CPU#0 makes no sense. Let's use a curcpu's pktqueue,
although bridge_foward softint doesn't run in parallel without
NET_MPSAFE.

This is a temporal solution. We need a fundamental solution.


Revision tags: nick-nhusb-base-20150921
# 1.102 28-Aug-2015 rjs

Don't set M_PROTO1 in mbuf flags.

This was left over from the old usage of gif(4) with bridges.


# 1.101 20-Aug-2015 christos

include "ioconf.h" to get the 'void <driver>attach(int count);' prototype.


# 1.100 23-Jul-2015 ozaki-r

Fix PR 48104

So far bridge cannot receive frames via a member interface when the frames
come from another member interface. So when we assign an IP address to
a member interface, hosts connected to another member interface cannot
ping to the IP address. That behavior isn't expected. See PR 48104 for
more realistic examples of this issue.

The change does:
- drop M_PROMISC before ether_input, which allows a bridge member interface
to receive a frame coming from another bridge member interface
- receive broadcast/multicast frames via all bridge member interfaces,
which is required to receive IPv6 multicast packets destined to a
multicast group belonging to a bridge member interface that is different
from a packet arrival interface

roy@ helped testing of the fix, thanks!


Revision tags: nick-nhusb-base-20150606
# 1.99 01-Jun-2015 matt

Modify the BRDGGIFS and BRDGRTS cmds to be more COMPAT_NETBSD32 friendly.
(XXX whitespace)


# 1.98 16-Apr-2015 ozaki-r

Fix racy bridge_delete_member

It can be called from bridge_ioctl_del and bridge_clone_destroy with
a same bridge member (bif) at the same time. We have to prevent
that happens.

Pointed out by riastradh@


Revision tags: nick-nhusb-base-20150406
# 1.97 08-Jan-2015 ozaki-r

Use pserialize for rtlist in bridge

This change enables lockless accesses to bridge rtable lists.
See locking notes in a comment to know how pserialize and
mutexes are used. Some functions are rearranged to use
pserialize. A workqueue is introduced to use pserialize in
bridge_rtage via bridge_timer callout.

As usual, pserialize and mutexes are used only when NET_MPSAFE
on. On the other hand, the newly added workqueue is used
regardless of NET_MPSAFE on or off.


# 1.96 01-Jan-2015 ozaki-r

Reset the expire time of a cache on receiving a frame for the cache

The expire time of a cache in a bridge MAC address table was never reset
once it is initialized regardless of traffic for the cache. The behavior
isn't supposed and active caches are unnecessarily expired and removed.

PR kern/49507


# 1.95 31-Dec-2014 ozaki-r

Use pserialize in bridge

This change enables lockless accesses to bridge member lists.
See locking notes in a comment to know how pserialize and
mutexes are used.

This change also provides support for softint-based interrupt
handling; pserialize readers can run in both HW interrupt and
softint contexts.

As usual, pserialize is used only when NET_MPSAFE on.


# 1.94 25-Dec-2014 ozaki-r

Use LIST_FOREACH_SAFE in bridge_rt* functions


# 1.93 24-Dec-2014 ozaki-r

Replace malloc/free with kmem_* in if_bridge

Additionally M_NOWAIT is replaced with KM_SLEEP.


# 1.92 22-Dec-2014 ozaki-r

Call ether_input/m_freem without holding a lock or referencing unnecessary objects

When NET_MPSAFE on, a bridge tries to pass up a packet to Layer 3
(or call m_freem) with holding a lock or referencing unnecessary
objects. That causes random lock ups. The change fixes the issue.


Revision tags: nick-nhusb-base
# 1.91 15-Aug-2014 ozaki-r

branches: 1.91.2;
bridge: reject non-IFF_SIMPLEX interfaces

bridge does not work with !IFF_SIMPLEX interfaces (PR/18035);
the bug is not yet fixed. Until it gets fixed, we should
reject non-IFF_SIMPLEX interfaces.

Discussed with pooka@


Revision tags: netbsd-7-1-2-RELEASE netbsd-7-1-1-RELEASE netbsd-7-1-RELEASE netbsd-7-1-RC2 netbsd-7-nhusb-base-20170116 netbsd-7-1-RC1 netbsd-7-0-2-RELEASE netbsd-7-nhusb-base netbsd-7-0-1-RELEASE netbsd-7-0-RELEASE netbsd-7-0-RC3 netbsd-7-0-RC2 netbsd-7-0-RC1 netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.90 23-Jul-2014 ozaki-r

branches: 1.90.2;
Avoid calling copyout with holding mutex(IPL_NET)

Because copyout may lead a page fault that may sleep, we have to pull it
out from the critical section of mutex(IPL_NET) in bridge_ioctl_gifs.


# 1.89 23-Jul-2014 ozaki-r

Add missing unlock


# 1.88 20-Jul-2014 ozaki-r

Don't return ENETRESET when ioctl SIOCSIFMTU

Otherwise, just changing MTU with ifconfig shows
a confusable error message.

RP kern/48996


# 1.87 14-Jul-2014 ozaki-r

Make bridge MPSAFE

- Introduce BRIDGE_MPSAFE
- It's enabled only when NET_MPSAFE is defined
in if.h or the kernel config
- Add iflist and rtlist mutex locks
- Locking iflist is performance sensitive,
so it's not used when !BRIDGE_MPSAFE
- Add bif object reference counting
- It enables fine-grain locking for bridge member lists
by allowing to not hold a lock during touching a bif
- bridge_release_member is added to decrement the
reference count
- A condition variable is added to do bridge_delete_member
gracefully
- Add if_bridgeif to ifnet
- It's a shortcut to a bif object of a bridge member
- It reduces a bif lookup cost and so lock contention on iflist
- Make bridgestp MPSAFE too


# 1.86 02-Jul-2014 ozaki-r

Protect bridge_list with a mutex


# 1.85 02-Jul-2014 ozaki-r

Remove obsolete codes for if_snd


# 1.84 23-Jun-2014 ozaki-r

Get rid of unnecessary xc_broadcast after pktq_barrier

Pointed out by rmind@


# 1.83 18-Jun-2014 ozaki-r

Restructure bridge_input and bridge_broadcast

There are two changes:
- Assemble the places calling pktq_enqueue (bridge_forward)
for unicast and {b,m}cast frames into one
- Receive {b,m}cast frames in bridge_broadcast, not in
bridge_input

The changes make the code clear and readable. bridge_input
now doesn't need to take care of {b,m}cast frames;
bridge_forward and bridge_broadcast have the responsibility.

The changes are based on a patch of Lloyd Parkes submitted
in PR 48104, but don't fix its issue yet.


# 1.82 18-Jun-2014 ozaki-r

Tidy up bridge_input

No functional change.


# 1.81 17-Jun-2014 ozaki-r

Restructure ether_input and bridge_input

The network stack of NetBSD is well organized and
layered. A packet reception is processed from a
lower layer to an upper layer one by one. However,
ether_input and bridge_input are not structured so.
bridge_input is called inside ether_input.

The new structure replaces ifnet#if_input of a bridge
member with bridge_input when the member is attached.
So a packet goes straight on a packet reception via
a bridge, bridge_input => ether_input => ip_input.

The change is part of a patch of Lloyd Parkes submitted
in PR 48104. Unlike the patch, the change doesn't
intend to change the behavior of the packet processing.
Another patch will fix PR 48104.


# 1.80 16-Jun-2014 ozaki-r

Add net.interfaces.bridgeN.fwdq.{maxlen,len,drops} sysctl


# 1.79 16-Jun-2014 ozaki-r

Use pktqueue for bridge forwarding queue and softint


# 1.78 15-Jun-2014 ozaki-r

Get rid of unnecessary splnet for pool_{get,put}

A mutex prevents interrupts in the functions now.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base
# 1.77 29-Jun-2013 rmind

branches: 1.77.4;
- Rewrite parts of pfil(9): use array to store hooks and thus be more cache
friendly (there are only few hooks in the system). Make the structures
opaque and the interface more strict.
- Remove PFIL_HOOKS option by making pfil(9) mandatory.


Revision tags: agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8
# 1.76 22-Mar-2012 wiz

branches: 1.76.2; 1.76.4;
Fix typo in kauth name. From PR 46234 by Matthew Mondor.
Tested by Geoff Adams and Ryo ONODERA.


# 1.75 13-Mar-2012 elad

Replace the remaining KAUTH_GENERIC_ISSUSER authorization calls with
something meaningful. All relevant documentation has been updated or
written.

Most of these changes were brought up in the following messages:

http://mail-index.netbsd.org/tech-kern/2012/01/18/msg012490.html
http://mail-index.netbsd.org/tech-kern/2012/01/19/msg012502.html
http://mail-index.netbsd.org/tech-kern/2012/02/17/msg012728.html

Thanks to christos, manu, njoly, and jmmv for input.

Huge thanks to pgoyette for spinning these changes through some build
cycles and ATF.


Revision tags: netbsd-6-0-6-RELEASE netbsd-6-1-5-RELEASE netbsd-6-1-4-RELEASE netbsd-6-0-5-RELEASE netbsd-6-1-3-RELEASE netbsd-6-0-4-RELEASE netbsd-6-1-2-RELEASE netbsd-6-0-3-RELEASE netbsd-6-1-1-RELEASE netbsd-6-0-2-RELEASE netbsd-6-1-RELEASE netbsd-6-1-RC4 netbsd-6-1-RC3 netbsd-6-1-RC2 netbsd-6-1-RC1 netbsd-6-0-1-RELEASE matt-nb6-plus-nbase netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base
# 1.74 19-Nov-2011 tls

branches: 1.74.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.73 23-May-2011 joerg

branches: 1.73.4;
simplify


Revision tags: bouyer-quota2-nbase bouyer-quota2-base jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.72 07-Dec-2010 pooka

branches: 1.72.2;
_KERNEL_TOP


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1 yamt-nfs-mp-base9 uebayasi-xip-base
# 1.71 19-Jan-2010 pooka

branches: 1.71.4;
Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


Revision tags: matt-premerge-20091211 yamt-nfs-mp-base8 yamt-nfs-mp-base7 jymxensuspend-base yamt-nfs-mp-base6 yamt-nfs-mp-base5 jym-xensuspend-nbase
# 1.70 17-May-2009 cegger

fix crash in bridge_ioctl():


BRDGGFLT and BRDGSFILT bridge controls are only available with BRIDGE_IPF and PFIL_HOOKS defined.
In amd64 GENERIC and XEN kernel configs PFIL_HOOKS is defined but BRIDGE_IPF is not.

When a BRDGGFLT or BRDGSFILT command comes in, then ifd->ifd_cmd is not in range
of bridge_control_table_size. Then bc is not set and is dereferenced
later => BOOM.


Revision tags: yamt-nfs-mp-base4 jym-xensuspend-base
# 1.69 12-May-2009 elad

Move kauth(9) call before going into splnet().

Mailing list reference:

http://mail-index.netbsd.org/tech-net/2009/05/08/msg001286.html


Revision tags: yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 nick-hppapmap-base
# 1.68 04-Apr-2009 bouyer

Fix another typo


# 1.67 04-Apr-2009 bouyer

Fix a comment, and make it build.


# 1.66 04-Apr-2009 bouyer

Fixes from Masao Uebayashi


# 1.65 04-Apr-2009 bouyer

Fix for if_start() and pfil_hook() being called from hardware interrupt
context (reported on various mailing-lists, and part of PR kern/41114,
causing panic in pf(4) and possibly ipf(4) when BRIDGE_IPF is used).
Defer bridge_forward() to a software interrupt; bridge_input() enqueues
mbufs to ifp->if_snd which is handled in bridge_forward().


Revision tags: nick-hppapmap-base2
# 1.64 18-Jan-2009 mrg

branches: 1.64.2;
Fix multiple problems:

* A sign extension error creating the bridge ID corrupted the
priority (always making it the maximum).
* Do not catch STP packets on an interface for which STP is not
enabled -- it's a violation of the spec, and causes STP to fail on
neighboring bridges.
* An optimization to bstp_input() -- some information is already
known when we call it.

contributed anonymously.


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.63 07-Nov-2008 dyoung

*** Summary ***

When a link-layer address changes (e.g., ifconfig ex0 link
02:de:ad:be:ef:02 active), send a gratuitous ARP and/or a Neighbor
Advertisement to update the network-/link-layer address bindings
on our LAN peers.

Refuse a change of ethernet address to the address 00:00:00:00:00:00
or to any multicast/broadcast address. (Thanks matt@.)

Reorder ifnet ioctl operations so that driver ioctls may inherit
the functions of their "class"---ether_ioctl(), fddi_ioctl(), et
cetera---and the class ioctls may inherit from the generic ioctl,
ifioctl_common(), but both driver- and class-ioctls may override
the generic behavior. Make network drivers share more code.

Distinguish a "factory" link-layer address from others for the
purposes of both protecting that address from deletion and computing
EUI64.

Return consistent, appropriate error codes from network drivers.

Improve readability. KNF.

*** Details ***

In if_attach(), always initialize the interface ioctl routine,
ifnet->if_ioctl, if the driver has not already initialized it.
Delete if_ioctl == NULL tests everywhere else, because it cannot
happen.

In the ioctl routines of network interfaces, inherit common ioctl
behaviors by calling either ifioctl_common() or whichever ioctl
routine is appropriate for the class of interface---e.g., ether_ioctl()
for ethernets.

Stop (ab)using SIOCSIFADDR and start to use SIOCINITIFADDR. In
the user->kernel interface, SIOCSIFADDR's argument was an ifreq,
but on the protocol->ifnet interface, SIOCSIFADDR's argument was
an ifaddr. That was confusing, and it would work against me as I
make it possible for a network interface to overload most ioctls.
On the protocol->ifnet interface, replace SIOCSIFADDR with
SIOCINITIFADDR. In ifioctl(), return EPERM if userland tries to
invoke SIOCINITIFADDR.

In ifioctl(), give the interface the first shot at handling most
interface ioctls, and give the protocol the second shot, instead
of the other way around. Finally, let compatibility code (COMPAT_OSOCK)
take a shot.

Pull device initialization out of switch statements under
SIOCINITIFADDR. For example, pull ..._init() out of any switch
statement that looks like this:

switch (...->sa_family) {
case ...:
..._init();
...
break;
...
default:
..._init();
...
break;
}

Rewrite many if-else clauses that handle all permutations of IFF_UP
and IFF_RUNNING to use a switch statement,

switch (x & (IFF_UP|IFF_RUNNING)) {
case 0:
...
break;
case IFF_RUNNING:
...
break;
case IFF_UP:
...
break;
case IFF_UP|IFF_RUNNING:
...
break;
}

unifdef lots of code containing #ifdef FreeBSD, #ifdef NetBSD, and
#ifdef SIOCSIFMTU, especially in fwip(4) and in ndis(4).

In ipw(4), remove an if_set_sadl() call that is out of place.

In nfe(4), reuse the jumbo MTU logic in ether_ioctl().

Let ethernets register a callback for setting h/w state such as
promiscuous mode and the multicast filter in accord with a change
in the if_flags: ether_set_ifflags_cb() registers a callback that
returns ENETRESET if the caller should reset the ethernet by calling
if_init(), 0 on success, != 0 on failure. Pull common code from
ex(4), gem(4), nfe(4), sip(4), tlp(4), vge(4) into ether_ioctl(),
and register if_flags callbacks for those drivers.

Return ENOTTY instead of EINVAL for inappropriate ioctls. In
zyd(4), use ENXIO instead of ENOTTY to indicate that the device is
not any longer attached.

Add to if_set_sadl() a boolean 'factory' argument that indicates
whether a link-layer address was assigned by the factory or some
other source. In a comment, recommend using the factory address
for generating an EUI64, and update in6_get_hw_ifid() to prefer a
factory address to any other link-layer address.

Add a routing message, RTM_LLINFO_UPD, that tells protocols to
update the binding of network-layer addresses to link-layer addresses.
Implement this message in IPv4 and IPv6 by sending a gratuitous
ARP or a neighbor advertisement, respectively. Generate RTM_LLINFO_UPD
messages on a change of an interface's link-layer address.

In ether_ioctl(), do not let SIOCALIFADDR set a link-layer address
that is broadcast/multicast or equal to 00:00:00:00:00:00.

Make ether_ioctl() call ifioctl_common() to handle ioctls that it
does not understand.

In gif(4), initialize if_softc and use it, instead of assuming that
the gif_softc and ifp overlap.

Let ifioctl_common() handle SIOCGIFADDR.

Sprinkle rtcache_invariants(), which checks on DIAGNOSTIC kernels
that certain invariants on a struct route are satisfied.

In agr(4), rewrite agr_ioctl_filter() to be a bit more explicit
about the ioctls that we do not allow on an agr(4) member interface.

bzero -> memset. Delete unnecessary casts to void *. Use
sockaddr_in_init() and sockaddr_in6_init(). Compare pointers with
NULL instead of "testing truth". Replace some instances of (type
*)0 with NULL. Change some K&R prototypes to ANSI C, and join
lines.


Revision tags: netbsd-5-0-RC3 netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 wrstuden-revivesa-base-1 simonb-wapbl-nbase yamt-pf42-base4 simonb-wapbl-base wrstuden-revivesa-base
# 1.62 15-Jun-2008 christos

branches: 1.62.2; 1.62.4; 1.62.6;
- add if_alloc (ours just mallocs), and if_initname and use them (from FreeBSD)
- kill memsets where M_ZERO can be used.


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base
# 1.61 15-Apr-2008 thorpej

branches: 1.61.2; 1.61.4; 1.61.6; 1.61.8;
Make ip6 and icmp6 stats per-cpu.


# 1.60 12-Apr-2008 cegger

make this build with BRIDGE_IPF and PFIL_HOOKS options


# 1.59 12-Apr-2008 thorpej

Make IP, TCP, UDP, and ICMP statistics per-CPU. The stats are collated
when the user requests them via sysctl.


# 1.58 08-Apr-2008 thorpej

Change IPv6 stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ip6stat structure; old netstat
binaries will continue to work properly.


# 1.57 07-Apr-2008 thorpej

Change IP stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ipstat structure; old netstat
binaries will continue to work properly.


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase nick-net80211-sync-base keiichi-mipv6-base matt-armv6-nbase hpcarm-cleanup-base
# 1.56 20-Feb-2008 matt

branches: 1.56.6;
s/u_\(int[0-9]*_t\)/u\1/g
(change u_int*_t to uint*_t)


Revision tags: bouyer-xeni386-nbase bouyer-xeni386-base mjf-devfs-base
# 1.55 19-Jan-2008 dyoung

Use C99 array initializers for bridge_control_table[].


Revision tags: nick-csl-alignment-base5 bouyer-xeni386-merge1 matt-armv6-prevmlocking vmlocking2-base3 yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 jmcneill-base bouyer-xenamd64-base2 vmlocking-nbase yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base matt-armv6-base jmcneill-pm-base reinoud-bufcleanup-base vmlocking-base
# 1.54 27-Aug-2007 dyoung

branches: 1.54.2; 1.54.8; 1.54.14;
LLADDR -> CLLADDR.


# 1.53 26-Aug-2007 dyoung

Constify: LLADDR -> CLLADDR. I'm aiming here to make it easier to
identify sockaddr_dl abuse that remains in the kernel, especially
the potential for overwriting memory past the end of a sockaddr_dl
with, e.g., memcpy(LLADDR(), ...).

Use sockaddr_dl_setaddr() in a few places.


Revision tags: matt-mips64-base nick-csl-alignment-base mjf-ufs-trans-base
# 1.52 09-Jul-2007 ad

branches: 1.52.2; 1.52.6;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.51 12-Mar-2007 ad

branches: 1.51.2;
Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.


# 1.50 04-Mar-2007 christos

branches: 1.50.2;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.49 21-Feb-2007 dyoung

Use __arraycount().


# 1.48 17-Feb-2007 dyoung

KNF: de-__P, bzero -> memset, bcmp -> memcmp. Remove extraneous
parentheses in return statements.

Cosmetic: don't open-code TAILQ_FOREACH().

Cosmetic: change types of variables to avoid oodles of casts: in
in6_src.c, avoid casts by changing several route_in6 pointers
to struct route pointers. Remove unnecessary casts to caddr_t
elsewhere.

Pave the way for eliminating address family-specific route caches:
soon, struct route will not embed a sockaddr, but it will hold
a reference to an external sockaddr, instead. We will set the
destination sockaddr using rtcache_setdst(). (I created a stub
for it, but it isn't used anywhere, yet.) rtcache_free() will
free the sockaddr. I have extracted from rtcache_free() a helper
subroutine, rtcache_clear(). rtcache_clear() will "forget" a
cached route, but it will not forget the destination by releasing
the sockaddr. I use rtcache_clear() instead of rtcache_free()
in rtcache_update(), because rtcache_update() is not supposed
to forget the destination.

Constify:

1 Introduce const accessor for route->ro_dst, rtcache_getdst().

2 Constify the 'dst' argument to ifnet->if_output(). This
led me to constify a lot of code called by output routines.

3 Constify the sockaddr argument to protosw->pr_ctlinput. This
led me to constify a lot of code called by ctlinput routines.

4 Introduce const macros for converting from a generic sockaddr
to family-specific sockaddrs, e.g., sockaddr_in: satocsin6,
satocsin, et cetera.


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.47 04-Jan-2007 elad

branches: 1.47.2;
Consistent usage of KAUTH_GENERIC_ISSUSER.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3 netbsd-4-base
# 1.46 23-Nov-2006 rpaulo

New EtherIP driver based on tap(4) and gif(4) by Hans Rosenfeld.
Notable changes:
* Fixes PR 34268.
* Separates the code from gif(4) (which is more cleaner).
* Allows the usage of STP (Spanning Tree Protocol).
* Removed EtherIP implementation from gif(4)/tap(4).

Some input from Christos.


# 1.45 16-Nov-2006 christos

__unused removal on arguments; approved by core.


Revision tags: yamt-splraiseipl-base2
# 1.44 17-Oct-2006 dogcow

now that we have -Wno-unused-parameter, back out all the tremendously ugly
code to gratuitously access said parameters.


# 1.43 13-Oct-2006 dogcow

More -Wunused fallout. sprinkle __unused when possible; otherwise, use the
do { if (&x) {} } while (/* CONSTCOND */ 0);
construct as suggested by uwe in <20061012224845.GA9449@snark.ptc.spbu.ru>.


# 1.42 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


# 1.41 05-Oct-2006 tls

Protect calls to pool_put/pool_get that may occur in interrupt context
with spl used to protect other allocations and frees, or datastructure
element insertion and removal, in adjacent code.

It is almost unquestionably the case that some of the spl()/splx() calls
added here are superfluous, but it really seems wrong to see:

s=splfoo();
/* frob data structure */
splx(s);
pool_put(x);

and if we think we need to protect the first operation, then it is hard
to see why we should not think we need to protect the next. "Better
safe than sorry".

It is also almost unquestionably the case that I missed some pool
gets/puts from interrupt context with my strategy for finding these
calls; use of PR_NOWAIT is a strong hint that a pool may be used from
interrupt context but many callers in the kernel pass a "can wait/can't
wait" flag down such that my searches might not have found them. One
notable area that needs to be looked at is pf.

See also:

http://mail-index.netbsd.org/tech-kern/2006/07/19/0003.html
http://mail-index.netbsd.org/tech-kern/2006/07/19/0009.html


Revision tags: abandoned-netbsd-4-base yamt-splraiseipl-base yamt-pdpolicy-base9 yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.40 23-Jul-2006 ad

branches: 1.40.4; 1.40.6;
Use the LWP cached credentials where sane.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.39 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.38 18-May-2006 liamjfoy

branches: 1.38.2;
Integrate Common Address Redundancy Procotol (CARP) from OpenBSD

'pseudo-device carp'

Thanks to: joerg@ christos@ riz@ and others who tested
Ok: core@


# 1.37 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base yamt-pdpolicy-base2 elad-kernelauth-base yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.36 17-Jan-2006 christos

branches: 1.36.2; 1.36.4; 1.36.6; 1.36.8; 1.36.10;
Make sure that breq is also cleared (from Xin LI)


# 1.35 09-Jan-2006 christos

Make sure we initialize all structs to 0; from Xin LI


# 1.34 24-Dec-2005 perry

branches: 1.34.2;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.33 11-Dec-2005 thorpej

ANSI function decls and application of static.


# 1.32 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.31 01-Jun-2005 jdc

branches: 1.31.2;
Fix this properly by renaming the conflicting variables.


# 1.30 01-Jun-2005 jdc

Remove extraneous definition of struct llc (found by shadow warning).


Revision tags: netbsd-3-0-RELEASE netbsd-3-0-RC6 netbsd-3-0-RC5 netbsd-3-0-RC4 netbsd-3-0-RC3 netbsd-3-0-RC2 netbsd-3-0-RC1 yamt-km-base4 yamt-km-base3 netbsd-3-base kent-audio2-base
# 1.29 26-Feb-2005 perry

branches: 1.29.2; 1.29.4;
nuke trailing whitespace


Revision tags: yamt-km-base2
# 1.28 31-Jan-2005 kim

Add RFC 3378 EtherIP support, ported from OpenBSD to NetBSD by
Hans Rosenfeld (rosenfeld at grumpf.hope-2000.org)

This change makes it possible to add gif interfaces to bridges, which
will then send and receive IP protocol 97 packets. Packets are Ethernet
frames with an EtherIP header prepended.


Revision tags: yamt-km-base kent-audio1-beforemerge kent-audio1-base
# 1.27 04-Dec-2004 peter

branches: 1.27.4; 1.27.6;
Change ifc_destroy to return an int instead of void, so that it
can pass back errors to ifconfig.


# 1.26 06-Oct-2004 bad

Interfaces that do checksum offloading indicate the checksum status of
received packets in csum_flags in the packet header. Packets that are
forwarded over the bridge need to have csum_flags cleared before being
put on the output queue. Do so in bridge_enqueue().

Discussed with Jason Thorpe.

Fixes PR kern/27007 and the first part of PR kern/21831.


# 1.25 05-Oct-2004 christos

Only enable BRIDGE_IPF code if PFIL_HOOKS is enabled.


# 1.24 21-Apr-2004 itojun

kill a sprintf


# 1.23 21-Apr-2004 itojun

kill sprintf, use snprintf


Revision tags: netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.22 31-Jan-2004 jdc

branches: 1.22.2;
Use m_copydata(), m_adj() and M_PREPEND() to manipulate mbuf's in
bridge_ipf(). Fixes kernel memory corruption that occured when using
m_split() and m_cat().
Idea from OpenBSD.


# 1.21 09-Dec-2003 augustss

Fix spelling mistake in a comment.


# 1.20 28-Oct-2003 mycroft

Mark this initializer in the canonical way so it can be found later.


# 1.19 25-Oct-2003 christos

Fix uninitialized variable warnings


# 1.18 16-Sep-2003 jdc

Add a flag parameter to bridge_enqueue() to tell it whether to run the
filter or not. We only need to run the filter for bridge_forward() and
bridge_broadcast(). If we also run it for bridge_output(), we will run
the filter twice outbound per packet, so don't.

In bridge_ipf(), make sure we don't run m_cat() on a single mbuf chain
by checking to see (and remembering) if we need to m_split() the mbuf.
This fixes bridge + ipfilter on sparc.

Fixes PR kern/22063.


# 1.17 11-Aug-2003 itojun

rm extra blank line


# 1.16 13-Jul-2003 jdc

Include opt_inet.h to get INET6 definition.
Now, bridged ipv6 packets are passed through ipfilter.
However, some v6 packets still do not get transmitted when ipf is enabled.
Partial fix for PR kern/22063.


# 1.15 23-Jun-2003 martin

branches: 1.15.2;
Make sure to include opt_foo.h if a defflag option FOO is used.


# 1.14 24-May-2003 kristerw

Make sure splx() is called for all bridge_ioctl() error cases.


# 1.13 16-May-2003 itojun

use strlcpy


# 1.12 14-May-2003 itojun

use arc4random


# 1.11 19-Mar-2003 bouyer

Fix 2 bugs:
- initialise stp when the bridge is turned up, without this stp will keep
all interfaces disabled in a sequence like:
brconfig bridge0 add if0 add if1 stp if0 stp if1 up
- s/BRDGSPRI/BRDGSIFPRIO in brconfig.c:cmd_ifpriority()

add a command (ifpathcost) to change the stp path cost of the STP path cost of
an interface. Display the interface path cost with the others STP parameters.


# 1.10 27-Feb-2003 perseant

Make BRIDGE_IPF an option, and document it. Add it (commented) to GENERIC.
Let brconfig tell whether the bridge is using the ipfilter hook, or not.


# 1.9 15-Feb-2003 perseant

Add ipf packet-filtering option to if_bridge. The option is controlled at
compile-time by BRIDGE_IPF, and at runtime by brconfig with the {ipf,-ipf}
option on a per-bridge basis.

As a side-effect, add PFIL_HOOKS processing to if_bridge.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base kqueue-aftermerge kqueue-beforemerge gehenna-devsw-base kqueue-base
# 1.8 24-Aug-2002 martin

Add a function to lookup bridge members by struct ifnet * and use
it at all call sites that have such a pointer readily available.
This avoids unnecessary strcmp()s in critical paths, and removes
some XXX comments.


# 1.7 08-Jun-2002 itojun

reject "add" request if if_mtu is different.


# 1.6 23-May-2002 itojun

use IFT_BRIDGE


Revision tags: netbsd-1-6-base
# 1.5 24-Mar-2002 jdolecek

branches: 1.5.2; 1.5.4;
Fix a memory leak in bridge_ioctl_add() when the called for non-ethernet
interface.
Problem noted and fix provided by in kern/16019 by Love.


Revision tags: eeh-devprop-base newlock-base
# 1.4 08-Mar-2002 thorpej

Pool deals fairly well with physical memory shortage, but it doesn't
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map). Try to deal with this:

* Group all information about the backend allocator for a pool in a
separate structure. The pool references this structure, rather than
the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
to become available, but will still fail if it cannot callocate KVA
space for the pages. If this happens, carefully drain all pools using
the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
some pages, and use that information to make draining easier and more
efficient.
* Get rid of PR_URGENT. There was only one use of it, and it could be
dealt with by the caller.

From art@openbsd.org.


Revision tags: ifpoll-base
# 1.3 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.2 17-Aug-2001 thorpej

branches: 1.2.2; 1.2.4;
Only report expire time for DYNAMIC forwarding table entries.


# 1.1 17-Aug-2001 thorpej

Add support for building Ethernet bridges, based on Jason Wright's
bridge driver from OpenBSD, although the bridge code has been *heavily*
modified by me (the 802.1D code remains mostly unchanged from the
original).


# 1.181 02-Jul-2021 yamaguchi

Use if_ioctl() for changing MTU, not ether_ioctl to prevent panic

Fix PR kern/56292


Revision tags: thorpej-i2c-spi-conf-base
# 1.180 16-Jun-2021 riastradh

if_attach and if_initialize cannot fail, don't test return value

These were originally made failable back in 2017 when if_initialize
allocated a softint in every interface for link state changes, so
that it could fail gracefully instead of panicking:

https://mail-index.NetBSD.org/source-changes/2017/10/23/msg089053.html

However, this spawned many seldom- or never-tested error branches,
which are risky to have around. And that softint in every interface
has since been replaced by a single global workqueue, because link
state changes require thread context but not low latency or high
throughput:

https://mail-index.NetBSD.org/source-changes/2020/02/06/msg113759.html

So there is no longer any reason for if_initialize to fail. (The
subroutine if_stats_init can't fail because percpu_alloc can't fail
either.)

There is a snag: the softint_establish in if_percpuq_create could
fail, potentially leading to bad consequences later on trying to use
the softint. This change doesn't introduce any new bugs because of
the snag -- if_percpuq_attach was already broken. However, the snag
can be better addressed without spawning error branches, either by
using a single softint or making softints less scarce.

(Separate commit will change the signatures of if_attach and
if_initialize to return void, scheduled to ride whatever is the next
convenient kernel bump.)

Patch and testing on amd64 and evbmips64-eb by maya@; commit message
soliloquy, and compile-testing on evbppc/i386/earmv7hf, by me.


Revision tags: cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-cfargs-base thorpej-futex-base
# 1.179 19-Feb-2021 christos

branches: 1.179.4;
- Make ALIGNED_POINTER use __alignof(t) instead of sizeof(t). This is more
correct because it works with non-primitive types and provides the ABI
alignment for the type the compiler will use.
- Remove all the *_HDR_ALIGNMENT macros and asserts
- Replace POINTER_ALIGNED_P with ACCESSIBLE_POINTER which is identical to
ALIGNED_POINTER, but returns that the pointer is always aligned if the
CPU supports unaligned accesses.
[ as proposed in tech-kern ]


# 1.178 14-Feb-2021 christos

- centralize header align and pullup into a single inline function
- use a single macro to align pointers and expose the alignment, instead
of hard-coding 3 in 1/2 the macros.
- fix an issue in the ipv6 lt2p where it was aligning for ipv4 and pulling
for ipv6.


# 1.177 02-Nov-2020 roy

bridge: revert prior

It's of little use.
If we need to do this in the future, consider a sysctl to do it for all
interfaces in the bridge and not just the one being added.


# 1.176 27-Sep-2020 roy

branches: 1.176.2;
bridge: When an interface joins then mark addresses on it as tentative

The exact flow is detatch addresses, join bridge and then mark detached
addresses as tentative.
This ensures that Duplicate Address Detection for the joining interface
are performed across all members of the bridge.


# 1.175 27-Sep-2020 roy

bridge: Calculate link state as the best link state of any member

If any member is LINK_STATE_UP then it's LINK_STATE_UP.
Otherwise if any member is LINK_STATE_UNKNOWN then it's LINK_STATE_UNKNOWN.
Otherwise it's LINK_STATE_DOWN.


# 1.174 01-Aug-2020 maxv

Remove #ifdef BRIDGE_IPF, compile in the code by default. Sent to
tech-net@.


# 1.173 01-May-2020 jdolecek

report no enabled capabilities when no interface is part of bridge


# 1.172 30-Apr-2020 jdolecek

for bridge(4), report the common enabled capabilities of the members
via SIOCGIFCAP for visibility


# 1.171 27-Apr-2020 jdolecek

if MTU of the added interface doesn't match the bridge, modify the MTU
of the interface to that of the bridge instead of just refusing the
addition with EINVAL

this is a convenience feature to simplify bridge setup with non-standard
MTU, the useful behaviour observed with Linux xenbr


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406
# 1.170 27-Mar-2020 jdolecek

replace the conditional m_pullup() on start of bridge_output() with
a KASSERT(), to make it clear no mbuf manipulation is ever done here

the condition should never trigger, this always runs after ether_output()
M_PREPEND()s ether_header


# 1.169 24-Mar-2020 jdolecek

reset the csum_flags in bridge_brodcast() also for bmcast path

for destination interfaces with real hardware offloading this fixes
multicast packet corruption; for xvif(4) this fix stops treating them
as having no csum

may fix PR kern/42386


Revision tags: ad-namecache-base3
# 1.168 24-Feb-2020 rin

Remove debug printf I put into bridge_calc_csum_flags().
Sorry for noise.


# 1.167 23-Feb-2020 jdolecek

disable the DEBUG bridge_calc_csum_flags() printf


# 1.166 29-Jan-2020 thorpej

Adopt <net/if_stats.h>.


Revision tags: ad-namecache-base2 ad-namecache-base1 ad-namecache-base phil-wifi-20191119
# 1.165 05-Aug-2019 msaitoh

branches: 1.165.2;
Cast uint32_t to avoid undefined behavior in bridge_rthash(). Found by kUBSan.


Revision tags: netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609 isaki-audio2-base pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226
# 1.164 22-Dec-2018 rin

branches: 1.164.4;
Take the interface out of promiscuous mode in bridge_delete_member()
instead of bridge_ioctl_del(). Otherwise, the member interfaces are
left in promiscuous mode when the bridge is destroyed.


# 1.163 15-Dec-2018 rin

Improve wording in comments: replace "chain" with "queue" for
sequence of mbuf's connected by m_nextpkt, in order to avoid
confusion with those connected by m_next.

No binary changes.


# 1.162 14-Dec-2018 martin

Need <netinet6/ip6_var.h> for ip6_statinc() prototype.


# 1.161 12-Dec-2018 rin

PR kern/53562

Handle TX offload in software when a packet is sent via
bridge_output(). We can send it as is in the following
exceptional cases:

For unicast:

(1) When the destination interface is the same as source.

(2) When the destination supports all TX offload options
specified in a packet.

For multicast/broadcast:

(3) When all the members of the bridge support the specified
TX offload options.

For (3), add sc_csum_flags_tx flag to bridge softc, which is
logical AND b/w capabilities of TX offload options in member
interface (ifp->if_csum_flags_tx). The flag is updated when a
member is (i) added to or (ii) removed from a bridge, or (iii)
if_csum_flags_tx flag of a member interface is manipulated via
ifconfig(8).

Turn on M_CSUM_TSOv[46] bit in ifp->if_csum_flags_tx flag when
TSO[46] is enabled for that interface.

OK msaitoh thorpej


Revision tags: pgoyette-compat-1126
# 1.160 09-Nov-2018 ozaki-r

Fix that brconfig <bridge> (addr) can't show a large number of MAC addresses

The command shows only 256 addresses at maximum even if a bridge caches more
addresses. It occurs because the kernel doesn't return an error if the command
passes a short buffer that can't store all cached addresses; the kernel fills
cached addresses as much as possible and returns it without telling that the
result is truncated.

Fix the issue by telling a required size of a buffer if a buffer passed from the
command is not enough, which lets the command retry with an enough buffer.

Reported by k-goda@IIJ


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930
# 1.159 19-Sep-2018 msaitoh

Micro optimization. m_copym(M_COPYALL) -> m_copypacket().


# 1.158 18-Sep-2018 msaitoh

- Fix bridge_enqueue() which was broken by last commit. Use correct mbuf
pointer.
- Modify comment.


# 1.157 14-Sep-2018 msaitoh

Fix a bug that bridge_enqueue() incorrectly cleared outgoing packet's offload
flags. bridge_enqueue() is called from bridge_output() when a packet is
spontaneous. Clear csum_flags before calling brige_enqueue() in
bridge_forward() or bridge_broadcast() instead of in the beginning of
bridge_enqueue().

Note that this change doesn't fix a problem on the following configuration:

A bridge has two or more interfaces.

An address is assigned to an bridge member interface and
some offload flags are set.

Another interface has no address and has no any offload flag.

XXX pullup-[78]


Revision tags: pgoyette-compat-0906 pgoyette-compat-0728 phil-wifi-base pgoyette-compat-0625
# 1.156 25-May-2018 ozaki-r

branches: 1.156.2;
Ensure to call if_register after interface initializations finish


Revision tags: pgoyette-compat-0521
# 1.155 14-May-2018 ozaki-r

Protect packet input routines with KERNEL_LOCK and splsoftnet

if_input, i.e, ether_input and friends, now runs in softint without any
protections. It's ok for ether_input itself because it's already MP-safe,
however, subsequent routines called from it such as carp_input and agr_input
aren't safe because they're not MP-safe. Protect if_input with KERNEL_LOCK.

if_input can be called from a normal LWP context. In that case we need to
prevent interrupts (softint) from running by splsoftnet to protect non-MP-safe
codes (e.g., carp_input and agr_input).

Pointed out by mlelstv@


Revision tags: pgoyette-compat-0502 pgoyette-compat-0422
# 1.154 18-Apr-2018 ozaki-r

Add missing PSLIST_ENTRY_INIT and PSLIST_ENTRY_DESTROY


# 1.153 18-Apr-2018 ozaki-r

Get rid of a unnecessary semicolon

Pointed out by kamil@


# 1.152 18-Apr-2018 ozaki-r

bridge: use pslist(9) for rtlist and rthash

The change fixes race conditions on list operations. One example is that a
reader may see invalid pointers on a looking item in a list due to lack of
membar_producer.


# 1.151 18-Apr-2018 ozaki-r

Simplify bridge_rtnode_insert (NFC)


# 1.150 18-Apr-2018 ozaki-r

Remove obsolete NULL checks


Revision tags: pgoyette-compat-0415
# 1.149 10-Apr-2018 ozaki-r

Fix bridge_rtdelete

It removes a rtable entry that belongs to a specified interface, however, its
original behavior was to delete all belonging entries. Restore the original
behavior.


Revision tags: pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.148 15-Jan-2018 maxv

branches: 1.148.2;
If the bridge is not running, don't call bridge_stop. Otherwise the
following commands will crash the kernel:

ifconfig bridge0 create
ifconfig bridge0 destroy


# 1.147 28-Dec-2017 ozaki-r

Ensure the timer isn't running by using workqueue_wait


# 1.146 19-Dec-2017 ozaki-r

Don't set IFEF_MPSAFE unless NET_MPSAFE at this point

Because recent investigations show that interfaces with IFEF_MPSAFE need to
follow additional restrictions to work with the flag safely. We should enable it
on an interface by default only if the interface surely satisfies the
restrictions, which are described in if.h.

Note that enabling IFEF_MPSAFE solely gains a few benefit on performance because
the network stack is still serialized by the big kernel locks by default.


# 1.145 11-Dec-2017 ozaki-r

Wrap if_ioctl_lock with IFNET_* macros (NFC)

Also if_ioctl_lock perhaps needs to be renamed to something because it's now
not just for ioctl...


# 1.144 08-Dec-2017 ozaki-r

Fix build of kernels without ether

By throwing out if_enable_vlan_mtu and if_disable_vlan_mtu that
created a unnecessary dependency from if.c to if_ethersubr.c.

PR kern/52790


# 1.143 06-Dec-2017 ozaki-r

Ensure to not turn on IFF_RUNNING of an interface until its initialization completes

And ensure to turn off it before destruction as per IFF_RUNNING's description
"resource allocated". (The description is a bit doubtful though, I believe the
change is still proper.)


# 1.142 06-Dec-2017 ozaki-r

Ensure to hold if_ioctl_lock when calling if_flags_set


Revision tags: tls-maxphys-base-20171202
# 1.141 17-Nov-2017 ozaki-r

Add missing IFEF_NO_LINK_STATE_CHANGE to bridge


# 1.140 16-Nov-2017 ozaki-r

Unify IFEF_*_MPSAFE into IFEF_MPSAFE

There are already two flags for if_output and if_start, however, it seems such
MPSAFE flags are eventually needed for all if_XXX operations. Having discrete
flags for each operation is wasteful of if_extflags bits. So let's unify
the flags into one: IFEF_MPSAFE.

Fortunately IFEF_*_MPSAFE flags have never been included in any releases, so
we can change them without breaking backward compatibility of the releases
(though the kernel version of -current should be bumped).

Note that if an interface have both MP-safe and non-MP-safe operations at a
time, we have to set the IFEF_MPSAFE flag and let callees of non-MP-safe
opeartions take the kernel lock.

Proposed on tech-kern@ and tech-net@


# 1.139 15-Nov-2017 ozaki-r

Mark callouts of bridge CALLOUT_MPSAFE


# 1.138 25-Oct-2017 ozaki-r

Remove unnecessary splsoftnet


# 1.137 25-Oct-2017 ozaki-r

Don't free sc_rthash twice


# 1.136 23-Oct-2017 msaitoh

- If if_initialize() failed in the attach function, free resources and return.
- Add some missing frees in bridge_clone_destroy().
- KNF


# 1.135 02-Oct-2017 ozaki-r

Add curlwp_bind to bridge_input for psref

It can be called in a thread context via tap (tap_dev_write).

Fix PR kern/52587


Revision tags: nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320
# 1.134 07-Mar-2017 ozaki-r

branches: 1.134.6;
Remove unnecessary splnet for bridge_enqueue

bridge_enqueue now uses if_transmit_lock that does splnet for device
drivers, so splnet for bridge_enqueue isn't needed anymore.


# 1.133 16-Feb-2017 knakahara

add l2tp(4) L2TPv3 interface.

originally implemented by IIJ SEIL team.


Revision tags: nick-nhusb-base-20170204
# 1.132 23-Jan-2017 ozaki-r

Replace some splnet with splsoftnet


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107 nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004
# 1.131 15-Sep-2016 christos

branches: 1.131.2;
Always do the mbuf checks. The packet filters (npf) expect the mbuf to be
pulled-up. (Krists Krilovs)


Revision tags: localcount-20160914
# 1.130 29-Aug-2016 ozaki-r

KNF; replace white spaces with hard tabs

No functional change.


Revision tags: pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.129 22-Jun-2016 knakahara

branches: 1.129.2;
fix: locking about IFQ_ENQUEUE and ALTQ

- If NET_MPSAFE is not defined, IFQ_LOCK is nop. Currently, that means
IFQ_ENQUEUE() of some paths such as bridge_enqueue() is called parallel
wrongly.
- If ALTQ is enabled, Tx processing should call if_transmit() (= IFQ_ENQUEUE
+ ifp->if_start()) instead of ifp->if_transmit() to call ALTQ_ENQUEUE()
and ALTQ_DEQUEUE().
Furthermore, ALTQ processing is always required KERNEL_LOCK currently.


# 1.128 20-Jun-2016 knakahara

fix: should not assert IFEF_OUTPUT_MPSAFE in bridge_output()


# 1.127 20-Jun-2016 knakahara

tentative fix for ATF(net/if_bridge/t_bridge)


# 1.126 20-Jun-2016 knakahara

make bridge_output MP-safe, so that bridge(4) can enable IFEF_OUTPUT_MPSAFE.

making MP-scalable is future work.


# 1.125 10-Jun-2016 ozaki-r

Avoid storing a pointer of an interface in a mbuf

Having a pointer of an interface in a mbuf isn't safe if we remove big
kernel locks; an interface object (ifnet) can be destroyed anytime in any
packet processing and accessing such object via a pointer is racy. Instead
we have to get an object from the interface collection (ifindex2ifnet) via
an interface index (if_index) that is stored to a mbuf instead of an
pointer.

The change provides two APIs: m_{get,put}_rcvif_psref that use psref(9)
for sleep-able critical sections and m_{get,put}_rcvif that use
pserialize(9) for other critical sections. The change also adds another
API called m_get_rcvif_NOMPSAFE, that is NOT MP-safe and for transition
moratorium, i.e., it is intended to be used for places where are not
planned to be MP-ified soon.

The change adds some overhead due to psref to performance sensitive paths,
however the overhead is not serious, 2% down at worst.

Proposed on tech-kern and tech-net.


# 1.124 10-Jun-2016 ozaki-r

Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.123 16-May-2016 ozaki-r

Apply if_get and if_put to bridge(4)


# 1.122 04-May-2016 roy

Allow multicast/broadcast packets from a bridge member to other members.
Note this should just call bridge_broadcast when more locking issues are
resolved.


# 1.121 28-Apr-2016 knakahara

introduce new ifnet MP-scalable sending interface "if_transmit".


# 1.120 28-Apr-2016 ozaki-r

Constify rtentry of if_output

We no longer need to change rtentry below if_output.

The change makes it clear where rtentries are changed (or not)
and helps forthcoming locking (os psrefing) rtentries.


# 1.119 24-Apr-2016 christos

CID 1358673: dead code


Revision tags: nick-nhusb-base-20160422
# 1.118 22-Apr-2016 roy

Change used from int to bool.
If used, abort the loop because we think we're already at the end.


# 1.117 20-Apr-2016 knakahara

IFQ_ENQUEUE refactor (3/3) : eliminate pktattr argument from IFQ_ENQUEUE caller


# 1.116 20-Apr-2016 knakahara

IFQ_ENQUEUE refactor (2/3) : eliminate pktattr argument from altq implemantation


# 1.115 19-Apr-2016 ozaki-r

Apply psref(9) to bridge(4)

Note that there is an issue that ioctls for an interface and a destruction
of the interface can run in parallel and it causes race conditions on
bridge as well (it rarely happens). The issue will be addressed in the
interface common code (if.c).


# 1.114 19-Apr-2016 ozaki-r

Remove BRIDGE_MPSAFE switch and enable MP-safe code by default

We need to enable it by default because bridge_input now runs
in softint, but bridge_input w/o BRIDGE_MPSAFE was designed as
it runs in hardware interrupt.

Note that there remains a racy code in bridge_output; it will be
solved in the upcoming change (applying psref(9)).


# 1.113 11-Apr-2016 ozaki-r

Fix usage of pslist(9)

Pointed out by riastradh@.


# 1.112 11-Apr-2016 ozaki-r

Use pslist(9) in bridge(4)

This adds missing memory barriers to list operations for pserialize.


# 1.111 28-Mar-2016 ozaki-r

Remove unused global bridge list

Pointed out by riastradh@


# 1.110 23-Mar-2016 ozaki-r

Fix LIST_FOREACH argument


# 1.109 23-Mar-2016 ozaki-r

Use LIST_FOREACH instead of LIST_FOREACH_SAFE

No need to use *_SAFE because we don't remove any items in the loop.


Revision tags: nick-nhusb-base-20160319
# 1.108 15-Feb-2016 ozaki-r

Simplify bridge(4)

Thanks to introducing softint-based if_input, the entire bridge code now
never run in hardware interrupt context. So we can simplify the code.

- Remove spin mutexes
- They were needed because some code of bridge could run in
hardware interrupt context
- We now need only an adaptive mutex for each shared object
(a member list and a forwarding table)
- Remove pktqueue
- bridge_input is already in softint, using another softint
(for bridge_forward) is useless
- Packet distribution should be down at device drivers


# 1.107 10-Feb-2016 ozaki-r

Don't share struct work, instead have one per softc

Pointed out by riastradh@


# 1.106 09-Feb-2016 ozaki-r

Introduce softint-based if_input

This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.

This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.

To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.

Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html

Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!


Revision tags: nick-nhusb-base-20151226
# 1.105 19-Nov-2015 christos

Add handling of VLAN packets in if_bridge where the parent interface supports
them (Jean-Jacques.Puig@espci.fr). Factor out the vlan_mtu enabling and
disabling code.


# 1.104 20-Oct-2015 maxv

Harmless alloc inconsistency; make sure the exact same argument is given to
kmem_alloc/kmem_free. Found by Brainy.


# 1.103 07-Oct-2015 ozaki-r

Enqueue frames to a curcpu's pktqueue

Currently RX can run on a CPU other than CPU#0, so always enqueuing
to a pktqueue of CPU#0 makes no sense. Let's use a curcpu's pktqueue,
although bridge_foward softint doesn't run in parallel without
NET_MPSAFE.

This is a temporal solution. We need a fundamental solution.


Revision tags: nick-nhusb-base-20150921
# 1.102 28-Aug-2015 rjs

Don't set M_PROTO1 in mbuf flags.

This was left over from the old usage of gif(4) with bridges.


# 1.101 20-Aug-2015 christos

include "ioconf.h" to get the 'void <driver>attach(int count);' prototype.


# 1.100 23-Jul-2015 ozaki-r

Fix PR 48104

So far bridge cannot receive frames via a member interface when the frames
come from another member interface. So when we assign an IP address to
a member interface, hosts connected to another member interface cannot
ping to the IP address. That behavior isn't expected. See PR 48104 for
more realistic examples of this issue.

The change does:
- drop M_PROMISC before ether_input, which allows a bridge member interface
to receive a frame coming from another bridge member interface
- receive broadcast/multicast frames via all bridge member interfaces,
which is required to receive IPv6 multicast packets destined to a
multicast group belonging to a bridge member interface that is different
from a packet arrival interface

roy@ helped testing of the fix, thanks!


Revision tags: nick-nhusb-base-20150606
# 1.99 01-Jun-2015 matt

Modify the BRDGGIFS and BRDGRTS cmds to be more COMPAT_NETBSD32 friendly.
(XXX whitespace)


# 1.98 16-Apr-2015 ozaki-r

Fix racy bridge_delete_member

It can be called from bridge_ioctl_del and bridge_clone_destroy with
a same bridge member (bif) at the same time. We have to prevent
that happens.

Pointed out by riastradh@


Revision tags: nick-nhusb-base-20150406
# 1.97 08-Jan-2015 ozaki-r

Use pserialize for rtlist in bridge

This change enables lockless accesses to bridge rtable lists.
See locking notes in a comment to know how pserialize and
mutexes are used. Some functions are rearranged to use
pserialize. A workqueue is introduced to use pserialize in
bridge_rtage via bridge_timer callout.

As usual, pserialize and mutexes are used only when NET_MPSAFE
on. On the other hand, the newly added workqueue is used
regardless of NET_MPSAFE on or off.


# 1.96 01-Jan-2015 ozaki-r

Reset the expire time of a cache on receiving a frame for the cache

The expire time of a cache in a bridge MAC address table was never reset
once it is initialized regardless of traffic for the cache. The behavior
isn't supposed and active caches are unnecessarily expired and removed.

PR kern/49507


# 1.95 31-Dec-2014 ozaki-r

Use pserialize in bridge

This change enables lockless accesses to bridge member lists.
See locking notes in a comment to know how pserialize and
mutexes are used.

This change also provides support for softint-based interrupt
handling; pserialize readers can run in both HW interrupt and
softint contexts.

As usual, pserialize is used only when NET_MPSAFE on.


# 1.94 25-Dec-2014 ozaki-r

Use LIST_FOREACH_SAFE in bridge_rt* functions


# 1.93 24-Dec-2014 ozaki-r

Replace malloc/free with kmem_* in if_bridge

Additionally M_NOWAIT is replaced with KM_SLEEP.


# 1.92 22-Dec-2014 ozaki-r

Call ether_input/m_freem without holding a lock or referencing unnecessary objects

When NET_MPSAFE on, a bridge tries to pass up a packet to Layer 3
(or call m_freem) with holding a lock or referencing unnecessary
objects. That causes random lock ups. The change fixes the issue.


Revision tags: nick-nhusb-base
# 1.91 15-Aug-2014 ozaki-r

branches: 1.91.2;
bridge: reject non-IFF_SIMPLEX interfaces

bridge does not work with !IFF_SIMPLEX interfaces (PR/18035);
the bug is not yet fixed. Until it gets fixed, we should
reject non-IFF_SIMPLEX interfaces.

Discussed with pooka@


Revision tags: netbsd-7-1-2-RELEASE netbsd-7-1-1-RELEASE netbsd-7-1-RELEASE netbsd-7-1-RC2 netbsd-7-nhusb-base-20170116 netbsd-7-1-RC1 netbsd-7-0-2-RELEASE netbsd-7-nhusb-base netbsd-7-0-1-RELEASE netbsd-7-0-RELEASE netbsd-7-0-RC3 netbsd-7-0-RC2 netbsd-7-0-RC1 netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.90 23-Jul-2014 ozaki-r

branches: 1.90.2;
Avoid calling copyout with holding mutex(IPL_NET)

Because copyout may lead a page fault that may sleep, we have to pull it
out from the critical section of mutex(IPL_NET) in bridge_ioctl_gifs.


# 1.89 23-Jul-2014 ozaki-r

Add missing unlock


# 1.88 20-Jul-2014 ozaki-r

Don't return ENETRESET when ioctl SIOCSIFMTU

Otherwise, just changing MTU with ifconfig shows
a confusable error message.

RP kern/48996


# 1.87 14-Jul-2014 ozaki-r

Make bridge MPSAFE

- Introduce BRIDGE_MPSAFE
- It's enabled only when NET_MPSAFE is defined
in if.h or the kernel config
- Add iflist and rtlist mutex locks
- Locking iflist is performance sensitive,
so it's not used when !BRIDGE_MPSAFE
- Add bif object reference counting
- It enables fine-grain locking for bridge member lists
by allowing to not hold a lock during touching a bif
- bridge_release_member is added to decrement the
reference count
- A condition variable is added to do bridge_delete_member
gracefully
- Add if_bridgeif to ifnet
- It's a shortcut to a bif object of a bridge member
- It reduces a bif lookup cost and so lock contention on iflist
- Make bridgestp MPSAFE too


# 1.86 02-Jul-2014 ozaki-r

Protect bridge_list with a mutex


# 1.85 02-Jul-2014 ozaki-r

Remove obsolete codes for if_snd


# 1.84 23-Jun-2014 ozaki-r

Get rid of unnecessary xc_broadcast after pktq_barrier

Pointed out by rmind@


# 1.83 18-Jun-2014 ozaki-r

Restructure bridge_input and bridge_broadcast

There are two changes:
- Assemble the places calling pktq_enqueue (bridge_forward)
for unicast and {b,m}cast frames into one
- Receive {b,m}cast frames in bridge_broadcast, not in
bridge_input

The changes make the code clear and readable. bridge_input
now doesn't need to take care of {b,m}cast frames;
bridge_forward and bridge_broadcast have the responsibility.

The changes are based on a patch of Lloyd Parkes submitted
in PR 48104, but don't fix its issue yet.


# 1.82 18-Jun-2014 ozaki-r

Tidy up bridge_input

No functional change.


# 1.81 17-Jun-2014 ozaki-r

Restructure ether_input and bridge_input

The network stack of NetBSD is well organized and
layered. A packet reception is processed from a
lower layer to an upper layer one by one. However,
ether_input and bridge_input are not structured so.
bridge_input is called inside ether_input.

The new structure replaces ifnet#if_input of a bridge
member with bridge_input when the member is attached.
So a packet goes straight on a packet reception via
a bridge, bridge_input => ether_input => ip_input.

The change is part of a patch of Lloyd Parkes submitted
in PR 48104. Unlike the patch, the change doesn't
intend to change the behavior of the packet processing.
Another patch will fix PR 48104.


# 1.80 16-Jun-2014 ozaki-r

Add net.interfaces.bridgeN.fwdq.{maxlen,len,drops} sysctl


# 1.79 16-Jun-2014 ozaki-r

Use pktqueue for bridge forwarding queue and softint


# 1.78 15-Jun-2014 ozaki-r

Get rid of unnecessary splnet for pool_{get,put}

A mutex prevents interrupts in the functions now.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base
# 1.77 29-Jun-2013 rmind

branches: 1.77.4;
- Rewrite parts of pfil(9): use array to store hooks and thus be more cache
friendly (there are only few hooks in the system). Make the structures
opaque and the interface more strict.
- Remove PFIL_HOOKS option by making pfil(9) mandatory.


Revision tags: agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8
# 1.76 22-Mar-2012 wiz

branches: 1.76.2; 1.76.4;
Fix typo in kauth name. From PR 46234 by Matthew Mondor.
Tested by Geoff Adams and Ryo ONODERA.


# 1.75 13-Mar-2012 elad

Replace the remaining KAUTH_GENERIC_ISSUSER authorization calls with
something meaningful. All relevant documentation has been updated or
written.

Most of these changes were brought up in the following messages:

http://mail-index.netbsd.org/tech-kern/2012/01/18/msg012490.html
http://mail-index.netbsd.org/tech-kern/2012/01/19/msg012502.html
http://mail-index.netbsd.org/tech-kern/2012/02/17/msg012728.html

Thanks to christos, manu, njoly, and jmmv for input.

Huge thanks to pgoyette for spinning these changes through some build
cycles and ATF.


Revision tags: netbsd-6-0-6-RELEASE netbsd-6-1-5-RELEASE netbsd-6-1-4-RELEASE netbsd-6-0-5-RELEASE netbsd-6-1-3-RELEASE netbsd-6-0-4-RELEASE netbsd-6-1-2-RELEASE netbsd-6-0-3-RELEASE netbsd-6-1-1-RELEASE netbsd-6-0-2-RELEASE netbsd-6-1-RELEASE netbsd-6-1-RC4 netbsd-6-1-RC3 netbsd-6-1-RC2 netbsd-6-1-RC1 netbsd-6-0-1-RELEASE matt-nb6-plus-nbase netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base
# 1.74 19-Nov-2011 tls

branches: 1.74.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.73 23-May-2011 joerg

branches: 1.73.4;
simplify


Revision tags: bouyer-quota2-nbase bouyer-quota2-base jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.72 07-Dec-2010 pooka

branches: 1.72.2;
_KERNEL_TOP


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1 yamt-nfs-mp-base9 uebayasi-xip-base
# 1.71 19-Jan-2010 pooka

branches: 1.71.4;
Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


Revision tags: matt-premerge-20091211 yamt-nfs-mp-base8 yamt-nfs-mp-base7 jymxensuspend-base yamt-nfs-mp-base6 yamt-nfs-mp-base5 jym-xensuspend-nbase
# 1.70 17-May-2009 cegger

fix crash in bridge_ioctl():


BRDGGFLT and BRDGSFILT bridge controls are only available with BRIDGE_IPF and PFIL_HOOKS defined.
In amd64 GENERIC and XEN kernel configs PFIL_HOOKS is defined but BRIDGE_IPF is not.

When a BRDGGFLT or BRDGSFILT command comes in, then ifd->ifd_cmd is not in range
of bridge_control_table_size. Then bc is not set and is dereferenced
later => BOOM.


Revision tags: yamt-nfs-mp-base4 jym-xensuspend-base
# 1.69 12-May-2009 elad

Move kauth(9) call before going into splnet().

Mailing list reference:

http://mail-index.netbsd.org/tech-net/2009/05/08/msg001286.html


Revision tags: yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 nick-hppapmap-base
# 1.68 04-Apr-2009 bouyer

Fix another typo


# 1.67 04-Apr-2009 bouyer

Fix a comment, and make it build.


# 1.66 04-Apr-2009 bouyer

Fixes from Masao Uebayashi


# 1.65 04-Apr-2009 bouyer

Fix for if_start() and pfil_hook() being called from hardware interrupt
context (reported on various mailing-lists, and part of PR kern/41114,
causing panic in pf(4) and possibly ipf(4) when BRIDGE_IPF is used).
Defer bridge_forward() to a software interrupt; bridge_input() enqueues
mbufs to ifp->if_snd which is handled in bridge_forward().


Revision tags: nick-hppapmap-base2
# 1.64 18-Jan-2009 mrg

branches: 1.64.2;
Fix multiple problems:

* A sign extension error creating the bridge ID corrupted the
priority (always making it the maximum).
* Do not catch STP packets on an interface for which STP is not
enabled -- it's a violation of the spec, and causes STP to fail on
neighboring bridges.
* An optimization to bstp_input() -- some information is already
known when we call it.

contributed anonymously.


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.63 07-Nov-2008 dyoung

*** Summary ***

When a link-layer address changes (e.g., ifconfig ex0 link
02:de:ad:be:ef:02 active), send a gratuitous ARP and/or a Neighbor
Advertisement to update the network-/link-layer address bindings
on our LAN peers.

Refuse a change of ethernet address to the address 00:00:00:00:00:00
or to any multicast/broadcast address. (Thanks matt@.)

Reorder ifnet ioctl operations so that driver ioctls may inherit
the functions of their "class"---ether_ioctl(), fddi_ioctl(), et
cetera---and the class ioctls may inherit from the generic ioctl,
ifioctl_common(), but both driver- and class-ioctls may override
the generic behavior. Make network drivers share more code.

Distinguish a "factory" link-layer address from others for the
purposes of both protecting that address from deletion and computing
EUI64.

Return consistent, appropriate error codes from network drivers.

Improve readability. KNF.

*** Details ***

In if_attach(), always initialize the interface ioctl routine,
ifnet->if_ioctl, if the driver has not already initialized it.
Delete if_ioctl == NULL tests everywhere else, because it cannot
happen.

In the ioctl routines of network interfaces, inherit common ioctl
behaviors by calling either ifioctl_common() or whichever ioctl
routine is appropriate for the class of interface---e.g., ether_ioctl()
for ethernets.

Stop (ab)using SIOCSIFADDR and start to use SIOCINITIFADDR. In
the user->kernel interface, SIOCSIFADDR's argument was an ifreq,
but on the protocol->ifnet interface, SIOCSIFADDR's argument was
an ifaddr. That was confusing, and it would work against me as I
make it possible for a network interface to overload most ioctls.
On the protocol->ifnet interface, replace SIOCSIFADDR with
SIOCINITIFADDR. In ifioctl(), return EPERM if userland tries to
invoke SIOCINITIFADDR.

In ifioctl(), give the interface the first shot at handling most
interface ioctls, and give the protocol the second shot, instead
of the other way around. Finally, let compatibility code (COMPAT_OSOCK)
take a shot.

Pull device initialization out of switch statements under
SIOCINITIFADDR. For example, pull ..._init() out of any switch
statement that looks like this:

switch (...->sa_family) {
case ...:
..._init();
...
break;
...
default:
..._init();
...
break;
}

Rewrite many if-else clauses that handle all permutations of IFF_UP
and IFF_RUNNING to use a switch statement,

switch (x & (IFF_UP|IFF_RUNNING)) {
case 0:
...
break;
case IFF_RUNNING:
...
break;
case IFF_UP:
...
break;
case IFF_UP|IFF_RUNNING:
...
break;
}

unifdef lots of code containing #ifdef FreeBSD, #ifdef NetBSD, and
#ifdef SIOCSIFMTU, especially in fwip(4) and in ndis(4).

In ipw(4), remove an if_set_sadl() call that is out of place.

In nfe(4), reuse the jumbo MTU logic in ether_ioctl().

Let ethernets register a callback for setting h/w state such as
promiscuous mode and the multicast filter in accord with a change
in the if_flags: ether_set_ifflags_cb() registers a callback that
returns ENETRESET if the caller should reset the ethernet by calling
if_init(), 0 on success, != 0 on failure. Pull common code from
ex(4), gem(4), nfe(4), sip(4), tlp(4), vge(4) into ether_ioctl(),
and register if_flags callbacks for those drivers.

Return ENOTTY instead of EINVAL for inappropriate ioctls. In
zyd(4), use ENXIO instead of ENOTTY to indicate that the device is
not any longer attached.

Add to if_set_sadl() a boolean 'factory' argument that indicates
whether a link-layer address was assigned by the factory or some
other source. In a comment, recommend using the factory address
for generating an EUI64, and update in6_get_hw_ifid() to prefer a
factory address to any other link-layer address.

Add a routing message, RTM_LLINFO_UPD, that tells protocols to
update the binding of network-layer addresses to link-layer addresses.
Implement this message in IPv4 and IPv6 by sending a gratuitous
ARP or a neighbor advertisement, respectively. Generate RTM_LLINFO_UPD
messages on a change of an interface's link-layer address.

In ether_ioctl(), do not let SIOCALIFADDR set a link-layer address
that is broadcast/multicast or equal to 00:00:00:00:00:00.

Make ether_ioctl() call ifioctl_common() to handle ioctls that it
does not understand.

In gif(4), initialize if_softc and use it, instead of assuming that
the gif_softc and ifp overlap.

Let ifioctl_common() handle SIOCGIFADDR.

Sprinkle rtcache_invariants(), which checks on DIAGNOSTIC kernels
that certain invariants on a struct route are satisfied.

In agr(4), rewrite agr_ioctl_filter() to be a bit more explicit
about the ioctls that we do not allow on an agr(4) member interface.

bzero -> memset. Delete unnecessary casts to void *. Use
sockaddr_in_init() and sockaddr_in6_init(). Compare pointers with
NULL instead of "testing truth". Replace some instances of (type
*)0 with NULL. Change some K&R prototypes to ANSI C, and join
lines.


Revision tags: netbsd-5-0-RC3 netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 wrstuden-revivesa-base-1 simonb-wapbl-nbase yamt-pf42-base4 simonb-wapbl-base wrstuden-revivesa-base
# 1.62 15-Jun-2008 christos

branches: 1.62.2; 1.62.4; 1.62.6;
- add if_alloc (ours just mallocs), and if_initname and use them (from FreeBSD)
- kill memsets where M_ZERO can be used.


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base
# 1.61 15-Apr-2008 thorpej

branches: 1.61.2; 1.61.4; 1.61.6; 1.61.8;
Make ip6 and icmp6 stats per-cpu.


# 1.60 12-Apr-2008 cegger

make this build with BRIDGE_IPF and PFIL_HOOKS options


# 1.59 12-Apr-2008 thorpej

Make IP, TCP, UDP, and ICMP statistics per-CPU. The stats are collated
when the user requests them via sysctl.


# 1.58 08-Apr-2008 thorpej

Change IPv6 stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ip6stat structure; old netstat
binaries will continue to work properly.


# 1.57 07-Apr-2008 thorpej

Change IP stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ipstat structure; old netstat
binaries will continue to work properly.


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase nick-net80211-sync-base keiichi-mipv6-base matt-armv6-nbase hpcarm-cleanup-base
# 1.56 20-Feb-2008 matt

branches: 1.56.6;
s/u_\(int[0-9]*_t\)/u\1/g
(change u_int*_t to uint*_t)


Revision tags: bouyer-xeni386-nbase bouyer-xeni386-base mjf-devfs-base
# 1.55 19-Jan-2008 dyoung

Use C99 array initializers for bridge_control_table[].


Revision tags: nick-csl-alignment-base5 bouyer-xeni386-merge1 matt-armv6-prevmlocking vmlocking2-base3 yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 jmcneill-base bouyer-xenamd64-base2 vmlocking-nbase yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base matt-armv6-base jmcneill-pm-base reinoud-bufcleanup-base vmlocking-base
# 1.54 27-Aug-2007 dyoung

branches: 1.54.2; 1.54.8; 1.54.14;
LLADDR -> CLLADDR.


# 1.53 26-Aug-2007 dyoung

Constify: LLADDR -> CLLADDR. I'm aiming here to make it easier to
identify sockaddr_dl abuse that remains in the kernel, especially
the potential for overwriting memory past the end of a sockaddr_dl
with, e.g., memcpy(LLADDR(), ...).

Use sockaddr_dl_setaddr() in a few places.


Revision tags: matt-mips64-base nick-csl-alignment-base mjf-ufs-trans-base
# 1.52 09-Jul-2007 ad

branches: 1.52.2; 1.52.6;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.51 12-Mar-2007 ad

branches: 1.51.2;
Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.


# 1.50 04-Mar-2007 christos

branches: 1.50.2;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.49 21-Feb-2007 dyoung

Use __arraycount().


# 1.48 17-Feb-2007 dyoung

KNF: de-__P, bzero -> memset, bcmp -> memcmp. Remove extraneous
parentheses in return statements.

Cosmetic: don't open-code TAILQ_FOREACH().

Cosmetic: change types of variables to avoid oodles of casts: in
in6_src.c, avoid casts by changing several route_in6 pointers
to struct route pointers. Remove unnecessary casts to caddr_t
elsewhere.

Pave the way for eliminating address family-specific route caches:
soon, struct route will not embed a sockaddr, but it will hold
a reference to an external sockaddr, instead. We will set the
destination sockaddr using rtcache_setdst(). (I created a stub
for it, but it isn't used anywhere, yet.) rtcache_free() will
free the sockaddr. I have extracted from rtcache_free() a helper
subroutine, rtcache_clear(). rtcache_clear() will "forget" a
cached route, but it will not forget the destination by releasing
the sockaddr. I use rtcache_clear() instead of rtcache_free()
in rtcache_update(), because rtcache_update() is not supposed
to forget the destination.

Constify:

1 Introduce const accessor for route->ro_dst, rtcache_getdst().

2 Constify the 'dst' argument to ifnet->if_output(). This
led me to constify a lot of code called by output routines.

3 Constify the sockaddr argument to protosw->pr_ctlinput. This
led me to constify a lot of code called by ctlinput routines.

4 Introduce const macros for converting from a generic sockaddr
to family-specific sockaddrs, e.g., sockaddr_in: satocsin6,
satocsin, et cetera.


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.47 04-Jan-2007 elad

branches: 1.47.2;
Consistent usage of KAUTH_GENERIC_ISSUSER.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3 netbsd-4-base
# 1.46 23-Nov-2006 rpaulo

New EtherIP driver based on tap(4) and gif(4) by Hans Rosenfeld.
Notable changes:
* Fixes PR 34268.
* Separates the code from gif(4) (which is more cleaner).
* Allows the usage of STP (Spanning Tree Protocol).
* Removed EtherIP implementation from gif(4)/tap(4).

Some input from Christos.


# 1.45 16-Nov-2006 christos

__unused removal on arguments; approved by core.


Revision tags: yamt-splraiseipl-base2
# 1.44 17-Oct-2006 dogcow

now that we have -Wno-unused-parameter, back out all the tremendously ugly
code to gratuitously access said parameters.


# 1.43 13-Oct-2006 dogcow

More -Wunused fallout. sprinkle __unused when possible; otherwise, use the
do { if (&x) {} } while (/* CONSTCOND */ 0);
construct as suggested by uwe in <20061012224845.GA9449@snark.ptc.spbu.ru>.


# 1.42 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


# 1.41 05-Oct-2006 tls

Protect calls to pool_put/pool_get that may occur in interrupt context
with spl used to protect other allocations and frees, or datastructure
element insertion and removal, in adjacent code.

It is almost unquestionably the case that some of the spl()/splx() calls
added here are superfluous, but it really seems wrong to see:

s=splfoo();
/* frob data structure */
splx(s);
pool_put(x);

and if we think we need to protect the first operation, then it is hard
to see why we should not think we need to protect the next. "Better
safe than sorry".

It is also almost unquestionably the case that I missed some pool
gets/puts from interrupt context with my strategy for finding these
calls; use of PR_NOWAIT is a strong hint that a pool may be used from
interrupt context but many callers in the kernel pass a "can wait/can't
wait" flag down such that my searches might not have found them. One
notable area that needs to be looked at is pf.

See also:

http://mail-index.netbsd.org/tech-kern/2006/07/19/0003.html
http://mail-index.netbsd.org/tech-kern/2006/07/19/0009.html


Revision tags: abandoned-netbsd-4-base yamt-splraiseipl-base yamt-pdpolicy-base9 yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.40 23-Jul-2006 ad

branches: 1.40.4; 1.40.6;
Use the LWP cached credentials where sane.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.39 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.38 18-May-2006 liamjfoy

branches: 1.38.2;
Integrate Common Address Redundancy Procotol (CARP) from OpenBSD

'pseudo-device carp'

Thanks to: joerg@ christos@ riz@ and others who tested
Ok: core@


# 1.37 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base yamt-pdpolicy-base2 elad-kernelauth-base yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.36 17-Jan-2006 christos

branches: 1.36.2; 1.36.4; 1.36.6; 1.36.8; 1.36.10;
Make sure that breq is also cleared (from Xin LI)


# 1.35 09-Jan-2006 christos

Make sure we initialize all structs to 0; from Xin LI


# 1.34 24-Dec-2005 perry

branches: 1.34.2;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.33 11-Dec-2005 thorpej

ANSI function decls and application of static.


# 1.32 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.31 01-Jun-2005 jdc

branches: 1.31.2;
Fix this properly by renaming the conflicting variables.


# 1.30 01-Jun-2005 jdc

Remove extraneous definition of struct llc (found by shadow warning).


Revision tags: netbsd-3-0-RELEASE netbsd-3-0-RC6 netbsd-3-0-RC5 netbsd-3-0-RC4 netbsd-3-0-RC3 netbsd-3-0-RC2 netbsd-3-0-RC1 yamt-km-base4 yamt-km-base3 netbsd-3-base kent-audio2-base
# 1.29 26-Feb-2005 perry

branches: 1.29.2; 1.29.4;
nuke trailing whitespace


Revision tags: yamt-km-base2
# 1.28 31-Jan-2005 kim

Add RFC 3378 EtherIP support, ported from OpenBSD to NetBSD by
Hans Rosenfeld (rosenfeld at grumpf.hope-2000.org)

This change makes it possible to add gif interfaces to bridges, which
will then send and receive IP protocol 97 packets. Packets are Ethernet
frames with an EtherIP header prepended.


Revision tags: yamt-km-base kent-audio1-beforemerge kent-audio1-base
# 1.27 04-Dec-2004 peter

branches: 1.27.4; 1.27.6;
Change ifc_destroy to return an int instead of void, so that it
can pass back errors to ifconfig.


# 1.26 06-Oct-2004 bad

Interfaces that do checksum offloading indicate the checksum status of
received packets in csum_flags in the packet header. Packets that are
forwarded over the bridge need to have csum_flags cleared before being
put on the output queue. Do so in bridge_enqueue().

Discussed with Jason Thorpe.

Fixes PR kern/27007 and the first part of PR kern/21831.


# 1.25 05-Oct-2004 christos

Only enable BRIDGE_IPF code if PFIL_HOOKS is enabled.


# 1.24 21-Apr-2004 itojun

kill a sprintf


# 1.23 21-Apr-2004 itojun

kill sprintf, use snprintf


Revision tags: netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.22 31-Jan-2004 jdc

branches: 1.22.2;
Use m_copydata(), m_adj() and M_PREPEND() to manipulate mbuf's in
bridge_ipf(). Fixes kernel memory corruption that occured when using
m_split() and m_cat().
Idea from OpenBSD.


# 1.21 09-Dec-2003 augustss

Fix spelling mistake in a comment.


# 1.20 28-Oct-2003 mycroft

Mark this initializer in the canonical way so it can be found later.


# 1.19 25-Oct-2003 christos

Fix uninitialized variable warnings


# 1.18 16-Sep-2003 jdc

Add a flag parameter to bridge_enqueue() to tell it whether to run the
filter or not. We only need to run the filter for bridge_forward() and
bridge_broadcast(). If we also run it for bridge_output(), we will run
the filter twice outbound per packet, so don't.

In bridge_ipf(), make sure we don't run m_cat() on a single mbuf chain
by checking to see (and remembering) if we need to m_split() the mbuf.
This fixes bridge + ipfilter on sparc.

Fixes PR kern/22063.


# 1.17 11-Aug-2003 itojun

rm extra blank line


# 1.16 13-Jul-2003 jdc

Include opt_inet.h to get INET6 definition.
Now, bridged ipv6 packets are passed through ipfilter.
However, some v6 packets still do not get transmitted when ipf is enabled.
Partial fix for PR kern/22063.


# 1.15 23-Jun-2003 martin

branches: 1.15.2;
Make sure to include opt_foo.h if a defflag option FOO is used.


# 1.14 24-May-2003 kristerw

Make sure splx() is called for all bridge_ioctl() error cases.


# 1.13 16-May-2003 itojun

use strlcpy


# 1.12 14-May-2003 itojun

use arc4random


# 1.11 19-Mar-2003 bouyer

Fix 2 bugs:
- initialise stp when the bridge is turned up, without this stp will keep
all interfaces disabled in a sequence like:
brconfig bridge0 add if0 add if1 stp if0 stp if1 up
- s/BRDGSPRI/BRDGSIFPRIO in brconfig.c:cmd_ifpriority()

add a command (ifpathcost) to change the stp path cost of the STP path cost of
an interface. Display the interface path cost with the others STP parameters.


# 1.10 27-Feb-2003 perseant

Make BRIDGE_IPF an option, and document it. Add it (commented) to GENERIC.
Let brconfig tell whether the bridge is using the ipfilter hook, or not.


# 1.9 15-Feb-2003 perseant

Add ipf packet-filtering option to if_bridge. The option is controlled at
compile-time by BRIDGE_IPF, and at runtime by brconfig with the {ipf,-ipf}
option on a per-bridge basis.

As a side-effect, add PFIL_HOOKS processing to if_bridge.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base kqueue-aftermerge kqueue-beforemerge gehenna-devsw-base kqueue-base
# 1.8 24-Aug-2002 martin

Add a function to lookup bridge members by struct ifnet * and use
it at all call sites that have such a pointer readily available.
This avoids unnecessary strcmp()s in critical paths, and removes
some XXX comments.


# 1.7 08-Jun-2002 itojun

reject "add" request if if_mtu is different.


# 1.6 23-May-2002 itojun

use IFT_BRIDGE


Revision tags: netbsd-1-6-base
# 1.5 24-Mar-2002 jdolecek

branches: 1.5.2; 1.5.4;
Fix a memory leak in bridge_ioctl_add() when the called for non-ethernet
interface.
Problem noted and fix provided by in kern/16019 by Love.


Revision tags: eeh-devprop-base newlock-base
# 1.4 08-Mar-2002 thorpej

Pool deals fairly well with physical memory shortage, but it doesn't
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map). Try to deal with this:

* Group all information about the backend allocator for a pool in a
separate structure. The pool references this structure, rather than
the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
to become available, but will still fail if it cannot callocate KVA
space for the pages. If this happens, carefully drain all pools using
the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
some pages, and use that information to make draining easier and more
efficient.
* Get rid of PR_URGENT. There was only one use of it, and it could be
dealt with by the caller.

From art@openbsd.org.


Revision tags: ifpoll-base
# 1.3 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.2 17-Aug-2001 thorpej

branches: 1.2.2; 1.2.4;
Only report expire time for DYNAMIC forwarding table entries.


# 1.1 17-Aug-2001 thorpej

Add support for building Ethernet bridges, based on Jason Wright's
bridge driver from OpenBSD, although the bridge code has been *heavily*
modified by me (the 802.1D code remains mostly unchanged from the
original).


# 1.180 16-Jun-2021 riastradh

if_attach and if_initialize cannot fail, don't test return value

These were originally made failable back in 2017 when if_initialize
allocated a softint in every interface for link state changes, so
that it could fail gracefully instead of panicking:

https://mail-index.NetBSD.org/source-changes/2017/10/23/msg089053.html

However, this spawned many seldom- or never-tested error branches,
which are risky to have around. And that softint in every interface
has since been replaced by a single global workqueue, because link
state changes require thread context but not low latency or high
throughput:

https://mail-index.NetBSD.org/source-changes/2020/02/06/msg113759.html

So there is no longer any reason for if_initialize to fail. (The
subroutine if_stats_init can't fail because percpu_alloc can't fail
either.)

There is a snag: the softint_establish in if_percpuq_create could
fail, potentially leading to bad consequences later on trying to use
the softint. This change doesn't introduce any new bugs because of
the snag -- if_percpuq_attach was already broken. However, the snag
can be better addressed without spawning error branches, either by
using a single softint or making softints less scarce.

(Separate commit will change the signatures of if_attach and
if_initialize to return void, scheduled to ride whatever is the next
convenient kernel bump.)

Patch and testing on amd64 and evbmips64-eb by maya@; commit message
soliloquy, and compile-testing on evbppc/i386/earmv7hf, by me.


Revision tags: cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-i2c-spi-conf-base thorpej-cfargs-base thorpej-futex-base
# 1.179 19-Feb-2021 christos

- Make ALIGNED_POINTER use __alignof(t) instead of sizeof(t). This is more
correct because it works with non-primitive types and provides the ABI
alignment for the type the compiler will use.
- Remove all the *_HDR_ALIGNMENT macros and asserts
- Replace POINTER_ALIGNED_P with ACCESSIBLE_POINTER which is identical to
ALIGNED_POINTER, but returns that the pointer is always aligned if the
CPU supports unaligned accesses.
[ as proposed in tech-kern ]


# 1.178 14-Feb-2021 christos

- centralize header align and pullup into a single inline function
- use a single macro to align pointers and expose the alignment, instead
of hard-coding 3 in 1/2 the macros.
- fix an issue in the ipv6 lt2p where it was aligning for ipv4 and pulling
for ipv6.


# 1.177 02-Nov-2020 roy

bridge: revert prior

It's of little use.
If we need to do this in the future, consider a sysctl to do it for all
interfaces in the bridge and not just the one being added.


# 1.176 27-Sep-2020 roy

branches: 1.176.2;
bridge: When an interface joins then mark addresses on it as tentative

The exact flow is detatch addresses, join bridge and then mark detached
addresses as tentative.
This ensures that Duplicate Address Detection for the joining interface
are performed across all members of the bridge.


# 1.175 27-Sep-2020 roy

bridge: Calculate link state as the best link state of any member

If any member is LINK_STATE_UP then it's LINK_STATE_UP.
Otherwise if any member is LINK_STATE_UNKNOWN then it's LINK_STATE_UNKNOWN.
Otherwise it's LINK_STATE_DOWN.


# 1.174 01-Aug-2020 maxv

Remove #ifdef BRIDGE_IPF, compile in the code by default. Sent to
tech-net@.


# 1.173 01-May-2020 jdolecek

report no enabled capabilities when no interface is part of bridge


# 1.172 30-Apr-2020 jdolecek

for bridge(4), report the common enabled capabilities of the members
via SIOCGIFCAP for visibility


# 1.171 27-Apr-2020 jdolecek

if MTU of the added interface doesn't match the bridge, modify the MTU
of the interface to that of the bridge instead of just refusing the
addition with EINVAL

this is a convenience feature to simplify bridge setup with non-standard
MTU, the useful behaviour observed with Linux xenbr


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406
# 1.170 27-Mar-2020 jdolecek

replace the conditional m_pullup() on start of bridge_output() with
a KASSERT(), to make it clear no mbuf manipulation is ever done here

the condition should never trigger, this always runs after ether_output()
M_PREPEND()s ether_header


# 1.169 24-Mar-2020 jdolecek

reset the csum_flags in bridge_brodcast() also for bmcast path

for destination interfaces with real hardware offloading this fixes
multicast packet corruption; for xvif(4) this fix stops treating them
as having no csum

may fix PR kern/42386


Revision tags: ad-namecache-base3
# 1.168 24-Feb-2020 rin

Remove debug printf I put into bridge_calc_csum_flags().
Sorry for noise.


# 1.167 23-Feb-2020 jdolecek

disable the DEBUG bridge_calc_csum_flags() printf


# 1.166 29-Jan-2020 thorpej

Adopt <net/if_stats.h>.


Revision tags: ad-namecache-base2 ad-namecache-base1 ad-namecache-base phil-wifi-20191119
# 1.165 05-Aug-2019 msaitoh

branches: 1.165.2;
Cast uint32_t to avoid undefined behavior in bridge_rthash(). Found by kUBSan.


Revision tags: netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609 isaki-audio2-base pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226
# 1.164 22-Dec-2018 rin

branches: 1.164.4;
Take the interface out of promiscuous mode in bridge_delete_member()
instead of bridge_ioctl_del(). Otherwise, the member interfaces are
left in promiscuous mode when the bridge is destroyed.


# 1.163 15-Dec-2018 rin

Improve wording in comments: replace "chain" with "queue" for
sequence of mbuf's connected by m_nextpkt, in order to avoid
confusion with those connected by m_next.

No binary changes.


# 1.162 14-Dec-2018 martin

Need <netinet6/ip6_var.h> for ip6_statinc() prototype.


# 1.161 12-Dec-2018 rin

PR kern/53562

Handle TX offload in software when a packet is sent via
bridge_output(). We can send it as is in the following
exceptional cases:

For unicast:

(1) When the destination interface is the same as source.

(2) When the destination supports all TX offload options
specified in a packet.

For multicast/broadcast:

(3) When all the members of the bridge support the specified
TX offload options.

For (3), add sc_csum_flags_tx flag to bridge softc, which is
logical AND b/w capabilities of TX offload options in member
interface (ifp->if_csum_flags_tx). The flag is updated when a
member is (i) added to or (ii) removed from a bridge, or (iii)
if_csum_flags_tx flag of a member interface is manipulated via
ifconfig(8).

Turn on M_CSUM_TSOv[46] bit in ifp->if_csum_flags_tx flag when
TSO[46] is enabled for that interface.

OK msaitoh thorpej


Revision tags: pgoyette-compat-1126
# 1.160 09-Nov-2018 ozaki-r

Fix that brconfig <bridge> (addr) can't show a large number of MAC addresses

The command shows only 256 addresses at maximum even if a bridge caches more
addresses. It occurs because the kernel doesn't return an error if the command
passes a short buffer that can't store all cached addresses; the kernel fills
cached addresses as much as possible and returns it without telling that the
result is truncated.

Fix the issue by telling a required size of a buffer if a buffer passed from the
command is not enough, which lets the command retry with an enough buffer.

Reported by k-goda@IIJ


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930
# 1.159 19-Sep-2018 msaitoh

Micro optimization. m_copym(M_COPYALL) -> m_copypacket().


# 1.158 18-Sep-2018 msaitoh

- Fix bridge_enqueue() which was broken by last commit. Use correct mbuf
pointer.
- Modify comment.


# 1.157 14-Sep-2018 msaitoh

Fix a bug that bridge_enqueue() incorrectly cleared outgoing packet's offload
flags. bridge_enqueue() is called from bridge_output() when a packet is
spontaneous. Clear csum_flags before calling brige_enqueue() in
bridge_forward() or bridge_broadcast() instead of in the beginning of
bridge_enqueue().

Note that this change doesn't fix a problem on the following configuration:

A bridge has two or more interfaces.

An address is assigned to an bridge member interface and
some offload flags are set.

Another interface has no address and has no any offload flag.

XXX pullup-[78]


Revision tags: pgoyette-compat-0906 pgoyette-compat-0728 phil-wifi-base pgoyette-compat-0625
# 1.156 25-May-2018 ozaki-r

branches: 1.156.2;
Ensure to call if_register after interface initializations finish


Revision tags: pgoyette-compat-0521
# 1.155 14-May-2018 ozaki-r

Protect packet input routines with KERNEL_LOCK and splsoftnet

if_input, i.e, ether_input and friends, now runs in softint without any
protections. It's ok for ether_input itself because it's already MP-safe,
however, subsequent routines called from it such as carp_input and agr_input
aren't safe because they're not MP-safe. Protect if_input with KERNEL_LOCK.

if_input can be called from a normal LWP context. In that case we need to
prevent interrupts (softint) from running by splsoftnet to protect non-MP-safe
codes (e.g., carp_input and agr_input).

Pointed out by mlelstv@


Revision tags: pgoyette-compat-0502 pgoyette-compat-0422
# 1.154 18-Apr-2018 ozaki-r

Add missing PSLIST_ENTRY_INIT and PSLIST_ENTRY_DESTROY


# 1.153 18-Apr-2018 ozaki-r

Get rid of a unnecessary semicolon

Pointed out by kamil@


# 1.152 18-Apr-2018 ozaki-r

bridge: use pslist(9) for rtlist and rthash

The change fixes race conditions on list operations. One example is that a
reader may see invalid pointers on a looking item in a list due to lack of
membar_producer.


# 1.151 18-Apr-2018 ozaki-r

Simplify bridge_rtnode_insert (NFC)


# 1.150 18-Apr-2018 ozaki-r

Remove obsolete NULL checks


Revision tags: pgoyette-compat-0415
# 1.149 10-Apr-2018 ozaki-r

Fix bridge_rtdelete

It removes a rtable entry that belongs to a specified interface, however, its
original behavior was to delete all belonging entries. Restore the original
behavior.


Revision tags: pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.148 15-Jan-2018 maxv

branches: 1.148.2;
If the bridge is not running, don't call bridge_stop. Otherwise the
following commands will crash the kernel:

ifconfig bridge0 create
ifconfig bridge0 destroy


# 1.147 28-Dec-2017 ozaki-r

Ensure the timer isn't running by using workqueue_wait


# 1.146 19-Dec-2017 ozaki-r

Don't set IFEF_MPSAFE unless NET_MPSAFE at this point

Because recent investigations show that interfaces with IFEF_MPSAFE need to
follow additional restrictions to work with the flag safely. We should enable it
on an interface by default only if the interface surely satisfies the
restrictions, which are described in if.h.

Note that enabling IFEF_MPSAFE solely gains a few benefit on performance because
the network stack is still serialized by the big kernel locks by default.


# 1.145 11-Dec-2017 ozaki-r

Wrap if_ioctl_lock with IFNET_* macros (NFC)

Also if_ioctl_lock perhaps needs to be renamed to something because it's now
not just for ioctl...


# 1.144 08-Dec-2017 ozaki-r

Fix build of kernels without ether

By throwing out if_enable_vlan_mtu and if_disable_vlan_mtu that
created a unnecessary dependency from if.c to if_ethersubr.c.

PR kern/52790


# 1.143 06-Dec-2017 ozaki-r

Ensure to not turn on IFF_RUNNING of an interface until its initialization completes

And ensure to turn off it before destruction as per IFF_RUNNING's description
"resource allocated". (The description is a bit doubtful though, I believe the
change is still proper.)


# 1.142 06-Dec-2017 ozaki-r

Ensure to hold if_ioctl_lock when calling if_flags_set


Revision tags: tls-maxphys-base-20171202
# 1.141 17-Nov-2017 ozaki-r

Add missing IFEF_NO_LINK_STATE_CHANGE to bridge


# 1.140 16-Nov-2017 ozaki-r

Unify IFEF_*_MPSAFE into IFEF_MPSAFE

There are already two flags for if_output and if_start, however, it seems such
MPSAFE flags are eventually needed for all if_XXX operations. Having discrete
flags for each operation is wasteful of if_extflags bits. So let's unify
the flags into one: IFEF_MPSAFE.

Fortunately IFEF_*_MPSAFE flags have never been included in any releases, so
we can change them without breaking backward compatibility of the releases
(though the kernel version of -current should be bumped).

Note that if an interface have both MP-safe and non-MP-safe operations at a
time, we have to set the IFEF_MPSAFE flag and let callees of non-MP-safe
opeartions take the kernel lock.

Proposed on tech-kern@ and tech-net@


# 1.139 15-Nov-2017 ozaki-r

Mark callouts of bridge CALLOUT_MPSAFE


# 1.138 25-Oct-2017 ozaki-r

Remove unnecessary splsoftnet


# 1.137 25-Oct-2017 ozaki-r

Don't free sc_rthash twice


# 1.136 23-Oct-2017 msaitoh

- If if_initialize() failed in the attach function, free resources and return.
- Add some missing frees in bridge_clone_destroy().
- KNF


# 1.135 02-Oct-2017 ozaki-r

Add curlwp_bind to bridge_input for psref

It can be called in a thread context via tap (tap_dev_write).

Fix PR kern/52587


Revision tags: nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320
# 1.134 07-Mar-2017 ozaki-r

branches: 1.134.6;
Remove unnecessary splnet for bridge_enqueue

bridge_enqueue now uses if_transmit_lock that does splnet for device
drivers, so splnet for bridge_enqueue isn't needed anymore.


# 1.133 16-Feb-2017 knakahara

add l2tp(4) L2TPv3 interface.

originally implemented by IIJ SEIL team.


Revision tags: nick-nhusb-base-20170204
# 1.132 23-Jan-2017 ozaki-r

Replace some splnet with splsoftnet


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107 nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004
# 1.131 15-Sep-2016 christos

branches: 1.131.2;
Always do the mbuf checks. The packet filters (npf) expect the mbuf to be
pulled-up. (Krists Krilovs)


Revision tags: localcount-20160914
# 1.130 29-Aug-2016 ozaki-r

KNF; replace white spaces with hard tabs

No functional change.


Revision tags: pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.129 22-Jun-2016 knakahara

branches: 1.129.2;
fix: locking about IFQ_ENQUEUE and ALTQ

- If NET_MPSAFE is not defined, IFQ_LOCK is nop. Currently, that means
IFQ_ENQUEUE() of some paths such as bridge_enqueue() is called parallel
wrongly.
- If ALTQ is enabled, Tx processing should call if_transmit() (= IFQ_ENQUEUE
+ ifp->if_start()) instead of ifp->if_transmit() to call ALTQ_ENQUEUE()
and ALTQ_DEQUEUE().
Furthermore, ALTQ processing is always required KERNEL_LOCK currently.


# 1.128 20-Jun-2016 knakahara

fix: should not assert IFEF_OUTPUT_MPSAFE in bridge_output()


# 1.127 20-Jun-2016 knakahara

tentative fix for ATF(net/if_bridge/t_bridge)


# 1.126 20-Jun-2016 knakahara

make bridge_output MP-safe, so that bridge(4) can enable IFEF_OUTPUT_MPSAFE.

making MP-scalable is future work.


# 1.125 10-Jun-2016 ozaki-r

Avoid storing a pointer of an interface in a mbuf

Having a pointer of an interface in a mbuf isn't safe if we remove big
kernel locks; an interface object (ifnet) can be destroyed anytime in any
packet processing and accessing such object via a pointer is racy. Instead
we have to get an object from the interface collection (ifindex2ifnet) via
an interface index (if_index) that is stored to a mbuf instead of an
pointer.

The change provides two APIs: m_{get,put}_rcvif_psref that use psref(9)
for sleep-able critical sections and m_{get,put}_rcvif that use
pserialize(9) for other critical sections. The change also adds another
API called m_get_rcvif_NOMPSAFE, that is NOT MP-safe and for transition
moratorium, i.e., it is intended to be used for places where are not
planned to be MP-ified soon.

The change adds some overhead due to psref to performance sensitive paths,
however the overhead is not serious, 2% down at worst.

Proposed on tech-kern and tech-net.


# 1.124 10-Jun-2016 ozaki-r

Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.123 16-May-2016 ozaki-r

Apply if_get and if_put to bridge(4)


# 1.122 04-May-2016 roy

Allow multicast/broadcast packets from a bridge member to other members.
Note this should just call bridge_broadcast when more locking issues are
resolved.


# 1.121 28-Apr-2016 knakahara

introduce new ifnet MP-scalable sending interface "if_transmit".


# 1.120 28-Apr-2016 ozaki-r

Constify rtentry of if_output

We no longer need to change rtentry below if_output.

The change makes it clear where rtentries are changed (or not)
and helps forthcoming locking (os psrefing) rtentries.


# 1.119 24-Apr-2016 christos

CID 1358673: dead code


Revision tags: nick-nhusb-base-20160422
# 1.118 22-Apr-2016 roy

Change used from int to bool.
If used, abort the loop because we think we're already at the end.


# 1.117 20-Apr-2016 knakahara

IFQ_ENQUEUE refactor (3/3) : eliminate pktattr argument from IFQ_ENQUEUE caller


# 1.116 20-Apr-2016 knakahara

IFQ_ENQUEUE refactor (2/3) : eliminate pktattr argument from altq implemantation


# 1.115 19-Apr-2016 ozaki-r

Apply psref(9) to bridge(4)

Note that there is an issue that ioctls for an interface and a destruction
of the interface can run in parallel and it causes race conditions on
bridge as well (it rarely happens). The issue will be addressed in the
interface common code (if.c).


# 1.114 19-Apr-2016 ozaki-r

Remove BRIDGE_MPSAFE switch and enable MP-safe code by default

We need to enable it by default because bridge_input now runs
in softint, but bridge_input w/o BRIDGE_MPSAFE was designed as
it runs in hardware interrupt.

Note that there remains a racy code in bridge_output; it will be
solved in the upcoming change (applying psref(9)).


# 1.113 11-Apr-2016 ozaki-r

Fix usage of pslist(9)

Pointed out by riastradh@.


# 1.112 11-Apr-2016 ozaki-r

Use pslist(9) in bridge(4)

This adds missing memory barriers to list operations for pserialize.


# 1.111 28-Mar-2016 ozaki-r

Remove unused global bridge list

Pointed out by riastradh@


# 1.110 23-Mar-2016 ozaki-r

Fix LIST_FOREACH argument


# 1.109 23-Mar-2016 ozaki-r

Use LIST_FOREACH instead of LIST_FOREACH_SAFE

No need to use *_SAFE because we don't remove any items in the loop.


Revision tags: nick-nhusb-base-20160319
# 1.108 15-Feb-2016 ozaki-r

Simplify bridge(4)

Thanks to introducing softint-based if_input, the entire bridge code now
never run in hardware interrupt context. So we can simplify the code.

- Remove spin mutexes
- They were needed because some code of bridge could run in
hardware interrupt context
- We now need only an adaptive mutex for each shared object
(a member list and a forwarding table)
- Remove pktqueue
- bridge_input is already in softint, using another softint
(for bridge_forward) is useless
- Packet distribution should be down at device drivers


# 1.107 10-Feb-2016 ozaki-r

Don't share struct work, instead have one per softc

Pointed out by riastradh@


# 1.106 09-Feb-2016 ozaki-r

Introduce softint-based if_input

This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.

This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.

To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.

Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html

Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!


Revision tags: nick-nhusb-base-20151226
# 1.105 19-Nov-2015 christos

Add handling of VLAN packets in if_bridge where the parent interface supports
them (Jean-Jacques.Puig@espci.fr). Factor out the vlan_mtu enabling and
disabling code.


# 1.104 20-Oct-2015 maxv

Harmless alloc inconsistency; make sure the exact same argument is given to
kmem_alloc/kmem_free. Found by Brainy.


# 1.103 07-Oct-2015 ozaki-r

Enqueue frames to a curcpu's pktqueue

Currently RX can run on a CPU other than CPU#0, so always enqueuing
to a pktqueue of CPU#0 makes no sense. Let's use a curcpu's pktqueue,
although bridge_foward softint doesn't run in parallel without
NET_MPSAFE.

This is a temporal solution. We need a fundamental solution.


Revision tags: nick-nhusb-base-20150921
# 1.102 28-Aug-2015 rjs

Don't set M_PROTO1 in mbuf flags.

This was left over from the old usage of gif(4) with bridges.


# 1.101 20-Aug-2015 christos

include "ioconf.h" to get the 'void <driver>attach(int count);' prototype.


# 1.100 23-Jul-2015 ozaki-r

Fix PR 48104

So far bridge cannot receive frames via a member interface when the frames
come from another member interface. So when we assign an IP address to
a member interface, hosts connected to another member interface cannot
ping to the IP address. That behavior isn't expected. See PR 48104 for
more realistic examples of this issue.

The change does:
- drop M_PROMISC before ether_input, which allows a bridge member interface
to receive a frame coming from another bridge member interface
- receive broadcast/multicast frames via all bridge member interfaces,
which is required to receive IPv6 multicast packets destined to a
multicast group belonging to a bridge member interface that is different
from a packet arrival interface

roy@ helped testing of the fix, thanks!


Revision tags: nick-nhusb-base-20150606
# 1.99 01-Jun-2015 matt

Modify the BRDGGIFS and BRDGRTS cmds to be more COMPAT_NETBSD32 friendly.
(XXX whitespace)


# 1.98 16-Apr-2015 ozaki-r

Fix racy bridge_delete_member

It can be called from bridge_ioctl_del and bridge_clone_destroy with
a same bridge member (bif) at the same time. We have to prevent
that happens.

Pointed out by riastradh@


Revision tags: nick-nhusb-base-20150406
# 1.97 08-Jan-2015 ozaki-r

Use pserialize for rtlist in bridge

This change enables lockless accesses to bridge rtable lists.
See locking notes in a comment to know how pserialize and
mutexes are used. Some functions are rearranged to use
pserialize. A workqueue is introduced to use pserialize in
bridge_rtage via bridge_timer callout.

As usual, pserialize and mutexes are used only when NET_MPSAFE
on. On the other hand, the newly added workqueue is used
regardless of NET_MPSAFE on or off.


# 1.96 01-Jan-2015 ozaki-r

Reset the expire time of a cache on receiving a frame for the cache

The expire time of a cache in a bridge MAC address table was never reset
once it is initialized regardless of traffic for the cache. The behavior
isn't supposed and active caches are unnecessarily expired and removed.

PR kern/49507


# 1.95 31-Dec-2014 ozaki-r

Use pserialize in bridge

This change enables lockless accesses to bridge member lists.
See locking notes in a comment to know how pserialize and
mutexes are used.

This change also provides support for softint-based interrupt
handling; pserialize readers can run in both HW interrupt and
softint contexts.

As usual, pserialize is used only when NET_MPSAFE on.


# 1.94 25-Dec-2014 ozaki-r

Use LIST_FOREACH_SAFE in bridge_rt* functions


# 1.93 24-Dec-2014 ozaki-r

Replace malloc/free with kmem_* in if_bridge

Additionally M_NOWAIT is replaced with KM_SLEEP.


# 1.92 22-Dec-2014 ozaki-r

Call ether_input/m_freem without holding a lock or referencing unnecessary objects

When NET_MPSAFE on, a bridge tries to pass up a packet to Layer 3
(or call m_freem) with holding a lock or referencing unnecessary
objects. That causes random lock ups. The change fixes the issue.


Revision tags: nick-nhusb-base
# 1.91 15-Aug-2014 ozaki-r

branches: 1.91.2;
bridge: reject non-IFF_SIMPLEX interfaces

bridge does not work with !IFF_SIMPLEX interfaces (PR/18035);
the bug is not yet fixed. Until it gets fixed, we should
reject non-IFF_SIMPLEX interfaces.

Discussed with pooka@


Revision tags: netbsd-7-1-2-RELEASE netbsd-7-1-1-RELEASE netbsd-7-1-RELEASE netbsd-7-1-RC2 netbsd-7-nhusb-base-20170116 netbsd-7-1-RC1 netbsd-7-0-2-RELEASE netbsd-7-nhusb-base netbsd-7-0-1-RELEASE netbsd-7-0-RELEASE netbsd-7-0-RC3 netbsd-7-0-RC2 netbsd-7-0-RC1 netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.90 23-Jul-2014 ozaki-r

branches: 1.90.2;
Avoid calling copyout with holding mutex(IPL_NET)

Because copyout may lead a page fault that may sleep, we have to pull it
out from the critical section of mutex(IPL_NET) in bridge_ioctl_gifs.


# 1.89 23-Jul-2014 ozaki-r

Add missing unlock


# 1.88 20-Jul-2014 ozaki-r

Don't return ENETRESET when ioctl SIOCSIFMTU

Otherwise, just changing MTU with ifconfig shows
a confusable error message.

RP kern/48996


# 1.87 14-Jul-2014 ozaki-r

Make bridge MPSAFE

- Introduce BRIDGE_MPSAFE
- It's enabled only when NET_MPSAFE is defined
in if.h or the kernel config
- Add iflist and rtlist mutex locks
- Locking iflist is performance sensitive,
so it's not used when !BRIDGE_MPSAFE
- Add bif object reference counting
- It enables fine-grain locking for bridge member lists
by allowing to not hold a lock during touching a bif
- bridge_release_member is added to decrement the
reference count
- A condition variable is added to do bridge_delete_member
gracefully
- Add if_bridgeif to ifnet
- It's a shortcut to a bif object of a bridge member
- It reduces a bif lookup cost and so lock contention on iflist
- Make bridgestp MPSAFE too


# 1.86 02-Jul-2014 ozaki-r

Protect bridge_list with a mutex


# 1.85 02-Jul-2014 ozaki-r

Remove obsolete codes for if_snd


# 1.84 23-Jun-2014 ozaki-r

Get rid of unnecessary xc_broadcast after pktq_barrier

Pointed out by rmind@


# 1.83 18-Jun-2014 ozaki-r

Restructure bridge_input and bridge_broadcast

There are two changes:
- Assemble the places calling pktq_enqueue (bridge_forward)
for unicast and {b,m}cast frames into one
- Receive {b,m}cast frames in bridge_broadcast, not in
bridge_input

The changes make the code clear and readable. bridge_input
now doesn't need to take care of {b,m}cast frames;
bridge_forward and bridge_broadcast have the responsibility.

The changes are based on a patch of Lloyd Parkes submitted
in PR 48104, but don't fix its issue yet.


# 1.82 18-Jun-2014 ozaki-r

Tidy up bridge_input

No functional change.


# 1.81 17-Jun-2014 ozaki-r

Restructure ether_input and bridge_input

The network stack of NetBSD is well organized and
layered. A packet reception is processed from a
lower layer to an upper layer one by one. However,
ether_input and bridge_input are not structured so.
bridge_input is called inside ether_input.

The new structure replaces ifnet#if_input of a bridge
member with bridge_input when the member is attached.
So a packet goes straight on a packet reception via
a bridge, bridge_input => ether_input => ip_input.

The change is part of a patch of Lloyd Parkes submitted
in PR 48104. Unlike the patch, the change doesn't
intend to change the behavior of the packet processing.
Another patch will fix PR 48104.


# 1.80 16-Jun-2014 ozaki-r

Add net.interfaces.bridgeN.fwdq.{maxlen,len,drops} sysctl


# 1.79 16-Jun-2014 ozaki-r

Use pktqueue for bridge forwarding queue and softint


# 1.78 15-Jun-2014 ozaki-r

Get rid of unnecessary splnet for pool_{get,put}

A mutex prevents interrupts in the functions now.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base
# 1.77 29-Jun-2013 rmind

branches: 1.77.4;
- Rewrite parts of pfil(9): use array to store hooks and thus be more cache
friendly (there are only few hooks in the system). Make the structures
opaque and the interface more strict.
- Remove PFIL_HOOKS option by making pfil(9) mandatory.


Revision tags: agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8
# 1.76 22-Mar-2012 wiz

branches: 1.76.2; 1.76.4;
Fix typo in kauth name. From PR 46234 by Matthew Mondor.
Tested by Geoff Adams and Ryo ONODERA.


# 1.75 13-Mar-2012 elad

Replace the remaining KAUTH_GENERIC_ISSUSER authorization calls with
something meaningful. All relevant documentation has been updated or
written.

Most of these changes were brought up in the following messages:

http://mail-index.netbsd.org/tech-kern/2012/01/18/msg012490.html
http://mail-index.netbsd.org/tech-kern/2012/01/19/msg012502.html
http://mail-index.netbsd.org/tech-kern/2012/02/17/msg012728.html

Thanks to christos, manu, njoly, and jmmv for input.

Huge thanks to pgoyette for spinning these changes through some build
cycles and ATF.


Revision tags: netbsd-6-0-6-RELEASE netbsd-6-1-5-RELEASE netbsd-6-1-4-RELEASE netbsd-6-0-5-RELEASE netbsd-6-1-3-RELEASE netbsd-6-0-4-RELEASE netbsd-6-1-2-RELEASE netbsd-6-0-3-RELEASE netbsd-6-1-1-RELEASE netbsd-6-0-2-RELEASE netbsd-6-1-RELEASE netbsd-6-1-RC4 netbsd-6-1-RC3 netbsd-6-1-RC2 netbsd-6-1-RC1 netbsd-6-0-1-RELEASE matt-nb6-plus-nbase netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base
# 1.74 19-Nov-2011 tls

branches: 1.74.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.73 23-May-2011 joerg

branches: 1.73.4;
simplify


Revision tags: bouyer-quota2-nbase bouyer-quota2-base jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.72 07-Dec-2010 pooka

branches: 1.72.2;
_KERNEL_TOP


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1 yamt-nfs-mp-base9 uebayasi-xip-base
# 1.71 19-Jan-2010 pooka

branches: 1.71.4;
Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


Revision tags: matt-premerge-20091211 yamt-nfs-mp-base8 yamt-nfs-mp-base7 jymxensuspend-base yamt-nfs-mp-base6 yamt-nfs-mp-base5 jym-xensuspend-nbase
# 1.70 17-May-2009 cegger

fix crash in bridge_ioctl():


BRDGGFLT and BRDGSFILT bridge controls are only available with BRIDGE_IPF and PFIL_HOOKS defined.
In amd64 GENERIC and XEN kernel configs PFIL_HOOKS is defined but BRIDGE_IPF is not.

When a BRDGGFLT or BRDGSFILT command comes in, then ifd->ifd_cmd is not in range
of bridge_control_table_size. Then bc is not set and is dereferenced
later => BOOM.


Revision tags: yamt-nfs-mp-base4 jym-xensuspend-base
# 1.69 12-May-2009 elad

Move kauth(9) call before going into splnet().

Mailing list reference:

http://mail-index.netbsd.org/tech-net/2009/05/08/msg001286.html


Revision tags: yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 nick-hppapmap-base
# 1.68 04-Apr-2009 bouyer

Fix another typo


# 1.67 04-Apr-2009 bouyer

Fix a comment, and make it build.


# 1.66 04-Apr-2009 bouyer

Fixes from Masao Uebayashi


# 1.65 04-Apr-2009 bouyer

Fix for if_start() and pfil_hook() being called from hardware interrupt
context (reported on various mailing-lists, and part of PR kern/41114,
causing panic in pf(4) and possibly ipf(4) when BRIDGE_IPF is used).
Defer bridge_forward() to a software interrupt; bridge_input() enqueues
mbufs to ifp->if_snd which is handled in bridge_forward().


Revision tags: nick-hppapmap-base2
# 1.64 18-Jan-2009 mrg

branches: 1.64.2;
Fix multiple problems:

* A sign extension error creating the bridge ID corrupted the
priority (always making it the maximum).
* Do not catch STP packets on an interface for which STP is not
enabled -- it's a violation of the spec, and causes STP to fail on
neighboring bridges.
* An optimization to bstp_input() -- some information is already
known when we call it.

contributed anonymously.


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.63 07-Nov-2008 dyoung

*** Summary ***

When a link-layer address changes (e.g., ifconfig ex0 link
02:de:ad:be:ef:02 active), send a gratuitous ARP and/or a Neighbor
Advertisement to update the network-/link-layer address bindings
on our LAN peers.

Refuse a change of ethernet address to the address 00:00:00:00:00:00
or to any multicast/broadcast address. (Thanks matt@.)

Reorder ifnet ioctl operations so that driver ioctls may inherit
the functions of their "class"---ether_ioctl(), fddi_ioctl(), et
cetera---and the class ioctls may inherit from the generic ioctl,
ifioctl_common(), but both driver- and class-ioctls may override
the generic behavior. Make network drivers share more code.

Distinguish a "factory" link-layer address from others for the
purposes of both protecting that address from deletion and computing
EUI64.

Return consistent, appropriate error codes from network drivers.

Improve readability. KNF.

*** Details ***

In if_attach(), always initialize the interface ioctl routine,
ifnet->if_ioctl, if the driver has not already initialized it.
Delete if_ioctl == NULL tests everywhere else, because it cannot
happen.

In the ioctl routines of network interfaces, inherit common ioctl
behaviors by calling either ifioctl_common() or whichever ioctl
routine is appropriate for the class of interface---e.g., ether_ioctl()
for ethernets.

Stop (ab)using SIOCSIFADDR and start to use SIOCINITIFADDR. In
the user->kernel interface, SIOCSIFADDR's argument was an ifreq,
but on the protocol->ifnet interface, SIOCSIFADDR's argument was
an ifaddr. That was confusing, and it would work against me as I
make it possible for a network interface to overload most ioctls.
On the protocol->ifnet interface, replace SIOCSIFADDR with
SIOCINITIFADDR. In ifioctl(), return EPERM if userland tries to
invoke SIOCINITIFADDR.

In ifioctl(), give the interface the first shot at handling most
interface ioctls, and give the protocol the second shot, instead
of the other way around. Finally, let compatibility code (COMPAT_OSOCK)
take a shot.

Pull device initialization out of switch statements under
SIOCINITIFADDR. For example, pull ..._init() out of any switch
statement that looks like this:

switch (...->sa_family) {
case ...:
..._init();
...
break;
...
default:
..._init();
...
break;
}

Rewrite many if-else clauses that handle all permutations of IFF_UP
and IFF_RUNNING to use a switch statement,

switch (x & (IFF_UP|IFF_RUNNING)) {
case 0:
...
break;
case IFF_RUNNING:
...
break;
case IFF_UP:
...
break;
case IFF_UP|IFF_RUNNING:
...
break;
}

unifdef lots of code containing #ifdef FreeBSD, #ifdef NetBSD, and
#ifdef SIOCSIFMTU, especially in fwip(4) and in ndis(4).

In ipw(4), remove an if_set_sadl() call that is out of place.

In nfe(4), reuse the jumbo MTU logic in ether_ioctl().

Let ethernets register a callback for setting h/w state such as
promiscuous mode and the multicast filter in accord with a change
in the if_flags: ether_set_ifflags_cb() registers a callback that
returns ENETRESET if the caller should reset the ethernet by calling
if_init(), 0 on success, != 0 on failure. Pull common code from
ex(4), gem(4), nfe(4), sip(4), tlp(4), vge(4) into ether_ioctl(),
and register if_flags callbacks for those drivers.

Return ENOTTY instead of EINVAL for inappropriate ioctls. In
zyd(4), use ENXIO instead of ENOTTY to indicate that the device is
not any longer attached.

Add to if_set_sadl() a boolean 'factory' argument that indicates
whether a link-layer address was assigned by the factory or some
other source. In a comment, recommend using the factory address
for generating an EUI64, and update in6_get_hw_ifid() to prefer a
factory address to any other link-layer address.

Add a routing message, RTM_LLINFO_UPD, that tells protocols to
update the binding of network-layer addresses to link-layer addresses.
Implement this message in IPv4 and IPv6 by sending a gratuitous
ARP or a neighbor advertisement, respectively. Generate RTM_LLINFO_UPD
messages on a change of an interface's link-layer address.

In ether_ioctl(), do not let SIOCALIFADDR set a link-layer address
that is broadcast/multicast or equal to 00:00:00:00:00:00.

Make ether_ioctl() call ifioctl_common() to handle ioctls that it
does not understand.

In gif(4), initialize if_softc and use it, instead of assuming that
the gif_softc and ifp overlap.

Let ifioctl_common() handle SIOCGIFADDR.

Sprinkle rtcache_invariants(), which checks on DIAGNOSTIC kernels
that certain invariants on a struct route are satisfied.

In agr(4), rewrite agr_ioctl_filter() to be a bit more explicit
about the ioctls that we do not allow on an agr(4) member interface.

bzero -> memset. Delete unnecessary casts to void *. Use
sockaddr_in_init() and sockaddr_in6_init(). Compare pointers with
NULL instead of "testing truth". Replace some instances of (type
*)0 with NULL. Change some K&R prototypes to ANSI C, and join
lines.


Revision tags: netbsd-5-0-RC3 netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 wrstuden-revivesa-base-1 simonb-wapbl-nbase yamt-pf42-base4 simonb-wapbl-base wrstuden-revivesa-base
# 1.62 15-Jun-2008 christos

branches: 1.62.2; 1.62.4; 1.62.6;
- add if_alloc (ours just mallocs), and if_initname and use them (from FreeBSD)
- kill memsets where M_ZERO can be used.


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base
# 1.61 15-Apr-2008 thorpej

branches: 1.61.2; 1.61.4; 1.61.6; 1.61.8;
Make ip6 and icmp6 stats per-cpu.


# 1.60 12-Apr-2008 cegger

make this build with BRIDGE_IPF and PFIL_HOOKS options


# 1.59 12-Apr-2008 thorpej

Make IP, TCP, UDP, and ICMP statistics per-CPU. The stats are collated
when the user requests them via sysctl.


# 1.58 08-Apr-2008 thorpej

Change IPv6 stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ip6stat structure; old netstat
binaries will continue to work properly.


# 1.57 07-Apr-2008 thorpej

Change IP stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ipstat structure; old netstat
binaries will continue to work properly.


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase nick-net80211-sync-base keiichi-mipv6-base matt-armv6-nbase hpcarm-cleanup-base
# 1.56 20-Feb-2008 matt

branches: 1.56.6;
s/u_\(int[0-9]*_t\)/u\1/g
(change u_int*_t to uint*_t)


Revision tags: bouyer-xeni386-nbase bouyer-xeni386-base mjf-devfs-base
# 1.55 19-Jan-2008 dyoung

Use C99 array initializers for bridge_control_table[].


Revision tags: nick-csl-alignment-base5 bouyer-xeni386-merge1 matt-armv6-prevmlocking vmlocking2-base3 yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 jmcneill-base bouyer-xenamd64-base2 vmlocking-nbase yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base matt-armv6-base jmcneill-pm-base reinoud-bufcleanup-base vmlocking-base
# 1.54 27-Aug-2007 dyoung

branches: 1.54.2; 1.54.8; 1.54.14;
LLADDR -> CLLADDR.


# 1.53 26-Aug-2007 dyoung

Constify: LLADDR -> CLLADDR. I'm aiming here to make it easier to
identify sockaddr_dl abuse that remains in the kernel, especially
the potential for overwriting memory past the end of a sockaddr_dl
with, e.g., memcpy(LLADDR(), ...).

Use sockaddr_dl_setaddr() in a few places.


Revision tags: matt-mips64-base nick-csl-alignment-base mjf-ufs-trans-base
# 1.52 09-Jul-2007 ad

branches: 1.52.2; 1.52.6;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.51 12-Mar-2007 ad

branches: 1.51.2;
Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.


# 1.50 04-Mar-2007 christos

branches: 1.50.2;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.49 21-Feb-2007 dyoung

Use __arraycount().


# 1.48 17-Feb-2007 dyoung

KNF: de-__P, bzero -> memset, bcmp -> memcmp. Remove extraneous
parentheses in return statements.

Cosmetic: don't open-code TAILQ_FOREACH().

Cosmetic: change types of variables to avoid oodles of casts: in
in6_src.c, avoid casts by changing several route_in6 pointers
to struct route pointers. Remove unnecessary casts to caddr_t
elsewhere.

Pave the way for eliminating address family-specific route caches:
soon, struct route will not embed a sockaddr, but it will hold
a reference to an external sockaddr, instead. We will set the
destination sockaddr using rtcache_setdst(). (I created a stub
for it, but it isn't used anywhere, yet.) rtcache_free() will
free the sockaddr. I have extracted from rtcache_free() a helper
subroutine, rtcache_clear(). rtcache_clear() will "forget" a
cached route, but it will not forget the destination by releasing
the sockaddr. I use rtcache_clear() instead of rtcache_free()
in rtcache_update(), because rtcache_update() is not supposed
to forget the destination.

Constify:

1 Introduce const accessor for route->ro_dst, rtcache_getdst().

2 Constify the 'dst' argument to ifnet->if_output(). This
led me to constify a lot of code called by output routines.

3 Constify the sockaddr argument to protosw->pr_ctlinput. This
led me to constify a lot of code called by ctlinput routines.

4 Introduce const macros for converting from a generic sockaddr
to family-specific sockaddrs, e.g., sockaddr_in: satocsin6,
satocsin, et cetera.


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.47 04-Jan-2007 elad

branches: 1.47.2;
Consistent usage of KAUTH_GENERIC_ISSUSER.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3 netbsd-4-base
# 1.46 23-Nov-2006 rpaulo

New EtherIP driver based on tap(4) and gif(4) by Hans Rosenfeld.
Notable changes:
* Fixes PR 34268.
* Separates the code from gif(4) (which is more cleaner).
* Allows the usage of STP (Spanning Tree Protocol).
* Removed EtherIP implementation from gif(4)/tap(4).

Some input from Christos.


# 1.45 16-Nov-2006 christos

__unused removal on arguments; approved by core.


Revision tags: yamt-splraiseipl-base2
# 1.44 17-Oct-2006 dogcow

now that we have -Wno-unused-parameter, back out all the tremendously ugly
code to gratuitously access said parameters.


# 1.43 13-Oct-2006 dogcow

More -Wunused fallout. sprinkle __unused when possible; otherwise, use the
do { if (&x) {} } while (/* CONSTCOND */ 0);
construct as suggested by uwe in <20061012224845.GA9449@snark.ptc.spbu.ru>.


# 1.42 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


# 1.41 05-Oct-2006 tls

Protect calls to pool_put/pool_get that may occur in interrupt context
with spl used to protect other allocations and frees, or datastructure
element insertion and removal, in adjacent code.

It is almost unquestionably the case that some of the spl()/splx() calls
added here are superfluous, but it really seems wrong to see:

s=splfoo();
/* frob data structure */
splx(s);
pool_put(x);

and if we think we need to protect the first operation, then it is hard
to see why we should not think we need to protect the next. "Better
safe than sorry".

It is also almost unquestionably the case that I missed some pool
gets/puts from interrupt context with my strategy for finding these
calls; use of PR_NOWAIT is a strong hint that a pool may be used from
interrupt context but many callers in the kernel pass a "can wait/can't
wait" flag down such that my searches might not have found them. One
notable area that needs to be looked at is pf.

See also:

http://mail-index.netbsd.org/tech-kern/2006/07/19/0003.html
http://mail-index.netbsd.org/tech-kern/2006/07/19/0009.html


Revision tags: abandoned-netbsd-4-base yamt-splraiseipl-base yamt-pdpolicy-base9 yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.40 23-Jul-2006 ad

branches: 1.40.4; 1.40.6;
Use the LWP cached credentials where sane.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.39 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.38 18-May-2006 liamjfoy

branches: 1.38.2;
Integrate Common Address Redundancy Procotol (CARP) from OpenBSD

'pseudo-device carp'

Thanks to: joerg@ christos@ riz@ and others who tested
Ok: core@


# 1.37 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base yamt-pdpolicy-base2 elad-kernelauth-base yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.36 17-Jan-2006 christos

branches: 1.36.2; 1.36.4; 1.36.6; 1.36.8; 1.36.10;
Make sure that breq is also cleared (from Xin LI)


# 1.35 09-Jan-2006 christos

Make sure we initialize all structs to 0; from Xin LI


# 1.34 24-Dec-2005 perry

branches: 1.34.2;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.33 11-Dec-2005 thorpej

ANSI function decls and application of static.


# 1.32 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.31 01-Jun-2005 jdc

branches: 1.31.2;
Fix this properly by renaming the conflicting variables.


# 1.30 01-Jun-2005 jdc

Remove extraneous definition of struct llc (found by shadow warning).


Revision tags: netbsd-3-0-RELEASE netbsd-3-0-RC6 netbsd-3-0-RC5 netbsd-3-0-RC4 netbsd-3-0-RC3 netbsd-3-0-RC2 netbsd-3-0-RC1 yamt-km-base4 yamt-km-base3 netbsd-3-base kent-audio2-base
# 1.29 26-Feb-2005 perry

branches: 1.29.2; 1.29.4;
nuke trailing whitespace


Revision tags: yamt-km-base2
# 1.28 31-Jan-2005 kim

Add RFC 3378 EtherIP support, ported from OpenBSD to NetBSD by
Hans Rosenfeld (rosenfeld at grumpf.hope-2000.org)

This change makes it possible to add gif interfaces to bridges, which
will then send and receive IP protocol 97 packets. Packets are Ethernet
frames with an EtherIP header prepended.


Revision tags: yamt-km-base kent-audio1-beforemerge kent-audio1-base
# 1.27 04-Dec-2004 peter

branches: 1.27.4; 1.27.6;
Change ifc_destroy to return an int instead of void, so that it
can pass back errors to ifconfig.


# 1.26 06-Oct-2004 bad

Interfaces that do checksum offloading indicate the checksum status of
received packets in csum_flags in the packet header. Packets that are
forwarded over the bridge need to have csum_flags cleared before being
put on the output queue. Do so in bridge_enqueue().

Discussed with Jason Thorpe.

Fixes PR kern/27007 and the first part of PR kern/21831.


# 1.25 05-Oct-2004 christos

Only enable BRIDGE_IPF code if PFIL_HOOKS is enabled.


# 1.24 21-Apr-2004 itojun

kill a sprintf


# 1.23 21-Apr-2004 itojun

kill sprintf, use snprintf


Revision tags: netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.22 31-Jan-2004 jdc

branches: 1.22.2;
Use m_copydata(), m_adj() and M_PREPEND() to manipulate mbuf's in
bridge_ipf(). Fixes kernel memory corruption that occured when using
m_split() and m_cat().
Idea from OpenBSD.


# 1.21 09-Dec-2003 augustss

Fix spelling mistake in a comment.


# 1.20 28-Oct-2003 mycroft

Mark this initializer in the canonical way so it can be found later.


# 1.19 25-Oct-2003 christos

Fix uninitialized variable warnings


# 1.18 16-Sep-2003 jdc

Add a flag parameter to bridge_enqueue() to tell it whether to run the
filter or not. We only need to run the filter for bridge_forward() and
bridge_broadcast(). If we also run it for bridge_output(), we will run
the filter twice outbound per packet, so don't.

In bridge_ipf(), make sure we don't run m_cat() on a single mbuf chain
by checking to see (and remembering) if we need to m_split() the mbuf.
This fixes bridge + ipfilter on sparc.

Fixes PR kern/22063.


# 1.17 11-Aug-2003 itojun

rm extra blank line


# 1.16 13-Jul-2003 jdc

Include opt_inet.h to get INET6 definition.
Now, bridged ipv6 packets are passed through ipfilter.
However, some v6 packets still do not get transmitted when ipf is enabled.
Partial fix for PR kern/22063.


# 1.15 23-Jun-2003 martin

branches: 1.15.2;
Make sure to include opt_foo.h if a defflag option FOO is used.


# 1.14 24-May-2003 kristerw

Make sure splx() is called for all bridge_ioctl() error cases.


# 1.13 16-May-2003 itojun

use strlcpy


# 1.12 14-May-2003 itojun

use arc4random


# 1.11 19-Mar-2003 bouyer

Fix 2 bugs:
- initialise stp when the bridge is turned up, without this stp will keep
all interfaces disabled in a sequence like:
brconfig bridge0 add if0 add if1 stp if0 stp if1 up
- s/BRDGSPRI/BRDGSIFPRIO in brconfig.c:cmd_ifpriority()

add a command (ifpathcost) to change the stp path cost of the STP path cost of
an interface. Display the interface path cost with the others STP parameters.


# 1.10 27-Feb-2003 perseant

Make BRIDGE_IPF an option, and document it. Add it (commented) to GENERIC.
Let brconfig tell whether the bridge is using the ipfilter hook, or not.


# 1.9 15-Feb-2003 perseant

Add ipf packet-filtering option to if_bridge. The option is controlled at
compile-time by BRIDGE_IPF, and at runtime by brconfig with the {ipf,-ipf}
option on a per-bridge basis.

As a side-effect, add PFIL_HOOKS processing to if_bridge.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base kqueue-aftermerge kqueue-beforemerge gehenna-devsw-base kqueue-base
# 1.8 24-Aug-2002 martin

Add a function to lookup bridge members by struct ifnet * and use
it at all call sites that have such a pointer readily available.
This avoids unnecessary strcmp()s in critical paths, and removes
some XXX comments.


# 1.7 08-Jun-2002 itojun

reject "add" request if if_mtu is different.


# 1.6 23-May-2002 itojun

use IFT_BRIDGE


Revision tags: netbsd-1-6-base
# 1.5 24-Mar-2002 jdolecek

branches: 1.5.2; 1.5.4;
Fix a memory leak in bridge_ioctl_add() when the called for non-ethernet
interface.
Problem noted and fix provided by in kern/16019 by Love.


Revision tags: eeh-devprop-base newlock-base
# 1.4 08-Mar-2002 thorpej

Pool deals fairly well with physical memory shortage, but it doesn't
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map). Try to deal with this:

* Group all information about the backend allocator for a pool in a
separate structure. The pool references this structure, rather than
the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
to become available, but will still fail if it cannot callocate KVA
space for the pages. If this happens, carefully drain all pools using
the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
some pages, and use that information to make draining easier and more
efficient.
* Get rid of PR_URGENT. There was only one use of it, and it could be
dealt with by the caller.

From art@openbsd.org.


Revision tags: ifpoll-base
# 1.3 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.2 17-Aug-2001 thorpej

branches: 1.2.2; 1.2.4;
Only report expire time for DYNAMIC forwarding table entries.


# 1.1 17-Aug-2001 thorpej

Add support for building Ethernet bridges, based on Jason Wright's
bridge driver from OpenBSD, although the bridge code has been *heavily*
modified by me (the 802.1D code remains mostly unchanged from the
original).


# 1.179 19-Feb-2021 christos

- Make ALIGNED_POINTER use __alignof(t) instead of sizeof(t). This is more
correct because it works with non-primitive types and provides the ABI
alignment for the type the compiler will use.
- Remove all the *_HDR_ALIGNMENT macros and asserts
- Replace POINTER_ALIGNED_P with ACCESSIBLE_POINTER which is identical to
ALIGNED_POINTER, but returns that the pointer is always aligned if the
CPU supports unaligned accesses.
[ as proposed in tech-kern ]


# 1.178 14-Feb-2021 christos

- centralize header align and pullup into a single inline function
- use a single macro to align pointers and expose the alignment, instead
of hard-coding 3 in 1/2 the macros.
- fix an issue in the ipv6 lt2p where it was aligning for ipv4 and pulling
for ipv6.


Revision tags: thorpej-futex-base
# 1.177 02-Nov-2020 roy

bridge: revert prior

It's of little use.
If we need to do this in the future, consider a sysctl to do it for all
interfaces in the bridge and not just the one being added.


# 1.176 27-Sep-2020 roy

branches: 1.176.2;
bridge: When an interface joins then mark addresses on it as tentative

The exact flow is detatch addresses, join bridge and then mark detached
addresses as tentative.
This ensures that Duplicate Address Detection for the joining interface
are performed across all members of the bridge.


# 1.175 27-Sep-2020 roy

bridge: Calculate link state as the best link state of any member

If any member is LINK_STATE_UP then it's LINK_STATE_UP.
Otherwise if any member is LINK_STATE_UNKNOWN then it's LINK_STATE_UNKNOWN.
Otherwise it's LINK_STATE_DOWN.


# 1.174 01-Aug-2020 maxv

Remove #ifdef BRIDGE_IPF, compile in the code by default. Sent to
tech-net@.


# 1.173 01-May-2020 jdolecek

report no enabled capabilities when no interface is part of bridge


# 1.172 30-Apr-2020 jdolecek

for bridge(4), report the common enabled capabilities of the members
via SIOCGIFCAP for visibility


# 1.171 27-Apr-2020 jdolecek

if MTU of the added interface doesn't match the bridge, modify the MTU
of the interface to that of the bridge instead of just refusing the
addition with EINVAL

this is a convenience feature to simplify bridge setup with non-standard
MTU, the useful behaviour observed with Linux xenbr


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406
# 1.170 27-Mar-2020 jdolecek

replace the conditional m_pullup() on start of bridge_output() with
a KASSERT(), to make it clear no mbuf manipulation is ever done here

the condition should never trigger, this always runs after ether_output()
M_PREPEND()s ether_header


# 1.169 24-Mar-2020 jdolecek

reset the csum_flags in bridge_brodcast() also for bmcast path

for destination interfaces with real hardware offloading this fixes
multicast packet corruption; for xvif(4) this fix stops treating them
as having no csum

may fix PR kern/42386


Revision tags: ad-namecache-base3
# 1.168 24-Feb-2020 rin

Remove debug printf I put into bridge_calc_csum_flags().
Sorry for noise.


# 1.167 23-Feb-2020 jdolecek

disable the DEBUG bridge_calc_csum_flags() printf


# 1.166 29-Jan-2020 thorpej

Adopt <net/if_stats.h>.


Revision tags: ad-namecache-base2 ad-namecache-base1 ad-namecache-base phil-wifi-20191119
# 1.165 05-Aug-2019 msaitoh

branches: 1.165.2;
Cast uint32_t to avoid undefined behavior in bridge_rthash(). Found by kUBSan.


Revision tags: netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609 isaki-audio2-base pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226
# 1.164 22-Dec-2018 rin

branches: 1.164.4;
Take the interface out of promiscuous mode in bridge_delete_member()
instead of bridge_ioctl_del(). Otherwise, the member interfaces are
left in promiscuous mode when the bridge is destroyed.


# 1.163 15-Dec-2018 rin

Improve wording in comments: replace "chain" with "queue" for
sequence of mbuf's connected by m_nextpkt, in order to avoid
confusion with those connected by m_next.

No binary changes.


# 1.162 14-Dec-2018 martin

Need <netinet6/ip6_var.h> for ip6_statinc() prototype.


# 1.161 12-Dec-2018 rin

PR kern/53562

Handle TX offload in software when a packet is sent via
bridge_output(). We can send it as is in the following
exceptional cases:

For unicast:

(1) When the destination interface is the same as source.

(2) When the destination supports all TX offload options
specified in a packet.

For multicast/broadcast:

(3) When all the members of the bridge support the specified
TX offload options.

For (3), add sc_csum_flags_tx flag to bridge softc, which is
logical AND b/w capabilities of TX offload options in member
interface (ifp->if_csum_flags_tx). The flag is updated when a
member is (i) added to or (ii) removed from a bridge, or (iii)
if_csum_flags_tx flag of a member interface is manipulated via
ifconfig(8).

Turn on M_CSUM_TSOv[46] bit in ifp->if_csum_flags_tx flag when
TSO[46] is enabled for that interface.

OK msaitoh thorpej


Revision tags: pgoyette-compat-1126
# 1.160 09-Nov-2018 ozaki-r

Fix that brconfig <bridge> (addr) can't show a large number of MAC addresses

The command shows only 256 addresses at maximum even if a bridge caches more
addresses. It occurs because the kernel doesn't return an error if the command
passes a short buffer that can't store all cached addresses; the kernel fills
cached addresses as much as possible and returns it without telling that the
result is truncated.

Fix the issue by telling a required size of a buffer if a buffer passed from the
command is not enough, which lets the command retry with an enough buffer.

Reported by k-goda@IIJ


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930
# 1.159 19-Sep-2018 msaitoh

Micro optimization. m_copym(M_COPYALL) -> m_copypacket().


# 1.158 18-Sep-2018 msaitoh

- Fix bridge_enqueue() which was broken by last commit. Use correct mbuf
pointer.
- Modify comment.


# 1.157 14-Sep-2018 msaitoh

Fix a bug that bridge_enqueue() incorrectly cleared outgoing packet's offload
flags. bridge_enqueue() is called from bridge_output() when a packet is
spontaneous. Clear csum_flags before calling brige_enqueue() in
bridge_forward() or bridge_broadcast() instead of in the beginning of
bridge_enqueue().

Note that this change doesn't fix a problem on the following configuration:

A bridge has two or more interfaces.

An address is assigned to an bridge member interface and
some offload flags are set.

Another interface has no address and has no any offload flag.

XXX pullup-[78]


Revision tags: pgoyette-compat-0906 pgoyette-compat-0728 phil-wifi-base pgoyette-compat-0625
# 1.156 25-May-2018 ozaki-r

branches: 1.156.2;
Ensure to call if_register after interface initializations finish


Revision tags: pgoyette-compat-0521
# 1.155 14-May-2018 ozaki-r

Protect packet input routines with KERNEL_LOCK and splsoftnet

if_input, i.e, ether_input and friends, now runs in softint without any
protections. It's ok for ether_input itself because it's already MP-safe,
however, subsequent routines called from it such as carp_input and agr_input
aren't safe because they're not MP-safe. Protect if_input with KERNEL_LOCK.

if_input can be called from a normal LWP context. In that case we need to
prevent interrupts (softint) from running by splsoftnet to protect non-MP-safe
codes (e.g., carp_input and agr_input).

Pointed out by mlelstv@


Revision tags: pgoyette-compat-0502 pgoyette-compat-0422
# 1.154 18-Apr-2018 ozaki-r

Add missing PSLIST_ENTRY_INIT and PSLIST_ENTRY_DESTROY


# 1.153 18-Apr-2018 ozaki-r

Get rid of a unnecessary semicolon

Pointed out by kamil@


# 1.152 18-Apr-2018 ozaki-r

bridge: use pslist(9) for rtlist and rthash

The change fixes race conditions on list operations. One example is that a
reader may see invalid pointers on a looking item in a list due to lack of
membar_producer.


# 1.151 18-Apr-2018 ozaki-r

Simplify bridge_rtnode_insert (NFC)


# 1.150 18-Apr-2018 ozaki-r

Remove obsolete NULL checks


Revision tags: pgoyette-compat-0415
# 1.149 10-Apr-2018 ozaki-r

Fix bridge_rtdelete

It removes a rtable entry that belongs to a specified interface, however, its
original behavior was to delete all belonging entries. Restore the original
behavior.


Revision tags: pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.148 15-Jan-2018 maxv

branches: 1.148.2;
If the bridge is not running, don't call bridge_stop. Otherwise the
following commands will crash the kernel:

ifconfig bridge0 create
ifconfig bridge0 destroy


# 1.147 28-Dec-2017 ozaki-r

Ensure the timer isn't running by using workqueue_wait


# 1.146 19-Dec-2017 ozaki-r

Don't set IFEF_MPSAFE unless NET_MPSAFE at this point

Because recent investigations show that interfaces with IFEF_MPSAFE need to
follow additional restrictions to work with the flag safely. We should enable it
on an interface by default only if the interface surely satisfies the
restrictions, which are described in if.h.

Note that enabling IFEF_MPSAFE solely gains a few benefit on performance because
the network stack is still serialized by the big kernel locks by default.


# 1.145 11-Dec-2017 ozaki-r

Wrap if_ioctl_lock with IFNET_* macros (NFC)

Also if_ioctl_lock perhaps needs to be renamed to something because it's now
not just for ioctl...


# 1.144 08-Dec-2017 ozaki-r

Fix build of kernels without ether

By throwing out if_enable_vlan_mtu and if_disable_vlan_mtu that
created a unnecessary dependency from if.c to if_ethersubr.c.

PR kern/52790


# 1.143 06-Dec-2017 ozaki-r

Ensure to not turn on IFF_RUNNING of an interface until its initialization completes

And ensure to turn off it before destruction as per IFF_RUNNING's description
"resource allocated". (The description is a bit doubtful though, I believe the
change is still proper.)


# 1.142 06-Dec-2017 ozaki-r

Ensure to hold if_ioctl_lock when calling if_flags_set


Revision tags: tls-maxphys-base-20171202
# 1.141 17-Nov-2017 ozaki-r

Add missing IFEF_NO_LINK_STATE_CHANGE to bridge


# 1.140 16-Nov-2017 ozaki-r

Unify IFEF_*_MPSAFE into IFEF_MPSAFE

There are already two flags for if_output and if_start, however, it seems such
MPSAFE flags are eventually needed for all if_XXX operations. Having discrete
flags for each operation is wasteful of if_extflags bits. So let's unify
the flags into one: IFEF_MPSAFE.

Fortunately IFEF_*_MPSAFE flags have never been included in any releases, so
we can change them without breaking backward compatibility of the releases
(though the kernel version of -current should be bumped).

Note that if an interface have both MP-safe and non-MP-safe operations at a
time, we have to set the IFEF_MPSAFE flag and let callees of non-MP-safe
opeartions take the kernel lock.

Proposed on tech-kern@ and tech-net@


# 1.139 15-Nov-2017 ozaki-r

Mark callouts of bridge CALLOUT_MPSAFE


# 1.138 25-Oct-2017 ozaki-r

Remove unnecessary splsoftnet


# 1.137 25-Oct-2017 ozaki-r

Don't free sc_rthash twice


# 1.136 23-Oct-2017 msaitoh

- If if_initialize() failed in the attach function, free resources and return.
- Add some missing frees in bridge_clone_destroy().
- KNF


# 1.135 02-Oct-2017 ozaki-r

Add curlwp_bind to bridge_input for psref

It can be called in a thread context via tap (tap_dev_write).

Fix PR kern/52587


Revision tags: nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320
# 1.134 07-Mar-2017 ozaki-r

branches: 1.134.6;
Remove unnecessary splnet for bridge_enqueue

bridge_enqueue now uses if_transmit_lock that does splnet for device
drivers, so splnet for bridge_enqueue isn't needed anymore.


# 1.133 16-Feb-2017 knakahara

add l2tp(4) L2TPv3 interface.

originally implemented by IIJ SEIL team.


Revision tags: nick-nhusb-base-20170204
# 1.132 23-Jan-2017 ozaki-r

Replace some splnet with splsoftnet


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107 nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004
# 1.131 15-Sep-2016 christos

branches: 1.131.2;
Always do the mbuf checks. The packet filters (npf) expect the mbuf to be
pulled-up. (Krists Krilovs)


Revision tags: localcount-20160914
# 1.130 29-Aug-2016 ozaki-r

KNF; replace white spaces with hard tabs

No functional change.


Revision tags: pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.129 22-Jun-2016 knakahara

branches: 1.129.2;
fix: locking about IFQ_ENQUEUE and ALTQ

- If NET_MPSAFE is not defined, IFQ_LOCK is nop. Currently, that means
IFQ_ENQUEUE() of some paths such as bridge_enqueue() is called parallel
wrongly.
- If ALTQ is enabled, Tx processing should call if_transmit() (= IFQ_ENQUEUE
+ ifp->if_start()) instead of ifp->if_transmit() to call ALTQ_ENQUEUE()
and ALTQ_DEQUEUE().
Furthermore, ALTQ processing is always required KERNEL_LOCK currently.


# 1.128 20-Jun-2016 knakahara

fix: should not assert IFEF_OUTPUT_MPSAFE in bridge_output()


# 1.127 20-Jun-2016 knakahara

tentative fix for ATF(net/if_bridge/t_bridge)


# 1.126 20-Jun-2016 knakahara

make bridge_output MP-safe, so that bridge(4) can enable IFEF_OUTPUT_MPSAFE.

making MP-scalable is future work.


# 1.125 10-Jun-2016 ozaki-r

Avoid storing a pointer of an interface in a mbuf

Having a pointer of an interface in a mbuf isn't safe if we remove big
kernel locks; an interface object (ifnet) can be destroyed anytime in any
packet processing and accessing such object via a pointer is racy. Instead
we have to get an object from the interface collection (ifindex2ifnet) via
an interface index (if_index) that is stored to a mbuf instead of an
pointer.

The change provides two APIs: m_{get,put}_rcvif_psref that use psref(9)
for sleep-able critical sections and m_{get,put}_rcvif that use
pserialize(9) for other critical sections. The change also adds another
API called m_get_rcvif_NOMPSAFE, that is NOT MP-safe and for transition
moratorium, i.e., it is intended to be used for places where are not
planned to be MP-ified soon.

The change adds some overhead due to psref to performance sensitive paths,
however the overhead is not serious, 2% down at worst.

Proposed on tech-kern and tech-net.


# 1.124 10-Jun-2016 ozaki-r

Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.123 16-May-2016 ozaki-r

Apply if_get and if_put to bridge(4)


# 1.122 04-May-2016 roy

Allow multicast/broadcast packets from a bridge member to other members.
Note this should just call bridge_broadcast when more locking issues are
resolved.


# 1.121 28-Apr-2016 knakahara

introduce new ifnet MP-scalable sending interface "if_transmit".


# 1.120 28-Apr-2016 ozaki-r

Constify rtentry of if_output

We no longer need to change rtentry below if_output.

The change makes it clear where rtentries are changed (or not)
and helps forthcoming locking (os psrefing) rtentries.


# 1.119 24-Apr-2016 christos

CID 1358673: dead code


Revision tags: nick-nhusb-base-20160422
# 1.118 22-Apr-2016 roy

Change used from int to bool.
If used, abort the loop because we think we're already at the end.


# 1.117 20-Apr-2016 knakahara

IFQ_ENQUEUE refactor (3/3) : eliminate pktattr argument from IFQ_ENQUEUE caller


# 1.116 20-Apr-2016 knakahara

IFQ_ENQUEUE refactor (2/3) : eliminate pktattr argument from altq implemantation


# 1.115 19-Apr-2016 ozaki-r

Apply psref(9) to bridge(4)

Note that there is an issue that ioctls for an interface and a destruction
of the interface can run in parallel and it causes race conditions on
bridge as well (it rarely happens). The issue will be addressed in the
interface common code (if.c).


# 1.114 19-Apr-2016 ozaki-r

Remove BRIDGE_MPSAFE switch and enable MP-safe code by default

We need to enable it by default because bridge_input now runs
in softint, but bridge_input w/o BRIDGE_MPSAFE was designed as
it runs in hardware interrupt.

Note that there remains a racy code in bridge_output; it will be
solved in the upcoming change (applying psref(9)).


# 1.113 11-Apr-2016 ozaki-r

Fix usage of pslist(9)

Pointed out by riastradh@.


# 1.112 11-Apr-2016 ozaki-r

Use pslist(9) in bridge(4)

This adds missing memory barriers to list operations for pserialize.


# 1.111 28-Mar-2016 ozaki-r

Remove unused global bridge list

Pointed out by riastradh@


# 1.110 23-Mar-2016 ozaki-r

Fix LIST_FOREACH argument


# 1.109 23-Mar-2016 ozaki-r

Use LIST_FOREACH instead of LIST_FOREACH_SAFE

No need to use *_SAFE because we don't remove any items in the loop.


Revision tags: nick-nhusb-base-20160319
# 1.108 15-Feb-2016 ozaki-r

Simplify bridge(4)

Thanks to introducing softint-based if_input, the entire bridge code now
never run in hardware interrupt context. So we can simplify the code.

- Remove spin mutexes
- They were needed because some code of bridge could run in
hardware interrupt context
- We now need only an adaptive mutex for each shared object
(a member list and a forwarding table)
- Remove pktqueue
- bridge_input is already in softint, using another softint
(for bridge_forward) is useless
- Packet distribution should be down at device drivers


# 1.107 10-Feb-2016 ozaki-r

Don't share struct work, instead have one per softc

Pointed out by riastradh@


# 1.106 09-Feb-2016 ozaki-r

Introduce softint-based if_input

This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.

This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.

To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.

Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html

Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!


Revision tags: nick-nhusb-base-20151226
# 1.105 19-Nov-2015 christos

Add handling of VLAN packets in if_bridge where the parent interface supports
them (Jean-Jacques.Puig@espci.fr). Factor out the vlan_mtu enabling and
disabling code.


# 1.104 20-Oct-2015 maxv

Harmless alloc inconsistency; make sure the exact same argument is given to
kmem_alloc/kmem_free. Found by Brainy.


# 1.103 07-Oct-2015 ozaki-r

Enqueue frames to a curcpu's pktqueue

Currently RX can run on a CPU other than CPU#0, so always enqueuing
to a pktqueue of CPU#0 makes no sense. Let's use a curcpu's pktqueue,
although bridge_foward softint doesn't run in parallel without
NET_MPSAFE.

This is a temporal solution. We need a fundamental solution.


Revision tags: nick-nhusb-base-20150921
# 1.102 28-Aug-2015 rjs

Don't set M_PROTO1 in mbuf flags.

This was left over from the old usage of gif(4) with bridges.


# 1.101 20-Aug-2015 christos

include "ioconf.h" to get the 'void <driver>attach(int count);' prototype.


# 1.100 23-Jul-2015 ozaki-r

Fix PR 48104

So far bridge cannot receive frames via a member interface when the frames
come from another member interface. So when we assign an IP address to
a member interface, hosts connected to another member interface cannot
ping to the IP address. That behavior isn't expected. See PR 48104 for
more realistic examples of this issue.

The change does:
- drop M_PROMISC before ether_input, which allows a bridge member interface
to receive a frame coming from another bridge member interface
- receive broadcast/multicast frames via all bridge member interfaces,
which is required to receive IPv6 multicast packets destined to a
multicast group belonging to a bridge member interface that is different
from a packet arrival interface

roy@ helped testing of the fix, thanks!


Revision tags: nick-nhusb-base-20150606
# 1.99 01-Jun-2015 matt

Modify the BRDGGIFS and BRDGRTS cmds to be more COMPAT_NETBSD32 friendly.
(XXX whitespace)


# 1.98 16-Apr-2015 ozaki-r

Fix racy bridge_delete_member

It can be called from bridge_ioctl_del and bridge_clone_destroy with
a same bridge member (bif) at the same time. We have to prevent
that happens.

Pointed out by riastradh@


Revision tags: nick-nhusb-base-20150406
# 1.97 08-Jan-2015 ozaki-r

Use pserialize for rtlist in bridge

This change enables lockless accesses to bridge rtable lists.
See locking notes in a comment to know how pserialize and
mutexes are used. Some functions are rearranged to use
pserialize. A workqueue is introduced to use pserialize in
bridge_rtage via bridge_timer callout.

As usual, pserialize and mutexes are used only when NET_MPSAFE
on. On the other hand, the newly added workqueue is used
regardless of NET_MPSAFE on or off.


# 1.96 01-Jan-2015 ozaki-r

Reset the expire time of a cache on receiving a frame for the cache

The expire time of a cache in a bridge MAC address table was never reset
once it is initialized regardless of traffic for the cache. The behavior
isn't supposed and active caches are unnecessarily expired and removed.

PR kern/49507


# 1.95 31-Dec-2014 ozaki-r

Use pserialize in bridge

This change enables lockless accesses to bridge member lists.
See locking notes in a comment to know how pserialize and
mutexes are used.

This change also provides support for softint-based interrupt
handling; pserialize readers can run in both HW interrupt and
softint contexts.

As usual, pserialize is used only when NET_MPSAFE on.


# 1.94 25-Dec-2014 ozaki-r

Use LIST_FOREACH_SAFE in bridge_rt* functions


# 1.93 24-Dec-2014 ozaki-r

Replace malloc/free with kmem_* in if_bridge

Additionally M_NOWAIT is replaced with KM_SLEEP.


# 1.92 22-Dec-2014 ozaki-r

Call ether_input/m_freem without holding a lock or referencing unnecessary objects

When NET_MPSAFE on, a bridge tries to pass up a packet to Layer 3
(or call m_freem) with holding a lock or referencing unnecessary
objects. That causes random lock ups. The change fixes the issue.


Revision tags: nick-nhusb-base
# 1.91 15-Aug-2014 ozaki-r

branches: 1.91.2;
bridge: reject non-IFF_SIMPLEX interfaces

bridge does not work with !IFF_SIMPLEX interfaces (PR/18035);
the bug is not yet fixed. Until it gets fixed, we should
reject non-IFF_SIMPLEX interfaces.

Discussed with pooka@


Revision tags: netbsd-7-1-2-RELEASE netbsd-7-1-1-RELEASE netbsd-7-1-RELEASE netbsd-7-1-RC2 netbsd-7-nhusb-base-20170116 netbsd-7-1-RC1 netbsd-7-0-2-RELEASE netbsd-7-nhusb-base netbsd-7-0-1-RELEASE netbsd-7-0-RELEASE netbsd-7-0-RC3 netbsd-7-0-RC2 netbsd-7-0-RC1 netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.90 23-Jul-2014 ozaki-r

branches: 1.90.2;
Avoid calling copyout with holding mutex(IPL_NET)

Because copyout may lead a page fault that may sleep, we have to pull it
out from the critical section of mutex(IPL_NET) in bridge_ioctl_gifs.


# 1.89 23-Jul-2014 ozaki-r

Add missing unlock


# 1.88 20-Jul-2014 ozaki-r

Don't return ENETRESET when ioctl SIOCSIFMTU

Otherwise, just changing MTU with ifconfig shows
a confusable error message.

RP kern/48996


# 1.87 14-Jul-2014 ozaki-r

Make bridge MPSAFE

- Introduce BRIDGE_MPSAFE
- It's enabled only when NET_MPSAFE is defined
in if.h or the kernel config
- Add iflist and rtlist mutex locks
- Locking iflist is performance sensitive,
so it's not used when !BRIDGE_MPSAFE
- Add bif object reference counting
- It enables fine-grain locking for bridge member lists
by allowing to not hold a lock during touching a bif
- bridge_release_member is added to decrement the
reference count
- A condition variable is added to do bridge_delete_member
gracefully
- Add if_bridgeif to ifnet
- It's a shortcut to a bif object of a bridge member
- It reduces a bif lookup cost and so lock contention on iflist
- Make bridgestp MPSAFE too


# 1.86 02-Jul-2014 ozaki-r

Protect bridge_list with a mutex


# 1.85 02-Jul-2014 ozaki-r

Remove obsolete codes for if_snd


# 1.84 23-Jun-2014 ozaki-r

Get rid of unnecessary xc_broadcast after pktq_barrier

Pointed out by rmind@


# 1.83 18-Jun-2014 ozaki-r

Restructure bridge_input and bridge_broadcast

There are two changes:
- Assemble the places calling pktq_enqueue (bridge_forward)
for unicast and {b,m}cast frames into one
- Receive {b,m}cast frames in bridge_broadcast, not in
bridge_input

The changes make the code clear and readable. bridge_input
now doesn't need to take care of {b,m}cast frames;
bridge_forward and bridge_broadcast have the responsibility.

The changes are based on a patch of Lloyd Parkes submitted
in PR 48104, but don't fix its issue yet.


# 1.82 18-Jun-2014 ozaki-r

Tidy up bridge_input

No functional change.


# 1.81 17-Jun-2014 ozaki-r

Restructure ether_input and bridge_input

The network stack of NetBSD is well organized and
layered. A packet reception is processed from a
lower layer to an upper layer one by one. However,
ether_input and bridge_input are not structured so.
bridge_input is called inside ether_input.

The new structure replaces ifnet#if_input of a bridge
member with bridge_input when the member is attached.
So a packet goes straight on a packet reception via
a bridge, bridge_input => ether_input => ip_input.

The change is part of a patch of Lloyd Parkes submitted
in PR 48104. Unlike the patch, the change doesn't
intend to change the behavior of the packet processing.
Another patch will fix PR 48104.


# 1.80 16-Jun-2014 ozaki-r

Add net.interfaces.bridgeN.fwdq.{maxlen,len,drops} sysctl


# 1.79 16-Jun-2014 ozaki-r

Use pktqueue for bridge forwarding queue and softint


# 1.78 15-Jun-2014 ozaki-r

Get rid of unnecessary splnet for pool_{get,put}

A mutex prevents interrupts in the functions now.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base
# 1.77 29-Jun-2013 rmind

branches: 1.77.4;
- Rewrite parts of pfil(9): use array to store hooks and thus be more cache
friendly (there are only few hooks in the system). Make the structures
opaque and the interface more strict.
- Remove PFIL_HOOKS option by making pfil(9) mandatory.


Revision tags: agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8
# 1.76 22-Mar-2012 wiz

branches: 1.76.2; 1.76.4;
Fix typo in kauth name. From PR 46234 by Matthew Mondor.
Tested by Geoff Adams and Ryo ONODERA.


# 1.75 13-Mar-2012 elad

Replace the remaining KAUTH_GENERIC_ISSUSER authorization calls with
something meaningful. All relevant documentation has been updated or
written.

Most of these changes were brought up in the following messages:

http://mail-index.netbsd.org/tech-kern/2012/01/18/msg012490.html
http://mail-index.netbsd.org/tech-kern/2012/01/19/msg012502.html
http://mail-index.netbsd.org/tech-kern/2012/02/17/msg012728.html

Thanks to christos, manu, njoly, and jmmv for input.

Huge thanks to pgoyette for spinning these changes through some build
cycles and ATF.


Revision tags: netbsd-6-0-6-RELEASE netbsd-6-1-5-RELEASE netbsd-6-1-4-RELEASE netbsd-6-0-5-RELEASE netbsd-6-1-3-RELEASE netbsd-6-0-4-RELEASE netbsd-6-1-2-RELEASE netbsd-6-0-3-RELEASE netbsd-6-1-1-RELEASE netbsd-6-0-2-RELEASE netbsd-6-1-RELEASE netbsd-6-1-RC4 netbsd-6-1-RC3 netbsd-6-1-RC2 netbsd-6-1-RC1 netbsd-6-0-1-RELEASE matt-nb6-plus-nbase netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base
# 1.74 19-Nov-2011 tls

branches: 1.74.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.73 23-May-2011 joerg

branches: 1.73.4;
simplify


Revision tags: bouyer-quota2-nbase bouyer-quota2-base jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.72 07-Dec-2010 pooka

branches: 1.72.2;
_KERNEL_TOP


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1 yamt-nfs-mp-base9 uebayasi-xip-base
# 1.71 19-Jan-2010 pooka

branches: 1.71.4;
Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


Revision tags: matt-premerge-20091211 yamt-nfs-mp-base8 yamt-nfs-mp-base7 jymxensuspend-base yamt-nfs-mp-base6 yamt-nfs-mp-base5 jym-xensuspend-nbase
# 1.70 17-May-2009 cegger

fix crash in bridge_ioctl():


BRDGGFLT and BRDGSFILT bridge controls are only available with BRIDGE_IPF and PFIL_HOOKS defined.
In amd64 GENERIC and XEN kernel configs PFIL_HOOKS is defined but BRIDGE_IPF is not.

When a BRDGGFLT or BRDGSFILT command comes in, then ifd->ifd_cmd is not in range
of bridge_control_table_size. Then bc is not set and is dereferenced
later => BOOM.


Revision tags: yamt-nfs-mp-base4 jym-xensuspend-base
# 1.69 12-May-2009 elad

Move kauth(9) call before going into splnet().

Mailing list reference:

http://mail-index.netbsd.org/tech-net/2009/05/08/msg001286.html


Revision tags: yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 nick-hppapmap-base
# 1.68 04-Apr-2009 bouyer

Fix another typo


# 1.67 04-Apr-2009 bouyer

Fix a comment, and make it build.


# 1.66 04-Apr-2009 bouyer

Fixes from Masao Uebayashi


# 1.65 04-Apr-2009 bouyer

Fix for if_start() and pfil_hook() being called from hardware interrupt
context (reported on various mailing-lists, and part of PR kern/41114,
causing panic in pf(4) and possibly ipf(4) when BRIDGE_IPF is used).
Defer bridge_forward() to a software interrupt; bridge_input() enqueues
mbufs to ifp->if_snd which is handled in bridge_forward().


Revision tags: nick-hppapmap-base2
# 1.64 18-Jan-2009 mrg

branches: 1.64.2;
Fix multiple problems:

* A sign extension error creating the bridge ID corrupted the
priority (always making it the maximum).
* Do not catch STP packets on an interface for which STP is not
enabled -- it's a violation of the spec, and causes STP to fail on
neighboring bridges.
* An optimization to bstp_input() -- some information is already
known when we call it.

contributed anonymously.


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.63 07-Nov-2008 dyoung

*** Summary ***

When a link-layer address changes (e.g., ifconfig ex0 link
02:de:ad:be:ef:02 active), send a gratuitous ARP and/or a Neighbor
Advertisement to update the network-/link-layer address bindings
on our LAN peers.

Refuse a change of ethernet address to the address 00:00:00:00:00:00
or to any multicast/broadcast address. (Thanks matt@.)

Reorder ifnet ioctl operations so that driver ioctls may inherit
the functions of their "class"---ether_ioctl(), fddi_ioctl(), et
cetera---and the class ioctls may inherit from the generic ioctl,
ifioctl_common(), but both driver- and class-ioctls may override
the generic behavior. Make network drivers share more code.

Distinguish a "factory" link-layer address from others for the
purposes of both protecting that address from deletion and computing
EUI64.

Return consistent, appropriate error codes from network drivers.

Improve readability. KNF.

*** Details ***

In if_attach(), always initialize the interface ioctl routine,
ifnet->if_ioctl, if the driver has not already initialized it.
Delete if_ioctl == NULL tests everywhere else, because it cannot
happen.

In the ioctl routines of network interfaces, inherit common ioctl
behaviors by calling either ifioctl_common() or whichever ioctl
routine is appropriate for the class of interface---e.g., ether_ioctl()
for ethernets.

Stop (ab)using SIOCSIFADDR and start to use SIOCINITIFADDR. In
the user->kernel interface, SIOCSIFADDR's argument was an ifreq,
but on the protocol->ifnet interface, SIOCSIFADDR's argument was
an ifaddr. That was confusing, and it would work against me as I
make it possible for a network interface to overload most ioctls.
On the protocol->ifnet interface, replace SIOCSIFADDR with
SIOCINITIFADDR. In ifioctl(), return EPERM if userland tries to
invoke SIOCINITIFADDR.

In ifioctl(), give the interface the first shot at handling most
interface ioctls, and give the protocol the second shot, instead
of the other way around. Finally, let compatibility code (COMPAT_OSOCK)
take a shot.

Pull device initialization out of switch statements under
SIOCINITIFADDR. For example, pull ..._init() out of any switch
statement that looks like this:

switch (...->sa_family) {
case ...:
..._init();
...
break;
...
default:
..._init();
...
break;
}

Rewrite many if-else clauses that handle all permutations of IFF_UP
and IFF_RUNNING to use a switch statement,

switch (x & (IFF_UP|IFF_RUNNING)) {
case 0:
...
break;
case IFF_RUNNING:
...
break;
case IFF_UP:
...
break;
case IFF_UP|IFF_RUNNING:
...
break;
}

unifdef lots of code containing #ifdef FreeBSD, #ifdef NetBSD, and
#ifdef SIOCSIFMTU, especially in fwip(4) and in ndis(4).

In ipw(4), remove an if_set_sadl() call that is out of place.

In nfe(4), reuse the jumbo MTU logic in ether_ioctl().

Let ethernets register a callback for setting h/w state such as
promiscuous mode and the multicast filter in accord with a change
in the if_flags: ether_set_ifflags_cb() registers a callback that
returns ENETRESET if the caller should reset the ethernet by calling
if_init(), 0 on success, != 0 on failure. Pull common code from
ex(4), gem(4), nfe(4), sip(4), tlp(4), vge(4) into ether_ioctl(),
and register if_flags callbacks for those drivers.

Return ENOTTY instead of EINVAL for inappropriate ioctls. In
zyd(4), use ENXIO instead of ENOTTY to indicate that the device is
not any longer attached.

Add to if_set_sadl() a boolean 'factory' argument that indicates
whether a link-layer address was assigned by the factory or some
other source. In a comment, recommend using the factory address
for generating an EUI64, and update in6_get_hw_ifid() to prefer a
factory address to any other link-layer address.

Add a routing message, RTM_LLINFO_UPD, that tells protocols to
update the binding of network-layer addresses to link-layer addresses.
Implement this message in IPv4 and IPv6 by sending a gratuitous
ARP or a neighbor advertisement, respectively. Generate RTM_LLINFO_UPD
messages on a change of an interface's link-layer address.

In ether_ioctl(), do not let SIOCALIFADDR set a link-layer address
that is broadcast/multicast or equal to 00:00:00:00:00:00.

Make ether_ioctl() call ifioctl_common() to handle ioctls that it
does not understand.

In gif(4), initialize if_softc and use it, instead of assuming that
the gif_softc and ifp overlap.

Let ifioctl_common() handle SIOCGIFADDR.

Sprinkle rtcache_invariants(), which checks on DIAGNOSTIC kernels
that certain invariants on a struct route are satisfied.

In agr(4), rewrite agr_ioctl_filter() to be a bit more explicit
about the ioctls that we do not allow on an agr(4) member interface.

bzero -> memset. Delete unnecessary casts to void *. Use
sockaddr_in_init() and sockaddr_in6_init(). Compare pointers with
NULL instead of "testing truth". Replace some instances of (type
*)0 with NULL. Change some K&R prototypes to ANSI C, and join
lines.


Revision tags: netbsd-5-0-RC3 netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 wrstuden-revivesa-base-1 simonb-wapbl-nbase yamt-pf42-base4 simonb-wapbl-base wrstuden-revivesa-base
# 1.62 15-Jun-2008 christos

branches: 1.62.2; 1.62.4; 1.62.6;
- add if_alloc (ours just mallocs), and if_initname and use them (from FreeBSD)
- kill memsets where M_ZERO can be used.


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base
# 1.61 15-Apr-2008 thorpej

branches: 1.61.2; 1.61.4; 1.61.6; 1.61.8;
Make ip6 and icmp6 stats per-cpu.


# 1.60 12-Apr-2008 cegger

make this build with BRIDGE_IPF and PFIL_HOOKS options


# 1.59 12-Apr-2008 thorpej

Make IP, TCP, UDP, and ICMP statistics per-CPU. The stats are collated
when the user requests them via sysctl.


# 1.58 08-Apr-2008 thorpej

Change IPv6 stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ip6stat structure; old netstat
binaries will continue to work properly.


# 1.57 07-Apr-2008 thorpej

Change IP stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ipstat structure; old netstat
binaries will continue to work properly.


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase nick-net80211-sync-base keiichi-mipv6-base matt-armv6-nbase hpcarm-cleanup-base
# 1.56 20-Feb-2008 matt

branches: 1.56.6;
s/u_\(int[0-9]*_t\)/u\1/g
(change u_int*_t to uint*_t)


Revision tags: bouyer-xeni386-nbase bouyer-xeni386-base mjf-devfs-base
# 1.55 19-Jan-2008 dyoung

Use C99 array initializers for bridge_control_table[].


Revision tags: nick-csl-alignment-base5 bouyer-xeni386-merge1 matt-armv6-prevmlocking vmlocking2-base3 yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 jmcneill-base bouyer-xenamd64-base2 vmlocking-nbase yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base matt-armv6-base jmcneill-pm-base reinoud-bufcleanup-base vmlocking-base
# 1.54 27-Aug-2007 dyoung

branches: 1.54.2; 1.54.8; 1.54.14;
LLADDR -> CLLADDR.


# 1.53 26-Aug-2007 dyoung

Constify: LLADDR -> CLLADDR. I'm aiming here to make it easier to
identify sockaddr_dl abuse that remains in the kernel, especially
the potential for overwriting memory past the end of a sockaddr_dl
with, e.g., memcpy(LLADDR(), ...).

Use sockaddr_dl_setaddr() in a few places.


Revision tags: matt-mips64-base nick-csl-alignment-base mjf-ufs-trans-base
# 1.52 09-Jul-2007 ad

branches: 1.52.2; 1.52.6;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.51 12-Mar-2007 ad

branches: 1.51.2;
Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.


# 1.50 04-Mar-2007 christos

branches: 1.50.2;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.49 21-Feb-2007 dyoung

Use __arraycount().


# 1.48 17-Feb-2007 dyoung

KNF: de-__P, bzero -> memset, bcmp -> memcmp. Remove extraneous
parentheses in return statements.

Cosmetic: don't open-code TAILQ_FOREACH().

Cosmetic: change types of variables to avoid oodles of casts: in
in6_src.c, avoid casts by changing several route_in6 pointers
to struct route pointers. Remove unnecessary casts to caddr_t
elsewhere.

Pave the way for eliminating address family-specific route caches:
soon, struct route will not embed a sockaddr, but it will hold
a reference to an external sockaddr, instead. We will set the
destination sockaddr using rtcache_setdst(). (I created a stub
for it, but it isn't used anywhere, yet.) rtcache_free() will
free the sockaddr. I have extracted from rtcache_free() a helper
subroutine, rtcache_clear(). rtcache_clear() will "forget" a
cached route, but it will not forget the destination by releasing
the sockaddr. I use rtcache_clear() instead of rtcache_free()
in rtcache_update(), because rtcache_update() is not supposed
to forget the destination.

Constify:

1 Introduce const accessor for route->ro_dst, rtcache_getdst().

2 Constify the 'dst' argument to ifnet->if_output(). This
led me to constify a lot of code called by output routines.

3 Constify the sockaddr argument to protosw->pr_ctlinput. This
led me to constify a lot of code called by ctlinput routines.

4 Introduce const macros for converting from a generic sockaddr
to family-specific sockaddrs, e.g., sockaddr_in: satocsin6,
satocsin, et cetera.


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.47 04-Jan-2007 elad

branches: 1.47.2;
Consistent usage of KAUTH_GENERIC_ISSUSER.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3 netbsd-4-base
# 1.46 23-Nov-2006 rpaulo

New EtherIP driver based on tap(4) and gif(4) by Hans Rosenfeld.
Notable changes:
* Fixes PR 34268.
* Separates the code from gif(4) (which is more cleaner).
* Allows the usage of STP (Spanning Tree Protocol).
* Removed EtherIP implementation from gif(4)/tap(4).

Some input from Christos.


# 1.45 16-Nov-2006 christos

__unused removal on arguments; approved by core.


Revision tags: yamt-splraiseipl-base2
# 1.44 17-Oct-2006 dogcow

now that we have -Wno-unused-parameter, back out all the tremendously ugly
code to gratuitously access said parameters.


# 1.43 13-Oct-2006 dogcow

More -Wunused fallout. sprinkle __unused when possible; otherwise, use the
do { if (&x) {} } while (/* CONSTCOND */ 0);
construct as suggested by uwe in <20061012224845.GA9449@snark.ptc.spbu.ru>.


# 1.42 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


# 1.41 05-Oct-2006 tls

Protect calls to pool_put/pool_get that may occur in interrupt context
with spl used to protect other allocations and frees, or datastructure
element insertion and removal, in adjacent code.

It is almost unquestionably the case that some of the spl()/splx() calls
added here are superfluous, but it really seems wrong to see:

s=splfoo();
/* frob data structure */
splx(s);
pool_put(x);

and if we think we need to protect the first operation, then it is hard
to see why we should not think we need to protect the next. "Better
safe than sorry".

It is also almost unquestionably the case that I missed some pool
gets/puts from interrupt context with my strategy for finding these
calls; use of PR_NOWAIT is a strong hint that a pool may be used from
interrupt context but many callers in the kernel pass a "can wait/can't
wait" flag down such that my searches might not have found them. One
notable area that needs to be looked at is pf.

See also:

http://mail-index.netbsd.org/tech-kern/2006/07/19/0003.html
http://mail-index.netbsd.org/tech-kern/2006/07/19/0009.html


Revision tags: abandoned-netbsd-4-base yamt-splraiseipl-base yamt-pdpolicy-base9 yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.40 23-Jul-2006 ad

branches: 1.40.4; 1.40.6;
Use the LWP cached credentials where sane.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.39 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.38 18-May-2006 liamjfoy

branches: 1.38.2;
Integrate Common Address Redundancy Procotol (CARP) from OpenBSD

'pseudo-device carp'

Thanks to: joerg@ christos@ riz@ and others who tested
Ok: core@


# 1.37 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base yamt-pdpolicy-base2 elad-kernelauth-base yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.36 17-Jan-2006 christos

branches: 1.36.2; 1.36.4; 1.36.6; 1.36.8; 1.36.10;
Make sure that breq is also cleared (from Xin LI)


# 1.35 09-Jan-2006 christos

Make sure we initialize all structs to 0; from Xin LI


# 1.34 24-Dec-2005 perry

branches: 1.34.2;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.33 11-Dec-2005 thorpej

ANSI function decls and application of static.


# 1.32 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.31 01-Jun-2005 jdc

branches: 1.31.2;
Fix this properly by renaming the conflicting variables.


# 1.30 01-Jun-2005 jdc

Remove extraneous definition of struct llc (found by shadow warning).


Revision tags: netbsd-3-0-RELEASE netbsd-3-0-RC6 netbsd-3-0-RC5 netbsd-3-0-RC4 netbsd-3-0-RC3 netbsd-3-0-RC2 netbsd-3-0-RC1 yamt-km-base4 yamt-km-base3 netbsd-3-base kent-audio2-base
# 1.29 26-Feb-2005 perry

branches: 1.29.2; 1.29.4;
nuke trailing whitespace


Revision tags: yamt-km-base2
# 1.28 31-Jan-2005 kim

Add RFC 3378 EtherIP support, ported from OpenBSD to NetBSD by
Hans Rosenfeld (rosenfeld at grumpf.hope-2000.org)

This change makes it possible to add gif interfaces to bridges, which
will then send and receive IP protocol 97 packets. Packets are Ethernet
frames with an EtherIP header prepended.


Revision tags: yamt-km-base kent-audio1-beforemerge kent-audio1-base
# 1.27 04-Dec-2004 peter

branches: 1.27.4; 1.27.6;
Change ifc_destroy to return an int instead of void, so that it
can pass back errors to ifconfig.


# 1.26 06-Oct-2004 bad

Interfaces that do checksum offloading indicate the checksum status of
received packets in csum_flags in the packet header. Packets that are
forwarded over the bridge need to have csum_flags cleared before being
put on the output queue. Do so in bridge_enqueue().

Discussed with Jason Thorpe.

Fixes PR kern/27007 and the first part of PR kern/21831.


# 1.25 05-Oct-2004 christos

Only enable BRIDGE_IPF code if PFIL_HOOKS is enabled.


# 1.24 21-Apr-2004 itojun

kill a sprintf


# 1.23 21-Apr-2004 itojun

kill sprintf, use snprintf


Revision tags: netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.22 31-Jan-2004 jdc

branches: 1.22.2;
Use m_copydata(), m_adj() and M_PREPEND() to manipulate mbuf's in
bridge_ipf(). Fixes kernel memory corruption that occured when using
m_split() and m_cat().
Idea from OpenBSD.


# 1.21 09-Dec-2003 augustss

Fix spelling mistake in a comment.


# 1.20 28-Oct-2003 mycroft

Mark this initializer in the canonical way so it can be found later.


# 1.19 25-Oct-2003 christos

Fix uninitialized variable warnings


# 1.18 16-Sep-2003 jdc

Add a flag parameter to bridge_enqueue() to tell it whether to run the
filter or not. We only need to run the filter for bridge_forward() and
bridge_broadcast(). If we also run it for bridge_output(), we will run
the filter twice outbound per packet, so don't.

In bridge_ipf(), make sure we don't run m_cat() on a single mbuf chain
by checking to see (and remembering) if we need to m_split() the mbuf.
This fixes bridge + ipfilter on sparc.

Fixes PR kern/22063.


# 1.17 11-Aug-2003 itojun

rm extra blank line


# 1.16 13-Jul-2003 jdc

Include opt_inet.h to get INET6 definition.
Now, bridged ipv6 packets are passed through ipfilter.
However, some v6 packets still do not get transmitted when ipf is enabled.
Partial fix for PR kern/22063.


# 1.15 23-Jun-2003 martin

branches: 1.15.2;
Make sure to include opt_foo.h if a defflag option FOO is used.


# 1.14 24-May-2003 kristerw

Make sure splx() is called for all bridge_ioctl() error cases.


# 1.13 16-May-2003 itojun

use strlcpy


# 1.12 14-May-2003 itojun

use arc4random


# 1.11 19-Mar-2003 bouyer

Fix 2 bugs:
- initialise stp when the bridge is turned up, without this stp will keep
all interfaces disabled in a sequence like:
brconfig bridge0 add if0 add if1 stp if0 stp if1 up
- s/BRDGSPRI/BRDGSIFPRIO in brconfig.c:cmd_ifpriority()

add a command (ifpathcost) to change the stp path cost of the STP path cost of
an interface. Display the interface path cost with the others STP parameters.


# 1.10 27-Feb-2003 perseant

Make BRIDGE_IPF an option, and document it. Add it (commented) to GENERIC.
Let brconfig tell whether the bridge is using the ipfilter hook, or not.


# 1.9 15-Feb-2003 perseant

Add ipf packet-filtering option to if_bridge. The option is controlled at
compile-time by BRIDGE_IPF, and at runtime by brconfig with the {ipf,-ipf}
option on a per-bridge basis.

As a side-effect, add PFIL_HOOKS processing to if_bridge.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base kqueue-aftermerge kqueue-beforemerge gehenna-devsw-base kqueue-base
# 1.8 24-Aug-2002 martin

Add a function to lookup bridge members by struct ifnet * and use
it at all call sites that have such a pointer readily available.
This avoids unnecessary strcmp()s in critical paths, and removes
some XXX comments.


# 1.7 08-Jun-2002 itojun

reject "add" request if if_mtu is different.


# 1.6 23-May-2002 itojun

use IFT_BRIDGE


Revision tags: netbsd-1-6-base
# 1.5 24-Mar-2002 jdolecek

branches: 1.5.2; 1.5.4;
Fix a memory leak in bridge_ioctl_add() when the called for non-ethernet
interface.
Problem noted and fix provided by in kern/16019 by Love.


Revision tags: eeh-devprop-base newlock-base
# 1.4 08-Mar-2002 thorpej

Pool deals fairly well with physical memory shortage, but it doesn't
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map). Try to deal with this:

* Group all information about the backend allocator for a pool in a
separate structure. The pool references this structure, rather than
the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
to become available, but will still fail if it cannot callocate KVA
space for the pages. If this happens, carefully drain all pools using
the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
some pages, and use that information to make draining easier and more
efficient.
* Get rid of PR_URGENT. There was only one use of it, and it could be
dealt with by the caller.

From art@openbsd.org.


Revision tags: ifpoll-base
# 1.3 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.2 17-Aug-2001 thorpej

branches: 1.2.2; 1.2.4;
Only report expire time for DYNAMIC forwarding table entries.


# 1.1 17-Aug-2001 thorpej

Add support for building Ethernet bridges, based on Jason Wright's
bridge driver from OpenBSD, although the bridge code has been *heavily*
modified by me (the 802.1D code remains mostly unchanged from the
original).


# 1.178 14-Feb-2021 christos

- centralize header align and pullup into a single inline function
- use a single macro to align pointers and expose the alignment, instead
of hard-coding 3 in 1/2 the macros.
- fix an issue in the ipv6 lt2p where it was aligning for ipv4 and pulling
for ipv6.


Revision tags: thorpej-futex-base
# 1.177 02-Nov-2020 roy

bridge: revert prior

It's of little use.
If we need to do this in the future, consider a sysctl to do it for all
interfaces in the bridge and not just the one being added.


# 1.176 27-Sep-2020 roy

branches: 1.176.2;
bridge: When an interface joins then mark addresses on it as tentative

The exact flow is detatch addresses, join bridge and then mark detached
addresses as tentative.
This ensures that Duplicate Address Detection for the joining interface
are performed across all members of the bridge.


# 1.175 27-Sep-2020 roy

bridge: Calculate link state as the best link state of any member

If any member is LINK_STATE_UP then it's LINK_STATE_UP.
Otherwise if any member is LINK_STATE_UNKNOWN then it's LINK_STATE_UNKNOWN.
Otherwise it's LINK_STATE_DOWN.


# 1.174 01-Aug-2020 maxv

Remove #ifdef BRIDGE_IPF, compile in the code by default. Sent to
tech-net@.


# 1.173 01-May-2020 jdolecek

report no enabled capabilities when no interface is part of bridge


# 1.172 30-Apr-2020 jdolecek

for bridge(4), report the common enabled capabilities of the members
via SIOCGIFCAP for visibility


# 1.171 27-Apr-2020 jdolecek

if MTU of the added interface doesn't match the bridge, modify the MTU
of the interface to that of the bridge instead of just refusing the
addition with EINVAL

this is a convenience feature to simplify bridge setup with non-standard
MTU, the useful behaviour observed with Linux xenbr


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406
# 1.170 27-Mar-2020 jdolecek

replace the conditional m_pullup() on start of bridge_output() with
a KASSERT(), to make it clear no mbuf manipulation is ever done here

the condition should never trigger, this always runs after ether_output()
M_PREPEND()s ether_header


# 1.169 24-Mar-2020 jdolecek

reset the csum_flags in bridge_brodcast() also for bmcast path

for destination interfaces with real hardware offloading this fixes
multicast packet corruption; for xvif(4) this fix stops treating them
as having no csum

may fix PR kern/42386


Revision tags: ad-namecache-base3
# 1.168 24-Feb-2020 rin

Remove debug printf I put into bridge_calc_csum_flags().
Sorry for noise.


# 1.167 23-Feb-2020 jdolecek

disable the DEBUG bridge_calc_csum_flags() printf


# 1.166 29-Jan-2020 thorpej

Adopt <net/if_stats.h>.


Revision tags: ad-namecache-base2 ad-namecache-base1 ad-namecache-base phil-wifi-20191119
# 1.165 05-Aug-2019 msaitoh

branches: 1.165.2;
Cast uint32_t to avoid undefined behavior in bridge_rthash(). Found by kUBSan.


Revision tags: netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609 isaki-audio2-base pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226
# 1.164 22-Dec-2018 rin

branches: 1.164.4;
Take the interface out of promiscuous mode in bridge_delete_member()
instead of bridge_ioctl_del(). Otherwise, the member interfaces are
left in promiscuous mode when the bridge is destroyed.


# 1.163 15-Dec-2018 rin

Improve wording in comments: replace "chain" with "queue" for
sequence of mbuf's connected by m_nextpkt, in order to avoid
confusion with those connected by m_next.

No binary changes.


# 1.162 14-Dec-2018 martin

Need <netinet6/ip6_var.h> for ip6_statinc() prototype.


# 1.161 12-Dec-2018 rin

PR kern/53562

Handle TX offload in software when a packet is sent via
bridge_output(). We can send it as is in the following
exceptional cases:

For unicast:

(1) When the destination interface is the same as source.

(2) When the destination supports all TX offload options
specified in a packet.

For multicast/broadcast:

(3) When all the members of the bridge support the specified
TX offload options.

For (3), add sc_csum_flags_tx flag to bridge softc, which is
logical AND b/w capabilities of TX offload options in member
interface (ifp->if_csum_flags_tx). The flag is updated when a
member is (i) added to or (ii) removed from a bridge, or (iii)
if_csum_flags_tx flag of a member interface is manipulated via
ifconfig(8).

Turn on M_CSUM_TSOv[46] bit in ifp->if_csum_flags_tx flag when
TSO[46] is enabled for that interface.

OK msaitoh thorpej


Revision tags: pgoyette-compat-1126
# 1.160 09-Nov-2018 ozaki-r

Fix that brconfig <bridge> (addr) can't show a large number of MAC addresses

The command shows only 256 addresses at maximum even if a bridge caches more
addresses. It occurs because the kernel doesn't return an error if the command
passes a short buffer that can't store all cached addresses; the kernel fills
cached addresses as much as possible and returns it without telling that the
result is truncated.

Fix the issue by telling a required size of a buffer if a buffer passed from the
command is not enough, which lets the command retry with an enough buffer.

Reported by k-goda@IIJ


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930
# 1.159 19-Sep-2018 msaitoh

Micro optimization. m_copym(M_COPYALL) -> m_copypacket().


# 1.158 18-Sep-2018 msaitoh

- Fix bridge_enqueue() which was broken by last commit. Use correct mbuf
pointer.
- Modify comment.


# 1.157 14-Sep-2018 msaitoh

Fix a bug that bridge_enqueue() incorrectly cleared outgoing packet's offload
flags. bridge_enqueue() is called from bridge_output() when a packet is
spontaneous. Clear csum_flags before calling brige_enqueue() in
bridge_forward() or bridge_broadcast() instead of in the beginning of
bridge_enqueue().

Note that this change doesn't fix a problem on the following configuration:

A bridge has two or more interfaces.

An address is assigned to an bridge member interface and
some offload flags are set.

Another interface has no address and has no any offload flag.

XXX pullup-[78]


Revision tags: pgoyette-compat-0906 pgoyette-compat-0728 phil-wifi-base pgoyette-compat-0625
# 1.156 25-May-2018 ozaki-r

branches: 1.156.2;
Ensure to call if_register after interface initializations finish


Revision tags: pgoyette-compat-0521
# 1.155 14-May-2018 ozaki-r

Protect packet input routines with KERNEL_LOCK and splsoftnet

if_input, i.e, ether_input and friends, now runs in softint without any
protections. It's ok for ether_input itself because it's already MP-safe,
however, subsequent routines called from it such as carp_input and agr_input
aren't safe because they're not MP-safe. Protect if_input with KERNEL_LOCK.

if_input can be called from a normal LWP context. In that case we need to
prevent interrupts (softint) from running by splsoftnet to protect non-MP-safe
codes (e.g., carp_input and agr_input).

Pointed out by mlelstv@


Revision tags: pgoyette-compat-0502 pgoyette-compat-0422
# 1.154 18-Apr-2018 ozaki-r

Add missing PSLIST_ENTRY_INIT and PSLIST_ENTRY_DESTROY


# 1.153 18-Apr-2018 ozaki-r

Get rid of a unnecessary semicolon

Pointed out by kamil@


# 1.152 18-Apr-2018 ozaki-r

bridge: use pslist(9) for rtlist and rthash

The change fixes race conditions on list operations. One example is that a
reader may see invalid pointers on a looking item in a list due to lack of
membar_producer.


# 1.151 18-Apr-2018 ozaki-r

Simplify bridge_rtnode_insert (NFC)


# 1.150 18-Apr-2018 ozaki-r

Remove obsolete NULL checks


Revision tags: pgoyette-compat-0415
# 1.149 10-Apr-2018 ozaki-r

Fix bridge_rtdelete

It removes a rtable entry that belongs to a specified interface, however, its
original behavior was to delete all belonging entries. Restore the original
behavior.


Revision tags: pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.148 15-Jan-2018 maxv

branches: 1.148.2;
If the bridge is not running, don't call bridge_stop. Otherwise the
following commands will crash the kernel:

ifconfig bridge0 create
ifconfig bridge0 destroy


# 1.147 28-Dec-2017 ozaki-r

Ensure the timer isn't running by using workqueue_wait


# 1.146 19-Dec-2017 ozaki-r

Don't set IFEF_MPSAFE unless NET_MPSAFE at this point

Because recent investigations show that interfaces with IFEF_MPSAFE need to
follow additional restrictions to work with the flag safely. We should enable it
on an interface by default only if the interface surely satisfies the
restrictions, which are described in if.h.

Note that enabling IFEF_MPSAFE solely gains a few benefit on performance because
the network stack is still serialized by the big kernel locks by default.


# 1.145 11-Dec-2017 ozaki-r

Wrap if_ioctl_lock with IFNET_* macros (NFC)

Also if_ioctl_lock perhaps needs to be renamed to something because it's now
not just for ioctl...


# 1.144 08-Dec-2017 ozaki-r

Fix build of kernels without ether

By throwing out if_enable_vlan_mtu and if_disable_vlan_mtu that
created a unnecessary dependency from if.c to if_ethersubr.c.

PR kern/52790


# 1.143 06-Dec-2017 ozaki-r

Ensure to not turn on IFF_RUNNING of an interface until its initialization completes

And ensure to turn off it before destruction as per IFF_RUNNING's description
"resource allocated". (The description is a bit doubtful though, I believe the
change is still proper.)


# 1.142 06-Dec-2017 ozaki-r

Ensure to hold if_ioctl_lock when calling if_flags_set


Revision tags: tls-maxphys-base-20171202
# 1.141 17-Nov-2017 ozaki-r

Add missing IFEF_NO_LINK_STATE_CHANGE to bridge


# 1.140 16-Nov-2017 ozaki-r

Unify IFEF_*_MPSAFE into IFEF_MPSAFE

There are already two flags for if_output and if_start, however, it seems such
MPSAFE flags are eventually needed for all if_XXX operations. Having discrete
flags for each operation is wasteful of if_extflags bits. So let's unify
the flags into one: IFEF_MPSAFE.

Fortunately IFEF_*_MPSAFE flags have never been included in any releases, so
we can change them without breaking backward compatibility of the releases
(though the kernel version of -current should be bumped).

Note that if an interface have both MP-safe and non-MP-safe operations at a
time, we have to set the IFEF_MPSAFE flag and let callees of non-MP-safe
opeartions take the kernel lock.

Proposed on tech-kern@ and tech-net@


# 1.139 15-Nov-2017 ozaki-r

Mark callouts of bridge CALLOUT_MPSAFE


# 1.138 25-Oct-2017 ozaki-r

Remove unnecessary splsoftnet


# 1.137 25-Oct-2017 ozaki-r

Don't free sc_rthash twice


# 1.136 23-Oct-2017 msaitoh

- If if_initialize() failed in the attach function, free resources and return.
- Add some missing frees in bridge_clone_destroy().
- KNF


# 1.135 02-Oct-2017 ozaki-r

Add curlwp_bind to bridge_input for psref

It can be called in a thread context via tap (tap_dev_write).

Fix PR kern/52587


Revision tags: nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320
# 1.134 07-Mar-2017 ozaki-r

branches: 1.134.6;
Remove unnecessary splnet for bridge_enqueue

bridge_enqueue now uses if_transmit_lock that does splnet for device
drivers, so splnet for bridge_enqueue isn't needed anymore.


# 1.133 16-Feb-2017 knakahara

add l2tp(4) L2TPv3 interface.

originally implemented by IIJ SEIL team.


Revision tags: nick-nhusb-base-20170204
# 1.132 23-Jan-2017 ozaki-r

Replace some splnet with splsoftnet


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107 nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004
# 1.131 15-Sep-2016 christos

branches: 1.131.2;
Always do the mbuf checks. The packet filters (npf) expect the mbuf to be
pulled-up. (Krists Krilovs)


Revision tags: localcount-20160914
# 1.130 29-Aug-2016 ozaki-r

KNF; replace white spaces with hard tabs

No functional change.


Revision tags: pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.129 22-Jun-2016 knakahara

branches: 1.129.2;
fix: locking about IFQ_ENQUEUE and ALTQ

- If NET_MPSAFE is not defined, IFQ_LOCK is nop. Currently, that means
IFQ_ENQUEUE() of some paths such as bridge_enqueue() is called parallel
wrongly.
- If ALTQ is enabled, Tx processing should call if_transmit() (= IFQ_ENQUEUE
+ ifp->if_start()) instead of ifp->if_transmit() to call ALTQ_ENQUEUE()
and ALTQ_DEQUEUE().
Furthermore, ALTQ processing is always required KERNEL_LOCK currently.


# 1.128 20-Jun-2016 knakahara

fix: should not assert IFEF_OUTPUT_MPSAFE in bridge_output()


# 1.127 20-Jun-2016 knakahara

tentative fix for ATF(net/if_bridge/t_bridge)


# 1.126 20-Jun-2016 knakahara

make bridge_output MP-safe, so that bridge(4) can enable IFEF_OUTPUT_MPSAFE.

making MP-scalable is future work.


# 1.125 10-Jun-2016 ozaki-r

Avoid storing a pointer of an interface in a mbuf

Having a pointer of an interface in a mbuf isn't safe if we remove big
kernel locks; an interface object (ifnet) can be destroyed anytime in any
packet processing and accessing such object via a pointer is racy. Instead
we have to get an object from the interface collection (ifindex2ifnet) via
an interface index (if_index) that is stored to a mbuf instead of an
pointer.

The change provides two APIs: m_{get,put}_rcvif_psref that use psref(9)
for sleep-able critical sections and m_{get,put}_rcvif that use
pserialize(9) for other critical sections. The change also adds another
API called m_get_rcvif_NOMPSAFE, that is NOT MP-safe and for transition
moratorium, i.e., it is intended to be used for places where are not
planned to be MP-ified soon.

The change adds some overhead due to psref to performance sensitive paths,
however the overhead is not serious, 2% down at worst.

Proposed on tech-kern and tech-net.


# 1.124 10-Jun-2016 ozaki-r

Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.123 16-May-2016 ozaki-r

Apply if_get and if_put to bridge(4)


# 1.122 04-May-2016 roy

Allow multicast/broadcast packets from a bridge member to other members.
Note this should just call bridge_broadcast when more locking issues are
resolved.


# 1.121 28-Apr-2016 knakahara

introduce new ifnet MP-scalable sending interface "if_transmit".


# 1.120 28-Apr-2016 ozaki-r

Constify rtentry of if_output

We no longer need to change rtentry below if_output.

The change makes it clear where rtentries are changed (or not)
and helps forthcoming locking (os psrefing) rtentries.


# 1.119 24-Apr-2016 christos

CID 1358673: dead code


Revision tags: nick-nhusb-base-20160422
# 1.118 22-Apr-2016 roy

Change used from int to bool.
If used, abort the loop because we think we're already at the end.


# 1.117 20-Apr-2016 knakahara

IFQ_ENQUEUE refactor (3/3) : eliminate pktattr argument from IFQ_ENQUEUE caller


# 1.116 20-Apr-2016 knakahara

IFQ_ENQUEUE refactor (2/3) : eliminate pktattr argument from altq implemantation


# 1.115 19-Apr-2016 ozaki-r

Apply psref(9) to bridge(4)

Note that there is an issue that ioctls for an interface and a destruction
of the interface can run in parallel and it causes race conditions on
bridge as well (it rarely happens). The issue will be addressed in the
interface common code (if.c).


# 1.114 19-Apr-2016 ozaki-r

Remove BRIDGE_MPSAFE switch and enable MP-safe code by default

We need to enable it by default because bridge_input now runs
in softint, but bridge_input w/o BRIDGE_MPSAFE was designed as
it runs in hardware interrupt.

Note that there remains a racy code in bridge_output; it will be
solved in the upcoming change (applying psref(9)).


# 1.113 11-Apr-2016 ozaki-r

Fix usage of pslist(9)

Pointed out by riastradh@.


# 1.112 11-Apr-2016 ozaki-r

Use pslist(9) in bridge(4)

This adds missing memory barriers to list operations for pserialize.


# 1.111 28-Mar-2016 ozaki-r

Remove unused global bridge list

Pointed out by riastradh@


# 1.110 23-Mar-2016 ozaki-r

Fix LIST_FOREACH argument


# 1.109 23-Mar-2016 ozaki-r

Use LIST_FOREACH instead of LIST_FOREACH_SAFE

No need to use *_SAFE because we don't remove any items in the loop.


Revision tags: nick-nhusb-base-20160319
# 1.108 15-Feb-2016 ozaki-r

Simplify bridge(4)

Thanks to introducing softint-based if_input, the entire bridge code now
never run in hardware interrupt context. So we can simplify the code.

- Remove spin mutexes
- They were needed because some code of bridge could run in
hardware interrupt context
- We now need only an adaptive mutex for each shared object
(a member list and a forwarding table)
- Remove pktqueue
- bridge_input is already in softint, using another softint
(for bridge_forward) is useless
- Packet distribution should be down at device drivers


# 1.107 10-Feb-2016 ozaki-r

Don't share struct work, instead have one per softc

Pointed out by riastradh@


# 1.106 09-Feb-2016 ozaki-r

Introduce softint-based if_input

This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.

This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.

To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.

Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html

Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!


Revision tags: nick-nhusb-base-20151226
# 1.105 19-Nov-2015 christos

Add handling of VLAN packets in if_bridge where the parent interface supports
them (Jean-Jacques.Puig@espci.fr). Factor out the vlan_mtu enabling and
disabling code.


# 1.104 20-Oct-2015 maxv

Harmless alloc inconsistency; make sure the exact same argument is given to
kmem_alloc/kmem_free. Found by Brainy.


# 1.103 07-Oct-2015 ozaki-r

Enqueue frames to a curcpu's pktqueue

Currently RX can run on a CPU other than CPU#0, so always enqueuing
to a pktqueue of CPU#0 makes no sense. Let's use a curcpu's pktqueue,
although bridge_foward softint doesn't run in parallel without
NET_MPSAFE.

This is a temporal solution. We need a fundamental solution.


Revision tags: nick-nhusb-base-20150921
# 1.102 28-Aug-2015 rjs

Don't set M_PROTO1 in mbuf flags.

This was left over from the old usage of gif(4) with bridges.


# 1.101 20-Aug-2015 christos

include "ioconf.h" to get the 'void <driver>attach(int count);' prototype.


# 1.100 23-Jul-2015 ozaki-r

Fix PR 48104

So far bridge cannot receive frames via a member interface when the frames
come from another member interface. So when we assign an IP address to
a member interface, hosts connected to another member interface cannot
ping to the IP address. That behavior isn't expected. See PR 48104 for
more realistic examples of this issue.

The change does:
- drop M_PROMISC before ether_input, which allows a bridge member interface
to receive a frame coming from another bridge member interface
- receive broadcast/multicast frames via all bridge member interfaces,
which is required to receive IPv6 multicast packets destined to a
multicast group belonging to a bridge member interface that is different
from a packet arrival interface

roy@ helped testing of the fix, thanks!


Revision tags: nick-nhusb-base-20150606
# 1.99 01-Jun-2015 matt

Modify the BRDGGIFS and BRDGRTS cmds to be more COMPAT_NETBSD32 friendly.
(XXX whitespace)


# 1.98 16-Apr-2015 ozaki-r

Fix racy bridge_delete_member

It can be called from bridge_ioctl_del and bridge_clone_destroy with
a same bridge member (bif) at the same time. We have to prevent
that happens.

Pointed out by riastradh@


Revision tags: nick-nhusb-base-20150406
# 1.97 08-Jan-2015 ozaki-r

Use pserialize for rtlist in bridge

This change enables lockless accesses to bridge rtable lists.
See locking notes in a comment to know how pserialize and
mutexes are used. Some functions are rearranged to use
pserialize. A workqueue is introduced to use pserialize in
bridge_rtage via bridge_timer callout.

As usual, pserialize and mutexes are used only when NET_MPSAFE
on. On the other hand, the newly added workqueue is used
regardless of NET_MPSAFE on or off.


# 1.96 01-Jan-2015 ozaki-r

Reset the expire time of a cache on receiving a frame for the cache

The expire time of a cache in a bridge MAC address table was never reset
once it is initialized regardless of traffic for the cache. The behavior
isn't supposed and active caches are unnecessarily expired and removed.

PR kern/49507


# 1.95 31-Dec-2014 ozaki-r

Use pserialize in bridge

This change enables lockless accesses to bridge member lists.
See locking notes in a comment to know how pserialize and
mutexes are used.

This change also provides support for softint-based interrupt
handling; pserialize readers can run in both HW interrupt and
softint contexts.

As usual, pserialize is used only when NET_MPSAFE on.


# 1.94 25-Dec-2014 ozaki-r

Use LIST_FOREACH_SAFE in bridge_rt* functions


# 1.93 24-Dec-2014 ozaki-r

Replace malloc/free with kmem_* in if_bridge

Additionally M_NOWAIT is replaced with KM_SLEEP.


# 1.92 22-Dec-2014 ozaki-r

Call ether_input/m_freem without holding a lock or referencing unnecessary objects

When NET_MPSAFE on, a bridge tries to pass up a packet to Layer 3
(or call m_freem) with holding a lock or referencing unnecessary
objects. That causes random lock ups. The change fixes the issue.


Revision tags: nick-nhusb-base
# 1.91 15-Aug-2014 ozaki-r

branches: 1.91.2;
bridge: reject non-IFF_SIMPLEX interfaces

bridge does not work with !IFF_SIMPLEX interfaces (PR/18035);
the bug is not yet fixed. Until it gets fixed, we should
reject non-IFF_SIMPLEX interfaces.

Discussed with pooka@


Revision tags: netbsd-7-1-2-RELEASE netbsd-7-1-1-RELEASE netbsd-7-1-RELEASE netbsd-7-1-RC2 netbsd-7-nhusb-base-20170116 netbsd-7-1-RC1 netbsd-7-0-2-RELEASE netbsd-7-nhusb-base netbsd-7-0-1-RELEASE netbsd-7-0-RELEASE netbsd-7-0-RC3 netbsd-7-0-RC2 netbsd-7-0-RC1 netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.90 23-Jul-2014 ozaki-r

branches: 1.90.2;
Avoid calling copyout with holding mutex(IPL_NET)

Because copyout may lead a page fault that may sleep, we have to pull it
out from the critical section of mutex(IPL_NET) in bridge_ioctl_gifs.


# 1.89 23-Jul-2014 ozaki-r

Add missing unlock


# 1.88 20-Jul-2014 ozaki-r

Don't return ENETRESET when ioctl SIOCSIFMTU

Otherwise, just changing MTU with ifconfig shows
a confusable error message.

RP kern/48996


# 1.87 14-Jul-2014 ozaki-r

Make bridge MPSAFE

- Introduce BRIDGE_MPSAFE
- It's enabled only when NET_MPSAFE is defined
in if.h or the kernel config
- Add iflist and rtlist mutex locks
- Locking iflist is performance sensitive,
so it's not used when !BRIDGE_MPSAFE
- Add bif object reference counting
- It enables fine-grain locking for bridge member lists
by allowing to not hold a lock during touching a bif
- bridge_release_member is added to decrement the
reference count
- A condition variable is added to do bridge_delete_member
gracefully
- Add if_bridgeif to ifnet
- It's a shortcut to a bif object of a bridge member
- It reduces a bif lookup cost and so lock contention on iflist
- Make bridgestp MPSAFE too


# 1.86 02-Jul-2014 ozaki-r

Protect bridge_list with a mutex


# 1.85 02-Jul-2014 ozaki-r

Remove obsolete codes for if_snd


# 1.84 23-Jun-2014 ozaki-r

Get rid of unnecessary xc_broadcast after pktq_barrier

Pointed out by rmind@


# 1.83 18-Jun-2014 ozaki-r

Restructure bridge_input and bridge_broadcast

There are two changes:
- Assemble the places calling pktq_enqueue (bridge_forward)
for unicast and {b,m}cast frames into one
- Receive {b,m}cast frames in bridge_broadcast, not in
bridge_input

The changes make the code clear and readable. bridge_input
now doesn't need to take care of {b,m}cast frames;
bridge_forward and bridge_broadcast have the responsibility.

The changes are based on a patch of Lloyd Parkes submitted
in PR 48104, but don't fix its issue yet.


# 1.82 18-Jun-2014 ozaki-r

Tidy up bridge_input

No functional change.


# 1.81 17-Jun-2014 ozaki-r

Restructure ether_input and bridge_input

The network stack of NetBSD is well organized and
layered. A packet reception is processed from a
lower layer to an upper layer one by one. However,
ether_input and bridge_input are not structured so.
bridge_input is called inside ether_input.

The new structure replaces ifnet#if_input of a bridge
member with bridge_input when the member is attached.
So a packet goes straight on a packet reception via
a bridge, bridge_input => ether_input => ip_input.

The change is part of a patch of Lloyd Parkes submitted
in PR 48104. Unlike the patch, the change doesn't
intend to change the behavior of the packet processing.
Another patch will fix PR 48104.


# 1.80 16-Jun-2014 ozaki-r

Add net.interfaces.bridgeN.fwdq.{maxlen,len,drops} sysctl


# 1.79 16-Jun-2014 ozaki-r

Use pktqueue for bridge forwarding queue and softint


# 1.78 15-Jun-2014 ozaki-r

Get rid of unnecessary splnet for pool_{get,put}

A mutex prevents interrupts in the functions now.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base
# 1.77 29-Jun-2013 rmind

branches: 1.77.4;
- Rewrite parts of pfil(9): use array to store hooks and thus be more cache
friendly (there are only few hooks in the system). Make the structures
opaque and the interface more strict.
- Remove PFIL_HOOKS option by making pfil(9) mandatory.


Revision tags: agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8
# 1.76 22-Mar-2012 wiz

branches: 1.76.2; 1.76.4;
Fix typo in kauth name. From PR 46234 by Matthew Mondor.
Tested by Geoff Adams and Ryo ONODERA.


# 1.75 13-Mar-2012 elad

Replace the remaining KAUTH_GENERIC_ISSUSER authorization calls with
something meaningful. All relevant documentation has been updated or
written.

Most of these changes were brought up in the following messages:

http://mail-index.netbsd.org/tech-kern/2012/01/18/msg012490.html
http://mail-index.netbsd.org/tech-kern/2012/01/19/msg012502.html
http://mail-index.netbsd.org/tech-kern/2012/02/17/msg012728.html

Thanks to christos, manu, njoly, and jmmv for input.

Huge thanks to pgoyette for spinning these changes through some build
cycles and ATF.


Revision tags: netbsd-6-0-6-RELEASE netbsd-6-1-5-RELEASE netbsd-6-1-4-RELEASE netbsd-6-0-5-RELEASE netbsd-6-1-3-RELEASE netbsd-6-0-4-RELEASE netbsd-6-1-2-RELEASE netbsd-6-0-3-RELEASE netbsd-6-1-1-RELEASE netbsd-6-0-2-RELEASE netbsd-6-1-RELEASE netbsd-6-1-RC4 netbsd-6-1-RC3 netbsd-6-1-RC2 netbsd-6-1-RC1 netbsd-6-0-1-RELEASE matt-nb6-plus-nbase netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base
# 1.74 19-Nov-2011 tls

branches: 1.74.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.73 23-May-2011 joerg

branches: 1.73.4;
simplify


Revision tags: bouyer-quota2-nbase bouyer-quota2-base jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.72 07-Dec-2010 pooka

branches: 1.72.2;
_KERNEL_TOP


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1 yamt-nfs-mp-base9 uebayasi-xip-base
# 1.71 19-Jan-2010 pooka

branches: 1.71.4;
Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


Revision tags: matt-premerge-20091211 yamt-nfs-mp-base8 yamt-nfs-mp-base7 jymxensuspend-base yamt-nfs-mp-base6 yamt-nfs-mp-base5 jym-xensuspend-nbase
# 1.70 17-May-2009 cegger

fix crash in bridge_ioctl():


BRDGGFLT and BRDGSFILT bridge controls are only available with BRIDGE_IPF and PFIL_HOOKS defined.
In amd64 GENERIC and XEN kernel configs PFIL_HOOKS is defined but BRIDGE_IPF is not.

When a BRDGGFLT or BRDGSFILT command comes in, then ifd->ifd_cmd is not in range
of bridge_control_table_size. Then bc is not set and is dereferenced
later => BOOM.


Revision tags: yamt-nfs-mp-base4 jym-xensuspend-base
# 1.69 12-May-2009 elad

Move kauth(9) call before going into splnet().

Mailing list reference:

http://mail-index.netbsd.org/tech-net/2009/05/08/msg001286.html


Revision tags: yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 nick-hppapmap-base
# 1.68 04-Apr-2009 bouyer

Fix another typo


# 1.67 04-Apr-2009 bouyer

Fix a comment, and make it build.


# 1.66 04-Apr-2009 bouyer

Fixes from Masao Uebayashi


# 1.65 04-Apr-2009 bouyer

Fix for if_start() and pfil_hook() being called from hardware interrupt
context (reported on various mailing-lists, and part of PR kern/41114,
causing panic in pf(4) and possibly ipf(4) when BRIDGE_IPF is used).
Defer bridge_forward() to a software interrupt; bridge_input() enqueues
mbufs to ifp->if_snd which is handled in bridge_forward().


Revision tags: nick-hppapmap-base2
# 1.64 18-Jan-2009 mrg

branches: 1.64.2;
Fix multiple problems:

* A sign extension error creating the bridge ID corrupted the
priority (always making it the maximum).
* Do not catch STP packets on an interface for which STP is not
enabled -- it's a violation of the spec, and causes STP to fail on
neighboring bridges.
* An optimization to bstp_input() -- some information is already
known when we call it.

contributed anonymously.


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.63 07-Nov-2008 dyoung

*** Summary ***

When a link-layer address changes (e.g., ifconfig ex0 link
02:de:ad:be:ef:02 active), send a gratuitous ARP and/or a Neighbor
Advertisement to update the network-/link-layer address bindings
on our LAN peers.

Refuse a change of ethernet address to the address 00:00:00:00:00:00
or to any multicast/broadcast address. (Thanks matt@.)

Reorder ifnet ioctl operations so that driver ioctls may inherit
the functions of their "class"---ether_ioctl(), fddi_ioctl(), et
cetera---and the class ioctls may inherit from the generic ioctl,
ifioctl_common(), but both driver- and class-ioctls may override
the generic behavior. Make network drivers share more code.

Distinguish a "factory" link-layer address from others for the
purposes of both protecting that address from deletion and computing
EUI64.

Return consistent, appropriate error codes from network drivers.

Improve readability. KNF.

*** Details ***

In if_attach(), always initialize the interface ioctl routine,
ifnet->if_ioctl, if the driver has not already initialized it.
Delete if_ioctl == NULL tests everywhere else, because it cannot
happen.

In the ioctl routines of network interfaces, inherit common ioctl
behaviors by calling either ifioctl_common() or whichever ioctl
routine is appropriate for the class of interface---e.g., ether_ioctl()
for ethernets.

Stop (ab)using SIOCSIFADDR and start to use SIOCINITIFADDR. In
the user->kernel interface, SIOCSIFADDR's argument was an ifreq,
but on the protocol->ifnet interface, SIOCSIFADDR's argument was
an ifaddr. That was confusing, and it would work against me as I
make it possible for a network interface to overload most ioctls.
On the protocol->ifnet interface, replace SIOCSIFADDR with
SIOCINITIFADDR. In ifioctl(), return EPERM if userland tries to
invoke SIOCINITIFADDR.

In ifioctl(), give the interface the first shot at handling most
interface ioctls, and give the protocol the second shot, instead
of the other way around. Finally, let compatibility code (COMPAT_OSOCK)
take a shot.

Pull device initialization out of switch statements under
SIOCINITIFADDR. For example, pull ..._init() out of any switch
statement that looks like this:

switch (...->sa_family) {
case ...:
..._init();
...
break;
...
default:
..._init();
...
break;
}

Rewrite many if-else clauses that handle all permutations of IFF_UP
and IFF_RUNNING to use a switch statement,

switch (x & (IFF_UP|IFF_RUNNING)) {
case 0:
...
break;
case IFF_RUNNING:
...
break;
case IFF_UP:
...
break;
case IFF_UP|IFF_RUNNING:
...
break;
}

unifdef lots of code containing #ifdef FreeBSD, #ifdef NetBSD, and
#ifdef SIOCSIFMTU, especially in fwip(4) and in ndis(4).

In ipw(4), remove an if_set_sadl() call that is out of place.

In nfe(4), reuse the jumbo MTU logic in ether_ioctl().

Let ethernets register a callback for setting h/w state such as
promiscuous mode and the multicast filter in accord with a change
in the if_flags: ether_set_ifflags_cb() registers a callback that
returns ENETRESET if the caller should reset the ethernet by calling
if_init(), 0 on success, != 0 on failure. Pull common code from
ex(4), gem(4), nfe(4), sip(4), tlp(4), vge(4) into ether_ioctl(),
and register if_flags callbacks for those drivers.

Return ENOTTY instead of EINVAL for inappropriate ioctls. In
zyd(4), use ENXIO instead of ENOTTY to indicate that the device is
not any longer attached.

Add to if_set_sadl() a boolean 'factory' argument that indicates
whether a link-layer address was assigned by the factory or some
other source. In a comment, recommend using the factory address
for generating an EUI64, and update in6_get_hw_ifid() to prefer a
factory address to any other link-layer address.

Add a routing message, RTM_LLINFO_UPD, that tells protocols to
update the binding of network-layer addresses to link-layer addresses.
Implement this message in IPv4 and IPv6 by sending a gratuitous
ARP or a neighbor advertisement, respectively. Generate RTM_LLINFO_UPD
messages on a change of an interface's link-layer address.

In ether_ioctl(), do not let SIOCALIFADDR set a link-layer address
that is broadcast/multicast or equal to 00:00:00:00:00:00.

Make ether_ioctl() call ifioctl_common() to handle ioctls that it
does not understand.

In gif(4), initialize if_softc and use it, instead of assuming that
the gif_softc and ifp overlap.

Let ifioctl_common() handle SIOCGIFADDR.

Sprinkle rtcache_invariants(), which checks on DIAGNOSTIC kernels
that certain invariants on a struct route are satisfied.

In agr(4), rewrite agr_ioctl_filter() to be a bit more explicit
about the ioctls that we do not allow on an agr(4) member interface.

bzero -> memset. Delete unnecessary casts to void *. Use
sockaddr_in_init() and sockaddr_in6_init(). Compare pointers with
NULL instead of "testing truth". Replace some instances of (type
*)0 with NULL. Change some K&R prototypes to ANSI C, and join
lines.


Revision tags: netbsd-5-0-RC3 netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 wrstuden-revivesa-base-1 simonb-wapbl-nbase yamt-pf42-base4 simonb-wapbl-base wrstuden-revivesa-base
# 1.62 15-Jun-2008 christos

branches: 1.62.2; 1.62.4; 1.62.6;
- add if_alloc (ours just mallocs), and if_initname and use them (from FreeBSD)
- kill memsets where M_ZERO can be used.


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base
# 1.61 15-Apr-2008 thorpej

branches: 1.61.2; 1.61.4; 1.61.6; 1.61.8;
Make ip6 and icmp6 stats per-cpu.


# 1.60 12-Apr-2008 cegger

make this build with BRIDGE_IPF and PFIL_HOOKS options


# 1.59 12-Apr-2008 thorpej

Make IP, TCP, UDP, and ICMP statistics per-CPU. The stats are collated
when the user requests them via sysctl.


# 1.58 08-Apr-2008 thorpej

Change IPv6 stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ip6stat structure; old netstat
binaries will continue to work properly.


# 1.57 07-Apr-2008 thorpej

Change IP stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ipstat structure; old netstat
binaries will continue to work properly.


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase nick-net80211-sync-base keiichi-mipv6-base matt-armv6-nbase hpcarm-cleanup-base
# 1.56 20-Feb-2008 matt

branches: 1.56.6;
s/u_\(int[0-9]*_t\)/u\1/g
(change u_int*_t to uint*_t)


Revision tags: bouyer-xeni386-nbase bouyer-xeni386-base mjf-devfs-base
# 1.55 19-Jan-2008 dyoung

Use C99 array initializers for bridge_control_table[].


Revision tags: nick-csl-alignment-base5 bouyer-xeni386-merge1 matt-armv6-prevmlocking vmlocking2-base3 yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 jmcneill-base bouyer-xenamd64-base2 vmlocking-nbase yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base matt-armv6-base jmcneill-pm-base reinoud-bufcleanup-base vmlocking-base
# 1.54 27-Aug-2007 dyoung

branches: 1.54.2; 1.54.8; 1.54.14;
LLADDR -> CLLADDR.


# 1.53 26-Aug-2007 dyoung

Constify: LLADDR -> CLLADDR. I'm aiming here to make it easier to
identify sockaddr_dl abuse that remains in the kernel, especially
the potential for overwriting memory past the end of a sockaddr_dl
with, e.g., memcpy(LLADDR(), ...).

Use sockaddr_dl_setaddr() in a few places.


Revision tags: matt-mips64-base nick-csl-alignment-base mjf-ufs-trans-base
# 1.52 09-Jul-2007 ad

branches: 1.52.2; 1.52.6;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.51 12-Mar-2007 ad

branches: 1.51.2;
Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.


# 1.50 04-Mar-2007 christos

branches: 1.50.2;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.49 21-Feb-2007 dyoung

Use __arraycount().


# 1.48 17-Feb-2007 dyoung

KNF: de-__P, bzero -> memset, bcmp -> memcmp. Remove extraneous
parentheses in return statements.

Cosmetic: don't open-code TAILQ_FOREACH().

Cosmetic: change types of variables to avoid oodles of casts: in
in6_src.c, avoid casts by changing several route_in6 pointers
to struct route pointers. Remove unnecessary casts to caddr_t
elsewhere.

Pave the way for eliminating address family-specific route caches:
soon, struct route will not embed a sockaddr, but it will hold
a reference to an external sockaddr, instead. We will set the
destination sockaddr using rtcache_setdst(). (I created a stub
for it, but it isn't used anywhere, yet.) rtcache_free() will
free the sockaddr. I have extracted from rtcache_free() a helper
subroutine, rtcache_clear(). rtcache_clear() will "forget" a
cached route, but it will not forget the destination by releasing
the sockaddr. I use rtcache_clear() instead of rtcache_free()
in rtcache_update(), because rtcache_update() is not supposed
to forget the destination.

Constify:

1 Introduce const accessor for route->ro_dst, rtcache_getdst().

2 Constify the 'dst' argument to ifnet->if_output(). This
led me to constify a lot of code called by output routines.

3 Constify the sockaddr argument to protosw->pr_ctlinput. This
led me to constify a lot of code called by ctlinput routines.

4 Introduce const macros for converting from a generic sockaddr
to family-specific sockaddrs, e.g., sockaddr_in: satocsin6,
satocsin, et cetera.


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.47 04-Jan-2007 elad

branches: 1.47.2;
Consistent usage of KAUTH_GENERIC_ISSUSER.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3 netbsd-4-base
# 1.46 23-Nov-2006 rpaulo

New EtherIP driver based on tap(4) and gif(4) by Hans Rosenfeld.
Notable changes:
* Fixes PR 34268.
* Separates the code from gif(4) (which is more cleaner).
* Allows the usage of STP (Spanning Tree Protocol).
* Removed EtherIP implementation from gif(4)/tap(4).

Some input from Christos.


# 1.45 16-Nov-2006 christos

__unused removal on arguments; approved by core.


Revision tags: yamt-splraiseipl-base2
# 1.44 17-Oct-2006 dogcow

now that we have -Wno-unused-parameter, back out all the tremendously ugly
code to gratuitously access said parameters.


# 1.43 13-Oct-2006 dogcow

More -Wunused fallout. sprinkle __unused when possible; otherwise, use the
do { if (&x) {} } while (/* CONSTCOND */ 0);
construct as suggested by uwe in <20061012224845.GA9449@snark.ptc.spbu.ru>.


# 1.42 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


# 1.41 05-Oct-2006 tls

Protect calls to pool_put/pool_get that may occur in interrupt context
with spl used to protect other allocations and frees, or datastructure
element insertion and removal, in adjacent code.

It is almost unquestionably the case that some of the spl()/splx() calls
added here are superfluous, but it really seems wrong to see:

s=splfoo();
/* frob data structure */
splx(s);
pool_put(x);

and if we think we need to protect the first operation, then it is hard
to see why we should not think we need to protect the next. "Better
safe than sorry".

It is also almost unquestionably the case that I missed some pool
gets/puts from interrupt context with my strategy for finding these
calls; use of PR_NOWAIT is a strong hint that a pool may be used from
interrupt context but many callers in the kernel pass a "can wait/can't
wait" flag down such that my searches might not have found them. One
notable area that needs to be looked at is pf.

See also:

http://mail-index.netbsd.org/tech-kern/2006/07/19/0003.html
http://mail-index.netbsd.org/tech-kern/2006/07/19/0009.html


Revision tags: abandoned-netbsd-4-base yamt-splraiseipl-base yamt-pdpolicy-base9 yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.40 23-Jul-2006 ad

branches: 1.40.4; 1.40.6;
Use the LWP cached credentials where sane.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.39 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.38 18-May-2006 liamjfoy

branches: 1.38.2;
Integrate Common Address Redundancy Procotol (CARP) from OpenBSD

'pseudo-device carp'

Thanks to: joerg@ christos@ riz@ and others who tested
Ok: core@


# 1.37 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base yamt-pdpolicy-base2 elad-kernelauth-base yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.36 17-Jan-2006 christos

branches: 1.36.2; 1.36.4; 1.36.6; 1.36.8; 1.36.10;
Make sure that breq is also cleared (from Xin LI)


# 1.35 09-Jan-2006 christos

Make sure we initialize all structs to 0; from Xin LI


# 1.34 24-Dec-2005 perry

branches: 1.34.2;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.33 11-Dec-2005 thorpej

ANSI function decls and application of static.


# 1.32 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.31 01-Jun-2005 jdc

branches: 1.31.2;
Fix this properly by renaming the conflicting variables.


# 1.30 01-Jun-2005 jdc

Remove extraneous definition of struct llc (found by shadow warning).


Revision tags: netbsd-3-0-RELEASE netbsd-3-0-RC6 netbsd-3-0-RC5 netbsd-3-0-RC4 netbsd-3-0-RC3 netbsd-3-0-RC2 netbsd-3-0-RC1 yamt-km-base4 yamt-km-base3 netbsd-3-base kent-audio2-base
# 1.29 26-Feb-2005 perry

branches: 1.29.2; 1.29.4;
nuke trailing whitespace


Revision tags: yamt-km-base2
# 1.28 31-Jan-2005 kim

Add RFC 3378 EtherIP support, ported from OpenBSD to NetBSD by
Hans Rosenfeld (rosenfeld at grumpf.hope-2000.org)

This change makes it possible to add gif interfaces to bridges, which
will then send and receive IP protocol 97 packets. Packets are Ethernet
frames with an EtherIP header prepended.


Revision tags: yamt-km-base kent-audio1-beforemerge kent-audio1-base
# 1.27 04-Dec-2004 peter

branches: 1.27.4; 1.27.6;
Change ifc_destroy to return an int instead of void, so that it
can pass back errors to ifconfig.


# 1.26 06-Oct-2004 bad

Interfaces that do checksum offloading indicate the checksum status of
received packets in csum_flags in the packet header. Packets that are
forwarded over the bridge need to have csum_flags cleared before being
put on the output queue. Do so in bridge_enqueue().

Discussed with Jason Thorpe.

Fixes PR kern/27007 and the first part of PR kern/21831.


# 1.25 05-Oct-2004 christos

Only enable BRIDGE_IPF code if PFIL_HOOKS is enabled.


# 1.24 21-Apr-2004 itojun

kill a sprintf


# 1.23 21-Apr-2004 itojun

kill sprintf, use snprintf


Revision tags: netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.22 31-Jan-2004 jdc

branches: 1.22.2;
Use m_copydata(), m_adj() and M_PREPEND() to manipulate mbuf's in
bridge_ipf(). Fixes kernel memory corruption that occured when using
m_split() and m_cat().
Idea from OpenBSD.


# 1.21 09-Dec-2003 augustss

Fix spelling mistake in a comment.


# 1.20 28-Oct-2003 mycroft

Mark this initializer in the canonical way so it can be found later.


# 1.19 25-Oct-2003 christos

Fix uninitialized variable warnings


# 1.18 16-Sep-2003 jdc

Add a flag parameter to bridge_enqueue() to tell it whether to run the
filter or not. We only need to run the filter for bridge_forward() and
bridge_broadcast(). If we also run it for bridge_output(), we will run
the filter twice outbound per packet, so don't.

In bridge_ipf(), make sure we don't run m_cat() on a single mbuf chain
by checking to see (and remembering) if we need to m_split() the mbuf.
This fixes bridge + ipfilter on sparc.

Fixes PR kern/22063.


# 1.17 11-Aug-2003 itojun

rm extra blank line


# 1.16 13-Jul-2003 jdc

Include opt_inet.h to get INET6 definition.
Now, bridged ipv6 packets are passed through ipfilter.
However, some v6 packets still do not get transmitted when ipf is enabled.
Partial fix for PR kern/22063.


# 1.15 23-Jun-2003 martin

branches: 1.15.2;
Make sure to include opt_foo.h if a defflag option FOO is used.


# 1.14 24-May-2003 kristerw

Make sure splx() is called for all bridge_ioctl() error cases.


# 1.13 16-May-2003 itojun

use strlcpy


# 1.12 14-May-2003 itojun

use arc4random


# 1.11 19-Mar-2003 bouyer

Fix 2 bugs:
- initialise stp when the bridge is turned up, without this stp will keep
all interfaces disabled in a sequence like:
brconfig bridge0 add if0 add if1 stp if0 stp if1 up
- s/BRDGSPRI/BRDGSIFPRIO in brconfig.c:cmd_ifpriority()

add a command (ifpathcost) to change the stp path cost of the STP path cost of
an interface. Display the interface path cost with the others STP parameters.


# 1.10 27-Feb-2003 perseant

Make BRIDGE_IPF an option, and document it. Add it (commented) to GENERIC.
Let brconfig tell whether the bridge is using the ipfilter hook, or not.


# 1.9 15-Feb-2003 perseant

Add ipf packet-filtering option to if_bridge. The option is controlled at
compile-time by BRIDGE_IPF, and at runtime by brconfig with the {ipf,-ipf}
option on a per-bridge basis.

As a side-effect, add PFIL_HOOKS processing to if_bridge.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base kqueue-aftermerge kqueue-beforemerge gehenna-devsw-base kqueue-base
# 1.8 24-Aug-2002 martin

Add a function to lookup bridge members by struct ifnet * and use
it at all call sites that have such a pointer readily available.
This avoids unnecessary strcmp()s in critical paths, and removes
some XXX comments.


# 1.7 08-Jun-2002 itojun

reject "add" request if if_mtu is different.


# 1.6 23-May-2002 itojun

use IFT_BRIDGE


Revision tags: netbsd-1-6-base
# 1.5 24-Mar-2002 jdolecek

branches: 1.5.2; 1.5.4;
Fix a memory leak in bridge_ioctl_add() when the called for non-ethernet
interface.
Problem noted and fix provided by in kern/16019 by Love.


Revision tags: eeh-devprop-base newlock-base
# 1.4 08-Mar-2002 thorpej

Pool deals fairly well with physical memory shortage, but it doesn't
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map). Try to deal with this:

* Group all information about the backend allocator for a pool in a
separate structure. The pool references this structure, rather than
the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
to become available, but will still fail if it cannot callocate KVA
space for the pages. If this happens, carefully drain all pools using
the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
some pages, and use that information to make draining easier and more
efficient.
* Get rid of PR_URGENT. There was only one use of it, and it could be
dealt with by the caller.

From art@openbsd.org.


Revision tags: ifpoll-base
# 1.3 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.2 17-Aug-2001 thorpej

branches: 1.2.2; 1.2.4;
Only report expire time for DYNAMIC forwarding table entries.


# 1.1 17-Aug-2001 thorpej

Add support for building Ethernet bridges, based on Jason Wright's
bridge driver from OpenBSD, although the bridge code has been *heavily*
modified by me (the 802.1D code remains mostly unchanged from the
original).


# 1.177 02-Nov-2020 roy

bridge: revert prior

It's of little use.
If we need to do this in the future, consider a sysctl to do it for all
interfaces in the bridge and not just the one being added.


Revision tags: thorpej-futex-base
# 1.176 27-Sep-2020 roy

bridge: When an interface joins then mark addresses on it as tentative

The exact flow is detatch addresses, join bridge and then mark detached
addresses as tentative.
This ensures that Duplicate Address Detection for the joining interface
are performed across all members of the bridge.


# 1.175 27-Sep-2020 roy

bridge: Calculate link state as the best link state of any member

If any member is LINK_STATE_UP then it's LINK_STATE_UP.
Otherwise if any member is LINK_STATE_UNKNOWN then it's LINK_STATE_UNKNOWN.
Otherwise it's LINK_STATE_DOWN.


# 1.174 01-Aug-2020 maxv

Remove #ifdef BRIDGE_IPF, compile in the code by default. Sent to
tech-net@.


# 1.173 01-May-2020 jdolecek

report no enabled capabilities when no interface is part of bridge


# 1.172 30-Apr-2020 jdolecek

for bridge(4), report the common enabled capabilities of the members
via SIOCGIFCAP for visibility


# 1.171 27-Apr-2020 jdolecek

if MTU of the added interface doesn't match the bridge, modify the MTU
of the interface to that of the bridge instead of just refusing the
addition with EINVAL

this is a convenience feature to simplify bridge setup with non-standard
MTU, the useful behaviour observed with Linux xenbr


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406
# 1.170 27-Mar-2020 jdolecek

replace the conditional m_pullup() on start of bridge_output() with
a KASSERT(), to make it clear no mbuf manipulation is ever done here

the condition should never trigger, this always runs after ether_output()
M_PREPEND()s ether_header


# 1.169 24-Mar-2020 jdolecek

reset the csum_flags in bridge_brodcast() also for bmcast path

for destination interfaces with real hardware offloading this fixes
multicast packet corruption; for xvif(4) this fix stops treating them
as having no csum

may fix PR kern/42386


Revision tags: ad-namecache-base3
# 1.168 24-Feb-2020 rin

Remove debug printf I put into bridge_calc_csum_flags().
Sorry for noise.


# 1.167 23-Feb-2020 jdolecek

disable the DEBUG bridge_calc_csum_flags() printf


# 1.166 29-Jan-2020 thorpej

Adopt <net/if_stats.h>.


Revision tags: ad-namecache-base2 ad-namecache-base1 ad-namecache-base phil-wifi-20191119
# 1.165 05-Aug-2019 msaitoh

branches: 1.165.2;
Cast uint32_t to avoid undefined behavior in bridge_rthash(). Found by kUBSan.


Revision tags: netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609 isaki-audio2-base pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226
# 1.164 22-Dec-2018 rin

branches: 1.164.4;
Take the interface out of promiscuous mode in bridge_delete_member()
instead of bridge_ioctl_del(). Otherwise, the member interfaces are
left in promiscuous mode when the bridge is destroyed.


# 1.163 15-Dec-2018 rin

Improve wording in comments: replace "chain" with "queue" for
sequence of mbuf's connected by m_nextpkt, in order to avoid
confusion with those connected by m_next.

No binary changes.


# 1.162 14-Dec-2018 martin

Need <netinet6/ip6_var.h> for ip6_statinc() prototype.


# 1.161 12-Dec-2018 rin

PR kern/53562

Handle TX offload in software when a packet is sent via
bridge_output(). We can send it as is in the following
exceptional cases:

For unicast:

(1) When the destination interface is the same as source.

(2) When the destination supports all TX offload options
specified in a packet.

For multicast/broadcast:

(3) When all the members of the bridge support the specified
TX offload options.

For (3), add sc_csum_flags_tx flag to bridge softc, which is
logical AND b/w capabilities of TX offload options in member
interface (ifp->if_csum_flags_tx). The flag is updated when a
member is (i) added to or (ii) removed from a bridge, or (iii)
if_csum_flags_tx flag of a member interface is manipulated via
ifconfig(8).

Turn on M_CSUM_TSOv[46] bit in ifp->if_csum_flags_tx flag when
TSO[46] is enabled for that interface.

OK msaitoh thorpej


Revision tags: pgoyette-compat-1126
# 1.160 09-Nov-2018 ozaki-r

Fix that brconfig <bridge> (addr) can't show a large number of MAC addresses

The command shows only 256 addresses at maximum even if a bridge caches more
addresses. It occurs because the kernel doesn't return an error if the command
passes a short buffer that can't store all cached addresses; the kernel fills
cached addresses as much as possible and returns it without telling that the
result is truncated.

Fix the issue by telling a required size of a buffer if a buffer passed from the
command is not enough, which lets the command retry with an enough buffer.

Reported by k-goda@IIJ


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930
# 1.159 19-Sep-2018 msaitoh

Micro optimization. m_copym(M_COPYALL) -> m_copypacket().


# 1.158 18-Sep-2018 msaitoh

- Fix bridge_enqueue() which was broken by last commit. Use correct mbuf
pointer.
- Modify comment.


# 1.157 14-Sep-2018 msaitoh

Fix a bug that bridge_enqueue() incorrectly cleared outgoing packet's offload
flags. bridge_enqueue() is called from bridge_output() when a packet is
spontaneous. Clear csum_flags before calling brige_enqueue() in
bridge_forward() or bridge_broadcast() instead of in the beginning of
bridge_enqueue().

Note that this change doesn't fix a problem on the following configuration:

A bridge has two or more interfaces.

An address is assigned to an bridge member interface and
some offload flags are set.

Another interface has no address and has no any offload flag.

XXX pullup-[78]


Revision tags: pgoyette-compat-0906 pgoyette-compat-0728 phil-wifi-base pgoyette-compat-0625
# 1.156 25-May-2018 ozaki-r

branches: 1.156.2;
Ensure to call if_register after interface initializations finish


Revision tags: pgoyette-compat-0521
# 1.155 14-May-2018 ozaki-r

Protect packet input routines with KERNEL_LOCK and splsoftnet

if_input, i.e, ether_input and friends, now runs in softint without any
protections. It's ok for ether_input itself because it's already MP-safe,
however, subsequent routines called from it such as carp_input and agr_input
aren't safe because they're not MP-safe. Protect if_input with KERNEL_LOCK.

if_input can be called from a normal LWP context. In that case we need to
prevent interrupts (softint) from running by splsoftnet to protect non-MP-safe
codes (e.g., carp_input and agr_input).

Pointed out by mlelstv@


Revision tags: pgoyette-compat-0502 pgoyette-compat-0422
# 1.154 18-Apr-2018 ozaki-r

Add missing PSLIST_ENTRY_INIT and PSLIST_ENTRY_DESTROY


# 1.153 18-Apr-2018 ozaki-r

Get rid of a unnecessary semicolon

Pointed out by kamil@


# 1.152 18-Apr-2018 ozaki-r

bridge: use pslist(9) for rtlist and rthash

The change fixes race conditions on list operations. One example is that a
reader may see invalid pointers on a looking item in a list due to lack of
membar_producer.


# 1.151 18-Apr-2018 ozaki-r

Simplify bridge_rtnode_insert (NFC)


# 1.150 18-Apr-2018 ozaki-r

Remove obsolete NULL checks


Revision tags: pgoyette-compat-0415
# 1.149 10-Apr-2018 ozaki-r

Fix bridge_rtdelete

It removes a rtable entry that belongs to a specified interface, however, its
original behavior was to delete all belonging entries. Restore the original
behavior.


Revision tags: pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.148 15-Jan-2018 maxv

branches: 1.148.2;
If the bridge is not running, don't call bridge_stop. Otherwise the
following commands will crash the kernel:

ifconfig bridge0 create
ifconfig bridge0 destroy


# 1.147 28-Dec-2017 ozaki-r

Ensure the timer isn't running by using workqueue_wait


# 1.146 19-Dec-2017 ozaki-r

Don't set IFEF_MPSAFE unless NET_MPSAFE at this point

Because recent investigations show that interfaces with IFEF_MPSAFE need to
follow additional restrictions to work with the flag safely. We should enable it
on an interface by default only if the interface surely satisfies the
restrictions, which are described in if.h.

Note that enabling IFEF_MPSAFE solely gains a few benefit on performance because
the network stack is still serialized by the big kernel locks by default.


# 1.145 11-Dec-2017 ozaki-r

Wrap if_ioctl_lock with IFNET_* macros (NFC)

Also if_ioctl_lock perhaps needs to be renamed to something because it's now
not just for ioctl...


# 1.144 08-Dec-2017 ozaki-r

Fix build of kernels without ether

By throwing out if_enable_vlan_mtu and if_disable_vlan_mtu that
created a unnecessary dependency from if.c to if_ethersubr.c.

PR kern/52790


# 1.143 06-Dec-2017 ozaki-r

Ensure to not turn on IFF_RUNNING of an interface until its initialization completes

And ensure to turn off it before destruction as per IFF_RUNNING's description
"resource allocated". (The description is a bit doubtful though, I believe the
change is still proper.)


# 1.142 06-Dec-2017 ozaki-r

Ensure to hold if_ioctl_lock when calling if_flags_set


Revision tags: tls-maxphys-base-20171202
# 1.141 17-Nov-2017 ozaki-r

Add missing IFEF_NO_LINK_STATE_CHANGE to bridge


# 1.140 16-Nov-2017 ozaki-r

Unify IFEF_*_MPSAFE into IFEF_MPSAFE

There are already two flags for if_output and if_start, however, it seems such
MPSAFE flags are eventually needed for all if_XXX operations. Having discrete
flags for each operation is wasteful of if_extflags bits. So let's unify
the flags into one: IFEF_MPSAFE.

Fortunately IFEF_*_MPSAFE flags have never been included in any releases, so
we can change them without breaking backward compatibility of the releases
(though the kernel version of -current should be bumped).

Note that if an interface have both MP-safe and non-MP-safe operations at a
time, we have to set the IFEF_MPSAFE flag and let callees of non-MP-safe
opeartions take the kernel lock.

Proposed on tech-kern@ and tech-net@


# 1.139 15-Nov-2017 ozaki-r

Mark callouts of bridge CALLOUT_MPSAFE


# 1.138 25-Oct-2017 ozaki-r

Remove unnecessary splsoftnet


# 1.137 25-Oct-2017 ozaki-r

Don't free sc_rthash twice


# 1.136 23-Oct-2017 msaitoh

- If if_initialize() failed in the attach function, free resources and return.
- Add some missing frees in bridge_clone_destroy().
- KNF


# 1.135 02-Oct-2017 ozaki-r

Add curlwp_bind to bridge_input for psref

It can be called in a thread context via tap (tap_dev_write).

Fix PR kern/52587


Revision tags: nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320
# 1.134 07-Mar-2017 ozaki-r

branches: 1.134.6;
Remove unnecessary splnet for bridge_enqueue

bridge_enqueue now uses if_transmit_lock that does splnet for device
drivers, so splnet for bridge_enqueue isn't needed anymore.


# 1.133 16-Feb-2017 knakahara

add l2tp(4) L2TPv3 interface.

originally implemented by IIJ SEIL team.


Revision tags: nick-nhusb-base-20170204
# 1.132 23-Jan-2017 ozaki-r

Replace some splnet with splsoftnet


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107 nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004
# 1.131 15-Sep-2016 christos

branches: 1.131.2;
Always do the mbuf checks. The packet filters (npf) expect the mbuf to be
pulled-up. (Krists Krilovs)


Revision tags: localcount-20160914
# 1.130 29-Aug-2016 ozaki-r

KNF; replace white spaces with hard tabs

No functional change.


Revision tags: pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.129 22-Jun-2016 knakahara

branches: 1.129.2;
fix: locking about IFQ_ENQUEUE and ALTQ

- If NET_MPSAFE is not defined, IFQ_LOCK is nop. Currently, that means
IFQ_ENQUEUE() of some paths such as bridge_enqueue() is called parallel
wrongly.
- If ALTQ is enabled, Tx processing should call if_transmit() (= IFQ_ENQUEUE
+ ifp->if_start()) instead of ifp->if_transmit() to call ALTQ_ENQUEUE()
and ALTQ_DEQUEUE().
Furthermore, ALTQ processing is always required KERNEL_LOCK currently.


# 1.128 20-Jun-2016 knakahara

fix: should not assert IFEF_OUTPUT_MPSAFE in bridge_output()


# 1.127 20-Jun-2016 knakahara

tentative fix for ATF(net/if_bridge/t_bridge)


# 1.126 20-Jun-2016 knakahara

make bridge_output MP-safe, so that bridge(4) can enable IFEF_OUTPUT_MPSAFE.

making MP-scalable is future work.


# 1.125 10-Jun-2016 ozaki-r

Avoid storing a pointer of an interface in a mbuf

Having a pointer of an interface in a mbuf isn't safe if we remove big
kernel locks; an interface object (ifnet) can be destroyed anytime in any
packet processing and accessing such object via a pointer is racy. Instead
we have to get an object from the interface collection (ifindex2ifnet) via
an interface index (if_index) that is stored to a mbuf instead of an
pointer.

The change provides two APIs: m_{get,put}_rcvif_psref that use psref(9)
for sleep-able critical sections and m_{get,put}_rcvif that use
pserialize(9) for other critical sections. The change also adds another
API called m_get_rcvif_NOMPSAFE, that is NOT MP-safe and for transition
moratorium, i.e., it is intended to be used for places where are not
planned to be MP-ified soon.

The change adds some overhead due to psref to performance sensitive paths,
however the overhead is not serious, 2% down at worst.

Proposed on tech-kern and tech-net.


# 1.124 10-Jun-2016 ozaki-r

Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.123 16-May-2016 ozaki-r

Apply if_get and if_put to bridge(4)


# 1.122 04-May-2016 roy

Allow multicast/broadcast packets from a bridge member to other members.
Note this should just call bridge_broadcast when more locking issues are
resolved.


# 1.121 28-Apr-2016 knakahara

introduce new ifnet MP-scalable sending interface "if_transmit".


# 1.120 28-Apr-2016 ozaki-r

Constify rtentry of if_output

We no longer need to change rtentry below if_output.

The change makes it clear where rtentries are changed (or not)
and helps forthcoming locking (os psrefing) rtentries.


# 1.119 24-Apr-2016 christos

CID 1358673: dead code


Revision tags: nick-nhusb-base-20160422
# 1.118 22-Apr-2016 roy

Change used from int to bool.
If used, abort the loop because we think we're already at the end.


# 1.117 20-Apr-2016 knakahara

IFQ_ENQUEUE refactor (3/3) : eliminate pktattr argument from IFQ_ENQUEUE caller


# 1.116 20-Apr-2016 knakahara

IFQ_ENQUEUE refactor (2/3) : eliminate pktattr argument from altq implemantation


# 1.115 19-Apr-2016 ozaki-r

Apply psref(9) to bridge(4)

Note that there is an issue that ioctls for an interface and a destruction
of the interface can run in parallel and it causes race conditions on
bridge as well (it rarely happens). The issue will be addressed in the
interface common code (if.c).


# 1.114 19-Apr-2016 ozaki-r

Remove BRIDGE_MPSAFE switch and enable MP-safe code by default

We need to enable it by default because bridge_input now runs
in softint, but bridge_input w/o BRIDGE_MPSAFE was designed as
it runs in hardware interrupt.

Note that there remains a racy code in bridge_output; it will be
solved in the upcoming change (applying psref(9)).


# 1.113 11-Apr-2016 ozaki-r

Fix usage of pslist(9)

Pointed out by riastradh@.


# 1.112 11-Apr-2016 ozaki-r

Use pslist(9) in bridge(4)

This adds missing memory barriers to list operations for pserialize.


# 1.111 28-Mar-2016 ozaki-r

Remove unused global bridge list

Pointed out by riastradh@


# 1.110 23-Mar-2016 ozaki-r

Fix LIST_FOREACH argument


# 1.109 23-Mar-2016 ozaki-r

Use LIST_FOREACH instead of LIST_FOREACH_SAFE

No need to use *_SAFE because we don't remove any items in the loop.


Revision tags: nick-nhusb-base-20160319
# 1.108 15-Feb-2016 ozaki-r

Simplify bridge(4)

Thanks to introducing softint-based if_input, the entire bridge code now
never run in hardware interrupt context. So we can simplify the code.

- Remove spin mutexes
- They were needed because some code of bridge could run in
hardware interrupt context
- We now need only an adaptive mutex for each shared object
(a member list and a forwarding table)
- Remove pktqueue
- bridge_input is already in softint, using another softint
(for bridge_forward) is useless
- Packet distribution should be down at device drivers


# 1.107 10-Feb-2016 ozaki-r

Don't share struct work, instead have one per softc

Pointed out by riastradh@


# 1.106 09-Feb-2016 ozaki-r

Introduce softint-based if_input

This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.

This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.

To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.

Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html

Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!


Revision tags: nick-nhusb-base-20151226
# 1.105 19-Nov-2015 christos

Add handling of VLAN packets in if_bridge where the parent interface supports
them (Jean-Jacques.Puig@espci.fr). Factor out the vlan_mtu enabling and
disabling code.


# 1.104 20-Oct-2015 maxv

Harmless alloc inconsistency; make sure the exact same argument is given to
kmem_alloc/kmem_free. Found by Brainy.


# 1.103 07-Oct-2015 ozaki-r

Enqueue frames to a curcpu's pktqueue

Currently RX can run on a CPU other than CPU#0, so always enqueuing
to a pktqueue of CPU#0 makes no sense. Let's use a curcpu's pktqueue,
although bridge_foward softint doesn't run in parallel without
NET_MPSAFE.

This is a temporal solution. We need a fundamental solution.


Revision tags: nick-nhusb-base-20150921
# 1.102 28-Aug-2015 rjs

Don't set M_PROTO1 in mbuf flags.

This was left over from the old usage of gif(4) with bridges.


# 1.101 20-Aug-2015 christos

include "ioconf.h" to get the 'void <driver>attach(int count);' prototype.


# 1.100 23-Jul-2015 ozaki-r

Fix PR 48104

So far bridge cannot receive frames via a member interface when the frames
come from another member interface. So when we assign an IP address to
a member interface, hosts connected to another member interface cannot
ping to the IP address. That behavior isn't expected. See PR 48104 for
more realistic examples of this issue.

The change does:
- drop M_PROMISC before ether_input, which allows a bridge member interface
to receive a frame coming from another bridge member interface
- receive broadcast/multicast frames via all bridge member interfaces,
which is required to receive IPv6 multicast packets destined to a
multicast group belonging to a bridge member interface that is different
from a packet arrival interface

roy@ helped testing of the fix, thanks!


Revision tags: nick-nhusb-base-20150606
# 1.99 01-Jun-2015 matt

Modify the BRDGGIFS and BRDGRTS cmds to be more COMPAT_NETBSD32 friendly.
(XXX whitespace)


# 1.98 16-Apr-2015 ozaki-r

Fix racy bridge_delete_member

It can be called from bridge_ioctl_del and bridge_clone_destroy with
a same bridge member (bif) at the same time. We have to prevent
that happens.

Pointed out by riastradh@


Revision tags: nick-nhusb-base-20150406
# 1.97 08-Jan-2015 ozaki-r

Use pserialize for rtlist in bridge

This change enables lockless accesses to bridge rtable lists.
See locking notes in a comment to know how pserialize and
mutexes are used. Some functions are rearranged to use
pserialize. A workqueue is introduced to use pserialize in
bridge_rtage via bridge_timer callout.

As usual, pserialize and mutexes are used only when NET_MPSAFE
on. On the other hand, the newly added workqueue is used
regardless of NET_MPSAFE on or off.


# 1.96 01-Jan-2015 ozaki-r

Reset the expire time of a cache on receiving a frame for the cache

The expire time of a cache in a bridge MAC address table was never reset
once it is initialized regardless of traffic for the cache. The behavior
isn't supposed and active caches are unnecessarily expired and removed.

PR kern/49507


# 1.95 31-Dec-2014 ozaki-r

Use pserialize in bridge

This change enables lockless accesses to bridge member lists.
See locking notes in a comment to know how pserialize and
mutexes are used.

This change also provides support for softint-based interrupt
handling; pserialize readers can run in both HW interrupt and
softint contexts.

As usual, pserialize is used only when NET_MPSAFE on.


# 1.94 25-Dec-2014 ozaki-r

Use LIST_FOREACH_SAFE in bridge_rt* functions


# 1.93 24-Dec-2014 ozaki-r

Replace malloc/free with kmem_* in if_bridge

Additionally M_NOWAIT is replaced with KM_SLEEP.


# 1.92 22-Dec-2014 ozaki-r

Call ether_input/m_freem without holding a lock or referencing unnecessary objects

When NET_MPSAFE on, a bridge tries to pass up a packet to Layer 3
(or call m_freem) with holding a lock or referencing unnecessary
objects. That causes random lock ups. The change fixes the issue.


Revision tags: nick-nhusb-base
# 1.91 15-Aug-2014 ozaki-r

branches: 1.91.2;
bridge: reject non-IFF_SIMPLEX interfaces

bridge does not work with !IFF_SIMPLEX interfaces (PR/18035);
the bug is not yet fixed. Until it gets fixed, we should
reject non-IFF_SIMPLEX interfaces.

Discussed with pooka@


Revision tags: netbsd-7-1-2-RELEASE netbsd-7-1-1-RELEASE netbsd-7-1-RELEASE netbsd-7-1-RC2 netbsd-7-nhusb-base-20170116 netbsd-7-1-RC1 netbsd-7-0-2-RELEASE netbsd-7-nhusb-base netbsd-7-0-1-RELEASE netbsd-7-0-RELEASE netbsd-7-0-RC3 netbsd-7-0-RC2 netbsd-7-0-RC1 netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.90 23-Jul-2014 ozaki-r

branches: 1.90.2;
Avoid calling copyout with holding mutex(IPL_NET)

Because copyout may lead a page fault that may sleep, we have to pull it
out from the critical section of mutex(IPL_NET) in bridge_ioctl_gifs.


# 1.89 23-Jul-2014 ozaki-r

Add missing unlock


# 1.88 20-Jul-2014 ozaki-r

Don't return ENETRESET when ioctl SIOCSIFMTU

Otherwise, just changing MTU with ifconfig shows
a confusable error message.

RP kern/48996


# 1.87 14-Jul-2014 ozaki-r

Make bridge MPSAFE

- Introduce BRIDGE_MPSAFE
- It's enabled only when NET_MPSAFE is defined
in if.h or the kernel config
- Add iflist and rtlist mutex locks
- Locking iflist is performance sensitive,
so it's not used when !BRIDGE_MPSAFE
- Add bif object reference counting
- It enables fine-grain locking for bridge member lists
by allowing to not hold a lock during touching a bif
- bridge_release_member is added to decrement the
reference count
- A condition variable is added to do bridge_delete_member
gracefully
- Add if_bridgeif to ifnet
- It's a shortcut to a bif object of a bridge member
- It reduces a bif lookup cost and so lock contention on iflist
- Make bridgestp MPSAFE too


# 1.86 02-Jul-2014 ozaki-r

Protect bridge_list with a mutex


# 1.85 02-Jul-2014 ozaki-r

Remove obsolete codes for if_snd


# 1.84 23-Jun-2014 ozaki-r

Get rid of unnecessary xc_broadcast after pktq_barrier

Pointed out by rmind@


# 1.83 18-Jun-2014 ozaki-r

Restructure bridge_input and bridge_broadcast

There are two changes:
- Assemble the places calling pktq_enqueue (bridge_forward)
for unicast and {b,m}cast frames into one
- Receive {b,m}cast frames in bridge_broadcast, not in
bridge_input

The changes make the code clear and readable. bridge_input
now doesn't need to take care of {b,m}cast frames;
bridge_forward and bridge_broadcast have the responsibility.

The changes are based on a patch of Lloyd Parkes submitted
in PR 48104, but don't fix its issue yet.


# 1.82 18-Jun-2014 ozaki-r

Tidy up bridge_input

No functional change.


# 1.81 17-Jun-2014 ozaki-r

Restructure ether_input and bridge_input

The network stack of NetBSD is well organized and
layered. A packet reception is processed from a
lower layer to an upper layer one by one. However,
ether_input and bridge_input are not structured so.
bridge_input is called inside ether_input.

The new structure replaces ifnet#if_input of a bridge
member with bridge_input when the member is attached.
So a packet goes straight on a packet reception via
a bridge, bridge_input => ether_input => ip_input.

The change is part of a patch of Lloyd Parkes submitted
in PR 48104. Unlike the patch, the change doesn't
intend to change the behavior of the packet processing.
Another patch will fix PR 48104.


# 1.80 16-Jun-2014 ozaki-r

Add net.interfaces.bridgeN.fwdq.{maxlen,len,drops} sysctl


# 1.79 16-Jun-2014 ozaki-r

Use pktqueue for bridge forwarding queue and softint


# 1.78 15-Jun-2014 ozaki-r

Get rid of unnecessary splnet for pool_{get,put}

A mutex prevents interrupts in the functions now.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base
# 1.77 29-Jun-2013 rmind

branches: 1.77.4;
- Rewrite parts of pfil(9): use array to store hooks and thus be more cache
friendly (there are only few hooks in the system). Make the structures
opaque and the interface more strict.
- Remove PFIL_HOOKS option by making pfil(9) mandatory.


Revision tags: agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8
# 1.76 22-Mar-2012 wiz

branches: 1.76.2; 1.76.4;
Fix typo in kauth name. From PR 46234 by Matthew Mondor.
Tested by Geoff Adams and Ryo ONODERA.


# 1.75 13-Mar-2012 elad

Replace the remaining KAUTH_GENERIC_ISSUSER authorization calls with
something meaningful. All relevant documentation has been updated or
written.

Most of these changes were brought up in the following messages:

http://mail-index.netbsd.org/tech-kern/2012/01/18/msg012490.html
http://mail-index.netbsd.org/tech-kern/2012/01/19/msg012502.html
http://mail-index.netbsd.org/tech-kern/2012/02/17/msg012728.html

Thanks to christos, manu, njoly, and jmmv for input.

Huge thanks to pgoyette for spinning these changes through some build
cycles and ATF.


Revision tags: netbsd-6-0-6-RELEASE netbsd-6-1-5-RELEASE netbsd-6-1-4-RELEASE netbsd-6-0-5-RELEASE netbsd-6-1-3-RELEASE netbsd-6-0-4-RELEASE netbsd-6-1-2-RELEASE netbsd-6-0-3-RELEASE netbsd-6-1-1-RELEASE netbsd-6-0-2-RELEASE netbsd-6-1-RELEASE netbsd-6-1-RC4 netbsd-6-1-RC3 netbsd-6-1-RC2 netbsd-6-1-RC1 netbsd-6-0-1-RELEASE matt-nb6-plus-nbase netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base
# 1.74 19-Nov-2011 tls

branches: 1.74.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.73 23-May-2011 joerg

branches: 1.73.4;
simplify


Revision tags: bouyer-quota2-nbase bouyer-quota2-base jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.72 07-Dec-2010 pooka

branches: 1.72.2;
_KERNEL_TOP


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1 yamt-nfs-mp-base9 uebayasi-xip-base
# 1.71 19-Jan-2010 pooka

branches: 1.71.4;
Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


Revision tags: matt-premerge-20091211 yamt-nfs-mp-base8 yamt-nfs-mp-base7 jymxensuspend-base yamt-nfs-mp-base6 yamt-nfs-mp-base5 jym-xensuspend-nbase
# 1.70 17-May-2009 cegger

fix crash in bridge_ioctl():


BRDGGFLT and BRDGSFILT bridge controls are only available with BRIDGE_IPF and PFIL_HOOKS defined.
In amd64 GENERIC and XEN kernel configs PFIL_HOOKS is defined but BRIDGE_IPF is not.

When a BRDGGFLT or BRDGSFILT command comes in, then ifd->ifd_cmd is not in range
of bridge_control_table_size. Then bc is not set and is dereferenced
later => BOOM.


Revision tags: yamt-nfs-mp-base4 jym-xensuspend-base
# 1.69 12-May-2009 elad

Move kauth(9) call before going into splnet().

Mailing list reference:

http://mail-index.netbsd.org/tech-net/2009/05/08/msg001286.html


Revision tags: yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 nick-hppapmap-base
# 1.68 04-Apr-2009 bouyer

Fix another typo


# 1.67 04-Apr-2009 bouyer

Fix a comment, and make it build.


# 1.66 04-Apr-2009 bouyer

Fixes from Masao Uebayashi


# 1.65 04-Apr-2009 bouyer

Fix for if_start() and pfil_hook() being called from hardware interrupt
context (reported on various mailing-lists, and part of PR kern/41114,
causing panic in pf(4) and possibly ipf(4) when BRIDGE_IPF is used).
Defer bridge_forward() to a software interrupt; bridge_input() enqueues
mbufs to ifp->if_snd which is handled in bridge_forward().


Revision tags: nick-hppapmap-base2
# 1.64 18-Jan-2009 mrg

branches: 1.64.2;
Fix multiple problems:

* A sign extension error creating the bridge ID corrupted the
priority (always making it the maximum).
* Do not catch STP packets on an interface for which STP is not
enabled -- it's a violation of the spec, and causes STP to fail on
neighboring bridges.
* An optimization to bstp_input() -- some information is already
known when we call it.

contributed anonymously.


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.63 07-Nov-2008 dyoung

*** Summary ***

When a link-layer address changes (e.g., ifconfig ex0 link
02:de:ad:be:ef:02 active), send a gratuitous ARP and/or a Neighbor
Advertisement to update the network-/link-layer address bindings
on our LAN peers.

Refuse a change of ethernet address to the address 00:00:00:00:00:00
or to any multicast/broadcast address. (Thanks matt@.)

Reorder ifnet ioctl operations so that driver ioctls may inherit
the functions of their "class"---ether_ioctl(), fddi_ioctl(), et
cetera---and the class ioctls may inherit from the generic ioctl,
ifioctl_common(), but both driver- and class-ioctls may override
the generic behavior. Make network drivers share more code.

Distinguish a "factory" link-layer address from others for the
purposes of both protecting that address from deletion and computing
EUI64.

Return consistent, appropriate error codes from network drivers.

Improve readability. KNF.

*** Details ***

In if_attach(), always initialize the interface ioctl routine,
ifnet->if_ioctl, if the driver has not already initialized it.
Delete if_ioctl == NULL tests everywhere else, because it cannot
happen.

In the ioctl routines of network interfaces, inherit common ioctl
behaviors by calling either ifioctl_common() or whichever ioctl
routine is appropriate for the class of interface---e.g., ether_ioctl()
for ethernets.

Stop (ab)using SIOCSIFADDR and start to use SIOCINITIFADDR. In
the user->kernel interface, SIOCSIFADDR's argument was an ifreq,
but on the protocol->ifnet interface, SIOCSIFADDR's argument was
an ifaddr. That was confusing, and it would work against me as I
make it possible for a network interface to overload most ioctls.
On the protocol->ifnet interface, replace SIOCSIFADDR with
SIOCINITIFADDR. In ifioctl(), return EPERM if userland tries to
invoke SIOCINITIFADDR.

In ifioctl(), give the interface the first shot at handling most
interface ioctls, and give the protocol the second shot, instead
of the other way around. Finally, let compatibility code (COMPAT_OSOCK)
take a shot.

Pull device initialization out of switch statements under
SIOCINITIFADDR. For example, pull ..._init() out of any switch
statement that looks like this:

switch (...->sa_family) {
case ...:
..._init();
...
break;
...
default:
..._init();
...
break;
}

Rewrite many if-else clauses that handle all permutations of IFF_UP
and IFF_RUNNING to use a switch statement,

switch (x & (IFF_UP|IFF_RUNNING)) {
case 0:
...
break;
case IFF_RUNNING:
...
break;
case IFF_UP:
...
break;
case IFF_UP|IFF_RUNNING:
...
break;
}

unifdef lots of code containing #ifdef FreeBSD, #ifdef NetBSD, and
#ifdef SIOCSIFMTU, especially in fwip(4) and in ndis(4).

In ipw(4), remove an if_set_sadl() call that is out of place.

In nfe(4), reuse the jumbo MTU logic in ether_ioctl().

Let ethernets register a callback for setting h/w state such as
promiscuous mode and the multicast filter in accord with a change
in the if_flags: ether_set_ifflags_cb() registers a callback that
returns ENETRESET if the caller should reset the ethernet by calling
if_init(), 0 on success, != 0 on failure. Pull common code from
ex(4), gem(4), nfe(4), sip(4), tlp(4), vge(4) into ether_ioctl(),
and register if_flags callbacks for those drivers.

Return ENOTTY instead of EINVAL for inappropriate ioctls. In
zyd(4), use ENXIO instead of ENOTTY to indicate that the device is
not any longer attached.

Add to if_set_sadl() a boolean 'factory' argument that indicates
whether a link-layer address was assigned by the factory or some
other source. In a comment, recommend using the factory address
for generating an EUI64, and update in6_get_hw_ifid() to prefer a
factory address to any other link-layer address.

Add a routing message, RTM_LLINFO_UPD, that tells protocols to
update the binding of network-layer addresses to link-layer addresses.
Implement this message in IPv4 and IPv6 by sending a gratuitous
ARP or a neighbor advertisement, respectively. Generate RTM_LLINFO_UPD
messages on a change of an interface's link-layer address.

In ether_ioctl(), do not let SIOCALIFADDR set a link-layer address
that is broadcast/multicast or equal to 00:00:00:00:00:00.

Make ether_ioctl() call ifioctl_common() to handle ioctls that it
does not understand.

In gif(4), initialize if_softc and use it, instead of assuming that
the gif_softc and ifp overlap.

Let ifioctl_common() handle SIOCGIFADDR.

Sprinkle rtcache_invariants(), which checks on DIAGNOSTIC kernels
that certain invariants on a struct route are satisfied.

In agr(4), rewrite agr_ioctl_filter() to be a bit more explicit
about the ioctls that we do not allow on an agr(4) member interface.

bzero -> memset. Delete unnecessary casts to void *. Use
sockaddr_in_init() and sockaddr_in6_init(). Compare pointers with
NULL instead of "testing truth". Replace some instances of (type
*)0 with NULL. Change some K&R prototypes to ANSI C, and join
lines.


Revision tags: netbsd-5-0-RC3 netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 wrstuden-revivesa-base-1 simonb-wapbl-nbase yamt-pf42-base4 simonb-wapbl-base wrstuden-revivesa-base
# 1.62 15-Jun-2008 christos

branches: 1.62.2; 1.62.4; 1.62.6;
- add if_alloc (ours just mallocs), and if_initname and use them (from FreeBSD)
- kill memsets where M_ZERO can be used.


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base
# 1.61 15-Apr-2008 thorpej

branches: 1.61.2; 1.61.4; 1.61.6; 1.61.8;
Make ip6 and icmp6 stats per-cpu.


# 1.60 12-Apr-2008 cegger

make this build with BRIDGE_IPF and PFIL_HOOKS options


# 1.59 12-Apr-2008 thorpej

Make IP, TCP, UDP, and ICMP statistics per-CPU. The stats are collated
when the user requests them via sysctl.


# 1.58 08-Apr-2008 thorpej

Change IPv6 stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ip6stat structure; old netstat
binaries will continue to work properly.


# 1.57 07-Apr-2008 thorpej

Change IP stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ipstat structure; old netstat
binaries will continue to work properly.


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase nick-net80211-sync-base keiichi-mipv6-base matt-armv6-nbase hpcarm-cleanup-base
# 1.56 20-Feb-2008 matt

branches: 1.56.6;
s/u_\(int[0-9]*_t\)/u\1/g
(change u_int*_t to uint*_t)


Revision tags: bouyer-xeni386-nbase bouyer-xeni386-base mjf-devfs-base
# 1.55 19-Jan-2008 dyoung

Use C99 array initializers for bridge_control_table[].


Revision tags: nick-csl-alignment-base5 bouyer-xeni386-merge1 matt-armv6-prevmlocking vmlocking2-base3 yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 jmcneill-base bouyer-xenamd64-base2 vmlocking-nbase yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base matt-armv6-base jmcneill-pm-base reinoud-bufcleanup-base vmlocking-base
# 1.54 27-Aug-2007 dyoung

branches: 1.54.2; 1.54.8; 1.54.14;
LLADDR -> CLLADDR.


# 1.53 26-Aug-2007 dyoung

Constify: LLADDR -> CLLADDR. I'm aiming here to make it easier to
identify sockaddr_dl abuse that remains in the kernel, especially
the potential for overwriting memory past the end of a sockaddr_dl
with, e.g., memcpy(LLADDR(), ...).

Use sockaddr_dl_setaddr() in a few places.


Revision tags: matt-mips64-base nick-csl-alignment-base mjf-ufs-trans-base
# 1.52 09-Jul-2007 ad

branches: 1.52.2; 1.52.6;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.51 12-Mar-2007 ad

branches: 1.51.2;
Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.


# 1.50 04-Mar-2007 christos

branches: 1.50.2;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.49 21-Feb-2007 dyoung

Use __arraycount().


# 1.48 17-Feb-2007 dyoung

KNF: de-__P, bzero -> memset, bcmp -> memcmp. Remove extraneous
parentheses in return statements.

Cosmetic: don't open-code TAILQ_FOREACH().

Cosmetic: change types of variables to avoid oodles of casts: in
in6_src.c, avoid casts by changing several route_in6 pointers
to struct route pointers. Remove unnecessary casts to caddr_t
elsewhere.

Pave the way for eliminating address family-specific route caches:
soon, struct route will not embed a sockaddr, but it will hold
a reference to an external sockaddr, instead. We will set the
destination sockaddr using rtcache_setdst(). (I created a stub
for it, but it isn't used anywhere, yet.) rtcache_free() will
free the sockaddr. I have extracted from rtcache_free() a helper
subroutine, rtcache_clear(). rtcache_clear() will "forget" a
cached route, but it will not forget the destination by releasing
the sockaddr. I use rtcache_clear() instead of rtcache_free()
in rtcache_update(), because rtcache_update() is not supposed
to forget the destination.

Constify:

1 Introduce const accessor for route->ro_dst, rtcache_getdst().

2 Constify the 'dst' argument to ifnet->if_output(). This
led me to constify a lot of code called by output routines.

3 Constify the sockaddr argument to protosw->pr_ctlinput. This
led me to constify a lot of code called by ctlinput routines.

4 Introduce const macros for converting from a generic sockaddr
to family-specific sockaddrs, e.g., sockaddr_in: satocsin6,
satocsin, et cetera.


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.47 04-Jan-2007 elad

branches: 1.47.2;
Consistent usage of KAUTH_GENERIC_ISSUSER.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3 netbsd-4-base
# 1.46 23-Nov-2006 rpaulo

New EtherIP driver based on tap(4) and gif(4) by Hans Rosenfeld.
Notable changes:
* Fixes PR 34268.
* Separates the code from gif(4) (which is more cleaner).
* Allows the usage of STP (Spanning Tree Protocol).
* Removed EtherIP implementation from gif(4)/tap(4).

Some input from Christos.


# 1.45 16-Nov-2006 christos

__unused removal on arguments; approved by core.


Revision tags: yamt-splraiseipl-base2
# 1.44 17-Oct-2006 dogcow

now that we have -Wno-unused-parameter, back out all the tremendously ugly
code to gratuitously access said parameters.


# 1.43 13-Oct-2006 dogcow

More -Wunused fallout. sprinkle __unused when possible; otherwise, use the
do { if (&x) {} } while (/* CONSTCOND */ 0);
construct as suggested by uwe in <20061012224845.GA9449@snark.ptc.spbu.ru>.


# 1.42 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


# 1.41 05-Oct-2006 tls

Protect calls to pool_put/pool_get that may occur in interrupt context
with spl used to protect other allocations and frees, or datastructure
element insertion and removal, in adjacent code.

It is almost unquestionably the case that some of the spl()/splx() calls
added here are superfluous, but it really seems wrong to see:

s=splfoo();
/* frob data structure */
splx(s);
pool_put(x);

and if we think we need to protect the first operation, then it is hard
to see why we should not think we need to protect the next. "Better
safe than sorry".

It is also almost unquestionably the case that I missed some pool
gets/puts from interrupt context with my strategy for finding these
calls; use of PR_NOWAIT is a strong hint that a pool may be used from
interrupt context but many callers in the kernel pass a "can wait/can't
wait" flag down such that my searches might not have found them. One
notable area that needs to be looked at is pf.

See also:

http://mail-index.netbsd.org/tech-kern/2006/07/19/0003.html
http://mail-index.netbsd.org/tech-kern/2006/07/19/0009.html


Revision tags: abandoned-netbsd-4-base yamt-splraiseipl-base yamt-pdpolicy-base9 yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.40 23-Jul-2006 ad

branches: 1.40.4; 1.40.6;
Use the LWP cached credentials where sane.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.39 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.38 18-May-2006 liamjfoy

branches: 1.38.2;
Integrate Common Address Redundancy Procotol (CARP) from OpenBSD

'pseudo-device carp'

Thanks to: joerg@ christos@ riz@ and others who tested
Ok: core@


# 1.37 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base yamt-pdpolicy-base2 elad-kernelauth-base yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.36 17-Jan-2006 christos

branches: 1.36.2; 1.36.4; 1.36.6; 1.36.8; 1.36.10;
Make sure that breq is also cleared (from Xin LI)


# 1.35 09-Jan-2006 christos

Make sure we initialize all structs to 0; from Xin LI


# 1.34 24-Dec-2005 perry

branches: 1.34.2;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.33 11-Dec-2005 thorpej

ANSI function decls and application of static.


# 1.32 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.31 01-Jun-2005 jdc

branches: 1.31.2;
Fix this properly by renaming the conflicting variables.


# 1.30 01-Jun-2005 jdc

Remove extraneous definition of struct llc (found by shadow warning).


Revision tags: netbsd-3-0-RELEASE netbsd-3-0-RC6 netbsd-3-0-RC5 netbsd-3-0-RC4 netbsd-3-0-RC3 netbsd-3-0-RC2 netbsd-3-0-RC1 yamt-km-base4 yamt-km-base3 netbsd-3-base kent-audio2-base
# 1.29 26-Feb-2005 perry

branches: 1.29.2; 1.29.4;
nuke trailing whitespace


Revision tags: yamt-km-base2
# 1.28 31-Jan-2005 kim

Add RFC 3378 EtherIP support, ported from OpenBSD to NetBSD by
Hans Rosenfeld (rosenfeld at grumpf.hope-2000.org)

This change makes it possible to add gif interfaces to bridges, which
will then send and receive IP protocol 97 packets. Packets are Ethernet
frames with an EtherIP header prepended.


Revision tags: yamt-km-base kent-audio1-beforemerge kent-audio1-base
# 1.27 04-Dec-2004 peter

branches: 1.27.4; 1.27.6;
Change ifc_destroy to return an int instead of void, so that it
can pass back errors to ifconfig.


# 1.26 06-Oct-2004 bad

Interfaces that do checksum offloading indicate the checksum status of
received packets in csum_flags in the packet header. Packets that are
forwarded over the bridge need to have csum_flags cleared before being
put on the output queue. Do so in bridge_enqueue().

Discussed with Jason Thorpe.

Fixes PR kern/27007 and the first part of PR kern/21831.


# 1.25 05-Oct-2004 christos

Only enable BRIDGE_IPF code if PFIL_HOOKS is enabled.


# 1.24 21-Apr-2004 itojun

kill a sprintf


# 1.23 21-Apr-2004 itojun

kill sprintf, use snprintf


Revision tags: netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.22 31-Jan-2004 jdc

branches: 1.22.2;
Use m_copydata(), m_adj() and M_PREPEND() to manipulate mbuf's in
bridge_ipf(). Fixes kernel memory corruption that occured when using
m_split() and m_cat().
Idea from OpenBSD.


# 1.21 09-Dec-2003 augustss

Fix spelling mistake in a comment.


# 1.20 28-Oct-2003 mycroft

Mark this initializer in the canonical way so it can be found later.


# 1.19 25-Oct-2003 christos

Fix uninitialized variable warnings


# 1.18 16-Sep-2003 jdc

Add a flag parameter to bridge_enqueue() to tell it whether to run the
filter or not. We only need to run the filter for bridge_forward() and
bridge_broadcast(). If we also run it for bridge_output(), we will run
the filter twice outbound per packet, so don't.

In bridge_ipf(), make sure we don't run m_cat() on a single mbuf chain
by checking to see (and remembering) if we need to m_split() the mbuf.
This fixes bridge + ipfilter on sparc.

Fixes PR kern/22063.


# 1.17 11-Aug-2003 itojun

rm extra blank line


# 1.16 13-Jul-2003 jdc

Include opt_inet.h to get INET6 definition.
Now, bridged ipv6 packets are passed through ipfilter.
However, some v6 packets still do not get transmitted when ipf is enabled.
Partial fix for PR kern/22063.


# 1.15 23-Jun-2003 martin

branches: 1.15.2;
Make sure to include opt_foo.h if a defflag option FOO is used.


# 1.14 24-May-2003 kristerw

Make sure splx() is called for all bridge_ioctl() error cases.


# 1.13 16-May-2003 itojun

use strlcpy


# 1.12 14-May-2003 itojun

use arc4random


# 1.11 19-Mar-2003 bouyer

Fix 2 bugs:
- initialise stp when the bridge is turned up, without this stp will keep
all interfaces disabled in a sequence like:
brconfig bridge0 add if0 add if1 stp if0 stp if1 up
- s/BRDGSPRI/BRDGSIFPRIO in brconfig.c:cmd_ifpriority()

add a command (ifpathcost) to change the stp path cost of the STP path cost of
an interface. Display the interface path cost with the others STP parameters.


# 1.10 27-Feb-2003 perseant

Make BRIDGE_IPF an option, and document it. Add it (commented) to GENERIC.
Let brconfig tell whether the bridge is using the ipfilter hook, or not.


# 1.9 15-Feb-2003 perseant

Add ipf packet-filtering option to if_bridge. The option is controlled at
compile-time by BRIDGE_IPF, and at runtime by brconfig with the {ipf,-ipf}
option on a per-bridge basis.

As a side-effect, add PFIL_HOOKS processing to if_bridge.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base kqueue-aftermerge kqueue-beforemerge gehenna-devsw-base kqueue-base
# 1.8 24-Aug-2002 martin

Add a function to lookup bridge members by struct ifnet * and use
it at all call sites that have such a pointer readily available.
This avoids unnecessary strcmp()s in critical paths, and removes
some XXX comments.


# 1.7 08-Jun-2002 itojun

reject "add" request if if_mtu is different.


# 1.6 23-May-2002 itojun

use IFT_BRIDGE


Revision tags: netbsd-1-6-base
# 1.5 24-Mar-2002 jdolecek

branches: 1.5.2; 1.5.4;
Fix a memory leak in bridge_ioctl_add() when the called for non-ethernet
interface.
Problem noted and fix provided by in kern/16019 by Love.


Revision tags: eeh-devprop-base newlock-base
# 1.4 08-Mar-2002 thorpej

Pool deals fairly well with physical memory shortage, but it doesn't
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map). Try to deal with this:

* Group all information about the backend allocator for a pool in a
separate structure. The pool references this structure, rather than
the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
to become available, but will still fail if it cannot callocate KVA
space for the pages. If this happens, carefully drain all pools using
the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
some pages, and use that information to make draining easier and more
efficient.
* Get rid of PR_URGENT. There was only one use of it, and it could be
dealt with by the caller.

From art@openbsd.org.


Revision tags: ifpoll-base
# 1.3 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.2 17-Aug-2001 thorpej

branches: 1.2.2; 1.2.4;
Only report expire time for DYNAMIC forwarding table entries.


# 1.1 17-Aug-2001 thorpej

Add support for building Ethernet bridges, based on Jason Wright's
bridge driver from OpenBSD, although the bridge code has been *heavily*
modified by me (the 802.1D code remains mostly unchanged from the
original).


# 1.176 27-Sep-2020 roy

bridge: When an interface joins then mark addresses on it as tentative

The exact flow is detatch addresses, join bridge and then mark detached
addresses as tentative.
This ensures that Duplicate Address Detection for the joining interface
are performed across all members of the bridge.


# 1.175 27-Sep-2020 roy

bridge: Calculate link state as the best link state of any member

If any member is LINK_STATE_UP then it's LINK_STATE_UP.
Otherwise if any member is LINK_STATE_UNKNOWN then it's LINK_STATE_UNKNOWN.
Otherwise it's LINK_STATE_DOWN.


# 1.174 01-Aug-2020 maxv

Remove #ifdef BRIDGE_IPF, compile in the code by default. Sent to
tech-net@.


# 1.173 01-May-2020 jdolecek

report no enabled capabilities when no interface is part of bridge


# 1.172 30-Apr-2020 jdolecek

for bridge(4), report the common enabled capabilities of the members
via SIOCGIFCAP for visibility


# 1.171 27-Apr-2020 jdolecek

if MTU of the added interface doesn't match the bridge, modify the MTU
of the interface to that of the bridge instead of just refusing the
addition with EINVAL

this is a convenience feature to simplify bridge setup with non-standard
MTU, the useful behaviour observed with Linux xenbr


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406
# 1.170 27-Mar-2020 jdolecek

replace the conditional m_pullup() on start of bridge_output() with
a KASSERT(), to make it clear no mbuf manipulation is ever done here

the condition should never trigger, this always runs after ether_output()
M_PREPEND()s ether_header


# 1.169 24-Mar-2020 jdolecek

reset the csum_flags in bridge_brodcast() also for bmcast path

for destination interfaces with real hardware offloading this fixes
multicast packet corruption; for xvif(4) this fix stops treating them
as having no csum

may fix PR kern/42386


Revision tags: ad-namecache-base3
# 1.168 24-Feb-2020 rin

Remove debug printf I put into bridge_calc_csum_flags().
Sorry for noise.


# 1.167 23-Feb-2020 jdolecek

disable the DEBUG bridge_calc_csum_flags() printf


# 1.166 29-Jan-2020 thorpej

Adopt <net/if_stats.h>.


Revision tags: ad-namecache-base2 ad-namecache-base1 ad-namecache-base phil-wifi-20191119
# 1.165 05-Aug-2019 msaitoh

branches: 1.165.2;
Cast uint32_t to avoid undefined behavior in bridge_rthash(). Found by kUBSan.


Revision tags: netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609 isaki-audio2-base pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226
# 1.164 22-Dec-2018 rin

branches: 1.164.4;
Take the interface out of promiscuous mode in bridge_delete_member()
instead of bridge_ioctl_del(). Otherwise, the member interfaces are
left in promiscuous mode when the bridge is destroyed.


# 1.163 15-Dec-2018 rin

Improve wording in comments: replace "chain" with "queue" for
sequence of mbuf's connected by m_nextpkt, in order to avoid
confusion with those connected by m_next.

No binary changes.


# 1.162 14-Dec-2018 martin

Need <netinet6/ip6_var.h> for ip6_statinc() prototype.


# 1.161 12-Dec-2018 rin

PR kern/53562

Handle TX offload in software when a packet is sent via
bridge_output(). We can send it as is in the following
exceptional cases:

For unicast:

(1) When the destination interface is the same as source.

(2) When the destination supports all TX offload options
specified in a packet.

For multicast/broadcast:

(3) When all the members of the bridge support the specified
TX offload options.

For (3), add sc_csum_flags_tx flag to bridge softc, which is
logical AND b/w capabilities of TX offload options in member
interface (ifp->if_csum_flags_tx). The flag is updated when a
member is (i) added to or (ii) removed from a bridge, or (iii)
if_csum_flags_tx flag of a member interface is manipulated via
ifconfig(8).

Turn on M_CSUM_TSOv[46] bit in ifp->if_csum_flags_tx flag when
TSO[46] is enabled for that interface.

OK msaitoh thorpej


Revision tags: pgoyette-compat-1126
# 1.160 09-Nov-2018 ozaki-r

Fix that brconfig <bridge> (addr) can't show a large number of MAC addresses

The command shows only 256 addresses at maximum even if a bridge caches more
addresses. It occurs because the kernel doesn't return an error if the command
passes a short buffer that can't store all cached addresses; the kernel fills
cached addresses as much as possible and returns it without telling that the
result is truncated.

Fix the issue by telling a required size of a buffer if a buffer passed from the
command is not enough, which lets the command retry with an enough buffer.

Reported by k-goda@IIJ


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930
# 1.159 19-Sep-2018 msaitoh

Micro optimization. m_copym(M_COPYALL) -> m_copypacket().


# 1.158 18-Sep-2018 msaitoh

- Fix bridge_enqueue() which was broken by last commit. Use correct mbuf
pointer.
- Modify comment.


# 1.157 14-Sep-2018 msaitoh

Fix a bug that bridge_enqueue() incorrectly cleared outgoing packet's offload
flags. bridge_enqueue() is called from bridge_output() when a packet is
spontaneous. Clear csum_flags before calling brige_enqueue() in
bridge_forward() or bridge_broadcast() instead of in the beginning of
bridge_enqueue().

Note that this change doesn't fix a problem on the following configuration:

A bridge has two or more interfaces.

An address is assigned to an bridge member interface and
some offload flags are set.

Another interface has no address and has no any offload flag.

XXX pullup-[78]


Revision tags: pgoyette-compat-0906 pgoyette-compat-0728 phil-wifi-base pgoyette-compat-0625
# 1.156 25-May-2018 ozaki-r

branches: 1.156.2;
Ensure to call if_register after interface initializations finish


Revision tags: pgoyette-compat-0521
# 1.155 14-May-2018 ozaki-r

Protect packet input routines with KERNEL_LOCK and splsoftnet

if_input, i.e, ether_input and friends, now runs in softint without any
protections. It's ok for ether_input itself because it's already MP-safe,
however, subsequent routines called from it such as carp_input and agr_input
aren't safe because they're not MP-safe. Protect if_input with KERNEL_LOCK.

if_input can be called from a normal LWP context. In that case we need to
prevent interrupts (softint) from running by splsoftnet to protect non-MP-safe
codes (e.g., carp_input and agr_input).

Pointed out by mlelstv@


Revision tags: pgoyette-compat-0502 pgoyette-compat-0422
# 1.154 18-Apr-2018 ozaki-r

Add missing PSLIST_ENTRY_INIT and PSLIST_ENTRY_DESTROY


# 1.153 18-Apr-2018 ozaki-r

Get rid of a unnecessary semicolon

Pointed out by kamil@


# 1.152 18-Apr-2018 ozaki-r

bridge: use pslist(9) for rtlist and rthash

The change fixes race conditions on list operations. One example is that a
reader may see invalid pointers on a looking item in a list due to lack of
membar_producer.


# 1.151 18-Apr-2018 ozaki-r

Simplify bridge_rtnode_insert (NFC)


# 1.150 18-Apr-2018 ozaki-r

Remove obsolete NULL checks


Revision tags: pgoyette-compat-0415
# 1.149 10-Apr-2018 ozaki-r

Fix bridge_rtdelete

It removes a rtable entry that belongs to a specified interface, however, its
original behavior was to delete all belonging entries. Restore the original
behavior.


Revision tags: pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.148 15-Jan-2018 maxv

branches: 1.148.2;
If the bridge is not running, don't call bridge_stop. Otherwise the
following commands will crash the kernel:

ifconfig bridge0 create
ifconfig bridge0 destroy


# 1.147 28-Dec-2017 ozaki-r

Ensure the timer isn't running by using workqueue_wait


# 1.146 19-Dec-2017 ozaki-r

Don't set IFEF_MPSAFE unless NET_MPSAFE at this point

Because recent investigations show that interfaces with IFEF_MPSAFE need to
follow additional restrictions to work with the flag safely. We should enable it
on an interface by default only if the interface surely satisfies the
restrictions, which are described in if.h.

Note that enabling IFEF_MPSAFE solely gains a few benefit on performance because
the network stack is still serialized by the big kernel locks by default.


# 1.145 11-Dec-2017 ozaki-r

Wrap if_ioctl_lock with IFNET_* macros (NFC)

Also if_ioctl_lock perhaps needs to be renamed to something because it's now
not just for ioctl...


# 1.144 08-Dec-2017 ozaki-r

Fix build of kernels without ether

By throwing out if_enable_vlan_mtu and if_disable_vlan_mtu that
created a unnecessary dependency from if.c to if_ethersubr.c.

PR kern/52790


# 1.143 06-Dec-2017 ozaki-r

Ensure to not turn on IFF_RUNNING of an interface until its initialization completes

And ensure to turn off it before destruction as per IFF_RUNNING's description
"resource allocated". (The description is a bit doubtful though, I believe the
change is still proper.)


# 1.142 06-Dec-2017 ozaki-r

Ensure to hold if_ioctl_lock when calling if_flags_set


Revision tags: tls-maxphys-base-20171202
# 1.141 17-Nov-2017 ozaki-r

Add missing IFEF_NO_LINK_STATE_CHANGE to bridge


# 1.140 16-Nov-2017 ozaki-r

Unify IFEF_*_MPSAFE into IFEF_MPSAFE

There are already two flags for if_output and if_start, however, it seems such
MPSAFE flags are eventually needed for all if_XXX operations. Having discrete
flags for each operation is wasteful of if_extflags bits. So let's unify
the flags into one: IFEF_MPSAFE.

Fortunately IFEF_*_MPSAFE flags have never been included in any releases, so
we can change them without breaking backward compatibility of the releases
(though the kernel version of -current should be bumped).

Note that if an interface have both MP-safe and non-MP-safe operations at a
time, we have to set the IFEF_MPSAFE flag and let callees of non-MP-safe
opeartions take the kernel lock.

Proposed on tech-kern@ and tech-net@


# 1.139 15-Nov-2017 ozaki-r

Mark callouts of bridge CALLOUT_MPSAFE


# 1.138 25-Oct-2017 ozaki-r

Remove unnecessary splsoftnet


# 1.137 25-Oct-2017 ozaki-r

Don't free sc_rthash twice


# 1.136 23-Oct-2017 msaitoh

- If if_initialize() failed in the attach function, free resources and return.
- Add some missing frees in bridge_clone_destroy().
- KNF


# 1.135 02-Oct-2017 ozaki-r

Add curlwp_bind to bridge_input for psref

It can be called in a thread context via tap (tap_dev_write).

Fix PR kern/52587


Revision tags: nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320
# 1.134 07-Mar-2017 ozaki-r

branches: 1.134.6;
Remove unnecessary splnet for bridge_enqueue

bridge_enqueue now uses if_transmit_lock that does splnet for device
drivers, so splnet for bridge_enqueue isn't needed anymore.


# 1.133 16-Feb-2017 knakahara

add l2tp(4) L2TPv3 interface.

originally implemented by IIJ SEIL team.


Revision tags: nick-nhusb-base-20170204
# 1.132 23-Jan-2017 ozaki-r

Replace some splnet with splsoftnet


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107 nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004
# 1.131 15-Sep-2016 christos

branches: 1.131.2;
Always do the mbuf checks. The packet filters (npf) expect the mbuf to be
pulled-up. (Krists Krilovs)


Revision tags: localcount-20160914
# 1.130 29-Aug-2016 ozaki-r

KNF; replace white spaces with hard tabs

No functional change.


Revision tags: pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.129 22-Jun-2016 knakahara

branches: 1.129.2;
fix: locking about IFQ_ENQUEUE and ALTQ

- If NET_MPSAFE is not defined, IFQ_LOCK is nop. Currently, that means
IFQ_ENQUEUE() of some paths such as bridge_enqueue() is called parallel
wrongly.
- If ALTQ is enabled, Tx processing should call if_transmit() (= IFQ_ENQUEUE
+ ifp->if_start()) instead of ifp->if_transmit() to call ALTQ_ENQUEUE()
and ALTQ_DEQUEUE().
Furthermore, ALTQ processing is always required KERNEL_LOCK currently.


# 1.128 20-Jun-2016 knakahara

fix: should not assert IFEF_OUTPUT_MPSAFE in bridge_output()


# 1.127 20-Jun-2016 knakahara

tentative fix for ATF(net/if_bridge/t_bridge)


# 1.126 20-Jun-2016 knakahara

make bridge_output MP-safe, so that bridge(4) can enable IFEF_OUTPUT_MPSAFE.

making MP-scalable is future work.


# 1.125 10-Jun-2016 ozaki-r

Avoid storing a pointer of an interface in a mbuf

Having a pointer of an interface in a mbuf isn't safe if we remove big
kernel locks; an interface object (ifnet) can be destroyed anytime in any
packet processing and accessing such object via a pointer is racy. Instead
we have to get an object from the interface collection (ifindex2ifnet) via
an interface index (if_index) that is stored to a mbuf instead of an
pointer.

The change provides two APIs: m_{get,put}_rcvif_psref that use psref(9)
for sleep-able critical sections and m_{get,put}_rcvif that use
pserialize(9) for other critical sections. The change also adds another
API called m_get_rcvif_NOMPSAFE, that is NOT MP-safe and for transition
moratorium, i.e., it is intended to be used for places where are not
planned to be MP-ified soon.

The change adds some overhead due to psref to performance sensitive paths,
however the overhead is not serious, 2% down at worst.

Proposed on tech-kern and tech-net.


# 1.124 10-Jun-2016 ozaki-r

Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.123 16-May-2016 ozaki-r

Apply if_get and if_put to bridge(4)


# 1.122 04-May-2016 roy

Allow multicast/broadcast packets from a bridge member to other members.
Note this should just call bridge_broadcast when more locking issues are
resolved.


# 1.121 28-Apr-2016 knakahara

introduce new ifnet MP-scalable sending interface "if_transmit".


# 1.120 28-Apr-2016 ozaki-r

Constify rtentry of if_output

We no longer need to change rtentry below if_output.

The change makes it clear where rtentries are changed (or not)
and helps forthcoming locking (os psrefing) rtentries.


# 1.119 24-Apr-2016 christos

CID 1358673: dead code


Revision tags: nick-nhusb-base-20160422
# 1.118 22-Apr-2016 roy

Change used from int to bool.
If used, abort the loop because we think we're already at the end.


# 1.117 20-Apr-2016 knakahara

IFQ_ENQUEUE refactor (3/3) : eliminate pktattr argument from IFQ_ENQUEUE caller


# 1.116 20-Apr-2016 knakahara

IFQ_ENQUEUE refactor (2/3) : eliminate pktattr argument from altq implemantation


# 1.115 19-Apr-2016 ozaki-r

Apply psref(9) to bridge(4)

Note that there is an issue that ioctls for an interface and a destruction
of the interface can run in parallel and it causes race conditions on
bridge as well (it rarely happens). The issue will be addressed in the
interface common code (if.c).


# 1.114 19-Apr-2016 ozaki-r

Remove BRIDGE_MPSAFE switch and enable MP-safe code by default

We need to enable it by default because bridge_input now runs
in softint, but bridge_input w/o BRIDGE_MPSAFE was designed as
it runs in hardware interrupt.

Note that there remains a racy code in bridge_output; it will be
solved in the upcoming change (applying psref(9)).


# 1.113 11-Apr-2016 ozaki-r

Fix usage of pslist(9)

Pointed out by riastradh@.


# 1.112 11-Apr-2016 ozaki-r

Use pslist(9) in bridge(4)

This adds missing memory barriers to list operations for pserialize.


# 1.111 28-Mar-2016 ozaki-r

Remove unused global bridge list

Pointed out by riastradh@


# 1.110 23-Mar-2016 ozaki-r

Fix LIST_FOREACH argument


# 1.109 23-Mar-2016 ozaki-r

Use LIST_FOREACH instead of LIST_FOREACH_SAFE

No need to use *_SAFE because we don't remove any items in the loop.


Revision tags: nick-nhusb-base-20160319
# 1.108 15-Feb-2016 ozaki-r

Simplify bridge(4)

Thanks to introducing softint-based if_input, the entire bridge code now
never run in hardware interrupt context. So we can simplify the code.

- Remove spin mutexes
- They were needed because some code of bridge could run in
hardware interrupt context
- We now need only an adaptive mutex for each shared object
(a member list and a forwarding table)
- Remove pktqueue
- bridge_input is already in softint, using another softint
(for bridge_forward) is useless
- Packet distribution should be down at device drivers


# 1.107 10-Feb-2016 ozaki-r

Don't share struct work, instead have one per softc

Pointed out by riastradh@


# 1.106 09-Feb-2016 ozaki-r

Introduce softint-based if_input

This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.

This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.

To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.

Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html

Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!


Revision tags: nick-nhusb-base-20151226
# 1.105 19-Nov-2015 christos

Add handling of VLAN packets in if_bridge where the parent interface supports
them (Jean-Jacques.Puig@espci.fr). Factor out the vlan_mtu enabling and
disabling code.


# 1.104 20-Oct-2015 maxv

Harmless alloc inconsistency; make sure the exact same argument is given to
kmem_alloc/kmem_free. Found by Brainy.


# 1.103 07-Oct-2015 ozaki-r

Enqueue frames to a curcpu's pktqueue

Currently RX can run on a CPU other than CPU#0, so always enqueuing
to a pktqueue of CPU#0 makes no sense. Let's use a curcpu's pktqueue,
although bridge_foward softint doesn't run in parallel without
NET_MPSAFE.

This is a temporal solution. We need a fundamental solution.


Revision tags: nick-nhusb-base-20150921
# 1.102 28-Aug-2015 rjs

Don't set M_PROTO1 in mbuf flags.

This was left over from the old usage of gif(4) with bridges.


# 1.101 20-Aug-2015 christos

include "ioconf.h" to get the 'void <driver>attach(int count);' prototype.


# 1.100 23-Jul-2015 ozaki-r

Fix PR 48104

So far bridge cannot receive frames via a member interface when the frames
come from another member interface. So when we assign an IP address to
a member interface, hosts connected to another member interface cannot
ping to the IP address. That behavior isn't expected. See PR 48104 for
more realistic examples of this issue.

The change does:
- drop M_PROMISC before ether_input, which allows a bridge member interface
to receive a frame coming from another bridge member interface
- receive broadcast/multicast frames via all bridge member interfaces,
which is required to receive IPv6 multicast packets destined to a
multicast group belonging to a bridge member interface that is different
from a packet arrival interface

roy@ helped testing of the fix, thanks!


Revision tags: nick-nhusb-base-20150606
# 1.99 01-Jun-2015 matt

Modify the BRDGGIFS and BRDGRTS cmds to be more COMPAT_NETBSD32 friendly.
(XXX whitespace)


# 1.98 16-Apr-2015 ozaki-r

Fix racy bridge_delete_member

It can be called from bridge_ioctl_del and bridge_clone_destroy with
a same bridge member (bif) at the same time. We have to prevent
that happens.

Pointed out by riastradh@


Revision tags: nick-nhusb-base-20150406
# 1.97 08-Jan-2015 ozaki-r

Use pserialize for rtlist in bridge

This change enables lockless accesses to bridge rtable lists.
See locking notes in a comment to know how pserialize and
mutexes are used. Some functions are rearranged to use
pserialize. A workqueue is introduced to use pserialize in
bridge_rtage via bridge_timer callout.

As usual, pserialize and mutexes are used only when NET_MPSAFE
on. On the other hand, the newly added workqueue is used
regardless of NET_MPSAFE on or off.


# 1.96 01-Jan-2015 ozaki-r

Reset the expire time of a cache on receiving a frame for the cache

The expire time of a cache in a bridge MAC address table was never reset
once it is initialized regardless of traffic for the cache. The behavior
isn't supposed and active caches are unnecessarily expired and removed.

PR kern/49507


# 1.95 31-Dec-2014 ozaki-r

Use pserialize in bridge

This change enables lockless accesses to bridge member lists.
See locking notes in a comment to know how pserialize and
mutexes are used.

This change also provides support for softint-based interrupt
handling; pserialize readers can run in both HW interrupt and
softint contexts.

As usual, pserialize is used only when NET_MPSAFE on.


# 1.94 25-Dec-2014 ozaki-r

Use LIST_FOREACH_SAFE in bridge_rt* functions


# 1.93 24-Dec-2014 ozaki-r

Replace malloc/free with kmem_* in if_bridge

Additionally M_NOWAIT is replaced with KM_SLEEP.


# 1.92 22-Dec-2014 ozaki-r

Call ether_input/m_freem without holding a lock or referencing unnecessary objects

When NET_MPSAFE on, a bridge tries to pass up a packet to Layer 3
(or call m_freem) with holding a lock or referencing unnecessary
objects. That causes random lock ups. The change fixes the issue.


Revision tags: nick-nhusb-base
# 1.91 15-Aug-2014 ozaki-r

branches: 1.91.2;
bridge: reject non-IFF_SIMPLEX interfaces

bridge does not work with !IFF_SIMPLEX interfaces (PR/18035);
the bug is not yet fixed. Until it gets fixed, we should
reject non-IFF_SIMPLEX interfaces.

Discussed with pooka@


Revision tags: netbsd-7-1-2-RELEASE netbsd-7-1-1-RELEASE netbsd-7-1-RELEASE netbsd-7-1-RC2 netbsd-7-nhusb-base-20170116 netbsd-7-1-RC1 netbsd-7-0-2-RELEASE netbsd-7-nhusb-base netbsd-7-0-1-RELEASE netbsd-7-0-RELEASE netbsd-7-0-RC3 netbsd-7-0-RC2 netbsd-7-0-RC1 netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.90 23-Jul-2014 ozaki-r

branches: 1.90.2;
Avoid calling copyout with holding mutex(IPL_NET)

Because copyout may lead a page fault that may sleep, we have to pull it
out from the critical section of mutex(IPL_NET) in bridge_ioctl_gifs.


# 1.89 23-Jul-2014 ozaki-r

Add missing unlock


# 1.88 20-Jul-2014 ozaki-r

Don't return ENETRESET when ioctl SIOCSIFMTU

Otherwise, just changing MTU with ifconfig shows
a confusable error message.

RP kern/48996


# 1.87 14-Jul-2014 ozaki-r

Make bridge MPSAFE

- Introduce BRIDGE_MPSAFE
- It's enabled only when NET_MPSAFE is defined
in if.h or the kernel config
- Add iflist and rtlist mutex locks
- Locking iflist is performance sensitive,
so it's not used when !BRIDGE_MPSAFE
- Add bif object reference counting
- It enables fine-grain locking for bridge member lists
by allowing to not hold a lock during touching a bif
- bridge_release_member is added to decrement the
reference count
- A condition variable is added to do bridge_delete_member
gracefully
- Add if_bridgeif to ifnet
- It's a shortcut to a bif object of a bridge member
- It reduces a bif lookup cost and so lock contention on iflist
- Make bridgestp MPSAFE too


# 1.86 02-Jul-2014 ozaki-r

Protect bridge_list with a mutex


# 1.85 02-Jul-2014 ozaki-r

Remove obsolete codes for if_snd


# 1.84 23-Jun-2014 ozaki-r

Get rid of unnecessary xc_broadcast after pktq_barrier

Pointed out by rmind@


# 1.83 18-Jun-2014 ozaki-r

Restructure bridge_input and bridge_broadcast

There are two changes:
- Assemble the places calling pktq_enqueue (bridge_forward)
for unicast and {b,m}cast frames into one
- Receive {b,m}cast frames in bridge_broadcast, not in
bridge_input

The changes make the code clear and readable. bridge_input
now doesn't need to take care of {b,m}cast frames;
bridge_forward and bridge_broadcast have the responsibility.

The changes are based on a patch of Lloyd Parkes submitted
in PR 48104, but don't fix its issue yet.


# 1.82 18-Jun-2014 ozaki-r

Tidy up bridge_input

No functional change.


# 1.81 17-Jun-2014 ozaki-r

Restructure ether_input and bridge_input

The network stack of NetBSD is well organized and
layered. A packet reception is processed from a
lower layer to an upper layer one by one. However,
ether_input and bridge_input are not structured so.
bridge_input is called inside ether_input.

The new structure replaces ifnet#if_input of a bridge
member with bridge_input when the member is attached.
So a packet goes straight on a packet reception via
a bridge, bridge_input => ether_input => ip_input.

The change is part of a patch of Lloyd Parkes submitted
in PR 48104. Unlike the patch, the change doesn't
intend to change the behavior of the packet processing.
Another patch will fix PR 48104.


# 1.80 16-Jun-2014 ozaki-r

Add net.interfaces.bridgeN.fwdq.{maxlen,len,drops} sysctl


# 1.79 16-Jun-2014 ozaki-r

Use pktqueue for bridge forwarding queue and softint


# 1.78 15-Jun-2014 ozaki-r

Get rid of unnecessary splnet for pool_{get,put}

A mutex prevents interrupts in the functions now.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base
# 1.77 29-Jun-2013 rmind

branches: 1.77.4;
- Rewrite parts of pfil(9): use array to store hooks and thus be more cache
friendly (there are only few hooks in the system). Make the structures
opaque and the interface more strict.
- Remove PFIL_HOOKS option by making pfil(9) mandatory.


Revision tags: agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8
# 1.76 22-Mar-2012 wiz

branches: 1.76.2; 1.76.4;
Fix typo in kauth name. From PR 46234 by Matthew Mondor.
Tested by Geoff Adams and Ryo ONODERA.


# 1.75 13-Mar-2012 elad

Replace the remaining KAUTH_GENERIC_ISSUSER authorization calls with
something meaningful. All relevant documentation has been updated or
written.

Most of these changes were brought up in the following messages:

http://mail-index.netbsd.org/tech-kern/2012/01/18/msg012490.html
http://mail-index.netbsd.org/tech-kern/2012/01/19/msg012502.html
http://mail-index.netbsd.org/tech-kern/2012/02/17/msg012728.html

Thanks to christos, manu, njoly, and jmmv for input.

Huge thanks to pgoyette for spinning these changes through some build
cycles and ATF.


Revision tags: netbsd-6-0-6-RELEASE netbsd-6-1-5-RELEASE netbsd-6-1-4-RELEASE netbsd-6-0-5-RELEASE netbsd-6-1-3-RELEASE netbsd-6-0-4-RELEASE netbsd-6-1-2-RELEASE netbsd-6-0-3-RELEASE netbsd-6-1-1-RELEASE netbsd-6-0-2-RELEASE netbsd-6-1-RELEASE netbsd-6-1-RC4 netbsd-6-1-RC3 netbsd-6-1-RC2 netbsd-6-1-RC1 netbsd-6-0-1-RELEASE matt-nb6-plus-nbase netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base
# 1.74 19-Nov-2011 tls

branches: 1.74.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.73 23-May-2011 joerg

branches: 1.73.4;
simplify


Revision tags: bouyer-quota2-nbase bouyer-quota2-base jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.72 07-Dec-2010 pooka

branches: 1.72.2;
_KERNEL_TOP


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1 yamt-nfs-mp-base9 uebayasi-xip-base
# 1.71 19-Jan-2010 pooka

branches: 1.71.4;
Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


Revision tags: matt-premerge-20091211 yamt-nfs-mp-base8 yamt-nfs-mp-base7 jymxensuspend-base yamt-nfs-mp-base6 yamt-nfs-mp-base5 jym-xensuspend-nbase
# 1.70 17-May-2009 cegger

fix crash in bridge_ioctl():


BRDGGFLT and BRDGSFILT bridge controls are only available with BRIDGE_IPF and PFIL_HOOKS defined.
In amd64 GENERIC and XEN kernel configs PFIL_HOOKS is defined but BRIDGE_IPF is not.

When a BRDGGFLT or BRDGSFILT command comes in, then ifd->ifd_cmd is not in range
of bridge_control_table_size. Then bc is not set and is dereferenced
later => BOOM.


Revision tags: yamt-nfs-mp-base4 jym-xensuspend-base
# 1.69 12-May-2009 elad

Move kauth(9) call before going into splnet().

Mailing list reference:

http://mail-index.netbsd.org/tech-net/2009/05/08/msg001286.html


Revision tags: yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 nick-hppapmap-base
# 1.68 04-Apr-2009 bouyer

Fix another typo


# 1.67 04-Apr-2009 bouyer

Fix a comment, and make it build.


# 1.66 04-Apr-2009 bouyer

Fixes from Masao Uebayashi


# 1.65 04-Apr-2009 bouyer

Fix for if_start() and pfil_hook() being called from hardware interrupt
context (reported on various mailing-lists, and part of PR kern/41114,
causing panic in pf(4) and possibly ipf(4) when BRIDGE_IPF is used).
Defer bridge_forward() to a software interrupt; bridge_input() enqueues
mbufs to ifp->if_snd which is handled in bridge_forward().


Revision tags: nick-hppapmap-base2
# 1.64 18-Jan-2009 mrg

branches: 1.64.2;
Fix multiple problems:

* A sign extension error creating the bridge ID corrupted the
priority (always making it the maximum).
* Do not catch STP packets on an interface for which STP is not
enabled -- it's a violation of the spec, and causes STP to fail on
neighboring bridges.
* An optimization to bstp_input() -- some information is already
known when we call it.

contributed anonymously.


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.63 07-Nov-2008 dyoung

*** Summary ***

When a link-layer address changes (e.g., ifconfig ex0 link
02:de:ad:be:ef:02 active), send a gratuitous ARP and/or a Neighbor
Advertisement to update the network-/link-layer address bindings
on our LAN peers.

Refuse a change of ethernet address to the address 00:00:00:00:00:00
or to any multicast/broadcast address. (Thanks matt@.)

Reorder ifnet ioctl operations so that driver ioctls may inherit
the functions of their "class"---ether_ioctl(), fddi_ioctl(), et
cetera---and the class ioctls may inherit from the generic ioctl,
ifioctl_common(), but both driver- and class-ioctls may override
the generic behavior. Make network drivers share more code.

Distinguish a "factory" link-layer address from others for the
purposes of both protecting that address from deletion and computing
EUI64.

Return consistent, appropriate error codes from network drivers.

Improve readability. KNF.

*** Details ***

In if_attach(), always initialize the interface ioctl routine,
ifnet->if_ioctl, if the driver has not already initialized it.
Delete if_ioctl == NULL tests everywhere else, because it cannot
happen.

In the ioctl routines of network interfaces, inherit common ioctl
behaviors by calling either ifioctl_common() or whichever ioctl
routine is appropriate for the class of interface---e.g., ether_ioctl()
for ethernets.

Stop (ab)using SIOCSIFADDR and start to use SIOCINITIFADDR. In
the user->kernel interface, SIOCSIFADDR's argument was an ifreq,
but on the protocol->ifnet interface, SIOCSIFADDR's argument was
an ifaddr. That was confusing, and it would work against me as I
make it possible for a network interface to overload most ioctls.
On the protocol->ifnet interface, replace SIOCSIFADDR with
SIOCINITIFADDR. In ifioctl(), return EPERM if userland tries to
invoke SIOCINITIFADDR.

In ifioctl(), give the interface the first shot at handling most
interface ioctls, and give the protocol the second shot, instead
of the other way around. Finally, let compatibility code (COMPAT_OSOCK)
take a shot.

Pull device initialization out of switch statements under
SIOCINITIFADDR. For example, pull ..._init() out of any switch
statement that looks like this:

switch (...->sa_family) {
case ...:
..._init();
...
break;
...
default:
..._init();
...
break;
}

Rewrite many if-else clauses that handle all permutations of IFF_UP
and IFF_RUNNING to use a switch statement,

switch (x & (IFF_UP|IFF_RUNNING)) {
case 0:
...
break;
case IFF_RUNNING:
...
break;
case IFF_UP:
...
break;
case IFF_UP|IFF_RUNNING:
...
break;
}

unifdef lots of code containing #ifdef FreeBSD, #ifdef NetBSD, and
#ifdef SIOCSIFMTU, especially in fwip(4) and in ndis(4).

In ipw(4), remove an if_set_sadl() call that is out of place.

In nfe(4), reuse the jumbo MTU logic in ether_ioctl().

Let ethernets register a callback for setting h/w state such as
promiscuous mode and the multicast filter in accord with a change
in the if_flags: ether_set_ifflags_cb() registers a callback that
returns ENETRESET if the caller should reset the ethernet by calling
if_init(), 0 on success, != 0 on failure. Pull common code from
ex(4), gem(4), nfe(4), sip(4), tlp(4), vge(4) into ether_ioctl(),
and register if_flags callbacks for those drivers.

Return ENOTTY instead of EINVAL for inappropriate ioctls. In
zyd(4), use ENXIO instead of ENOTTY to indicate that the device is
not any longer attached.

Add to if_set_sadl() a boolean 'factory' argument that indicates
whether a link-layer address was assigned by the factory or some
other source. In a comment, recommend using the factory address
for generating an EUI64, and update in6_get_hw_ifid() to prefer a
factory address to any other link-layer address.

Add a routing message, RTM_LLINFO_UPD, that tells protocols to
update the binding of network-layer addresses to link-layer addresses.
Implement this message in IPv4 and IPv6 by sending a gratuitous
ARP or a neighbor advertisement, respectively. Generate RTM_LLINFO_UPD
messages on a change of an interface's link-layer address.

In ether_ioctl(), do not let SIOCALIFADDR set a link-layer address
that is broadcast/multicast or equal to 00:00:00:00:00:00.

Make ether_ioctl() call ifioctl_common() to handle ioctls that it
does not understand.

In gif(4), initialize if_softc and use it, instead of assuming that
the gif_softc and ifp overlap.

Let ifioctl_common() handle SIOCGIFADDR.

Sprinkle rtcache_invariants(), which checks on DIAGNOSTIC kernels
that certain invariants on a struct route are satisfied.

In agr(4), rewrite agr_ioctl_filter() to be a bit more explicit
about the ioctls that we do not allow on an agr(4) member interface.

bzero -> memset. Delete unnecessary casts to void *. Use
sockaddr_in_init() and sockaddr_in6_init(). Compare pointers with
NULL instead of "testing truth". Replace some instances of (type
*)0 with NULL. Change some K&R prototypes to ANSI C, and join
lines.


Revision tags: netbsd-5-0-RC3 netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 wrstuden-revivesa-base-1 simonb-wapbl-nbase yamt-pf42-base4 simonb-wapbl-base wrstuden-revivesa-base
# 1.62 15-Jun-2008 christos

branches: 1.62.2; 1.62.4; 1.62.6;
- add if_alloc (ours just mallocs), and if_initname and use them (from FreeBSD)
- kill memsets where M_ZERO can be used.


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base
# 1.61 15-Apr-2008 thorpej

branches: 1.61.2; 1.61.4; 1.61.6; 1.61.8;
Make ip6 and icmp6 stats per-cpu.


# 1.60 12-Apr-2008 cegger

make this build with BRIDGE_IPF and PFIL_HOOKS options


# 1.59 12-Apr-2008 thorpej

Make IP, TCP, UDP, and ICMP statistics per-CPU. The stats are collated
when the user requests them via sysctl.


# 1.58 08-Apr-2008 thorpej

Change IPv6 stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ip6stat structure; old netstat
binaries will continue to work properly.


# 1.57 07-Apr-2008 thorpej

Change IP stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ipstat structure; old netstat
binaries will continue to work properly.


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase nick-net80211-sync-base keiichi-mipv6-base matt-armv6-nbase hpcarm-cleanup-base
# 1.56 20-Feb-2008 matt

branches: 1.56.6;
s/u_\(int[0-9]*_t\)/u\1/g
(change u_int*_t to uint*_t)


Revision tags: bouyer-xeni386-nbase bouyer-xeni386-base mjf-devfs-base
# 1.55 19-Jan-2008 dyoung

Use C99 array initializers for bridge_control_table[].


Revision tags: nick-csl-alignment-base5 bouyer-xeni386-merge1 matt-armv6-prevmlocking vmlocking2-base3 yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 jmcneill-base bouyer-xenamd64-base2 vmlocking-nbase yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base matt-armv6-base jmcneill-pm-base reinoud-bufcleanup-base vmlocking-base
# 1.54 27-Aug-2007 dyoung

branches: 1.54.2; 1.54.8; 1.54.14;
LLADDR -> CLLADDR.


# 1.53 26-Aug-2007 dyoung

Constify: LLADDR -> CLLADDR. I'm aiming here to make it easier to
identify sockaddr_dl abuse that remains in the kernel, especially
the potential for overwriting memory past the end of a sockaddr_dl
with, e.g., memcpy(LLADDR(), ...).

Use sockaddr_dl_setaddr() in a few places.


Revision tags: matt-mips64-base nick-csl-alignment-base mjf-ufs-trans-base
# 1.52 09-Jul-2007 ad

branches: 1.52.2; 1.52.6;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.51 12-Mar-2007 ad

branches: 1.51.2;
Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.


# 1.50 04-Mar-2007 christos

branches: 1.50.2;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.49 21-Feb-2007 dyoung

Use __arraycount().


# 1.48 17-Feb-2007 dyoung

KNF: de-__P, bzero -> memset, bcmp -> memcmp. Remove extraneous
parentheses in return statements.

Cosmetic: don't open-code TAILQ_FOREACH().

Cosmetic: change types of variables to avoid oodles of casts: in
in6_src.c, avoid casts by changing several route_in6 pointers
to struct route pointers. Remove unnecessary casts to caddr_t
elsewhere.

Pave the way for eliminating address family-specific route caches:
soon, struct route will not embed a sockaddr, but it will hold
a reference to an external sockaddr, instead. We will set the
destination sockaddr using rtcache_setdst(). (I created a stub
for it, but it isn't used anywhere, yet.) rtcache_free() will
free the sockaddr. I have extracted from rtcache_free() a helper
subroutine, rtcache_clear(). rtcache_clear() will "forget" a
cached route, but it will not forget the destination by releasing
the sockaddr. I use rtcache_clear() instead of rtcache_free()
in rtcache_update(), because rtcache_update() is not supposed
to forget the destination.

Constify:

1 Introduce const accessor for route->ro_dst, rtcache_getdst().

2 Constify the 'dst' argument to ifnet->if_output(). This
led me to constify a lot of code called by output routines.

3 Constify the sockaddr argument to protosw->pr_ctlinput. This
led me to constify a lot of code called by ctlinput routines.

4 Introduce const macros for converting from a generic sockaddr
to family-specific sockaddrs, e.g., sockaddr_in: satocsin6,
satocsin, et cetera.


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.47 04-Jan-2007 elad

branches: 1.47.2;
Consistent usage of KAUTH_GENERIC_ISSUSER.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3 netbsd-4-base
# 1.46 23-Nov-2006 rpaulo

New EtherIP driver based on tap(4) and gif(4) by Hans Rosenfeld.
Notable changes:
* Fixes PR 34268.
* Separates the code from gif(4) (which is more cleaner).
* Allows the usage of STP (Spanning Tree Protocol).
* Removed EtherIP implementation from gif(4)/tap(4).

Some input from Christos.


# 1.45 16-Nov-2006 christos

__unused removal on arguments; approved by core.


Revision tags: yamt-splraiseipl-base2
# 1.44 17-Oct-2006 dogcow

now that we have -Wno-unused-parameter, back out all the tremendously ugly
code to gratuitously access said parameters.


# 1.43 13-Oct-2006 dogcow

More -Wunused fallout. sprinkle __unused when possible; otherwise, use the
do { if (&x) {} } while (/* CONSTCOND */ 0);
construct as suggested by uwe in <20061012224845.GA9449@snark.ptc.spbu.ru>.


# 1.42 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


# 1.41 05-Oct-2006 tls

Protect calls to pool_put/pool_get that may occur in interrupt context
with spl used to protect other allocations and frees, or datastructure
element insertion and removal, in adjacent code.

It is almost unquestionably the case that some of the spl()/splx() calls
added here are superfluous, but it really seems wrong to see:

s=splfoo();
/* frob data structure */
splx(s);
pool_put(x);

and if we think we need to protect the first operation, then it is hard
to see why we should not think we need to protect the next. "Better
safe than sorry".

It is also almost unquestionably the case that I missed some pool
gets/puts from interrupt context with my strategy for finding these
calls; use of PR_NOWAIT is a strong hint that a pool may be used from
interrupt context but many callers in the kernel pass a "can wait/can't
wait" flag down such that my searches might not have found them. One
notable area that needs to be looked at is pf.

See also:

http://mail-index.netbsd.org/tech-kern/2006/07/19/0003.html
http://mail-index.netbsd.org/tech-kern/2006/07/19/0009.html


Revision tags: abandoned-netbsd-4-base yamt-splraiseipl-base yamt-pdpolicy-base9 yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.40 23-Jul-2006 ad

branches: 1.40.4; 1.40.6;
Use the LWP cached credentials where sane.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.39 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.38 18-May-2006 liamjfoy

branches: 1.38.2;
Integrate Common Address Redundancy Procotol (CARP) from OpenBSD

'pseudo-device carp'

Thanks to: joerg@ christos@ riz@ and others who tested
Ok: core@


# 1.37 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base yamt-pdpolicy-base2 elad-kernelauth-base yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.36 17-Jan-2006 christos

branches: 1.36.2; 1.36.4; 1.36.6; 1.36.8; 1.36.10;
Make sure that breq is also cleared (from Xin LI)


# 1.35 09-Jan-2006 christos

Make sure we initialize all structs to 0; from Xin LI


# 1.34 24-Dec-2005 perry

branches: 1.34.2;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.33 11-Dec-2005 thorpej

ANSI function decls and application of static.


# 1.32 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.31 01-Jun-2005 jdc

branches: 1.31.2;
Fix this properly by renaming the conflicting variables.


# 1.30 01-Jun-2005 jdc

Remove extraneous definition of struct llc (found by shadow warning).


Revision tags: netbsd-3-0-RELEASE netbsd-3-0-RC6 netbsd-3-0-RC5 netbsd-3-0-RC4 netbsd-3-0-RC3 netbsd-3-0-RC2 netbsd-3-0-RC1 yamt-km-base4 yamt-km-base3 netbsd-3-base kent-audio2-base
# 1.29 26-Feb-2005 perry

branches: 1.29.2; 1.29.4;
nuke trailing whitespace


Revision tags: yamt-km-base2
# 1.28 31-Jan-2005 kim

Add RFC 3378 EtherIP support, ported from OpenBSD to NetBSD by
Hans Rosenfeld (rosenfeld at grumpf.hope-2000.org)

This change makes it possible to add gif interfaces to bridges, which
will then send and receive IP protocol 97 packets. Packets are Ethernet
frames with an EtherIP header prepended.


Revision tags: yamt-km-base kent-audio1-beforemerge kent-audio1-base
# 1.27 04-Dec-2004 peter

branches: 1.27.4; 1.27.6;
Change ifc_destroy to return an int instead of void, so that it
can pass back errors to ifconfig.


# 1.26 06-Oct-2004 bad

Interfaces that do checksum offloading indicate the checksum status of
received packets in csum_flags in the packet header. Packets that are
forwarded over the bridge need to have csum_flags cleared before being
put on the output queue. Do so in bridge_enqueue().

Discussed with Jason Thorpe.

Fixes PR kern/27007 and the first part of PR kern/21831.


# 1.25 05-Oct-2004 christos

Only enable BRIDGE_IPF code if PFIL_HOOKS is enabled.


# 1.24 21-Apr-2004 itojun

kill a sprintf


# 1.23 21-Apr-2004 itojun

kill sprintf, use snprintf


Revision tags: netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.22 31-Jan-2004 jdc

branches: 1.22.2;
Use m_copydata(), m_adj() and M_PREPEND() to manipulate mbuf's in
bridge_ipf(). Fixes kernel memory corruption that occured when using
m_split() and m_cat().
Idea from OpenBSD.


# 1.21 09-Dec-2003 augustss

Fix spelling mistake in a comment.


# 1.20 28-Oct-2003 mycroft

Mark this initializer in the canonical way so it can be found later.


# 1.19 25-Oct-2003 christos

Fix uninitialized variable warnings


# 1.18 16-Sep-2003 jdc

Add a flag parameter to bridge_enqueue() to tell it whether to run the
filter or not. We only need to run the filter for bridge_forward() and
bridge_broadcast(). If we also run it for bridge_output(), we will run
the filter twice outbound per packet, so don't.

In bridge_ipf(), make sure we don't run m_cat() on a single mbuf chain
by checking to see (and remembering) if we need to m_split() the mbuf.
This fixes bridge + ipfilter on sparc.

Fixes PR kern/22063.


# 1.17 11-Aug-2003 itojun

rm extra blank line


# 1.16 13-Jul-2003 jdc

Include opt_inet.h to get INET6 definition.
Now, bridged ipv6 packets are passed through ipfilter.
However, some v6 packets still do not get transmitted when ipf is enabled.
Partial fix for PR kern/22063.


# 1.15 23-Jun-2003 martin

branches: 1.15.2;
Make sure to include opt_foo.h if a defflag option FOO is used.


# 1.14 24-May-2003 kristerw

Make sure splx() is called for all bridge_ioctl() error cases.


# 1.13 16-May-2003 itojun

use strlcpy


# 1.12 14-May-2003 itojun

use arc4random


# 1.11 19-Mar-2003 bouyer

Fix 2 bugs:
- initialise stp when the bridge is turned up, without this stp will keep
all interfaces disabled in a sequence like:
brconfig bridge0 add if0 add if1 stp if0 stp if1 up
- s/BRDGSPRI/BRDGSIFPRIO in brconfig.c:cmd_ifpriority()

add a command (ifpathcost) to change the stp path cost of the STP path cost of
an interface. Display the interface path cost with the others STP parameters.


# 1.10 27-Feb-2003 perseant

Make BRIDGE_IPF an option, and document it. Add it (commented) to GENERIC.
Let brconfig tell whether the bridge is using the ipfilter hook, or not.


# 1.9 15-Feb-2003 perseant

Add ipf packet-filtering option to if_bridge. The option is controlled at
compile-time by BRIDGE_IPF, and at runtime by brconfig with the {ipf,-ipf}
option on a per-bridge basis.

As a side-effect, add PFIL_HOOKS processing to if_bridge.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base kqueue-aftermerge kqueue-beforemerge gehenna-devsw-base kqueue-base
# 1.8 24-Aug-2002 martin

Add a function to lookup bridge members by struct ifnet * and use
it at all call sites that have such a pointer readily available.
This avoids unnecessary strcmp()s in critical paths, and removes
some XXX comments.


# 1.7 08-Jun-2002 itojun

reject "add" request if if_mtu is different.


# 1.6 23-May-2002 itojun

use IFT_BRIDGE


Revision tags: netbsd-1-6-base
# 1.5 24-Mar-2002 jdolecek

branches: 1.5.2; 1.5.4;
Fix a memory leak in bridge_ioctl_add() when the called for non-ethernet
interface.
Problem noted and fix provided by in kern/16019 by Love.


Revision tags: eeh-devprop-base newlock-base
# 1.4 08-Mar-2002 thorpej

Pool deals fairly well with physical memory shortage, but it doesn't
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map). Try to deal with this:

* Group all information about the backend allocator for a pool in a
separate structure. The pool references this structure, rather than
the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
to become available, but will still fail if it cannot callocate KVA
space for the pages. If this happens, carefully drain all pools using
the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
some pages, and use that information to make draining easier and more
efficient.
* Get rid of PR_URGENT. There was only one use of it, and it could be
dealt with by the caller.

From art@openbsd.org.


Revision tags: ifpoll-base
# 1.3 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.2 17-Aug-2001 thorpej

branches: 1.2.2; 1.2.4;
Only report expire time for DYNAMIC forwarding table entries.


# 1.1 17-Aug-2001 thorpej

Add support for building Ethernet bridges, based on Jason Wright's
bridge driver from OpenBSD, although the bridge code has been *heavily*
modified by me (the 802.1D code remains mostly unchanged from the
original).


# 1.174 01-Aug-2020 maxv

Remove #ifdef BRIDGE_IPF, compile in the code by default. Sent to
tech-net@.


# 1.173 01-May-2020 jdolecek

report no enabled capabilities when no interface is part of bridge


# 1.172 30-Apr-2020 jdolecek

for bridge(4), report the common enabled capabilities of the members
via SIOCGIFCAP for visibility


# 1.171 27-Apr-2020 jdolecek

if MTU of the added interface doesn't match the bridge, modify the MTU
of the interface to that of the bridge instead of just refusing the
addition with EINVAL

this is a convenience feature to simplify bridge setup with non-standard
MTU, the useful behaviour observed with Linux xenbr


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406
# 1.170 27-Mar-2020 jdolecek

replace the conditional m_pullup() on start of bridge_output() with
a KASSERT(), to make it clear no mbuf manipulation is ever done here

the condition should never trigger, this always runs after ether_output()
M_PREPEND()s ether_header


# 1.169 24-Mar-2020 jdolecek

reset the csum_flags in bridge_brodcast() also for bmcast path

for destination interfaces with real hardware offloading this fixes
multicast packet corruption; for xvif(4) this fix stops treating them
as having no csum

may fix PR kern/42386


Revision tags: ad-namecache-base3
# 1.168 24-Feb-2020 rin

Remove debug printf I put into bridge_calc_csum_flags().
Sorry for noise.


# 1.167 23-Feb-2020 jdolecek

disable the DEBUG bridge_calc_csum_flags() printf


# 1.166 29-Jan-2020 thorpej

Adopt <net/if_stats.h>.


Revision tags: ad-namecache-base2 ad-namecache-base1 ad-namecache-base phil-wifi-20191119
# 1.165 05-Aug-2019 msaitoh

branches: 1.165.2;
Cast uint32_t to avoid undefined behavior in bridge_rthash(). Found by kUBSan.


Revision tags: netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609 isaki-audio2-base pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226
# 1.164 22-Dec-2018 rin

branches: 1.164.4;
Take the interface out of promiscuous mode in bridge_delete_member()
instead of bridge_ioctl_del(). Otherwise, the member interfaces are
left in promiscuous mode when the bridge is destroyed.


# 1.163 15-Dec-2018 rin

Improve wording in comments: replace "chain" with "queue" for
sequence of mbuf's connected by m_nextpkt, in order to avoid
confusion with those connected by m_next.

No binary changes.


# 1.162 14-Dec-2018 martin

Need <netinet6/ip6_var.h> for ip6_statinc() prototype.


# 1.161 12-Dec-2018 rin

PR kern/53562

Handle TX offload in software when a packet is sent via
bridge_output(). We can send it as is in the following
exceptional cases:

For unicast:

(1) When the destination interface is the same as source.

(2) When the destination supports all TX offload options
specified in a packet.

For multicast/broadcast:

(3) When all the members of the bridge support the specified
TX offload options.

For (3), add sc_csum_flags_tx flag to bridge softc, which is
logical AND b/w capabilities of TX offload options in member
interface (ifp->if_csum_flags_tx). The flag is updated when a
member is (i) added to or (ii) removed from a bridge, or (iii)
if_csum_flags_tx flag of a member interface is manipulated via
ifconfig(8).

Turn on M_CSUM_TSOv[46] bit in ifp->if_csum_flags_tx flag when
TSO[46] is enabled for that interface.

OK msaitoh thorpej


Revision tags: pgoyette-compat-1126
# 1.160 09-Nov-2018 ozaki-r

Fix that brconfig <bridge> (addr) can't show a large number of MAC addresses

The command shows only 256 addresses at maximum even if a bridge caches more
addresses. It occurs because the kernel doesn't return an error if the command
passes a short buffer that can't store all cached addresses; the kernel fills
cached addresses as much as possible and returns it without telling that the
result is truncated.

Fix the issue by telling a required size of a buffer if a buffer passed from the
command is not enough, which lets the command retry with an enough buffer.

Reported by k-goda@IIJ


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930
# 1.159 19-Sep-2018 msaitoh

Micro optimization. m_copym(M_COPYALL) -> m_copypacket().


# 1.158 18-Sep-2018 msaitoh

- Fix bridge_enqueue() which was broken by last commit. Use correct mbuf
pointer.
- Modify comment.


# 1.157 14-Sep-2018 msaitoh

Fix a bug that bridge_enqueue() incorrectly cleared outgoing packet's offload
flags. bridge_enqueue() is called from bridge_output() when a packet is
spontaneous. Clear csum_flags before calling brige_enqueue() in
bridge_forward() or bridge_broadcast() instead of in the beginning of
bridge_enqueue().

Note that this change doesn't fix a problem on the following configuration:

A bridge has two or more interfaces.

An address is assigned to an bridge member interface and
some offload flags are set.

Another interface has no address and has no any offload flag.

XXX pullup-[78]


Revision tags: pgoyette-compat-0906 pgoyette-compat-0728 phil-wifi-base pgoyette-compat-0625
# 1.156 25-May-2018 ozaki-r

branches: 1.156.2;
Ensure to call if_register after interface initializations finish


Revision tags: pgoyette-compat-0521
# 1.155 14-May-2018 ozaki-r

Protect packet input routines with KERNEL_LOCK and splsoftnet

if_input, i.e, ether_input and friends, now runs in softint without any
protections. It's ok for ether_input itself because it's already MP-safe,
however, subsequent routines called from it such as carp_input and agr_input
aren't safe because they're not MP-safe. Protect if_input with KERNEL_LOCK.

if_input can be called from a normal LWP context. In that case we need to
prevent interrupts (softint) from running by splsoftnet to protect non-MP-safe
codes (e.g., carp_input and agr_input).

Pointed out by mlelstv@


Revision tags: pgoyette-compat-0502 pgoyette-compat-0422
# 1.154 18-Apr-2018 ozaki-r

Add missing PSLIST_ENTRY_INIT and PSLIST_ENTRY_DESTROY


# 1.153 18-Apr-2018 ozaki-r

Get rid of a unnecessary semicolon

Pointed out by kamil@


# 1.152 18-Apr-2018 ozaki-r

bridge: use pslist(9) for rtlist and rthash

The change fixes race conditions on list operations. One example is that a
reader may see invalid pointers on a looking item in a list due to lack of
membar_producer.


# 1.151 18-Apr-2018 ozaki-r

Simplify bridge_rtnode_insert (NFC)


# 1.150 18-Apr-2018 ozaki-r

Remove obsolete NULL checks


Revision tags: pgoyette-compat-0415
# 1.149 10-Apr-2018 ozaki-r

Fix bridge_rtdelete

It removes a rtable entry that belongs to a specified interface, however, its
original behavior was to delete all belonging entries. Restore the original
behavior.


Revision tags: pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.148 15-Jan-2018 maxv

branches: 1.148.2;
If the bridge is not running, don't call bridge_stop. Otherwise the
following commands will crash the kernel:

ifconfig bridge0 create
ifconfig bridge0 destroy


# 1.147 28-Dec-2017 ozaki-r

Ensure the timer isn't running by using workqueue_wait


# 1.146 19-Dec-2017 ozaki-r

Don't set IFEF_MPSAFE unless NET_MPSAFE at this point

Because recent investigations show that interfaces with IFEF_MPSAFE need to
follow additional restrictions to work with the flag safely. We should enable it
on an interface by default only if the interface surely satisfies the
restrictions, which are described in if.h.

Note that enabling IFEF_MPSAFE solely gains a few benefit on performance because
the network stack is still serialized by the big kernel locks by default.


# 1.145 11-Dec-2017 ozaki-r

Wrap if_ioctl_lock with IFNET_* macros (NFC)

Also if_ioctl_lock perhaps needs to be renamed to something because it's now
not just for ioctl...


# 1.144 08-Dec-2017 ozaki-r

Fix build of kernels without ether

By throwing out if_enable_vlan_mtu and if_disable_vlan_mtu that
created a unnecessary dependency from if.c to if_ethersubr.c.

PR kern/52790


# 1.143 06-Dec-2017 ozaki-r

Ensure to not turn on IFF_RUNNING of an interface until its initialization completes

And ensure to turn off it before destruction as per IFF_RUNNING's description
"resource allocated". (The description is a bit doubtful though, I believe the
change is still proper.)


# 1.142 06-Dec-2017 ozaki-r

Ensure to hold if_ioctl_lock when calling if_flags_set


Revision tags: tls-maxphys-base-20171202
# 1.141 17-Nov-2017 ozaki-r

Add missing IFEF_NO_LINK_STATE_CHANGE to bridge


# 1.140 16-Nov-2017 ozaki-r

Unify IFEF_*_MPSAFE into IFEF_MPSAFE

There are already two flags for if_output and if_start, however, it seems such
MPSAFE flags are eventually needed for all if_XXX operations. Having discrete
flags for each operation is wasteful of if_extflags bits. So let's unify
the flags into one: IFEF_MPSAFE.

Fortunately IFEF_*_MPSAFE flags have never been included in any releases, so
we can change them without breaking backward compatibility of the releases
(though the kernel version of -current should be bumped).

Note that if an interface have both MP-safe and non-MP-safe operations at a
time, we have to set the IFEF_MPSAFE flag and let callees of non-MP-safe
opeartions take the kernel lock.

Proposed on tech-kern@ and tech-net@


# 1.139 15-Nov-2017 ozaki-r

Mark callouts of bridge CALLOUT_MPSAFE


# 1.138 25-Oct-2017 ozaki-r

Remove unnecessary splsoftnet


# 1.137 25-Oct-2017 ozaki-r

Don't free sc_rthash twice


# 1.136 23-Oct-2017 msaitoh

- If if_initialize() failed in the attach function, free resources and return.
- Add some missing frees in bridge_clone_destroy().
- KNF


# 1.135 02-Oct-2017 ozaki-r

Add curlwp_bind to bridge_input for psref

It can be called in a thread context via tap (tap_dev_write).

Fix PR kern/52587


Revision tags: nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320
# 1.134 07-Mar-2017 ozaki-r

branches: 1.134.6;
Remove unnecessary splnet for bridge_enqueue

bridge_enqueue now uses if_transmit_lock that does splnet for device
drivers, so splnet for bridge_enqueue isn't needed anymore.


# 1.133 16-Feb-2017 knakahara

add l2tp(4) L2TPv3 interface.

originally implemented by IIJ SEIL team.


Revision tags: nick-nhusb-base-20170204
# 1.132 23-Jan-2017 ozaki-r

Replace some splnet with splsoftnet


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107 nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004
# 1.131 15-Sep-2016 christos

branches: 1.131.2;
Always do the mbuf checks. The packet filters (npf) expect the mbuf to be
pulled-up. (Krists Krilovs)


Revision tags: localcount-20160914
# 1.130 29-Aug-2016 ozaki-r

KNF; replace white spaces with hard tabs

No functional change.


Revision tags: pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.129 22-Jun-2016 knakahara

branches: 1.129.2;
fix: locking about IFQ_ENQUEUE and ALTQ

- If NET_MPSAFE is not defined, IFQ_LOCK is nop. Currently, that means
IFQ_ENQUEUE() of some paths such as bridge_enqueue() is called parallel
wrongly.
- If ALTQ is enabled, Tx processing should call if_transmit() (= IFQ_ENQUEUE
+ ifp->if_start()) instead of ifp->if_transmit() to call ALTQ_ENQUEUE()
and ALTQ_DEQUEUE().
Furthermore, ALTQ processing is always required KERNEL_LOCK currently.


# 1.128 20-Jun-2016 knakahara

fix: should not assert IFEF_OUTPUT_MPSAFE in bridge_output()


# 1.127 20-Jun-2016 knakahara

tentative fix for ATF(net/if_bridge/t_bridge)


# 1.126 20-Jun-2016 knakahara

make bridge_output MP-safe, so that bridge(4) can enable IFEF_OUTPUT_MPSAFE.

making MP-scalable is future work.


# 1.125 10-Jun-2016 ozaki-r

Avoid storing a pointer of an interface in a mbuf

Having a pointer of an interface in a mbuf isn't safe if we remove big
kernel locks; an interface object (ifnet) can be destroyed anytime in any
packet processing and accessing such object via a pointer is racy. Instead
we have to get an object from the interface collection (ifindex2ifnet) via
an interface index (if_index) that is stored to a mbuf instead of an
pointer.

The change provides two APIs: m_{get,put}_rcvif_psref that use psref(9)
for sleep-able critical sections and m_{get,put}_rcvif that use
pserialize(9) for other critical sections. The change also adds another
API called m_get_rcvif_NOMPSAFE, that is NOT MP-safe and for transition
moratorium, i.e., it is intended to be used for places where are not
planned to be MP-ified soon.

The change adds some overhead due to psref to performance sensitive paths,
however the overhead is not serious, 2% down at worst.

Proposed on tech-kern and tech-net.


# 1.124 10-Jun-2016 ozaki-r

Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.123 16-May-2016 ozaki-r

Apply if_get and if_put to bridge(4)


# 1.122 04-May-2016 roy

Allow multicast/broadcast packets from a bridge member to other members.
Note this should just call bridge_broadcast when more locking issues are
resolved.


# 1.121 28-Apr-2016 knakahara

introduce new ifnet MP-scalable sending interface "if_transmit".


# 1.120 28-Apr-2016 ozaki-r

Constify rtentry of if_output

We no longer need to change rtentry below if_output.

The change makes it clear where rtentries are changed (or not)
and helps forthcoming locking (os psrefing) rtentries.


# 1.119 24-Apr-2016 christos

CID 1358673: dead code


Revision tags: nick-nhusb-base-20160422
# 1.118 22-Apr-2016 roy

Change used from int to bool.
If used, abort the loop because we think we're already at the end.


# 1.117 20-Apr-2016 knakahara

IFQ_ENQUEUE refactor (3/3) : eliminate pktattr argument from IFQ_ENQUEUE caller


# 1.116 20-Apr-2016 knakahara

IFQ_ENQUEUE refactor (2/3) : eliminate pktattr argument from altq implemantation


# 1.115 19-Apr-2016 ozaki-r

Apply psref(9) to bridge(4)

Note that there is an issue that ioctls for an interface and a destruction
of the interface can run in parallel and it causes race conditions on
bridge as well (it rarely happens). The issue will be addressed in the
interface common code (if.c).


# 1.114 19-Apr-2016 ozaki-r

Remove BRIDGE_MPSAFE switch and enable MP-safe code by default

We need to enable it by default because bridge_input now runs
in softint, but bridge_input w/o BRIDGE_MPSAFE was designed as
it runs in hardware interrupt.

Note that there remains a racy code in bridge_output; it will be
solved in the upcoming change (applying psref(9)).


# 1.113 11-Apr-2016 ozaki-r

Fix usage of pslist(9)

Pointed out by riastradh@.


# 1.112 11-Apr-2016 ozaki-r

Use pslist(9) in bridge(4)

This adds missing memory barriers to list operations for pserialize.


# 1.111 28-Mar-2016 ozaki-r

Remove unused global bridge list

Pointed out by riastradh@


# 1.110 23-Mar-2016 ozaki-r

Fix LIST_FOREACH argument


# 1.109 23-Mar-2016 ozaki-r

Use LIST_FOREACH instead of LIST_FOREACH_SAFE

No need to use *_SAFE because we don't remove any items in the loop.


Revision tags: nick-nhusb-base-20160319
# 1.108 15-Feb-2016 ozaki-r

Simplify bridge(4)

Thanks to introducing softint-based if_input, the entire bridge code now
never run in hardware interrupt context. So we can simplify the code.

- Remove spin mutexes
- They were needed because some code of bridge could run in
hardware interrupt context
- We now need only an adaptive mutex for each shared object
(a member list and a forwarding table)
- Remove pktqueue
- bridge_input is already in softint, using another softint
(for bridge_forward) is useless
- Packet distribution should be down at device drivers


# 1.107 10-Feb-2016 ozaki-r

Don't share struct work, instead have one per softc

Pointed out by riastradh@


# 1.106 09-Feb-2016 ozaki-r

Introduce softint-based if_input

This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.

This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.

To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.

Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html

Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!


Revision tags: nick-nhusb-base-20151226
# 1.105 19-Nov-2015 christos

Add handling of VLAN packets in if_bridge where the parent interface supports
them (Jean-Jacques.Puig@espci.fr). Factor out the vlan_mtu enabling and
disabling code.


# 1.104 20-Oct-2015 maxv

Harmless alloc inconsistency; make sure the exact same argument is given to
kmem_alloc/kmem_free. Found by Brainy.


# 1.103 07-Oct-2015 ozaki-r

Enqueue frames to a curcpu's pktqueue

Currently RX can run on a CPU other than CPU#0, so always enqueuing
to a pktqueue of CPU#0 makes no sense. Let's use a curcpu's pktqueue,
although bridge_foward softint doesn't run in parallel without
NET_MPSAFE.

This is a temporal solution. We need a fundamental solution.


Revision tags: nick-nhusb-base-20150921
# 1.102 28-Aug-2015 rjs

Don't set M_PROTO1 in mbuf flags.

This was left over from the old usage of gif(4) with bridges.


# 1.101 20-Aug-2015 christos

include "ioconf.h" to get the 'void <driver>attach(int count);' prototype.


# 1.100 23-Jul-2015 ozaki-r

Fix PR 48104

So far bridge cannot receive frames via a member interface when the frames
come from another member interface. So when we assign an IP address to
a member interface, hosts connected to another member interface cannot
ping to the IP address. That behavior isn't expected. See PR 48104 for
more realistic examples of this issue.

The change does:
- drop M_PROMISC before ether_input, which allows a bridge member interface
to receive a frame coming from another bridge member interface
- receive broadcast/multicast frames via all bridge member interfaces,
which is required to receive IPv6 multicast packets destined to a
multicast group belonging to a bridge member interface that is different
from a packet arrival interface

roy@ helped testing of the fix, thanks!


Revision tags: nick-nhusb-base-20150606
# 1.99 01-Jun-2015 matt

Modify the BRDGGIFS and BRDGRTS cmds to be more COMPAT_NETBSD32 friendly.
(XXX whitespace)


# 1.98 16-Apr-2015 ozaki-r

Fix racy bridge_delete_member

It can be called from bridge_ioctl_del and bridge_clone_destroy with
a same bridge member (bif) at the same time. We have to prevent
that happens.

Pointed out by riastradh@


Revision tags: nick-nhusb-base-20150406
# 1.97 08-Jan-2015 ozaki-r

Use pserialize for rtlist in bridge

This change enables lockless accesses to bridge rtable lists.
See locking notes in a comment to know how pserialize and
mutexes are used. Some functions are rearranged to use
pserialize. A workqueue is introduced to use pserialize in
bridge_rtage via bridge_timer callout.

As usual, pserialize and mutexes are used only when NET_MPSAFE
on. On the other hand, the newly added workqueue is used
regardless of NET_MPSAFE on or off.


# 1.96 01-Jan-2015 ozaki-r

Reset the expire time of a cache on receiving a frame for the cache

The expire time of a cache in a bridge MAC address table was never reset
once it is initialized regardless of traffic for the cache. The behavior
isn't supposed and active caches are unnecessarily expired and removed.

PR kern/49507


# 1.95 31-Dec-2014 ozaki-r

Use pserialize in bridge

This change enables lockless accesses to bridge member lists.
See locking notes in a comment to know how pserialize and
mutexes are used.

This change also provides support for softint-based interrupt
handling; pserialize readers can run in both HW interrupt and
softint contexts.

As usual, pserialize is used only when NET_MPSAFE on.


# 1.94 25-Dec-2014 ozaki-r

Use LIST_FOREACH_SAFE in bridge_rt* functions


# 1.93 24-Dec-2014 ozaki-r

Replace malloc/free with kmem_* in if_bridge

Additionally M_NOWAIT is replaced with KM_SLEEP.


# 1.92 22-Dec-2014 ozaki-r

Call ether_input/m_freem without holding a lock or referencing unnecessary objects

When NET_MPSAFE on, a bridge tries to pass up a packet to Layer 3
(or call m_freem) with holding a lock or referencing unnecessary
objects. That causes random lock ups. The change fixes the issue.


Revision tags: nick-nhusb-base
# 1.91 15-Aug-2014 ozaki-r

branches: 1.91.2;
bridge: reject non-IFF_SIMPLEX interfaces

bridge does not work with !IFF_SIMPLEX interfaces (PR/18035);
the bug is not yet fixed. Until it gets fixed, we should
reject non-IFF_SIMPLEX interfaces.

Discussed with pooka@


Revision tags: netbsd-7-1-2-RELEASE netbsd-7-1-1-RELEASE netbsd-7-1-RELEASE netbsd-7-1-RC2 netbsd-7-nhusb-base-20170116 netbsd-7-1-RC1 netbsd-7-0-2-RELEASE netbsd-7-nhusb-base netbsd-7-0-1-RELEASE netbsd-7-0-RELEASE netbsd-7-0-RC3 netbsd-7-0-RC2 netbsd-7-0-RC1 netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.90 23-Jul-2014 ozaki-r

branches: 1.90.2;
Avoid calling copyout with holding mutex(IPL_NET)

Because copyout may lead a page fault that may sleep, we have to pull it
out from the critical section of mutex(IPL_NET) in bridge_ioctl_gifs.


# 1.89 23-Jul-2014 ozaki-r

Add missing unlock


# 1.88 20-Jul-2014 ozaki-r

Don't return ENETRESET when ioctl SIOCSIFMTU

Otherwise, just changing MTU with ifconfig shows
a confusable error message.

RP kern/48996


# 1.87 14-Jul-2014 ozaki-r

Make bridge MPSAFE

- Introduce BRIDGE_MPSAFE
- It's enabled only when NET_MPSAFE is defined
in if.h or the kernel config
- Add iflist and rtlist mutex locks
- Locking iflist is performance sensitive,
so it's not used when !BRIDGE_MPSAFE
- Add bif object reference counting
- It enables fine-grain locking for bridge member lists
by allowing to not hold a lock during touching a bif
- bridge_release_member is added to decrement the
reference count
- A condition variable is added to do bridge_delete_member
gracefully
- Add if_bridgeif to ifnet
- It's a shortcut to a bif object of a bridge member
- It reduces a bif lookup cost and so lock contention on iflist
- Make bridgestp MPSAFE too


# 1.86 02-Jul-2014 ozaki-r

Protect bridge_list with a mutex


# 1.85 02-Jul-2014 ozaki-r

Remove obsolete codes for if_snd


# 1.84 23-Jun-2014 ozaki-r

Get rid of unnecessary xc_broadcast after pktq_barrier

Pointed out by rmind@


# 1.83 18-Jun-2014 ozaki-r

Restructure bridge_input and bridge_broadcast

There are two changes:
- Assemble the places calling pktq_enqueue (bridge_forward)
for unicast and {b,m}cast frames into one
- Receive {b,m}cast frames in bridge_broadcast, not in
bridge_input

The changes make the code clear and readable. bridge_input
now doesn't need to take care of {b,m}cast frames;
bridge_forward and bridge_broadcast have the responsibility.

The changes are based on a patch of Lloyd Parkes submitted
in PR 48104, but don't fix its issue yet.


# 1.82 18-Jun-2014 ozaki-r

Tidy up bridge_input

No functional change.


# 1.81 17-Jun-2014 ozaki-r

Restructure ether_input and bridge_input

The network stack of NetBSD is well organized and
layered. A packet reception is processed from a
lower layer to an upper layer one by one. However,
ether_input and bridge_input are not structured so.
bridge_input is called inside ether_input.

The new structure replaces ifnet#if_input of a bridge
member with bridge_input when the member is attached.
So a packet goes straight on a packet reception via
a bridge, bridge_input => ether_input => ip_input.

The change is part of a patch of Lloyd Parkes submitted
in PR 48104. Unlike the patch, the change doesn't
intend to change the behavior of the packet processing.
Another patch will fix PR 48104.


# 1.80 16-Jun-2014 ozaki-r

Add net.interfaces.bridgeN.fwdq.{maxlen,len,drops} sysctl


# 1.79 16-Jun-2014 ozaki-r

Use pktqueue for bridge forwarding queue and softint


# 1.78 15-Jun-2014 ozaki-r

Get rid of unnecessary splnet for pool_{get,put}

A mutex prevents interrupts in the functions now.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base
# 1.77 29-Jun-2013 rmind

branches: 1.77.4;
- Rewrite parts of pfil(9): use array to store hooks and thus be more cache
friendly (there are only few hooks in the system). Make the structures
opaque and the interface more strict.
- Remove PFIL_HOOKS option by making pfil(9) mandatory.


Revision tags: agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8
# 1.76 22-Mar-2012 wiz

branches: 1.76.2; 1.76.4;
Fix typo in kauth name. From PR 46234 by Matthew Mondor.
Tested by Geoff Adams and Ryo ONODERA.


# 1.75 13-Mar-2012 elad

Replace the remaining KAUTH_GENERIC_ISSUSER authorization calls with
something meaningful. All relevant documentation has been updated or
written.

Most of these changes were brought up in the following messages:

http://mail-index.netbsd.org/tech-kern/2012/01/18/msg012490.html
http://mail-index.netbsd.org/tech-kern/2012/01/19/msg012502.html
http://mail-index.netbsd.org/tech-kern/2012/02/17/msg012728.html

Thanks to christos, manu, njoly, and jmmv for input.

Huge thanks to pgoyette for spinning these changes through some build
cycles and ATF.


Revision tags: netbsd-6-0-6-RELEASE netbsd-6-1-5-RELEASE netbsd-6-1-4-RELEASE netbsd-6-0-5-RELEASE netbsd-6-1-3-RELEASE netbsd-6-0-4-RELEASE netbsd-6-1-2-RELEASE netbsd-6-0-3-RELEASE netbsd-6-1-1-RELEASE netbsd-6-0-2-RELEASE netbsd-6-1-RELEASE netbsd-6-1-RC4 netbsd-6-1-RC3 netbsd-6-1-RC2 netbsd-6-1-RC1 netbsd-6-0-1-RELEASE matt-nb6-plus-nbase netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base
# 1.74 19-Nov-2011 tls

branches: 1.74.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.73 23-May-2011 joerg

branches: 1.73.4;
simplify


Revision tags: bouyer-quota2-nbase bouyer-quota2-base jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.72 07-Dec-2010 pooka

branches: 1.72.2;
_KERNEL_TOP


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1 yamt-nfs-mp-base9 uebayasi-xip-base
# 1.71 19-Jan-2010 pooka

branches: 1.71.4;
Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


Revision tags: matt-premerge-20091211 yamt-nfs-mp-base8 yamt-nfs-mp-base7 jymxensuspend-base yamt-nfs-mp-base6 yamt-nfs-mp-base5 jym-xensuspend-nbase
# 1.70 17-May-2009 cegger

fix crash in bridge_ioctl():


BRDGGFLT and BRDGSFILT bridge controls are only available with BRIDGE_IPF and PFIL_HOOKS defined.
In amd64 GENERIC and XEN kernel configs PFIL_HOOKS is defined but BRIDGE_IPF is not.

When a BRDGGFLT or BRDGSFILT command comes in, then ifd->ifd_cmd is not in range
of bridge_control_table_size. Then bc is not set and is dereferenced
later => BOOM.


Revision tags: yamt-nfs-mp-base4 jym-xensuspend-base
# 1.69 12-May-2009 elad

Move kauth(9) call before going into splnet().

Mailing list reference:

http://mail-index.netbsd.org/tech-net/2009/05/08/msg001286.html


Revision tags: yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 nick-hppapmap-base
# 1.68 04-Apr-2009 bouyer

Fix another typo


# 1.67 04-Apr-2009 bouyer

Fix a comment, and make it build.


# 1.66 04-Apr-2009 bouyer

Fixes from Masao Uebayashi


# 1.65 04-Apr-2009 bouyer

Fix for if_start() and pfil_hook() being called from hardware interrupt
context (reported on various mailing-lists, and part of PR kern/41114,
causing panic in pf(4) and possibly ipf(4) when BRIDGE_IPF is used).
Defer bridge_forward() to a software interrupt; bridge_input() enqueues
mbufs to ifp->if_snd which is handled in bridge_forward().


Revision tags: nick-hppapmap-base2
# 1.64 18-Jan-2009 mrg

branches: 1.64.2;
Fix multiple problems:

* A sign extension error creating the bridge ID corrupted the
priority (always making it the maximum).
* Do not catch STP packets on an interface for which STP is not
enabled -- it's a violation of the spec, and causes STP to fail on
neighboring bridges.
* An optimization to bstp_input() -- some information is already
known when we call it.

contributed anonymously.


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.63 07-Nov-2008 dyoung

*** Summary ***

When a link-layer address changes (e.g., ifconfig ex0 link
02:de:ad:be:ef:02 active), send a gratuitous ARP and/or a Neighbor
Advertisement to update the network-/link-layer address bindings
on our LAN peers.

Refuse a change of ethernet address to the address 00:00:00:00:00:00
or to any multicast/broadcast address. (Thanks matt@.)

Reorder ifnet ioctl operations so that driver ioctls may inherit
the functions of their "class"---ether_ioctl(), fddi_ioctl(), et
cetera---and the class ioctls may inherit from the generic ioctl,
ifioctl_common(), but both driver- and class-ioctls may override
the generic behavior. Make network drivers share more code.

Distinguish a "factory" link-layer address from others for the
purposes of both protecting that address from deletion and computing
EUI64.

Return consistent, appropriate error codes from network drivers.

Improve readability. KNF.

*** Details ***

In if_attach(), always initialize the interface ioctl routine,
ifnet->if_ioctl, if the driver has not already initialized it.
Delete if_ioctl == NULL tests everywhere else, because it cannot
happen.

In the ioctl routines of network interfaces, inherit common ioctl
behaviors by calling either ifioctl_common() or whichever ioctl
routine is appropriate for the class of interface---e.g., ether_ioctl()
for ethernets.

Stop (ab)using SIOCSIFADDR and start to use SIOCINITIFADDR. In
the user->kernel interface, SIOCSIFADDR's argument was an ifreq,
but on the protocol->ifnet interface, SIOCSIFADDR's argument was
an ifaddr. That was confusing, and it would work against me as I
make it possible for a network interface to overload most ioctls.
On the protocol->ifnet interface, replace SIOCSIFADDR with
SIOCINITIFADDR. In ifioctl(), return EPERM if userland tries to
invoke SIOCINITIFADDR.

In ifioctl(), give the interface the first shot at handling most
interface ioctls, and give the protocol the second shot, instead
of the other way around. Finally, let compatibility code (COMPAT_OSOCK)
take a shot.

Pull device initialization out of switch statements under
SIOCINITIFADDR. For example, pull ..._init() out of any switch
statement that looks like this:

switch (...->sa_family) {
case ...:
..._init();
...
break;
...
default:
..._init();
...
break;
}

Rewrite many if-else clauses that handle all permutations of IFF_UP
and IFF_RUNNING to use a switch statement,

switch (x & (IFF_UP|IFF_RUNNING)) {
case 0:
...
break;
case IFF_RUNNING:
...
break;
case IFF_UP:
...
break;
case IFF_UP|IFF_RUNNING:
...
break;
}

unifdef lots of code containing #ifdef FreeBSD, #ifdef NetBSD, and
#ifdef SIOCSIFMTU, especially in fwip(4) and in ndis(4).

In ipw(4), remove an if_set_sadl() call that is out of place.

In nfe(4), reuse the jumbo MTU logic in ether_ioctl().

Let ethernets register a callback for setting h/w state such as
promiscuous mode and the multicast filter in accord with a change
in the if_flags: ether_set_ifflags_cb() registers a callback that
returns ENETRESET if the caller should reset the ethernet by calling
if_init(), 0 on success, != 0 on failure. Pull common code from
ex(4), gem(4), nfe(4), sip(4), tlp(4), vge(4) into ether_ioctl(),
and register if_flags callbacks for those drivers.

Return ENOTTY instead of EINVAL for inappropriate ioctls. In
zyd(4), use ENXIO instead of ENOTTY to indicate that the device is
not any longer attached.

Add to if_set_sadl() a boolean 'factory' argument that indicates
whether a link-layer address was assigned by the factory or some
other source. In a comment, recommend using the factory address
for generating an EUI64, and update in6_get_hw_ifid() to prefer a
factory address to any other link-layer address.

Add a routing message, RTM_LLINFO_UPD, that tells protocols to
update the binding of network-layer addresses to link-layer addresses.
Implement this message in IPv4 and IPv6 by sending a gratuitous
ARP or a neighbor advertisement, respectively. Generate RTM_LLINFO_UPD
messages on a change of an interface's link-layer address.

In ether_ioctl(), do not let SIOCALIFADDR set a link-layer address
that is broadcast/multicast or equal to 00:00:00:00:00:00.

Make ether_ioctl() call ifioctl_common() to handle ioctls that it
does not understand.

In gif(4), initialize if_softc and use it, instead of assuming that
the gif_softc and ifp overlap.

Let ifioctl_common() handle SIOCGIFADDR.

Sprinkle rtcache_invariants(), which checks on DIAGNOSTIC kernels
that certain invariants on a struct route are satisfied.

In agr(4), rewrite agr_ioctl_filter() to be a bit more explicit
about the ioctls that we do not allow on an agr(4) member interface.

bzero -> memset. Delete unnecessary casts to void *. Use
sockaddr_in_init() and sockaddr_in6_init(). Compare pointers with
NULL instead of "testing truth". Replace some instances of (type
*)0 with NULL. Change some K&R prototypes to ANSI C, and join
lines.


Revision tags: netbsd-5-0-RC3 netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 wrstuden-revivesa-base-1 simonb-wapbl-nbase yamt-pf42-base4 simonb-wapbl-base wrstuden-revivesa-base
# 1.62 15-Jun-2008 christos

branches: 1.62.2; 1.62.4; 1.62.6;
- add if_alloc (ours just mallocs), and if_initname and use them (from FreeBSD)
- kill memsets where M_ZERO can be used.


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base
# 1.61 15-Apr-2008 thorpej

branches: 1.61.2; 1.61.4; 1.61.6; 1.61.8;
Make ip6 and icmp6 stats per-cpu.


# 1.60 12-Apr-2008 cegger

make this build with BRIDGE_IPF and PFIL_HOOKS options


# 1.59 12-Apr-2008 thorpej

Make IP, TCP, UDP, and ICMP statistics per-CPU. The stats are collated
when the user requests them via sysctl.


# 1.58 08-Apr-2008 thorpej

Change IPv6 stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ip6stat structure; old netstat
binaries will continue to work properly.


# 1.57 07-Apr-2008 thorpej

Change IP stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ipstat structure; old netstat
binaries will continue to work properly.


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase nick-net80211-sync-base keiichi-mipv6-base matt-armv6-nbase hpcarm-cleanup-base
# 1.56 20-Feb-2008 matt

branches: 1.56.6;
s/u_\(int[0-9]*_t\)/u\1/g
(change u_int*_t to uint*_t)


Revision tags: bouyer-xeni386-nbase bouyer-xeni386-base mjf-devfs-base
# 1.55 19-Jan-2008 dyoung

Use C99 array initializers for bridge_control_table[].


Revision tags: nick-csl-alignment-base5 bouyer-xeni386-merge1 matt-armv6-prevmlocking vmlocking2-base3 yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 jmcneill-base bouyer-xenamd64-base2 vmlocking-nbase yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base matt-armv6-base jmcneill-pm-base reinoud-bufcleanup-base vmlocking-base
# 1.54 27-Aug-2007 dyoung

branches: 1.54.2; 1.54.8; 1.54.14;
LLADDR -> CLLADDR.


# 1.53 26-Aug-2007 dyoung

Constify: LLADDR -> CLLADDR. I'm aiming here to make it easier to
identify sockaddr_dl abuse that remains in the kernel, especially
the potential for overwriting memory past the end of a sockaddr_dl
with, e.g., memcpy(LLADDR(), ...).

Use sockaddr_dl_setaddr() in a few places.


Revision tags: matt-mips64-base nick-csl-alignment-base mjf-ufs-trans-base
# 1.52 09-Jul-2007 ad

branches: 1.52.2; 1.52.6;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.51 12-Mar-2007 ad

branches: 1.51.2;
Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.


# 1.50 04-Mar-2007 christos

branches: 1.50.2;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.49 21-Feb-2007 dyoung

Use __arraycount().


# 1.48 17-Feb-2007 dyoung

KNF: de-__P, bzero -> memset, bcmp -> memcmp. Remove extraneous
parentheses in return statements.

Cosmetic: don't open-code TAILQ_FOREACH().

Cosmetic: change types of variables to avoid oodles of casts: in
in6_src.c, avoid casts by changing several route_in6 pointers
to struct route pointers. Remove unnecessary casts to caddr_t
elsewhere.

Pave the way for eliminating address family-specific route caches:
soon, struct route will not embed a sockaddr, but it will hold
a reference to an external sockaddr, instead. We will set the
destination sockaddr using rtcache_setdst(). (I created a stub
for it, but it isn't used anywhere, yet.) rtcache_free() will
free the sockaddr. I have extracted from rtcache_free() a helper
subroutine, rtcache_clear(). rtcache_clear() will "forget" a
cached route, but it will not forget the destination by releasing
the sockaddr. I use rtcache_clear() instead of rtcache_free()
in rtcache_update(), because rtcache_update() is not supposed
to forget the destination.

Constify:

1 Introduce const accessor for route->ro_dst, rtcache_getdst().

2 Constify the 'dst' argument to ifnet->if_output(). This
led me to constify a lot of code called by output routines.

3 Constify the sockaddr argument to protosw->pr_ctlinput. This
led me to constify a lot of code called by ctlinput routines.

4 Introduce const macros for converting from a generic sockaddr
to family-specific sockaddrs, e.g., sockaddr_in: satocsin6,
satocsin, et cetera.


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.47 04-Jan-2007 elad

branches: 1.47.2;
Consistent usage of KAUTH_GENERIC_ISSUSER.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3 netbsd-4-base
# 1.46 23-Nov-2006 rpaulo

New EtherIP driver based on tap(4) and gif(4) by Hans Rosenfeld.
Notable changes:
* Fixes PR 34268.
* Separates the code from gif(4) (which is more cleaner).
* Allows the usage of STP (Spanning Tree Protocol).
* Removed EtherIP implementation from gif(4)/tap(4).

Some input from Christos.


# 1.45 16-Nov-2006 christos

__unused removal on arguments; approved by core.


Revision tags: yamt-splraiseipl-base2
# 1.44 17-Oct-2006 dogcow

now that we have -Wno-unused-parameter, back out all the tremendously ugly
code to gratuitously access said parameters.


# 1.43 13-Oct-2006 dogcow

More -Wunused fallout. sprinkle __unused when possible; otherwise, use the
do { if (&x) {} } while (/* CONSTCOND */ 0);
construct as suggested by uwe in <20061012224845.GA9449@snark.ptc.spbu.ru>.


# 1.42 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


# 1.41 05-Oct-2006 tls

Protect calls to pool_put/pool_get that may occur in interrupt context
with spl used to protect other allocations and frees, or datastructure
element insertion and removal, in adjacent code.

It is almost unquestionably the case that some of the spl()/splx() calls
added here are superfluous, but it really seems wrong to see:

s=splfoo();
/* frob data structure */
splx(s);
pool_put(x);

and if we think we need to protect the first operation, then it is hard
to see why we should not think we need to protect the next. "Better
safe than sorry".

It is also almost unquestionably the case that I missed some pool
gets/puts from interrupt context with my strategy for finding these
calls; use of PR_NOWAIT is a strong hint that a pool may be used from
interrupt context but many callers in the kernel pass a "can wait/can't
wait" flag down such that my searches might not have found them. One
notable area that needs to be looked at is pf.

See also:

http://mail-index.netbsd.org/tech-kern/2006/07/19/0003.html
http://mail-index.netbsd.org/tech-kern/2006/07/19/0009.html


Revision tags: abandoned-netbsd-4-base yamt-splraiseipl-base yamt-pdpolicy-base9 yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.40 23-Jul-2006 ad

branches: 1.40.4; 1.40.6;
Use the LWP cached credentials where sane.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.39 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.38 18-May-2006 liamjfoy

branches: 1.38.2;
Integrate Common Address Redundancy Procotol (CARP) from OpenBSD

'pseudo-device carp'

Thanks to: joerg@ christos@ riz@ and others who tested
Ok: core@


# 1.37 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base yamt-pdpolicy-base2 elad-kernelauth-base yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.36 17-Jan-2006 christos

branches: 1.36.2; 1.36.4; 1.36.6; 1.36.8; 1.36.10;
Make sure that breq is also cleared (from Xin LI)


# 1.35 09-Jan-2006 christos

Make sure we initialize all structs to 0; from Xin LI


# 1.34 24-Dec-2005 perry

branches: 1.34.2;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.33 11-Dec-2005 thorpej

ANSI function decls and application of static.


# 1.32 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.31 01-Jun-2005 jdc

branches: 1.31.2;
Fix this properly by renaming the conflicting variables.


# 1.30 01-Jun-2005 jdc

Remove extraneous definition of struct llc (found by shadow warning).


Revision tags: netbsd-3-0-RELEASE netbsd-3-0-RC6 netbsd-3-0-RC5 netbsd-3-0-RC4 netbsd-3-0-RC3 netbsd-3-0-RC2 netbsd-3-0-RC1 yamt-km-base4 yamt-km-base3 netbsd-3-base kent-audio2-base
# 1.29 26-Feb-2005 perry

branches: 1.29.2; 1.29.4;
nuke trailing whitespace


Revision tags: yamt-km-base2
# 1.28 31-Jan-2005 kim

Add RFC 3378 EtherIP support, ported from OpenBSD to NetBSD by
Hans Rosenfeld (rosenfeld at grumpf.hope-2000.org)

This change makes it possible to add gif interfaces to bridges, which
will then send and receive IP protocol 97 packets. Packets are Ethernet
frames with an EtherIP header prepended.


Revision tags: yamt-km-base kent-audio1-beforemerge kent-audio1-base
# 1.27 04-Dec-2004 peter

branches: 1.27.4; 1.27.6;
Change ifc_destroy to return an int instead of void, so that it
can pass back errors to ifconfig.


# 1.26 06-Oct-2004 bad

Interfaces that do checksum offloading indicate the checksum status of
received packets in csum_flags in the packet header. Packets that are
forwarded over the bridge need to have csum_flags cleared before being
put on the output queue. Do so in bridge_enqueue().

Discussed with Jason Thorpe.

Fixes PR kern/27007 and the first part of PR kern/21831.


# 1.25 05-Oct-2004 christos

Only enable BRIDGE_IPF code if PFIL_HOOKS is enabled.


# 1.24 21-Apr-2004 itojun

kill a sprintf


# 1.23 21-Apr-2004 itojun

kill sprintf, use snprintf


Revision tags: netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.22 31-Jan-2004 jdc

branches: 1.22.2;
Use m_copydata(), m_adj() and M_PREPEND() to manipulate mbuf's in
bridge_ipf(). Fixes kernel memory corruption that occured when using
m_split() and m_cat().
Idea from OpenBSD.


# 1.21 09-Dec-2003 augustss

Fix spelling mistake in a comment.


# 1.20 28-Oct-2003 mycroft

Mark this initializer in the canonical way so it can be found later.


# 1.19 25-Oct-2003 christos

Fix uninitialized variable warnings


# 1.18 16-Sep-2003 jdc

Add a flag parameter to bridge_enqueue() to tell it whether to run the
filter or not. We only need to run the filter for bridge_forward() and
bridge_broadcast(). If we also run it for bridge_output(), we will run
the filter twice outbound per packet, so don't.

In bridge_ipf(), make sure we don't run m_cat() on a single mbuf chain
by checking to see (and remembering) if we need to m_split() the mbuf.
This fixes bridge + ipfilter on sparc.

Fixes PR kern/22063.


# 1.17 11-Aug-2003 itojun

rm extra blank line


# 1.16 13-Jul-2003 jdc

Include opt_inet.h to get INET6 definition.
Now, bridged ipv6 packets are passed through ipfilter.
However, some v6 packets still do not get transmitted when ipf is enabled.
Partial fix for PR kern/22063.


# 1.15 23-Jun-2003 martin

branches: 1.15.2;
Make sure to include opt_foo.h if a defflag option FOO is used.


# 1.14 24-May-2003 kristerw

Make sure splx() is called for all bridge_ioctl() error cases.


# 1.13 16-May-2003 itojun

use strlcpy


# 1.12 14-May-2003 itojun

use arc4random


# 1.11 19-Mar-2003 bouyer

Fix 2 bugs:
- initialise stp when the bridge is turned up, without this stp will keep
all interfaces disabled in a sequence like:
brconfig bridge0 add if0 add if1 stp if0 stp if1 up
- s/BRDGSPRI/BRDGSIFPRIO in brconfig.c:cmd_ifpriority()

add a command (ifpathcost) to change the stp path cost of the STP path cost of
an interface. Display the interface path cost with the others STP parameters.


# 1.10 27-Feb-2003 perseant

Make BRIDGE_IPF an option, and document it. Add it (commented) to GENERIC.
Let brconfig tell whether the bridge is using the ipfilter hook, or not.


# 1.9 15-Feb-2003 perseant

Add ipf packet-filtering option to if_bridge. The option is controlled at
compile-time by BRIDGE_IPF, and at runtime by brconfig with the {ipf,-ipf}
option on a per-bridge basis.

As a side-effect, add PFIL_HOOKS processing to if_bridge.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base kqueue-aftermerge kqueue-beforemerge gehenna-devsw-base kqueue-base
# 1.8 24-Aug-2002 martin

Add a function to lookup bridge members by struct ifnet * and use
it at all call sites that have such a pointer readily available.
This avoids unnecessary strcmp()s in critical paths, and removes
some XXX comments.


# 1.7 08-Jun-2002 itojun

reject "add" request if if_mtu is different.


# 1.6 23-May-2002 itojun

use IFT_BRIDGE


Revision tags: netbsd-1-6-base
# 1.5 24-Mar-2002 jdolecek

branches: 1.5.2; 1.5.4;
Fix a memory leak in bridge_ioctl_add() when the called for non-ethernet
interface.
Problem noted and fix provided by in kern/16019 by Love.


Revision tags: eeh-devprop-base newlock-base
# 1.4 08-Mar-2002 thorpej

Pool deals fairly well with physical memory shortage, but it doesn't
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map). Try to deal with this:

* Group all information about the backend allocator for a pool in a
separate structure. The pool references this structure, rather than
the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
to become available, but will still fail if it cannot callocate KVA
space for the pages. If this happens, carefully drain all pools using
the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
some pages, and use that information to make draining easier and more
efficient.
* Get rid of PR_URGENT. There was only one use of it, and it could be
dealt with by the caller.

From art@openbsd.org.


Revision tags: ifpoll-base
# 1.3 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.2 17-Aug-2001 thorpej

branches: 1.2.2; 1.2.4;
Only report expire time for DYNAMIC forwarding table entries.


# 1.1 17-Aug-2001 thorpej

Add support for building Ethernet bridges, based on Jason Wright's
bridge driver from OpenBSD, although the bridge code has been *heavily*
modified by me (the 802.1D code remains mostly unchanged from the
original).


# 1.173 01-May-2020 jdolecek

report no enabled capabilities when no interface is part of bridge


# 1.172 30-Apr-2020 jdolecek

for bridge(4), report the common enabled capabilities of the members
via SIOCGIFCAP for visibility


# 1.171 27-Apr-2020 jdolecek

if MTU of the added interface doesn't match the bridge, modify the MTU
of the interface to that of the bridge instead of just refusing the
addition with EINVAL

this is a convenience feature to simplify bridge setup with non-standard
MTU, the useful behaviour observed with Linux xenbr


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406
# 1.170 27-Mar-2020 jdolecek

replace the conditional m_pullup() on start of bridge_output() with
a KASSERT(), to make it clear no mbuf manipulation is ever done here

the condition should never trigger, this always runs after ether_output()
M_PREPEND()s ether_header


# 1.169 24-Mar-2020 jdolecek

reset the csum_flags in bridge_brodcast() also for bmcast path

for destination interfaces with real hardware offloading this fixes
multicast packet corruption; for xvif(4) this fix stops treating them
as having no csum

may fix PR kern/42386


Revision tags: ad-namecache-base3
# 1.168 24-Feb-2020 rin

Remove debug printf I put into bridge_calc_csum_flags().
Sorry for noise.


# 1.167 23-Feb-2020 jdolecek

disable the DEBUG bridge_calc_csum_flags() printf


# 1.166 29-Jan-2020 thorpej

Adopt <net/if_stats.h>.


Revision tags: ad-namecache-base2 ad-namecache-base1 ad-namecache-base phil-wifi-20191119
# 1.165 05-Aug-2019 msaitoh

branches: 1.165.2;
Cast uint32_t to avoid undefined behavior in bridge_rthash(). Found by kUBSan.


Revision tags: netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609 isaki-audio2-base pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226
# 1.164 22-Dec-2018 rin

branches: 1.164.4;
Take the interface out of promiscuous mode in bridge_delete_member()
instead of bridge_ioctl_del(). Otherwise, the member interfaces are
left in promiscuous mode when the bridge is destroyed.


# 1.163 15-Dec-2018 rin

Improve wording in comments: replace "chain" with "queue" for
sequence of mbuf's connected by m_nextpkt, in order to avoid
confusion with those connected by m_next.

No binary changes.


# 1.162 14-Dec-2018 martin

Need <netinet6/ip6_var.h> for ip6_statinc() prototype.


# 1.161 12-Dec-2018 rin

PR kern/53562

Handle TX offload in software when a packet is sent via
bridge_output(). We can send it as is in the following
exceptional cases:

For unicast:

(1) When the destination interface is the same as source.

(2) When the destination supports all TX offload options
specified in a packet.

For multicast/broadcast:

(3) When all the members of the bridge support the specified
TX offload options.

For (3), add sc_csum_flags_tx flag to bridge softc, which is
logical AND b/w capabilities of TX offload options in member
interface (ifp->if_csum_flags_tx). The flag is updated when a
member is (i) added to or (ii) removed from a bridge, or (iii)
if_csum_flags_tx flag of a member interface is manipulated via
ifconfig(8).

Turn on M_CSUM_TSOv[46] bit in ifp->if_csum_flags_tx flag when
TSO[46] is enabled for that interface.

OK msaitoh thorpej


Revision tags: pgoyette-compat-1126
# 1.160 09-Nov-2018 ozaki-r

Fix that brconfig <bridge> (addr) can't show a large number of MAC addresses

The command shows only 256 addresses at maximum even if a bridge caches more
addresses. It occurs because the kernel doesn't return an error if the command
passes a short buffer that can't store all cached addresses; the kernel fills
cached addresses as much as possible and returns it without telling that the
result is truncated.

Fix the issue by telling a required size of a buffer if a buffer passed from the
command is not enough, which lets the command retry with an enough buffer.

Reported by k-goda@IIJ


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930
# 1.159 19-Sep-2018 msaitoh

Micro optimization. m_copym(M_COPYALL) -> m_copypacket().


# 1.158 18-Sep-2018 msaitoh

- Fix bridge_enqueue() which was broken by last commit. Use correct mbuf
pointer.
- Modify comment.


# 1.157 14-Sep-2018 msaitoh

Fix a bug that bridge_enqueue() incorrectly cleared outgoing packet's offload
flags. bridge_enqueue() is called from bridge_output() when a packet is
spontaneous. Clear csum_flags before calling brige_enqueue() in
bridge_forward() or bridge_broadcast() instead of in the beginning of
bridge_enqueue().

Note that this change doesn't fix a problem on the following configuration:

A bridge has two or more interfaces.

An address is assigned to an bridge member interface and
some offload flags are set.

Another interface has no address and has no any offload flag.

XXX pullup-[78]


Revision tags: pgoyette-compat-0906 pgoyette-compat-0728 phil-wifi-base pgoyette-compat-0625
# 1.156 25-May-2018 ozaki-r

branches: 1.156.2;
Ensure to call if_register after interface initializations finish


Revision tags: pgoyette-compat-0521
# 1.155 14-May-2018 ozaki-r

Protect packet input routines with KERNEL_LOCK and splsoftnet

if_input, i.e, ether_input and friends, now runs in softint without any
protections. It's ok for ether_input itself because it's already MP-safe,
however, subsequent routines called from it such as carp_input and agr_input
aren't safe because they're not MP-safe. Protect if_input with KERNEL_LOCK.

if_input can be called from a normal LWP context. In that case we need to
prevent interrupts (softint) from running by splsoftnet to protect non-MP-safe
codes (e.g., carp_input and agr_input).

Pointed out by mlelstv@


Revision tags: pgoyette-compat-0502 pgoyette-compat-0422
# 1.154 18-Apr-2018 ozaki-r

Add missing PSLIST_ENTRY_INIT and PSLIST_ENTRY_DESTROY


# 1.153 18-Apr-2018 ozaki-r

Get rid of a unnecessary semicolon

Pointed out by kamil@


# 1.152 18-Apr-2018 ozaki-r

bridge: use pslist(9) for rtlist and rthash

The change fixes race conditions on list operations. One example is that a
reader may see invalid pointers on a looking item in a list due to lack of
membar_producer.


# 1.151 18-Apr-2018 ozaki-r

Simplify bridge_rtnode_insert (NFC)


# 1.150 18-Apr-2018 ozaki-r

Remove obsolete NULL checks


Revision tags: pgoyette-compat-0415
# 1.149 10-Apr-2018 ozaki-r

Fix bridge_rtdelete

It removes a rtable entry that belongs to a specified interface, however, its
original behavior was to delete all belonging entries. Restore the original
behavior.


Revision tags: pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.148 15-Jan-2018 maxv

branches: 1.148.2;
If the bridge is not running, don't call bridge_stop. Otherwise the
following commands will crash the kernel:

ifconfig bridge0 create
ifconfig bridge0 destroy


# 1.147 28-Dec-2017 ozaki-r

Ensure the timer isn't running by using workqueue_wait


# 1.146 19-Dec-2017 ozaki-r

Don't set IFEF_MPSAFE unless NET_MPSAFE at this point

Because recent investigations show that interfaces with IFEF_MPSAFE need to
follow additional restrictions to work with the flag safely. We should enable it
on an interface by default only if the interface surely satisfies the
restrictions, which are described in if.h.

Note that enabling IFEF_MPSAFE solely gains a few benefit on performance because
the network stack is still serialized by the big kernel locks by default.


# 1.145 11-Dec-2017 ozaki-r

Wrap if_ioctl_lock with IFNET_* macros (NFC)

Also if_ioctl_lock perhaps needs to be renamed to something because it's now
not just for ioctl...


# 1.144 08-Dec-2017 ozaki-r

Fix build of kernels without ether

By throwing out if_enable_vlan_mtu and if_disable_vlan_mtu that
created a unnecessary dependency from if.c to if_ethersubr.c.

PR kern/52790


# 1.143 06-Dec-2017 ozaki-r

Ensure to not turn on IFF_RUNNING of an interface until its initialization completes

And ensure to turn off it before destruction as per IFF_RUNNING's description
"resource allocated". (The description is a bit doubtful though, I believe the
change is still proper.)


# 1.142 06-Dec-2017 ozaki-r

Ensure to hold if_ioctl_lock when calling if_flags_set


Revision tags: tls-maxphys-base-20171202
# 1.141 17-Nov-2017 ozaki-r

Add missing IFEF_NO_LINK_STATE_CHANGE to bridge


# 1.140 16-Nov-2017 ozaki-r

Unify IFEF_*_MPSAFE into IFEF_MPSAFE

There are already two flags for if_output and if_start, however, it seems such
MPSAFE flags are eventually needed for all if_XXX operations. Having discrete
flags for each operation is wasteful of if_extflags bits. So let's unify
the flags into one: IFEF_MPSAFE.

Fortunately IFEF_*_MPSAFE flags have never been included in any releases, so
we can change them without breaking backward compatibility of the releases
(though the kernel version of -current should be bumped).

Note that if an interface have both MP-safe and non-MP-safe operations at a
time, we have to set the IFEF_MPSAFE flag and let callees of non-MP-safe
opeartions take the kernel lock.

Proposed on tech-kern@ and tech-net@


# 1.139 15-Nov-2017 ozaki-r

Mark callouts of bridge CALLOUT_MPSAFE


# 1.138 25-Oct-2017 ozaki-r

Remove unnecessary splsoftnet


# 1.137 25-Oct-2017 ozaki-r

Don't free sc_rthash twice


# 1.136 23-Oct-2017 msaitoh

- If if_initialize() failed in the attach function, free resources and return.
- Add some missing frees in bridge_clone_destroy().
- KNF


# 1.135 02-Oct-2017 ozaki-r

Add curlwp_bind to bridge_input for psref

It can be called in a thread context via tap (tap_dev_write).

Fix PR kern/52587


Revision tags: nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320
# 1.134 07-Mar-2017 ozaki-r

branches: 1.134.6;
Remove unnecessary splnet for bridge_enqueue

bridge_enqueue now uses if_transmit_lock that does splnet for device
drivers, so splnet for bridge_enqueue isn't needed anymore.


# 1.133 16-Feb-2017 knakahara

add l2tp(4) L2TPv3 interface.

originally implemented by IIJ SEIL team.


Revision tags: nick-nhusb-base-20170204
# 1.132 23-Jan-2017 ozaki-r

Replace some splnet with splsoftnet


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107 nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004
# 1.131 15-Sep-2016 christos

branches: 1.131.2;
Always do the mbuf checks. The packet filters (npf) expect the mbuf to be
pulled-up. (Krists Krilovs)


Revision tags: localcount-20160914
# 1.130 29-Aug-2016 ozaki-r

KNF; replace white spaces with hard tabs

No functional change.


Revision tags: pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.129 22-Jun-2016 knakahara

branches: 1.129.2;
fix: locking about IFQ_ENQUEUE and ALTQ

- If NET_MPSAFE is not defined, IFQ_LOCK is nop. Currently, that means
IFQ_ENQUEUE() of some paths such as bridge_enqueue() is called parallel
wrongly.
- If ALTQ is enabled, Tx processing should call if_transmit() (= IFQ_ENQUEUE
+ ifp->if_start()) instead of ifp->if_transmit() to call ALTQ_ENQUEUE()
and ALTQ_DEQUEUE().
Furthermore, ALTQ processing is always required KERNEL_LOCK currently.


# 1.128 20-Jun-2016 knakahara

fix: should not assert IFEF_OUTPUT_MPSAFE in bridge_output()


# 1.127 20-Jun-2016 knakahara

tentative fix for ATF(net/if_bridge/t_bridge)


# 1.126 20-Jun-2016 knakahara

make bridge_output MP-safe, so that bridge(4) can enable IFEF_OUTPUT_MPSAFE.

making MP-scalable is future work.


# 1.125 10-Jun-2016 ozaki-r

Avoid storing a pointer of an interface in a mbuf

Having a pointer of an interface in a mbuf isn't safe if we remove big
kernel locks; an interface object (ifnet) can be destroyed anytime in any
packet processing and accessing such object via a pointer is racy. Instead
we have to get an object from the interface collection (ifindex2ifnet) via
an interface index (if_index) that is stored to a mbuf instead of an
pointer.

The change provides two APIs: m_{get,put}_rcvif_psref that use psref(9)
for sleep-able critical sections and m_{get,put}_rcvif that use
pserialize(9) for other critical sections. The change also adds another
API called m_get_rcvif_NOMPSAFE, that is NOT MP-safe and for transition
moratorium, i.e., it is intended to be used for places where are not
planned to be MP-ified soon.

The change adds some overhead due to psref to performance sensitive paths,
however the overhead is not serious, 2% down at worst.

Proposed on tech-kern and tech-net.


# 1.124 10-Jun-2016 ozaki-r

Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.123 16-May-2016 ozaki-r

Apply if_get and if_put to bridge(4)


# 1.122 04-May-2016 roy

Allow multicast/broadcast packets from a bridge member to other members.
Note this should just call bridge_broadcast when more locking issues are
resolved.


# 1.121 28-Apr-2016 knakahara

introduce new ifnet MP-scalable sending interface "if_transmit".


# 1.120 28-Apr-2016 ozaki-r

Constify rtentry of if_output

We no longer need to change rtentry below if_output.

The change makes it clear where rtentries are changed (or not)
and helps forthcoming locking (os psrefing) rtentries.


# 1.119 24-Apr-2016 christos

CID 1358673: dead code


Revision tags: nick-nhusb-base-20160422
# 1.118 22-Apr-2016 roy

Change used from int to bool.
If used, abort the loop because we think we're already at the end.


# 1.117 20-Apr-2016 knakahara

IFQ_ENQUEUE refactor (3/3) : eliminate pktattr argument from IFQ_ENQUEUE caller


# 1.116 20-Apr-2016 knakahara

IFQ_ENQUEUE refactor (2/3) : eliminate pktattr argument from altq implemantation


# 1.115 19-Apr-2016 ozaki-r

Apply psref(9) to bridge(4)

Note that there is an issue that ioctls for an interface and a destruction
of the interface can run in parallel and it causes race conditions on
bridge as well (it rarely happens). The issue will be addressed in the
interface common code (if.c).


# 1.114 19-Apr-2016 ozaki-r

Remove BRIDGE_MPSAFE switch and enable MP-safe code by default

We need to enable it by default because bridge_input now runs
in softint, but bridge_input w/o BRIDGE_MPSAFE was designed as
it runs in hardware interrupt.

Note that there remains a racy code in bridge_output; it will be
solved in the upcoming change (applying psref(9)).


# 1.113 11-Apr-2016 ozaki-r

Fix usage of pslist(9)

Pointed out by riastradh@.


# 1.112 11-Apr-2016 ozaki-r

Use pslist(9) in bridge(4)

This adds missing memory barriers to list operations for pserialize.


# 1.111 28-Mar-2016 ozaki-r

Remove unused global bridge list

Pointed out by riastradh@


# 1.110 23-Mar-2016 ozaki-r

Fix LIST_FOREACH argument


# 1.109 23-Mar-2016 ozaki-r

Use LIST_FOREACH instead of LIST_FOREACH_SAFE

No need to use *_SAFE because we don't remove any items in the loop.


Revision tags: nick-nhusb-base-20160319
# 1.108 15-Feb-2016 ozaki-r

Simplify bridge(4)

Thanks to introducing softint-based if_input, the entire bridge code now
never run in hardware interrupt context. So we can simplify the code.

- Remove spin mutexes
- They were needed because some code of bridge could run in
hardware interrupt context
- We now need only an adaptive mutex for each shared object
(a member list and a forwarding table)
- Remove pktqueue
- bridge_input is already in softint, using another softint
(for bridge_forward) is useless
- Packet distribution should be down at device drivers


# 1.107 10-Feb-2016 ozaki-r

Don't share struct work, instead have one per softc

Pointed out by riastradh@


# 1.106 09-Feb-2016 ozaki-r

Introduce softint-based if_input

This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.

This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.

To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.

Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html

Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!


Revision tags: nick-nhusb-base-20151226
# 1.105 19-Nov-2015 christos

Add handling of VLAN packets in if_bridge where the parent interface supports
them (Jean-Jacques.Puig@espci.fr). Factor out the vlan_mtu enabling and
disabling code.


# 1.104 20-Oct-2015 maxv

Harmless alloc inconsistency; make sure the exact same argument is given to
kmem_alloc/kmem_free. Found by Brainy.


# 1.103 07-Oct-2015 ozaki-r

Enqueue frames to a curcpu's pktqueue

Currently RX can run on a CPU other than CPU#0, so always enqueuing
to a pktqueue of CPU#0 makes no sense. Let's use a curcpu's pktqueue,
although bridge_foward softint doesn't run in parallel without
NET_MPSAFE.

This is a temporal solution. We need a fundamental solution.


Revision tags: nick-nhusb-base-20150921
# 1.102 28-Aug-2015 rjs

Don't set M_PROTO1 in mbuf flags.

This was left over from the old usage of gif(4) with bridges.


# 1.101 20-Aug-2015 christos

include "ioconf.h" to get the 'void <driver>attach(int count);' prototype.


# 1.100 23-Jul-2015 ozaki-r

Fix PR 48104

So far bridge cannot receive frames via a member interface when the frames
come from another member interface. So when we assign an IP address to
a member interface, hosts connected to another member interface cannot
ping to the IP address. That behavior isn't expected. See PR 48104 for
more realistic examples of this issue.

The change does:
- drop M_PROMISC before ether_input, which allows a bridge member interface
to receive a frame coming from another bridge member interface
- receive broadcast/multicast frames via all bridge member interfaces,
which is required to receive IPv6 multicast packets destined to a
multicast group belonging to a bridge member interface that is different
from a packet arrival interface

roy@ helped testing of the fix, thanks!


Revision tags: nick-nhusb-base-20150606
# 1.99 01-Jun-2015 matt

Modify the BRDGGIFS and BRDGRTS cmds to be more COMPAT_NETBSD32 friendly.
(XXX whitespace)


# 1.98 16-Apr-2015 ozaki-r

Fix racy bridge_delete_member

It can be called from bridge_ioctl_del and bridge_clone_destroy with
a same bridge member (bif) at the same time. We have to prevent
that happens.

Pointed out by riastradh@


Revision tags: nick-nhusb-base-20150406
# 1.97 08-Jan-2015 ozaki-r

Use pserialize for rtlist in bridge

This change enables lockless accesses to bridge rtable lists.
See locking notes in a comment to know how pserialize and
mutexes are used. Some functions are rearranged to use
pserialize. A workqueue is introduced to use pserialize in
bridge_rtage via bridge_timer callout.

As usual, pserialize and mutexes are used only when NET_MPSAFE
on. On the other hand, the newly added workqueue is used
regardless of NET_MPSAFE on or off.


# 1.96 01-Jan-2015 ozaki-r

Reset the expire time of a cache on receiving a frame for the cache

The expire time of a cache in a bridge MAC address table was never reset
once it is initialized regardless of traffic for the cache. The behavior
isn't supposed and active caches are unnecessarily expired and removed.

PR kern/49507


# 1.95 31-Dec-2014 ozaki-r

Use pserialize in bridge

This change enables lockless accesses to bridge member lists.
See locking notes in a comment to know how pserialize and
mutexes are used.

This change also provides support for softint-based interrupt
handling; pserialize readers can run in both HW interrupt and
softint contexts.

As usual, pserialize is used only when NET_MPSAFE on.


# 1.94 25-Dec-2014 ozaki-r

Use LIST_FOREACH_SAFE in bridge_rt* functions


# 1.93 24-Dec-2014 ozaki-r

Replace malloc/free with kmem_* in if_bridge

Additionally M_NOWAIT is replaced with KM_SLEEP.


# 1.92 22-Dec-2014 ozaki-r

Call ether_input/m_freem without holding a lock or referencing unnecessary objects

When NET_MPSAFE on, a bridge tries to pass up a packet to Layer 3
(or call m_freem) with holding a lock or referencing unnecessary
objects. That causes random lock ups. The change fixes the issue.


Revision tags: nick-nhusb-base
# 1.91 15-Aug-2014 ozaki-r

branches: 1.91.2;
bridge: reject non-IFF_SIMPLEX interfaces

bridge does not work with !IFF_SIMPLEX interfaces (PR/18035);
the bug is not yet fixed. Until it gets fixed, we should
reject non-IFF_SIMPLEX interfaces.

Discussed with pooka@


Revision tags: netbsd-7-1-2-RELEASE netbsd-7-1-1-RELEASE netbsd-7-1-RELEASE netbsd-7-1-RC2 netbsd-7-nhusb-base-20170116 netbsd-7-1-RC1 netbsd-7-0-2-RELEASE netbsd-7-nhusb-base netbsd-7-0-1-RELEASE netbsd-7-0-RELEASE netbsd-7-0-RC3 netbsd-7-0-RC2 netbsd-7-0-RC1 netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.90 23-Jul-2014 ozaki-r

branches: 1.90.2;
Avoid calling copyout with holding mutex(IPL_NET)

Because copyout may lead a page fault that may sleep, we have to pull it
out from the critical section of mutex(IPL_NET) in bridge_ioctl_gifs.


# 1.89 23-Jul-2014 ozaki-r

Add missing unlock


# 1.88 20-Jul-2014 ozaki-r

Don't return ENETRESET when ioctl SIOCSIFMTU

Otherwise, just changing MTU with ifconfig shows
a confusable error message.

RP kern/48996


# 1.87 14-Jul-2014 ozaki-r

Make bridge MPSAFE

- Introduce BRIDGE_MPSAFE
- It's enabled only when NET_MPSAFE is defined
in if.h or the kernel config
- Add iflist and rtlist mutex locks
- Locking iflist is performance sensitive,
so it's not used when !BRIDGE_MPSAFE
- Add bif object reference counting
- It enables fine-grain locking for bridge member lists
by allowing to not hold a lock during touching a bif
- bridge_release_member is added to decrement the
reference count
- A condition variable is added to do bridge_delete_member
gracefully
- Add if_bridgeif to ifnet
- It's a shortcut to a bif object of a bridge member
- It reduces a bif lookup cost and so lock contention on iflist
- Make bridgestp MPSAFE too


# 1.86 02-Jul-2014 ozaki-r

Protect bridge_list with a mutex


# 1.85 02-Jul-2014 ozaki-r

Remove obsolete codes for if_snd


# 1.84 23-Jun-2014 ozaki-r

Get rid of unnecessary xc_broadcast after pktq_barrier

Pointed out by rmind@


# 1.83 18-Jun-2014 ozaki-r

Restructure bridge_input and bridge_broadcast

There are two changes:
- Assemble the places calling pktq_enqueue (bridge_forward)
for unicast and {b,m}cast frames into one
- Receive {b,m}cast frames in bridge_broadcast, not in
bridge_input

The changes make the code clear and readable. bridge_input
now doesn't need to take care of {b,m}cast frames;
bridge_forward and bridge_broadcast have the responsibility.

The changes are based on a patch of Lloyd Parkes submitted
in PR 48104, but don't fix its issue yet.


# 1.82 18-Jun-2014 ozaki-r

Tidy up bridge_input

No functional change.


# 1.81 17-Jun-2014 ozaki-r

Restructure ether_input and bridge_input

The network stack of NetBSD is well organized and
layered. A packet reception is processed from a
lower layer to an upper layer one by one. However,
ether_input and bridge_input are not structured so.
bridge_input is called inside ether_input.

The new structure replaces ifnet#if_input of a bridge
member with bridge_input when the member is attached.
So a packet goes straight on a packet reception via
a bridge, bridge_input => ether_input => ip_input.

The change is part of a patch of Lloyd Parkes submitted
in PR 48104. Unlike the patch, the change doesn't
intend to change the behavior of the packet processing.
Another patch will fix PR 48104.


# 1.80 16-Jun-2014 ozaki-r

Add net.interfaces.bridgeN.fwdq.{maxlen,len,drops} sysctl


# 1.79 16-Jun-2014 ozaki-r

Use pktqueue for bridge forwarding queue and softint


# 1.78 15-Jun-2014 ozaki-r

Get rid of unnecessary splnet for pool_{get,put}

A mutex prevents interrupts in the functions now.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base
# 1.77 29-Jun-2013 rmind

branches: 1.77.4;
- Rewrite parts of pfil(9): use array to store hooks and thus be more cache
friendly (there are only few hooks in the system). Make the structures
opaque and the interface more strict.
- Remove PFIL_HOOKS option by making pfil(9) mandatory.


Revision tags: agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8
# 1.76 22-Mar-2012 wiz

branches: 1.76.2; 1.76.4;
Fix typo in kauth name. From PR 46234 by Matthew Mondor.
Tested by Geoff Adams and Ryo ONODERA.


# 1.75 13-Mar-2012 elad

Replace the remaining KAUTH_GENERIC_ISSUSER authorization calls with
something meaningful. All relevant documentation has been updated or
written.

Most of these changes were brought up in the following messages:

http://mail-index.netbsd.org/tech-kern/2012/01/18/msg012490.html
http://mail-index.netbsd.org/tech-kern/2012/01/19/msg012502.html
http://mail-index.netbsd.org/tech-kern/2012/02/17/msg012728.html

Thanks to christos, manu, njoly, and jmmv for input.

Huge thanks to pgoyette for spinning these changes through some build
cycles and ATF.


Revision tags: netbsd-6-0-6-RELEASE netbsd-6-1-5-RELEASE netbsd-6-1-4-RELEASE netbsd-6-0-5-RELEASE netbsd-6-1-3-RELEASE netbsd-6-0-4-RELEASE netbsd-6-1-2-RELEASE netbsd-6-0-3-RELEASE netbsd-6-1-1-RELEASE netbsd-6-0-2-RELEASE netbsd-6-1-RELEASE netbsd-6-1-RC4 netbsd-6-1-RC3 netbsd-6-1-RC2 netbsd-6-1-RC1 netbsd-6-0-1-RELEASE matt-nb6-plus-nbase netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base
# 1.74 19-Nov-2011 tls

branches: 1.74.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.73 23-May-2011 joerg

branches: 1.73.4;
simplify


Revision tags: bouyer-quota2-nbase bouyer-quota2-base jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.72 07-Dec-2010 pooka

branches: 1.72.2;
_KERNEL_TOP


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1 yamt-nfs-mp-base9 uebayasi-xip-base
# 1.71 19-Jan-2010 pooka

branches: 1.71.4;
Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


Revision tags: matt-premerge-20091211 yamt-nfs-mp-base8 yamt-nfs-mp-base7 jymxensuspend-base yamt-nfs-mp-base6 yamt-nfs-mp-base5 jym-xensuspend-nbase
# 1.70 17-May-2009 cegger

fix crash in bridge_ioctl():


BRDGGFLT and BRDGSFILT bridge controls are only available with BRIDGE_IPF and PFIL_HOOKS defined.
In amd64 GENERIC and XEN kernel configs PFIL_HOOKS is defined but BRIDGE_IPF is not.

When a BRDGGFLT or BRDGSFILT command comes in, then ifd->ifd_cmd is not in range
of bridge_control_table_size. Then bc is not set and is dereferenced
later => BOOM.


Revision tags: yamt-nfs-mp-base4 jym-xensuspend-base
# 1.69 12-May-2009 elad

Move kauth(9) call before going into splnet().

Mailing list reference:

http://mail-index.netbsd.org/tech-net/2009/05/08/msg001286.html


Revision tags: yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 nick-hppapmap-base
# 1.68 04-Apr-2009 bouyer

Fix another typo


# 1.67 04-Apr-2009 bouyer

Fix a comment, and make it build.


# 1.66 04-Apr-2009 bouyer

Fixes from Masao Uebayashi


# 1.65 04-Apr-2009 bouyer

Fix for if_start() and pfil_hook() being called from hardware interrupt
context (reported on various mailing-lists, and part of PR kern/41114,
causing panic in pf(4) and possibly ipf(4) when BRIDGE_IPF is used).
Defer bridge_forward() to a software interrupt; bridge_input() enqueues
mbufs to ifp->if_snd which is handled in bridge_forward().


Revision tags: nick-hppapmap-base2
# 1.64 18-Jan-2009 mrg

branches: 1.64.2;
Fix multiple problems:

* A sign extension error creating the bridge ID corrupted the
priority (always making it the maximum).
* Do not catch STP packets on an interface for which STP is not
enabled -- it's a violation of the spec, and causes STP to fail on
neighboring bridges.
* An optimization to bstp_input() -- some information is already
known when we call it.

contributed anonymously.


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.63 07-Nov-2008 dyoung

*** Summary ***

When a link-layer address changes (e.g., ifconfig ex0 link
02:de:ad:be:ef:02 active), send a gratuitous ARP and/or a Neighbor
Advertisement to update the network-/link-layer address bindings
on our LAN peers.

Refuse a change of ethernet address to the address 00:00:00:00:00:00
or to any multicast/broadcast address. (Thanks matt@.)

Reorder ifnet ioctl operations so that driver ioctls may inherit
the functions of their "class"---ether_ioctl(), fddi_ioctl(), et
cetera---and the class ioctls may inherit from the generic ioctl,
ifioctl_common(), but both driver- and class-ioctls may override
the generic behavior. Make network drivers share more code.

Distinguish a "factory" link-layer address from others for the
purposes of both protecting that address from deletion and computing
EUI64.

Return consistent, appropriate error codes from network drivers.

Improve readability. KNF.

*** Details ***

In if_attach(), always initialize the interface ioctl routine,
ifnet->if_ioctl, if the driver has not already initialized it.
Delete if_ioctl == NULL tests everywhere else, because it cannot
happen.

In the ioctl routines of network interfaces, inherit common ioctl
behaviors by calling either ifioctl_common() or whichever ioctl
routine is appropriate for the class of interface---e.g., ether_ioctl()
for ethernets.

Stop (ab)using SIOCSIFADDR and start to use SIOCINITIFADDR. In
the user->kernel interface, SIOCSIFADDR's argument was an ifreq,
but on the protocol->ifnet interface, SIOCSIFADDR's argument was
an ifaddr. That was confusing, and it would work against me as I
make it possible for a network interface to overload most ioctls.
On the protocol->ifnet interface, replace SIOCSIFADDR with
SIOCINITIFADDR. In ifioctl(), return EPERM if userland tries to
invoke SIOCINITIFADDR.

In ifioctl(), give the interface the first shot at handling most
interface ioctls, and give the protocol the second shot, instead
of the other way around. Finally, let compatibility code (COMPAT_OSOCK)
take a shot.

Pull device initialization out of switch statements under
SIOCINITIFADDR. For example, pull ..._init() out of any switch
statement that looks like this:

switch (...->sa_family) {
case ...:
..._init();
...
break;
...
default:
..._init();
...
break;
}

Rewrite many if-else clauses that handle all permutations of IFF_UP
and IFF_RUNNING to use a switch statement,

switch (x & (IFF_UP|IFF_RUNNING)) {
case 0:
...
break;
case IFF_RUNNING:
...
break;
case IFF_UP:
...
break;
case IFF_UP|IFF_RUNNING:
...
break;
}

unifdef lots of code containing #ifdef FreeBSD, #ifdef NetBSD, and
#ifdef SIOCSIFMTU, especially in fwip(4) and in ndis(4).

In ipw(4), remove an if_set_sadl() call that is out of place.

In nfe(4), reuse the jumbo MTU logic in ether_ioctl().

Let ethernets register a callback for setting h/w state such as
promiscuous mode and the multicast filter in accord with a change
in the if_flags: ether_set_ifflags_cb() registers a callback that
returns ENETRESET if the caller should reset the ethernet by calling
if_init(), 0 on success, != 0 on failure. Pull common code from
ex(4), gem(4), nfe(4), sip(4), tlp(4), vge(4) into ether_ioctl(),
and register if_flags callbacks for those drivers.

Return ENOTTY instead of EINVAL for inappropriate ioctls. In
zyd(4), use ENXIO instead of ENOTTY to indicate that the device is
not any longer attached.

Add to if_set_sadl() a boolean 'factory' argument that indicates
whether a link-layer address was assigned by the factory or some
other source. In a comment, recommend using the factory address
for generating an EUI64, and update in6_get_hw_ifid() to prefer a
factory address to any other link-layer address.

Add a routing message, RTM_LLINFO_UPD, that tells protocols to
update the binding of network-layer addresses to link-layer addresses.
Implement this message in IPv4 and IPv6 by sending a gratuitous
ARP or a neighbor advertisement, respectively. Generate RTM_LLINFO_UPD
messages on a change of an interface's link-layer address.

In ether_ioctl(), do not let SIOCALIFADDR set a link-layer address
that is broadcast/multicast or equal to 00:00:00:00:00:00.

Make ether_ioctl() call ifioctl_common() to handle ioctls that it
does not understand.

In gif(4), initialize if_softc and use it, instead of assuming that
the gif_softc and ifp overlap.

Let ifioctl_common() handle SIOCGIFADDR.

Sprinkle rtcache_invariants(), which checks on DIAGNOSTIC kernels
that certain invariants on a struct route are satisfied.

In agr(4), rewrite agr_ioctl_filter() to be a bit more explicit
about the ioctls that we do not allow on an agr(4) member interface.

bzero -> memset. Delete unnecessary casts to void *. Use
sockaddr_in_init() and sockaddr_in6_init(). Compare pointers with
NULL instead of "testing truth". Replace some instances of (type
*)0 with NULL. Change some K&R prototypes to ANSI C, and join
lines.


Revision tags: netbsd-5-0-RC3 netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 wrstuden-revivesa-base-1 simonb-wapbl-nbase yamt-pf42-base4 simonb-wapbl-base wrstuden-revivesa-base
# 1.62 15-Jun-2008 christos

branches: 1.62.2; 1.62.4; 1.62.6;
- add if_alloc (ours just mallocs), and if_initname and use them (from FreeBSD)
- kill memsets where M_ZERO can be used.


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base
# 1.61 15-Apr-2008 thorpej

branches: 1.61.2; 1.61.4; 1.61.6; 1.61.8;
Make ip6 and icmp6 stats per-cpu.


# 1.60 12-Apr-2008 cegger

make this build with BRIDGE_IPF and PFIL_HOOKS options


# 1.59 12-Apr-2008 thorpej

Make IP, TCP, UDP, and ICMP statistics per-CPU. The stats are collated
when the user requests them via sysctl.


# 1.58 08-Apr-2008 thorpej

Change IPv6 stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ip6stat structure; old netstat
binaries will continue to work properly.


# 1.57 07-Apr-2008 thorpej

Change IP stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ipstat structure; old netstat
binaries will continue to work properly.


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase nick-net80211-sync-base keiichi-mipv6-base matt-armv6-nbase hpcarm-cleanup-base
# 1.56 20-Feb-2008 matt

branches: 1.56.6;
s/u_\(int[0-9]*_t\)/u\1/g
(change u_int*_t to uint*_t)


Revision tags: bouyer-xeni386-nbase bouyer-xeni386-base mjf-devfs-base
# 1.55 19-Jan-2008 dyoung

Use C99 array initializers for bridge_control_table[].


Revision tags: nick-csl-alignment-base5 bouyer-xeni386-merge1 matt-armv6-prevmlocking vmlocking2-base3 yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 jmcneill-base bouyer-xenamd64-base2 vmlocking-nbase yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base matt-armv6-base jmcneill-pm-base reinoud-bufcleanup-base vmlocking-base
# 1.54 27-Aug-2007 dyoung

branches: 1.54.2; 1.54.8; 1.54.14;
LLADDR -> CLLADDR.


# 1.53 26-Aug-2007 dyoung

Constify: LLADDR -> CLLADDR. I'm aiming here to make it easier to
identify sockaddr_dl abuse that remains in the kernel, especially
the potential for overwriting memory past the end of a sockaddr_dl
with, e.g., memcpy(LLADDR(), ...).

Use sockaddr_dl_setaddr() in a few places.


Revision tags: matt-mips64-base nick-csl-alignment-base mjf-ufs-trans-base
# 1.52 09-Jul-2007 ad

branches: 1.52.2; 1.52.6;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.51 12-Mar-2007 ad

branches: 1.51.2;
Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.


# 1.50 04-Mar-2007 christos

branches: 1.50.2;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.49 21-Feb-2007 dyoung

Use __arraycount().


# 1.48 17-Feb-2007 dyoung

KNF: de-__P, bzero -> memset, bcmp -> memcmp. Remove extraneous
parentheses in return statements.

Cosmetic: don't open-code TAILQ_FOREACH().

Cosmetic: change types of variables to avoid oodles of casts: in
in6_src.c, avoid casts by changing several route_in6 pointers
to struct route pointers. Remove unnecessary casts to caddr_t
elsewhere.

Pave the way for eliminating address family-specific route caches:
soon, struct route will not embed a sockaddr, but it will hold
a reference to an external sockaddr, instead. We will set the
destination sockaddr using rtcache_setdst(). (I created a stub
for it, but it isn't used anywhere, yet.) rtcache_free() will
free the sockaddr. I have extracted from rtcache_free() a helper
subroutine, rtcache_clear(). rtcache_clear() will "forget" a
cached route, but it will not forget the destination by releasing
the sockaddr. I use rtcache_clear() instead of rtcache_free()
in rtcache_update(), because rtcache_update() is not supposed
to forget the destination.

Constify:

1 Introduce const accessor for route->ro_dst, rtcache_getdst().

2 Constify the 'dst' argument to ifnet->if_output(). This
led me to constify a lot of code called by output routines.

3 Constify the sockaddr argument to protosw->pr_ctlinput. This
led me to constify a lot of code called by ctlinput routines.

4 Introduce const macros for converting from a generic sockaddr
to family-specific sockaddrs, e.g., sockaddr_in: satocsin6,
satocsin, et cetera.


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.47 04-Jan-2007 elad

branches: 1.47.2;
Consistent usage of KAUTH_GENERIC_ISSUSER.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3 netbsd-4-base
# 1.46 23-Nov-2006 rpaulo

New EtherIP driver based on tap(4) and gif(4) by Hans Rosenfeld.
Notable changes:
* Fixes PR 34268.
* Separates the code from gif(4) (which is more cleaner).
* Allows the usage of STP (Spanning Tree Protocol).
* Removed EtherIP implementation from gif(4)/tap(4).

Some input from Christos.


# 1.45 16-Nov-2006 christos

__unused removal on arguments; approved by core.


Revision tags: yamt-splraiseipl-base2
# 1.44 17-Oct-2006 dogcow

now that we have -Wno-unused-parameter, back out all the tremendously ugly
code to gratuitously access said parameters.


# 1.43 13-Oct-2006 dogcow

More -Wunused fallout. sprinkle __unused when possible; otherwise, use the
do { if (&x) {} } while (/* CONSTCOND */ 0);
construct as suggested by uwe in <20061012224845.GA9449@snark.ptc.spbu.ru>.


# 1.42 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


# 1.41 05-Oct-2006 tls

Protect calls to pool_put/pool_get that may occur in interrupt context
with spl used to protect other allocations and frees, or datastructure
element insertion and removal, in adjacent code.

It is almost unquestionably the case that some of the spl()/splx() calls
added here are superfluous, but it really seems wrong to see:

s=splfoo();
/* frob data structure */
splx(s);
pool_put(x);

and if we think we need to protect the first operation, then it is hard
to see why we should not think we need to protect the next. "Better
safe than sorry".

It is also almost unquestionably the case that I missed some pool
gets/puts from interrupt context with my strategy for finding these
calls; use of PR_NOWAIT is a strong hint that a pool may be used from
interrupt context but many callers in the kernel pass a "can wait/can't
wait" flag down such that my searches might not have found them. One
notable area that needs to be looked at is pf.

See also:

http://mail-index.netbsd.org/tech-kern/2006/07/19/0003.html
http://mail-index.netbsd.org/tech-kern/2006/07/19/0009.html


Revision tags: abandoned-netbsd-4-base yamt-splraiseipl-base yamt-pdpolicy-base9 yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.40 23-Jul-2006 ad

branches: 1.40.4; 1.40.6;
Use the LWP cached credentials where sane.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.39 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.38 18-May-2006 liamjfoy

branches: 1.38.2;
Integrate Common Address Redundancy Procotol (CARP) from OpenBSD

'pseudo-device carp'

Thanks to: joerg@ christos@ riz@ and others who tested
Ok: core@


# 1.37 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base yamt-pdpolicy-base2 elad-kernelauth-base yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.36 17-Jan-2006 christos

branches: 1.36.2; 1.36.4; 1.36.6; 1.36.8; 1.36.10;
Make sure that breq is also cleared (from Xin LI)


# 1.35 09-Jan-2006 christos

Make sure we initialize all structs to 0; from Xin LI


# 1.34 24-Dec-2005 perry

branches: 1.34.2;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.33 11-Dec-2005 thorpej

ANSI function decls and application of static.


# 1.32 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.31 01-Jun-2005 jdc

branches: 1.31.2;
Fix this properly by renaming the conflicting variables.


# 1.30 01-Jun-2005 jdc

Remove extraneous definition of struct llc (found by shadow warning).


Revision tags: netbsd-3-0-RELEASE netbsd-3-0-RC6 netbsd-3-0-RC5 netbsd-3-0-RC4 netbsd-3-0-RC3 netbsd-3-0-RC2 netbsd-3-0-RC1 yamt-km-base4 yamt-km-base3 netbsd-3-base kent-audio2-base
# 1.29 26-Feb-2005 perry

branches: 1.29.2; 1.29.4;
nuke trailing whitespace


Revision tags: yamt-km-base2
# 1.28 31-Jan-2005 kim

Add RFC 3378 EtherIP support, ported from OpenBSD to NetBSD by
Hans Rosenfeld (rosenfeld at grumpf.hope-2000.org)

This change makes it possible to add gif interfaces to bridges, which
will then send and receive IP protocol 97 packets. Packets are Ethernet
frames with an EtherIP header prepended.


Revision tags: yamt-km-base kent-audio1-beforemerge kent-audio1-base
# 1.27 04-Dec-2004 peter

branches: 1.27.4; 1.27.6;
Change ifc_destroy to return an int instead of void, so that it
can pass back errors to ifconfig.


# 1.26 06-Oct-2004 bad

Interfaces that do checksum offloading indicate the checksum status of
received packets in csum_flags in the packet header. Packets that are
forwarded over the bridge need to have csum_flags cleared before being
put on the output queue. Do so in bridge_enqueue().

Discussed with Jason Thorpe.

Fixes PR kern/27007 and the first part of PR kern/21831.


# 1.25 05-Oct-2004 christos

Only enable BRIDGE_IPF code if PFIL_HOOKS is enabled.


# 1.24 21-Apr-2004 itojun

kill a sprintf


# 1.23 21-Apr-2004 itojun

kill sprintf, use snprintf


Revision tags: netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.22 31-Jan-2004 jdc

branches: 1.22.2;
Use m_copydata(), m_adj() and M_PREPEND() to manipulate mbuf's in
bridge_ipf(). Fixes kernel memory corruption that occured when using
m_split() and m_cat().
Idea from OpenBSD.


# 1.21 09-Dec-2003 augustss

Fix spelling mistake in a comment.


# 1.20 28-Oct-2003 mycroft

Mark this initializer in the canonical way so it can be found later.


# 1.19 25-Oct-2003 christos

Fix uninitialized variable warnings


# 1.18 16-Sep-2003 jdc

Add a flag parameter to bridge_enqueue() to tell it whether to run the
filter or not. We only need to run the filter for bridge_forward() and
bridge_broadcast(). If we also run it for bridge_output(), we will run
the filter twice outbound per packet, so don't.

In bridge_ipf(), make sure we don't run m_cat() on a single mbuf chain
by checking to see (and remembering) if we need to m_split() the mbuf.
This fixes bridge + ipfilter on sparc.

Fixes PR kern/22063.


# 1.17 11-Aug-2003 itojun

rm extra blank line


# 1.16 13-Jul-2003 jdc

Include opt_inet.h to get INET6 definition.
Now, bridged ipv6 packets are passed through ipfilter.
However, some v6 packets still do not get transmitted when ipf is enabled.
Partial fix for PR kern/22063.


# 1.15 23-Jun-2003 martin

branches: 1.15.2;
Make sure to include opt_foo.h if a defflag option FOO is used.


# 1.14 24-May-2003 kristerw

Make sure splx() is called for all bridge_ioctl() error cases.


# 1.13 16-May-2003 itojun

use strlcpy


# 1.12 14-May-2003 itojun

use arc4random


# 1.11 19-Mar-2003 bouyer

Fix 2 bugs:
- initialise stp when the bridge is turned up, without this stp will keep
all interfaces disabled in a sequence like:
brconfig bridge0 add if0 add if1 stp if0 stp if1 up
- s/BRDGSPRI/BRDGSIFPRIO in brconfig.c:cmd_ifpriority()

add a command (ifpathcost) to change the stp path cost of the STP path cost of
an interface. Display the interface path cost with the others STP parameters.


# 1.10 27-Feb-2003 perseant

Make BRIDGE_IPF an option, and document it. Add it (commented) to GENERIC.
Let brconfig tell whether the bridge is using the ipfilter hook, or not.


# 1.9 15-Feb-2003 perseant

Add ipf packet-filtering option to if_bridge. The option is controlled at
compile-time by BRIDGE_IPF, and at runtime by brconfig with the {ipf,-ipf}
option on a per-bridge basis.

As a side-effect, add PFIL_HOOKS processing to if_bridge.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base kqueue-aftermerge kqueue-beforemerge gehenna-devsw-base kqueue-base
# 1.8 24-Aug-2002 martin

Add a function to lookup bridge members by struct ifnet * and use
it at all call sites that have such a pointer readily available.
This avoids unnecessary strcmp()s in critical paths, and removes
some XXX comments.


# 1.7 08-Jun-2002 itojun

reject "add" request if if_mtu is different.


# 1.6 23-May-2002 itojun

use IFT_BRIDGE


Revision tags: netbsd-1-6-base
# 1.5 24-Mar-2002 jdolecek

branches: 1.5.2; 1.5.4;
Fix a memory leak in bridge_ioctl_add() when the called for non-ethernet
interface.
Problem noted and fix provided by in kern/16019 by Love.


Revision tags: eeh-devprop-base newlock-base
# 1.4 08-Mar-2002 thorpej

Pool deals fairly well with physical memory shortage, but it doesn't
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map). Try to deal with this:

* Group all information about the backend allocator for a pool in a
separate structure. The pool references this structure, rather than
the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
to become available, but will still fail if it cannot callocate KVA
space for the pages. If this happens, carefully drain all pools using
the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
some pages, and use that information to make draining easier and more
efficient.
* Get rid of PR_URGENT. There was only one use of it, and it could be
dealt with by the caller.

From art@openbsd.org.


Revision tags: ifpoll-base
# 1.3 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.2 17-Aug-2001 thorpej

branches: 1.2.2; 1.2.4;
Only report expire time for DYNAMIC forwarding table entries.


# 1.1 17-Aug-2001 thorpej

Add support for building Ethernet bridges, based on Jason Wright's
bridge driver from OpenBSD, although the bridge code has been *heavily*
modified by me (the 802.1D code remains mostly unchanged from the
original).


# 1.172 30-Apr-2020 jdolecek

for bridge(4), report the common enabled capabilities of the members
via SIOCGIFCAP for visibility


# 1.171 27-Apr-2020 jdolecek

if MTU of the added interface doesn't match the bridge, modify the MTU
of the interface to that of the bridge instead of just refusing the
addition with EINVAL

this is a convenience feature to simplify bridge setup with non-standard
MTU, the useful behaviour observed with Linux xenbr


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406
# 1.170 27-Mar-2020 jdolecek

replace the conditional m_pullup() on start of bridge_output() with
a KASSERT(), to make it clear no mbuf manipulation is ever done here

the condition should never trigger, this always runs after ether_output()
M_PREPEND()s ether_header


# 1.169 24-Mar-2020 jdolecek

reset the csum_flags in bridge_brodcast() also for bmcast path

for destination interfaces with real hardware offloading this fixes
multicast packet corruption; for xvif(4) this fix stops treating them
as having no csum

may fix PR kern/42386


Revision tags: ad-namecache-base3
# 1.168 24-Feb-2020 rin

Remove debug printf I put into bridge_calc_csum_flags().
Sorry for noise.


# 1.167 23-Feb-2020 jdolecek

disable the DEBUG bridge_calc_csum_flags() printf


# 1.166 29-Jan-2020 thorpej

Adopt <net/if_stats.h>.


Revision tags: ad-namecache-base2 ad-namecache-base1 ad-namecache-base phil-wifi-20191119
# 1.165 05-Aug-2019 msaitoh

branches: 1.165.2;
Cast uint32_t to avoid undefined behavior in bridge_rthash(). Found by kUBSan.


Revision tags: netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609 isaki-audio2-base pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226
# 1.164 22-Dec-2018 rin

branches: 1.164.4;
Take the interface out of promiscuous mode in bridge_delete_member()
instead of bridge_ioctl_del(). Otherwise, the member interfaces are
left in promiscuous mode when the bridge is destroyed.


# 1.163 15-Dec-2018 rin

Improve wording in comments: replace "chain" with "queue" for
sequence of mbuf's connected by m_nextpkt, in order to avoid
confusion with those connected by m_next.

No binary changes.


# 1.162 14-Dec-2018 martin

Need <netinet6/ip6_var.h> for ip6_statinc() prototype.


# 1.161 12-Dec-2018 rin

PR kern/53562

Handle TX offload in software when a packet is sent via
bridge_output(). We can send it as is in the following
exceptional cases:

For unicast:

(1) When the destination interface is the same as source.

(2) When the destination supports all TX offload options
specified in a packet.

For multicast/broadcast:

(3) When all the members of the bridge support the specified
TX offload options.

For (3), add sc_csum_flags_tx flag to bridge softc, which is
logical AND b/w capabilities of TX offload options in member
interface (ifp->if_csum_flags_tx). The flag is updated when a
member is (i) added to or (ii) removed from a bridge, or (iii)
if_csum_flags_tx flag of a member interface is manipulated via
ifconfig(8).

Turn on M_CSUM_TSOv[46] bit in ifp->if_csum_flags_tx flag when
TSO[46] is enabled for that interface.

OK msaitoh thorpej


Revision tags: pgoyette-compat-1126
# 1.160 09-Nov-2018 ozaki-r

Fix that brconfig <bridge> (addr) can't show a large number of MAC addresses

The command shows only 256 addresses at maximum even if a bridge caches more
addresses. It occurs because the kernel doesn't return an error if the command
passes a short buffer that can't store all cached addresses; the kernel fills
cached addresses as much as possible and returns it without telling that the
result is truncated.

Fix the issue by telling a required size of a buffer if a buffer passed from the
command is not enough, which lets the command retry with an enough buffer.

Reported by k-goda@IIJ


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930
# 1.159 19-Sep-2018 msaitoh

Micro optimization. m_copym(M_COPYALL) -> m_copypacket().


# 1.158 18-Sep-2018 msaitoh

- Fix bridge_enqueue() which was broken by last commit. Use correct mbuf
pointer.
- Modify comment.


# 1.157 14-Sep-2018 msaitoh

Fix a bug that bridge_enqueue() incorrectly cleared outgoing packet's offload
flags. bridge_enqueue() is called from bridge_output() when a packet is
spontaneous. Clear csum_flags before calling brige_enqueue() in
bridge_forward() or bridge_broadcast() instead of in the beginning of
bridge_enqueue().

Note that this change doesn't fix a problem on the following configuration:

A bridge has two or more interfaces.

An address is assigned to an bridge member interface and
some offload flags are set.

Another interface has no address and has no any offload flag.

XXX pullup-[78]


Revision tags: pgoyette-compat-0906 pgoyette-compat-0728 phil-wifi-base pgoyette-compat-0625
# 1.156 25-May-2018 ozaki-r

branches: 1.156.2;
Ensure to call if_register after interface initializations finish


Revision tags: pgoyette-compat-0521
# 1.155 14-May-2018 ozaki-r

Protect packet input routines with KERNEL_LOCK and splsoftnet

if_input, i.e, ether_input and friends, now runs in softint without any
protections. It's ok for ether_input itself because it's already MP-safe,
however, subsequent routines called from it such as carp_input and agr_input
aren't safe because they're not MP-safe. Protect if_input with KERNEL_LOCK.

if_input can be called from a normal LWP context. In that case we need to
prevent interrupts (softint) from running by splsoftnet to protect non-MP-safe
codes (e.g., carp_input and agr_input).

Pointed out by mlelstv@


Revision tags: pgoyette-compat-0502 pgoyette-compat-0422
# 1.154 18-Apr-2018 ozaki-r

Add missing PSLIST_ENTRY_INIT and PSLIST_ENTRY_DESTROY


# 1.153 18-Apr-2018 ozaki-r

Get rid of a unnecessary semicolon

Pointed out by kamil@


# 1.152 18-Apr-2018 ozaki-r

bridge: use pslist(9) for rtlist and rthash

The change fixes race conditions on list operations. One example is that a
reader may see invalid pointers on a looking item in a list due to lack of
membar_producer.


# 1.151 18-Apr-2018 ozaki-r

Simplify bridge_rtnode_insert (NFC)


# 1.150 18-Apr-2018 ozaki-r

Remove obsolete NULL checks


Revision tags: pgoyette-compat-0415
# 1.149 10-Apr-2018 ozaki-r

Fix bridge_rtdelete

It removes a rtable entry that belongs to a specified interface, however, its
original behavior was to delete all belonging entries. Restore the original
behavior.


Revision tags: pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.148 15-Jan-2018 maxv

branches: 1.148.2;
If the bridge is not running, don't call bridge_stop. Otherwise the
following commands will crash the kernel:

ifconfig bridge0 create
ifconfig bridge0 destroy


# 1.147 28-Dec-2017 ozaki-r

Ensure the timer isn't running by using workqueue_wait


# 1.146 19-Dec-2017 ozaki-r

Don't set IFEF_MPSAFE unless NET_MPSAFE at this point

Because recent investigations show that interfaces with IFEF_MPSAFE need to
follow additional restrictions to work with the flag safely. We should enable it
on an interface by default only if the interface surely satisfies the
restrictions, which are described in if.h.

Note that enabling IFEF_MPSAFE solely gains a few benefit on performance because
the network stack is still serialized by the big kernel locks by default.


# 1.145 11-Dec-2017 ozaki-r

Wrap if_ioctl_lock with IFNET_* macros (NFC)

Also if_ioctl_lock perhaps needs to be renamed to something because it's now
not just for ioctl...


# 1.144 08-Dec-2017 ozaki-r

Fix build of kernels without ether

By throwing out if_enable_vlan_mtu and if_disable_vlan_mtu that
created a unnecessary dependency from if.c to if_ethersubr.c.

PR kern/52790


# 1.143 06-Dec-2017 ozaki-r

Ensure to not turn on IFF_RUNNING of an interface until its initialization completes

And ensure to turn off it before destruction as per IFF_RUNNING's description
"resource allocated". (The description is a bit doubtful though, I believe the
change is still proper.)


# 1.142 06-Dec-2017 ozaki-r

Ensure to hold if_ioctl_lock when calling if_flags_set


Revision tags: tls-maxphys-base-20171202
# 1.141 17-Nov-2017 ozaki-r

Add missing IFEF_NO_LINK_STATE_CHANGE to bridge


# 1.140 16-Nov-2017 ozaki-r

Unify IFEF_*_MPSAFE into IFEF_MPSAFE

There are already two flags for if_output and if_start, however, it seems such
MPSAFE flags are eventually needed for all if_XXX operations. Having discrete
flags for each operation is wasteful of if_extflags bits. So let's unify
the flags into one: IFEF_MPSAFE.

Fortunately IFEF_*_MPSAFE flags have never been included in any releases, so
we can change them without breaking backward compatibility of the releases
(though the kernel version of -current should be bumped).

Note that if an interface have both MP-safe and non-MP-safe operations at a
time, we have to set the IFEF_MPSAFE flag and let callees of non-MP-safe
opeartions take the kernel lock.

Proposed on tech-kern@ and tech-net@


# 1.139 15-Nov-2017 ozaki-r

Mark callouts of bridge CALLOUT_MPSAFE


# 1.138 25-Oct-2017 ozaki-r

Remove unnecessary splsoftnet


# 1.137 25-Oct-2017 ozaki-r

Don't free sc_rthash twice


# 1.136 23-Oct-2017 msaitoh

- If if_initialize() failed in the attach function, free resources and return.
- Add some missing frees in bridge_clone_destroy().
- KNF


# 1.135 02-Oct-2017 ozaki-r

Add curlwp_bind to bridge_input for psref

It can be called in a thread context via tap (tap_dev_write).

Fix PR kern/52587


Revision tags: nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320
# 1.134 07-Mar-2017 ozaki-r

branches: 1.134.6;
Remove unnecessary splnet for bridge_enqueue

bridge_enqueue now uses if_transmit_lock that does splnet for device
drivers, so splnet for bridge_enqueue isn't needed anymore.


# 1.133 16-Feb-2017 knakahara

add l2tp(4) L2TPv3 interface.

originally implemented by IIJ SEIL team.


Revision tags: nick-nhusb-base-20170204
# 1.132 23-Jan-2017 ozaki-r

Replace some splnet with splsoftnet


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107 nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004
# 1.131 15-Sep-2016 christos

branches: 1.131.2;
Always do the mbuf checks. The packet filters (npf) expect the mbuf to be
pulled-up. (Krists Krilovs)


Revision tags: localcount-20160914
# 1.130 29-Aug-2016 ozaki-r

KNF; replace white spaces with hard tabs

No functional change.


Revision tags: pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.129 22-Jun-2016 knakahara

branches: 1.129.2;
fix: locking about IFQ_ENQUEUE and ALTQ

- If NET_MPSAFE is not defined, IFQ_LOCK is nop. Currently, that means
IFQ_ENQUEUE() of some paths such as bridge_enqueue() is called parallel
wrongly.
- If ALTQ is enabled, Tx processing should call if_transmit() (= IFQ_ENQUEUE
+ ifp->if_start()) instead of ifp->if_transmit() to call ALTQ_ENQUEUE()
and ALTQ_DEQUEUE().
Furthermore, ALTQ processing is always required KERNEL_LOCK currently.


# 1.128 20-Jun-2016 knakahara

fix: should not assert IFEF_OUTPUT_MPSAFE in bridge_output()


# 1.127 20-Jun-2016 knakahara

tentative fix for ATF(net/if_bridge/t_bridge)


# 1.126 20-Jun-2016 knakahara

make bridge_output MP-safe, so that bridge(4) can enable IFEF_OUTPUT_MPSAFE.

making MP-scalable is future work.


# 1.125 10-Jun-2016 ozaki-r

Avoid storing a pointer of an interface in a mbuf

Having a pointer of an interface in a mbuf isn't safe if we remove big
kernel locks; an interface object (ifnet) can be destroyed anytime in any
packet processing and accessing such object via a pointer is racy. Instead
we have to get an object from the interface collection (ifindex2ifnet) via
an interface index (if_index) that is stored to a mbuf instead of an
pointer.

The change provides two APIs: m_{get,put}_rcvif_psref that use psref(9)
for sleep-able critical sections and m_{get,put}_rcvif that use
pserialize(9) for other critical sections. The change also adds another
API called m_get_rcvif_NOMPSAFE, that is NOT MP-safe and for transition
moratorium, i.e., it is intended to be used for places where are not
planned to be MP-ified soon.

The change adds some overhead due to psref to performance sensitive paths,
however the overhead is not serious, 2% down at worst.

Proposed on tech-kern and tech-net.


# 1.124 10-Jun-2016 ozaki-r

Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.123 16-May-2016 ozaki-r

Apply if_get and if_put to bridge(4)


# 1.122 04-May-2016 roy

Allow multicast/broadcast packets from a bridge member to other members.
Note this should just call bridge_broadcast when more locking issues are
resolved.


# 1.121 28-Apr-2016 knakahara

introduce new ifnet MP-scalable sending interface "if_transmit".


# 1.120 28-Apr-2016 ozaki-r

Constify rtentry of if_output

We no longer need to change rtentry below if_output.

The change makes it clear where rtentries are changed (or not)
and helps forthcoming locking (os psrefing) rtentries.


# 1.119 24-Apr-2016 christos

CID 1358673: dead code


Revision tags: nick-nhusb-base-20160422
# 1.118 22-Apr-2016 roy

Change used from int to bool.
If used, abort the loop because we think we're already at the end.


# 1.117 20-Apr-2016 knakahara

IFQ_ENQUEUE refactor (3/3) : eliminate pktattr argument from IFQ_ENQUEUE caller


# 1.116 20-Apr-2016 knakahara

IFQ_ENQUEUE refactor (2/3) : eliminate pktattr argument from altq implemantation


# 1.115 19-Apr-2016 ozaki-r

Apply psref(9) to bridge(4)

Note that there is an issue that ioctls for an interface and a destruction
of the interface can run in parallel and it causes race conditions on
bridge as well (it rarely happens). The issue will be addressed in the
interface common code (if.c).


# 1.114 19-Apr-2016 ozaki-r

Remove BRIDGE_MPSAFE switch and enable MP-safe code by default

We need to enable it by default because bridge_input now runs
in softint, but bridge_input w/o BRIDGE_MPSAFE was designed as
it runs in hardware interrupt.

Note that there remains a racy code in bridge_output; it will be
solved in the upcoming change (applying psref(9)).


# 1.113 11-Apr-2016 ozaki-r

Fix usage of pslist(9)

Pointed out by riastradh@.


# 1.112 11-Apr-2016 ozaki-r

Use pslist(9) in bridge(4)

This adds missing memory barriers to list operations for pserialize.


# 1.111 28-Mar-2016 ozaki-r

Remove unused global bridge list

Pointed out by riastradh@


# 1.110 23-Mar-2016 ozaki-r

Fix LIST_FOREACH argument


# 1.109 23-Mar-2016 ozaki-r

Use LIST_FOREACH instead of LIST_FOREACH_SAFE

No need to use *_SAFE because we don't remove any items in the loop.


Revision tags: nick-nhusb-base-20160319
# 1.108 15-Feb-2016 ozaki-r

Simplify bridge(4)

Thanks to introducing softint-based if_input, the entire bridge code now
never run in hardware interrupt context. So we can simplify the code.

- Remove spin mutexes
- They were needed because some code of bridge could run in
hardware interrupt context
- We now need only an adaptive mutex for each shared object
(a member list and a forwarding table)
- Remove pktqueue
- bridge_input is already in softint, using another softint
(for bridge_forward) is useless
- Packet distribution should be down at device drivers


# 1.107 10-Feb-2016 ozaki-r

Don't share struct work, instead have one per softc

Pointed out by riastradh@


# 1.106 09-Feb-2016 ozaki-r

Introduce softint-based if_input

This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.

This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.

To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.

Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html

Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!


Revision tags: nick-nhusb-base-20151226
# 1.105 19-Nov-2015 christos

Add handling of VLAN packets in if_bridge where the parent interface supports
them (Jean-Jacques.Puig@espci.fr). Factor out the vlan_mtu enabling and
disabling code.


# 1.104 20-Oct-2015 maxv

Harmless alloc inconsistency; make sure the exact same argument is given to
kmem_alloc/kmem_free. Found by Brainy.


# 1.103 07-Oct-2015 ozaki-r

Enqueue frames to a curcpu's pktqueue

Currently RX can run on a CPU other than CPU#0, so always enqueuing
to a pktqueue of CPU#0 makes no sense. Let's use a curcpu's pktqueue,
although bridge_foward softint doesn't run in parallel without
NET_MPSAFE.

This is a temporal solution. We need a fundamental solution.


Revision tags: nick-nhusb-base-20150921
# 1.102 28-Aug-2015 rjs

Don't set M_PROTO1 in mbuf flags.

This was left over from the old usage of gif(4) with bridges.


# 1.101 20-Aug-2015 christos

include "ioconf.h" to get the 'void <driver>attach(int count);' prototype.


# 1.100 23-Jul-2015 ozaki-r

Fix PR 48104

So far bridge cannot receive frames via a member interface when the frames
come from another member interface. So when we assign an IP address to
a member interface, hosts connected to another member interface cannot
ping to the IP address. That behavior isn't expected. See PR 48104 for
more realistic examples of this issue.

The change does:
- drop M_PROMISC before ether_input, which allows a bridge member interface
to receive a frame coming from another bridge member interface
- receive broadcast/multicast frames via all bridge member interfaces,
which is required to receive IPv6 multicast packets destined to a
multicast group belonging to a bridge member interface that is different
from a packet arrival interface

roy@ helped testing of the fix, thanks!


Revision tags: nick-nhusb-base-20150606
# 1.99 01-Jun-2015 matt

Modify the BRDGGIFS and BRDGRTS cmds to be more COMPAT_NETBSD32 friendly.
(XXX whitespace)


# 1.98 16-Apr-2015 ozaki-r

Fix racy bridge_delete_member

It can be called from bridge_ioctl_del and bridge_clone_destroy with
a same bridge member (bif) at the same time. We have to prevent
that happens.

Pointed out by riastradh@


Revision tags: nick-nhusb-base-20150406
# 1.97 08-Jan-2015 ozaki-r

Use pserialize for rtlist in bridge

This change enables lockless accesses to bridge rtable lists.
See locking notes in a comment to know how pserialize and
mutexes are used. Some functions are rearranged to use
pserialize. A workqueue is introduced to use pserialize in
bridge_rtage via bridge_timer callout.

As usual, pserialize and mutexes are used only when NET_MPSAFE
on. On the other hand, the newly added workqueue is used
regardless of NET_MPSAFE on or off.


# 1.96 01-Jan-2015 ozaki-r

Reset the expire time of a cache on receiving a frame for the cache

The expire time of a cache in a bridge MAC address table was never reset
once it is initialized regardless of traffic for the cache. The behavior
isn't supposed and active caches are unnecessarily expired and removed.

PR kern/49507


# 1.95 31-Dec-2014 ozaki-r

Use pserialize in bridge

This change enables lockless accesses to bridge member lists.
See locking notes in a comment to know how pserialize and
mutexes are used.

This change also provides support for softint-based interrupt
handling; pserialize readers can run in both HW interrupt and
softint contexts.

As usual, pserialize is used only when NET_MPSAFE on.


# 1.94 25-Dec-2014 ozaki-r

Use LIST_FOREACH_SAFE in bridge_rt* functions


# 1.93 24-Dec-2014 ozaki-r

Replace malloc/free with kmem_* in if_bridge

Additionally M_NOWAIT is replaced with KM_SLEEP.


# 1.92 22-Dec-2014 ozaki-r

Call ether_input/m_freem without holding a lock or referencing unnecessary objects

When NET_MPSAFE on, a bridge tries to pass up a packet to Layer 3
(or call m_freem) with holding a lock or referencing unnecessary
objects. That causes random lock ups. The change fixes the issue.


Revision tags: nick-nhusb-base
# 1.91 15-Aug-2014 ozaki-r

branches: 1.91.2;
bridge: reject non-IFF_SIMPLEX interfaces

bridge does not work with !IFF_SIMPLEX interfaces (PR/18035);
the bug is not yet fixed. Until it gets fixed, we should
reject non-IFF_SIMPLEX interfaces.

Discussed with pooka@


Revision tags: netbsd-7-1-2-RELEASE netbsd-7-1-1-RELEASE netbsd-7-1-RELEASE netbsd-7-1-RC2 netbsd-7-nhusb-base-20170116 netbsd-7-1-RC1 netbsd-7-0-2-RELEASE netbsd-7-nhusb-base netbsd-7-0-1-RELEASE netbsd-7-0-RELEASE netbsd-7-0-RC3 netbsd-7-0-RC2 netbsd-7-0-RC1 netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.90 23-Jul-2014 ozaki-r

branches: 1.90.2;
Avoid calling copyout with holding mutex(IPL_NET)

Because copyout may lead a page fault that may sleep, we have to pull it
out from the critical section of mutex(IPL_NET) in bridge_ioctl_gifs.


# 1.89 23-Jul-2014 ozaki-r

Add missing unlock


# 1.88 20-Jul-2014 ozaki-r

Don't return ENETRESET when ioctl SIOCSIFMTU

Otherwise, just changing MTU with ifconfig shows
a confusable error message.

RP kern/48996


# 1.87 14-Jul-2014 ozaki-r

Make bridge MPSAFE

- Introduce BRIDGE_MPSAFE
- It's enabled only when NET_MPSAFE is defined
in if.h or the kernel config
- Add iflist and rtlist mutex locks
- Locking iflist is performance sensitive,
so it's not used when !BRIDGE_MPSAFE
- Add bif object reference counting
- It enables fine-grain locking for bridge member lists
by allowing to not hold a lock during touching a bif
- bridge_release_member is added to decrement the
reference count
- A condition variable is added to do bridge_delete_member
gracefully
- Add if_bridgeif to ifnet
- It's a shortcut to a bif object of a bridge member
- It reduces a bif lookup cost and so lock contention on iflist
- Make bridgestp MPSAFE too


# 1.86 02-Jul-2014 ozaki-r

Protect bridge_list with a mutex


# 1.85 02-Jul-2014 ozaki-r

Remove obsolete codes for if_snd


# 1.84 23-Jun-2014 ozaki-r

Get rid of unnecessary xc_broadcast after pktq_barrier

Pointed out by rmind@


# 1.83 18-Jun-2014 ozaki-r

Restructure bridge_input and bridge_broadcast

There are two changes:
- Assemble the places calling pktq_enqueue (bridge_forward)
for unicast and {b,m}cast frames into one
- Receive {b,m}cast frames in bridge_broadcast, not in
bridge_input

The changes make the code clear and readable. bridge_input
now doesn't need to take care of {b,m}cast frames;
bridge_forward and bridge_broadcast have the responsibility.

The changes are based on a patch of Lloyd Parkes submitted
in PR 48104, but don't fix its issue yet.


# 1.82 18-Jun-2014 ozaki-r

Tidy up bridge_input

No functional change.


# 1.81 17-Jun-2014 ozaki-r

Restructure ether_input and bridge_input

The network stack of NetBSD is well organized and
layered. A packet reception is processed from a
lower layer to an upper layer one by one. However,
ether_input and bridge_input are not structured so.
bridge_input is called inside ether_input.

The new structure replaces ifnet#if_input of a bridge
member with bridge_input when the member is attached.
So a packet goes straight on a packet reception via
a bridge, bridge_input => ether_input => ip_input.

The change is part of a patch of Lloyd Parkes submitted
in PR 48104. Unlike the patch, the change doesn't
intend to change the behavior of the packet processing.
Another patch will fix PR 48104.


# 1.80 16-Jun-2014 ozaki-r

Add net.interfaces.bridgeN.fwdq.{maxlen,len,drops} sysctl


# 1.79 16-Jun-2014 ozaki-r

Use pktqueue for bridge forwarding queue and softint


# 1.78 15-Jun-2014 ozaki-r

Get rid of unnecessary splnet for pool_{get,put}

A mutex prevents interrupts in the functions now.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base
# 1.77 29-Jun-2013 rmind

branches: 1.77.4;
- Rewrite parts of pfil(9): use array to store hooks and thus be more cache
friendly (there are only few hooks in the system). Make the structures
opaque and the interface more strict.
- Remove PFIL_HOOKS option by making pfil(9) mandatory.


Revision tags: agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8
# 1.76 22-Mar-2012 wiz

branches: 1.76.2; 1.76.4;
Fix typo in kauth name. From PR 46234 by Matthew Mondor.
Tested by Geoff Adams and Ryo ONODERA.


# 1.75 13-Mar-2012 elad

Replace the remaining KAUTH_GENERIC_ISSUSER authorization calls with
something meaningful. All relevant documentation has been updated or
written.

Most of these changes were brought up in the following messages:

http://mail-index.netbsd.org/tech-kern/2012/01/18/msg012490.html
http://mail-index.netbsd.org/tech-kern/2012/01/19/msg012502.html
http://mail-index.netbsd.org/tech-kern/2012/02/17/msg012728.html

Thanks to christos, manu, njoly, and jmmv for input.

Huge thanks to pgoyette for spinning these changes through some build
cycles and ATF.


Revision tags: netbsd-6-0-6-RELEASE netbsd-6-1-5-RELEASE netbsd-6-1-4-RELEASE netbsd-6-0-5-RELEASE netbsd-6-1-3-RELEASE netbsd-6-0-4-RELEASE netbsd-6-1-2-RELEASE netbsd-6-0-3-RELEASE netbsd-6-1-1-RELEASE netbsd-6-0-2-RELEASE netbsd-6-1-RELEASE netbsd-6-1-RC4 netbsd-6-1-RC3 netbsd-6-1-RC2 netbsd-6-1-RC1 netbsd-6-0-1-RELEASE matt-nb6-plus-nbase netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base
# 1.74 19-Nov-2011 tls

branches: 1.74.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.73 23-May-2011 joerg

branches: 1.73.4;
simplify


Revision tags: bouyer-quota2-nbase bouyer-quota2-base jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.72 07-Dec-2010 pooka

branches: 1.72.2;
_KERNEL_TOP


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1 yamt-nfs-mp-base9 uebayasi-xip-base
# 1.71 19-Jan-2010 pooka

branches: 1.71.4;
Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


Revision tags: matt-premerge-20091211 yamt-nfs-mp-base8 yamt-nfs-mp-base7 jymxensuspend-base yamt-nfs-mp-base6 yamt-nfs-mp-base5 jym-xensuspend-nbase
# 1.70 17-May-2009 cegger

fix crash in bridge_ioctl():


BRDGGFLT and BRDGSFILT bridge controls are only available with BRIDGE_IPF and PFIL_HOOKS defined.
In amd64 GENERIC and XEN kernel configs PFIL_HOOKS is defined but BRIDGE_IPF is not.

When a BRDGGFLT or BRDGSFILT command comes in, then ifd->ifd_cmd is not in range
of bridge_control_table_size. Then bc is not set and is dereferenced
later => BOOM.


Revision tags: yamt-nfs-mp-base4 jym-xensuspend-base
# 1.69 12-May-2009 elad

Move kauth(9) call before going into splnet().

Mailing list reference:

http://mail-index.netbsd.org/tech-net/2009/05/08/msg001286.html


Revision tags: yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 nick-hppapmap-base
# 1.68 04-Apr-2009 bouyer

Fix another typo


# 1.67 04-Apr-2009 bouyer

Fix a comment, and make it build.


# 1.66 04-Apr-2009 bouyer

Fixes from Masao Uebayashi


# 1.65 04-Apr-2009 bouyer

Fix for if_start() and pfil_hook() being called from hardware interrupt
context (reported on various mailing-lists, and part of PR kern/41114,
causing panic in pf(4) and possibly ipf(4) when BRIDGE_IPF is used).
Defer bridge_forward() to a software interrupt; bridge_input() enqueues
mbufs to ifp->if_snd which is handled in bridge_forward().


Revision tags: nick-hppapmap-base2
# 1.64 18-Jan-2009 mrg

branches: 1.64.2;
Fix multiple problems:

* A sign extension error creating the bridge ID corrupted the
priority (always making it the maximum).
* Do not catch STP packets on an interface for which STP is not
enabled -- it's a violation of the spec, and causes STP to fail on
neighboring bridges.
* An optimization to bstp_input() -- some information is already
known when we call it.

contributed anonymously.


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.63 07-Nov-2008 dyoung

*** Summary ***

When a link-layer address changes (e.g., ifconfig ex0 link
02:de:ad:be:ef:02 active), send a gratuitous ARP and/or a Neighbor
Advertisement to update the network-/link-layer address bindings
on our LAN peers.

Refuse a change of ethernet address to the address 00:00:00:00:00:00
or to any multicast/broadcast address. (Thanks matt@.)

Reorder ifnet ioctl operations so that driver ioctls may inherit
the functions of their "class"---ether_ioctl(), fddi_ioctl(), et
cetera---and the class ioctls may inherit from the generic ioctl,
ifioctl_common(), but both driver- and class-ioctls may override
the generic behavior. Make network drivers share more code.

Distinguish a "factory" link-layer address from others for the
purposes of both protecting that address from deletion and computing
EUI64.

Return consistent, appropriate error codes from network drivers.

Improve readability. KNF.

*** Details ***

In if_attach(), always initialize the interface ioctl routine,
ifnet->if_ioctl, if the driver has not already initialized it.
Delete if_ioctl == NULL tests everywhere else, because it cannot
happen.

In the ioctl routines of network interfaces, inherit common ioctl
behaviors by calling either ifioctl_common() or whichever ioctl
routine is appropriate for the class of interface---e.g., ether_ioctl()
for ethernets.

Stop (ab)using SIOCSIFADDR and start to use SIOCINITIFADDR. In
the user->kernel interface, SIOCSIFADDR's argument was an ifreq,
but on the protocol->ifnet interface, SIOCSIFADDR's argument was
an ifaddr. That was confusing, and it would work against me as I
make it possible for a network interface to overload most ioctls.
On the protocol->ifnet interface, replace SIOCSIFADDR with
SIOCINITIFADDR. In ifioctl(), return EPERM if userland tries to
invoke SIOCINITIFADDR.

In ifioctl(), give the interface the first shot at handling most
interface ioctls, and give the protocol the second shot, instead
of the other way around. Finally, let compatibility code (COMPAT_OSOCK)
take a shot.

Pull device initialization out of switch statements under
SIOCINITIFADDR. For example, pull ..._init() out of any switch
statement that looks like this:

switch (...->sa_family) {
case ...:
..._init();
...
break;
...
default:
..._init();
...
break;
}

Rewrite many if-else clauses that handle all permutations of IFF_UP
and IFF_RUNNING to use a switch statement,

switch (x & (IFF_UP|IFF_RUNNING)) {
case 0:
...
break;
case IFF_RUNNING:
...
break;
case IFF_UP:
...
break;
case IFF_UP|IFF_RUNNING:
...
break;
}

unifdef lots of code containing #ifdef FreeBSD, #ifdef NetBSD, and
#ifdef SIOCSIFMTU, especially in fwip(4) and in ndis(4).

In ipw(4), remove an if_set_sadl() call that is out of place.

In nfe(4), reuse the jumbo MTU logic in ether_ioctl().

Let ethernets register a callback for setting h/w state such as
promiscuous mode and the multicast filter in accord with a change
in the if_flags: ether_set_ifflags_cb() registers a callback that
returns ENETRESET if the caller should reset the ethernet by calling
if_init(), 0 on success, != 0 on failure. Pull common code from
ex(4), gem(4), nfe(4), sip(4), tlp(4), vge(4) into ether_ioctl(),
and register if_flags callbacks for those drivers.

Return ENOTTY instead of EINVAL for inappropriate ioctls. In
zyd(4), use ENXIO instead of ENOTTY to indicate that the device is
not any longer attached.

Add to if_set_sadl() a boolean 'factory' argument that indicates
whether a link-layer address was assigned by the factory or some
other source. In a comment, recommend using the factory address
for generating an EUI64, and update in6_get_hw_ifid() to prefer a
factory address to any other link-layer address.

Add a routing message, RTM_LLINFO_UPD, that tells protocols to
update the binding of network-layer addresses to link-layer addresses.
Implement this message in IPv4 and IPv6 by sending a gratuitous
ARP or a neighbor advertisement, respectively. Generate RTM_LLINFO_UPD
messages on a change of an interface's link-layer address.

In ether_ioctl(), do not let SIOCALIFADDR set a link-layer address
that is broadcast/multicast or equal to 00:00:00:00:00:00.

Make ether_ioctl() call ifioctl_common() to handle ioctls that it
does not understand.

In gif(4), initialize if_softc and use it, instead of assuming that
the gif_softc and ifp overlap.

Let ifioctl_common() handle SIOCGIFADDR.

Sprinkle rtcache_invariants(), which checks on DIAGNOSTIC kernels
that certain invariants on a struct route are satisfied.

In agr(4), rewrite agr_ioctl_filter() to be a bit more explicit
about the ioctls that we do not allow on an agr(4) member interface.

bzero -> memset. Delete unnecessary casts to void *. Use
sockaddr_in_init() and sockaddr_in6_init(). Compare pointers with
NULL instead of "testing truth". Replace some instances of (type
*)0 with NULL. Change some K&R prototypes to ANSI C, and join
lines.


Revision tags: netbsd-5-0-RC3 netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 wrstuden-revivesa-base-1 simonb-wapbl-nbase yamt-pf42-base4 simonb-wapbl-base wrstuden-revivesa-base
# 1.62 15-Jun-2008 christos

branches: 1.62.2; 1.62.4; 1.62.6;
- add if_alloc (ours just mallocs), and if_initname and use them (from FreeBSD)
- kill memsets where M_ZERO can be used.


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base
# 1.61 15-Apr-2008 thorpej

branches: 1.61.2; 1.61.4; 1.61.6; 1.61.8;
Make ip6 and icmp6 stats per-cpu.


# 1.60 12-Apr-2008 cegger

make this build with BRIDGE_IPF and PFIL_HOOKS options


# 1.59 12-Apr-2008 thorpej

Make IP, TCP, UDP, and ICMP statistics per-CPU. The stats are collated
when the user requests them via sysctl.


# 1.58 08-Apr-2008 thorpej

Change IPv6 stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ip6stat structure; old netstat
binaries will continue to work properly.


# 1.57 07-Apr-2008 thorpej

Change IP stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ipstat structure; old netstat
binaries will continue to work properly.


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase nick-net80211-sync-base keiichi-mipv6-base matt-armv6-nbase hpcarm-cleanup-base
# 1.56 20-Feb-2008 matt

branches: 1.56.6;
s/u_\(int[0-9]*_t\)/u\1/g
(change u_int*_t to uint*_t)


Revision tags: bouyer-xeni386-nbase bouyer-xeni386-base mjf-devfs-base
# 1.55 19-Jan-2008 dyoung

Use C99 array initializers for bridge_control_table[].


Revision tags: nick-csl-alignment-base5 bouyer-xeni386-merge1 matt-armv6-prevmlocking vmlocking2-base3 yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 jmcneill-base bouyer-xenamd64-base2 vmlocking-nbase yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base matt-armv6-base jmcneill-pm-base reinoud-bufcleanup-base vmlocking-base
# 1.54 27-Aug-2007 dyoung

branches: 1.54.2; 1.54.8; 1.54.14;
LLADDR -> CLLADDR.


# 1.53 26-Aug-2007 dyoung

Constify: LLADDR -> CLLADDR. I'm aiming here to make it easier to
identify sockaddr_dl abuse that remains in the kernel, especially
the potential for overwriting memory past the end of a sockaddr_dl
with, e.g., memcpy(LLADDR(), ...).

Use sockaddr_dl_setaddr() in a few places.


Revision tags: matt-mips64-base nick-csl-alignment-base mjf-ufs-trans-base
# 1.52 09-Jul-2007 ad

branches: 1.52.2; 1.52.6;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.51 12-Mar-2007 ad

branches: 1.51.2;
Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.


# 1.50 04-Mar-2007 christos

branches: 1.50.2;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.49 21-Feb-2007 dyoung

Use __arraycount().


# 1.48 17-Feb-2007 dyoung

KNF: de-__P, bzero -> memset, bcmp -> memcmp. Remove extraneous
parentheses in return statements.

Cosmetic: don't open-code TAILQ_FOREACH().

Cosmetic: change types of variables to avoid oodles of casts: in
in6_src.c, avoid casts by changing several route_in6 pointers
to struct route pointers. Remove unnecessary casts to caddr_t
elsewhere.

Pave the way for eliminating address family-specific route caches:
soon, struct route will not embed a sockaddr, but it will hold
a reference to an external sockaddr, instead. We will set the
destination sockaddr using rtcache_setdst(). (I created a stub
for it, but it isn't used anywhere, yet.) rtcache_free() will
free the sockaddr. I have extracted from rtcache_free() a helper
subroutine, rtcache_clear(). rtcache_clear() will "forget" a
cached route, but it will not forget the destination by releasing
the sockaddr. I use rtcache_clear() instead of rtcache_free()
in rtcache_update(), because rtcache_update() is not supposed
to forget the destination.

Constify:

1 Introduce const accessor for route->ro_dst, rtcache_getdst().

2 Constify the 'dst' argument to ifnet->if_output(). This
led me to constify a lot of code called by output routines.

3 Constify the sockaddr argument to protosw->pr_ctlinput. This
led me to constify a lot of code called by ctlinput routines.

4 Introduce const macros for converting from a generic sockaddr
to family-specific sockaddrs, e.g., sockaddr_in: satocsin6,
satocsin, et cetera.


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.47 04-Jan-2007 elad

branches: 1.47.2;
Consistent usage of KAUTH_GENERIC_ISSUSER.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3 netbsd-4-base
# 1.46 23-Nov-2006 rpaulo

New EtherIP driver based on tap(4) and gif(4) by Hans Rosenfeld.
Notable changes:
* Fixes PR 34268.
* Separates the code from gif(4) (which is more cleaner).
* Allows the usage of STP (Spanning Tree Protocol).
* Removed EtherIP implementation from gif(4)/tap(4).

Some input from Christos.


# 1.45 16-Nov-2006 christos

__unused removal on arguments; approved by core.


Revision tags: yamt-splraiseipl-base2
# 1.44 17-Oct-2006 dogcow

now that we have -Wno-unused-parameter, back out all the tremendously ugly
code to gratuitously access said parameters.


# 1.43 13-Oct-2006 dogcow

More -Wunused fallout. sprinkle __unused when possible; otherwise, use the
do { if (&x) {} } while (/* CONSTCOND */ 0);
construct as suggested by uwe in <20061012224845.GA9449@snark.ptc.spbu.ru>.


# 1.42 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


# 1.41 05-Oct-2006 tls

Protect calls to pool_put/pool_get that may occur in interrupt context
with spl used to protect other allocations and frees, or datastructure
element insertion and removal, in adjacent code.

It is almost unquestionably the case that some of the spl()/splx() calls
added here are superfluous, but it really seems wrong to see:

s=splfoo();
/* frob data structure */
splx(s);
pool_put(x);

and if we think we need to protect the first operation, then it is hard
to see why we should not think we need to protect the next. "Better
safe than sorry".

It is also almost unquestionably the case that I missed some pool
gets/puts from interrupt context with my strategy for finding these
calls; use of PR_NOWAIT is a strong hint that a pool may be used from
interrupt context but many callers in the kernel pass a "can wait/can't
wait" flag down such that my searches might not have found them. One
notable area that needs to be looked at is pf.

See also:

http://mail-index.netbsd.org/tech-kern/2006/07/19/0003.html
http://mail-index.netbsd.org/tech-kern/2006/07/19/0009.html


Revision tags: abandoned-netbsd-4-base yamt-splraiseipl-base yamt-pdpolicy-base9 yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.40 23-Jul-2006 ad

branches: 1.40.4; 1.40.6;
Use the LWP cached credentials where sane.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.39 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.38 18-May-2006 liamjfoy

branches: 1.38.2;
Integrate Common Address Redundancy Procotol (CARP) from OpenBSD

'pseudo-device carp'

Thanks to: joerg@ christos@ riz@ and others who tested
Ok: core@


# 1.37 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base yamt-pdpolicy-base2 elad-kernelauth-base yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.36 17-Jan-2006 christos

branches: 1.36.2; 1.36.4; 1.36.6; 1.36.8; 1.36.10;
Make sure that breq is also cleared (from Xin LI)


# 1.35 09-Jan-2006 christos

Make sure we initialize all structs to 0; from Xin LI


# 1.34 24-Dec-2005 perry

branches: 1.34.2;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.33 11-Dec-2005 thorpej

ANSI function decls and application of static.


# 1.32 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.31 01-Jun-2005 jdc

branches: 1.31.2;
Fix this properly by renaming the conflicting variables.


# 1.30 01-Jun-2005 jdc

Remove extraneous definition of struct llc (found by shadow warning).


Revision tags: netbsd-3-0-RELEASE netbsd-3-0-RC6 netbsd-3-0-RC5 netbsd-3-0-RC4 netbsd-3-0-RC3 netbsd-3-0-RC2 netbsd-3-0-RC1 yamt-km-base4 yamt-km-base3 netbsd-3-base kent-audio2-base
# 1.29 26-Feb-2005 perry

branches: 1.29.2; 1.29.4;
nuke trailing whitespace


Revision tags: yamt-km-base2
# 1.28 31-Jan-2005 kim

Add RFC 3378 EtherIP support, ported from OpenBSD to NetBSD by
Hans Rosenfeld (rosenfeld at grumpf.hope-2000.org)

This change makes it possible to add gif interfaces to bridges, which
will then send and receive IP protocol 97 packets. Packets are Ethernet
frames with an EtherIP header prepended.


Revision tags: yamt-km-base kent-audio1-beforemerge kent-audio1-base
# 1.27 04-Dec-2004 peter

branches: 1.27.4; 1.27.6;
Change ifc_destroy to return an int instead of void, so that it
can pass back errors to ifconfig.


# 1.26 06-Oct-2004 bad

Interfaces that do checksum offloading indicate the checksum status of
received packets in csum_flags in the packet header. Packets that are
forwarded over the bridge need to have csum_flags cleared before being
put on the output queue. Do so in bridge_enqueue().

Discussed with Jason Thorpe.

Fixes PR kern/27007 and the first part of PR kern/21831.


# 1.25 05-Oct-2004 christos

Only enable BRIDGE_IPF code if PFIL_HOOKS is enabled.


# 1.24 21-Apr-2004 itojun

kill a sprintf


# 1.23 21-Apr-2004 itojun

kill sprintf, use snprintf


Revision tags: netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.22 31-Jan-2004 jdc

branches: 1.22.2;
Use m_copydata(), m_adj() and M_PREPEND() to manipulate mbuf's in
bridge_ipf(). Fixes kernel memory corruption that occured when using
m_split() and m_cat().
Idea from OpenBSD.


# 1.21 09-Dec-2003 augustss

Fix spelling mistake in a comment.


# 1.20 28-Oct-2003 mycroft

Mark this initializer in the canonical way so it can be found later.


# 1.19 25-Oct-2003 christos

Fix uninitialized variable warnings


# 1.18 16-Sep-2003 jdc

Add a flag parameter to bridge_enqueue() to tell it whether to run the
filter or not. We only need to run the filter for bridge_forward() and
bridge_broadcast(). If we also run it for bridge_output(), we will run
the filter twice outbound per packet, so don't.

In bridge_ipf(), make sure we don't run m_cat() on a single mbuf chain
by checking to see (and remembering) if we need to m_split() the mbuf.
This fixes bridge + ipfilter on sparc.

Fixes PR kern/22063.


# 1.17 11-Aug-2003 itojun

rm extra blank line


# 1.16 13-Jul-2003 jdc

Include opt_inet.h to get INET6 definition.
Now, bridged ipv6 packets are passed through ipfilter.
However, some v6 packets still do not get transmitted when ipf is enabled.
Partial fix for PR kern/22063.


# 1.15 23-Jun-2003 martin

branches: 1.15.2;
Make sure to include opt_foo.h if a defflag option FOO is used.


# 1.14 24-May-2003 kristerw

Make sure splx() is called for all bridge_ioctl() error cases.


# 1.13 16-May-2003 itojun

use strlcpy


# 1.12 14-May-2003 itojun

use arc4random


# 1.11 19-Mar-2003 bouyer

Fix 2 bugs:
- initialise stp when the bridge is turned up, without this stp will keep
all interfaces disabled in a sequence like:
brconfig bridge0 add if0 add if1 stp if0 stp if1 up
- s/BRDGSPRI/BRDGSIFPRIO in brconfig.c:cmd_ifpriority()

add a command (ifpathcost) to change the stp path cost of the STP path cost of
an interface. Display the interface path cost with the others STP parameters.


# 1.10 27-Feb-2003 perseant

Make BRIDGE_IPF an option, and document it. Add it (commented) to GENERIC.
Let brconfig tell whether the bridge is using the ipfilter hook, or not.


# 1.9 15-Feb-2003 perseant

Add ipf packet-filtering option to if_bridge. The option is controlled at
compile-time by BRIDGE_IPF, and at runtime by brconfig with the {ipf,-ipf}
option on a per-bridge basis.

As a side-effect, add PFIL_HOOKS processing to if_bridge.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base kqueue-aftermerge kqueue-beforemerge gehenna-devsw-base kqueue-base
# 1.8 24-Aug-2002 martin

Add a function to lookup bridge members by struct ifnet * and use
it at all call sites that have such a pointer readily available.
This avoids unnecessary strcmp()s in critical paths, and removes
some XXX comments.


# 1.7 08-Jun-2002 itojun

reject "add" request if if_mtu is different.


# 1.6 23-May-2002 itojun

use IFT_BRIDGE


Revision tags: netbsd-1-6-base
# 1.5 24-Mar-2002 jdolecek

branches: 1.5.2; 1.5.4;
Fix a memory leak in bridge_ioctl_add() when the called for non-ethernet
interface.
Problem noted and fix provided by in kern/16019 by Love.


Revision tags: eeh-devprop-base newlock-base
# 1.4 08-Mar-2002 thorpej

Pool deals fairly well with physical memory shortage, but it doesn't
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map). Try to deal with this:

* Group all information about the backend allocator for a pool in a
separate structure. The pool references this structure, rather than
the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
to become available, but will still fail if it cannot callocate KVA
space for the pages. If this happens, carefully drain all pools using
the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
some pages, and use that information to make draining easier and more
efficient.
* Get rid of PR_URGENT. There was only one use of it, and it could be
dealt with by the caller.

From art@openbsd.org.


Revision tags: ifpoll-base
# 1.3 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.2 17-Aug-2001 thorpej

branches: 1.2.2; 1.2.4;
Only report expire time for DYNAMIC forwarding table entries.


# 1.1 17-Aug-2001 thorpej

Add support for building Ethernet bridges, based on Jason Wright's
bridge driver from OpenBSD, although the bridge code has been *heavily*
modified by me (the 802.1D code remains mostly unchanged from the
original).


# 1.171 27-Apr-2020 jdolecek

if MTU of the added interface doesn't match the bridge, modify the MTU
of the interface to that of the bridge instead of just refusing the
addition with EINVAL

this is a convenience feature to simplify bridge setup with non-standard
MTU, the useful behaviour observed with Linux xenbr


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406
# 1.170 27-Mar-2020 jdolecek

replace the conditional m_pullup() on start of bridge_output() with
a KASSERT(), to make it clear no mbuf manipulation is ever done here

the condition should never trigger, this always runs after ether_output()
M_PREPEND()s ether_header


# 1.169 24-Mar-2020 jdolecek

reset the csum_flags in bridge_brodcast() also for bmcast path

for destination interfaces with real hardware offloading this fixes
multicast packet corruption; for xvif(4) this fix stops treating them
as having no csum

may fix PR kern/42386


Revision tags: ad-namecache-base3
# 1.168 24-Feb-2020 rin

Remove debug printf I put into bridge_calc_csum_flags().
Sorry for noise.


# 1.167 23-Feb-2020 jdolecek

disable the DEBUG bridge_calc_csum_flags() printf


# 1.166 29-Jan-2020 thorpej

Adopt <net/if_stats.h>.


Revision tags: ad-namecache-base2 ad-namecache-base1 ad-namecache-base phil-wifi-20191119
# 1.165 05-Aug-2019 msaitoh

branches: 1.165.2;
Cast uint32_t to avoid undefined behavior in bridge_rthash(). Found by kUBSan.


Revision tags: netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609 isaki-audio2-base pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226
# 1.164 22-Dec-2018 rin

branches: 1.164.4;
Take the interface out of promiscuous mode in bridge_delete_member()
instead of bridge_ioctl_del(). Otherwise, the member interfaces are
left in promiscuous mode when the bridge is destroyed.


# 1.163 15-Dec-2018 rin

Improve wording in comments: replace "chain" with "queue" for
sequence of mbuf's connected by m_nextpkt, in order to avoid
confusion with those connected by m_next.

No binary changes.


# 1.162 14-Dec-2018 martin

Need <netinet6/ip6_var.h> for ip6_statinc() prototype.


# 1.161 12-Dec-2018 rin

PR kern/53562

Handle TX offload in software when a packet is sent via
bridge_output(). We can send it as is in the following
exceptional cases:

For unicast:

(1) When the destination interface is the same as source.

(2) When the destination supports all TX offload options
specified in a packet.

For multicast/broadcast:

(3) When all the members of the bridge support the specified
TX offload options.

For (3), add sc_csum_flags_tx flag to bridge softc, which is
logical AND b/w capabilities of TX offload options in member
interface (ifp->if_csum_flags_tx). The flag is updated when a
member is (i) added to or (ii) removed from a bridge, or (iii)
if_csum_flags_tx flag of a member interface is manipulated via
ifconfig(8).

Turn on M_CSUM_TSOv[46] bit in ifp->if_csum_flags_tx flag when
TSO[46] is enabled for that interface.

OK msaitoh thorpej


Revision tags: pgoyette-compat-1126
# 1.160 09-Nov-2018 ozaki-r

Fix that brconfig <bridge> (addr) can't show a large number of MAC addresses

The command shows only 256 addresses at maximum even if a bridge caches more
addresses. It occurs because the kernel doesn't return an error if the command
passes a short buffer that can't store all cached addresses; the kernel fills
cached addresses as much as possible and returns it without telling that the
result is truncated.

Fix the issue by telling a required size of a buffer if a buffer passed from the
command is not enough, which lets the command retry with an enough buffer.

Reported by k-goda@IIJ


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930
# 1.159 19-Sep-2018 msaitoh

Micro optimization. m_copym(M_COPYALL) -> m_copypacket().


# 1.158 18-Sep-2018 msaitoh

- Fix bridge_enqueue() which was broken by last commit. Use correct mbuf
pointer.
- Modify comment.


# 1.157 14-Sep-2018 msaitoh

Fix a bug that bridge_enqueue() incorrectly cleared outgoing packet's offload
flags. bridge_enqueue() is called from bridge_output() when a packet is
spontaneous. Clear csum_flags before calling brige_enqueue() in
bridge_forward() or bridge_broadcast() instead of in the beginning of
bridge_enqueue().

Note that this change doesn't fix a problem on the following configuration:

A bridge has two or more interfaces.

An address is assigned to an bridge member interface and
some offload flags are set.

Another interface has no address and has no any offload flag.

XXX pullup-[78]


Revision tags: pgoyette-compat-0906 pgoyette-compat-0728 phil-wifi-base pgoyette-compat-0625
# 1.156 25-May-2018 ozaki-r

branches: 1.156.2;
Ensure to call if_register after interface initializations finish


Revision tags: pgoyette-compat-0521
# 1.155 14-May-2018 ozaki-r

Protect packet input routines with KERNEL_LOCK and splsoftnet

if_input, i.e, ether_input and friends, now runs in softint without any
protections. It's ok for ether_input itself because it's already MP-safe,
however, subsequent routines called from it such as carp_input and agr_input
aren't safe because they're not MP-safe. Protect if_input with KERNEL_LOCK.

if_input can be called from a normal LWP context. In that case we need to
prevent interrupts (softint) from running by splsoftnet to protect non-MP-safe
codes (e.g., carp_input and agr_input).

Pointed out by mlelstv@


Revision tags: pgoyette-compat-0502 pgoyette-compat-0422
# 1.154 18-Apr-2018 ozaki-r

Add missing PSLIST_ENTRY_INIT and PSLIST_ENTRY_DESTROY


# 1.153 18-Apr-2018 ozaki-r

Get rid of a unnecessary semicolon

Pointed out by kamil@


# 1.152 18-Apr-2018 ozaki-r

bridge: use pslist(9) for rtlist and rthash

The change fixes race conditions on list operations. One example is that a
reader may see invalid pointers on a looking item in a list due to lack of
membar_producer.


# 1.151 18-Apr-2018 ozaki-r

Simplify bridge_rtnode_insert (NFC)


# 1.150 18-Apr-2018 ozaki-r

Remove obsolete NULL checks


Revision tags: pgoyette-compat-0415
# 1.149 10-Apr-2018 ozaki-r

Fix bridge_rtdelete

It removes a rtable entry that belongs to a specified interface, however, its
original behavior was to delete all belonging entries. Restore the original
behavior.


Revision tags: pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.148 15-Jan-2018 maxv

branches: 1.148.2;
If the bridge is not running, don't call bridge_stop. Otherwise the
following commands will crash the kernel:

ifconfig bridge0 create
ifconfig bridge0 destroy


# 1.147 28-Dec-2017 ozaki-r

Ensure the timer isn't running by using workqueue_wait


# 1.146 19-Dec-2017 ozaki-r

Don't set IFEF_MPSAFE unless NET_MPSAFE at this point

Because recent investigations show that interfaces with IFEF_MPSAFE need to
follow additional restrictions to work with the flag safely. We should enable it
on an interface by default only if the interface surely satisfies the
restrictions, which are described in if.h.

Note that enabling IFEF_MPSAFE solely gains a few benefit on performance because
the network stack is still serialized by the big kernel locks by default.


# 1.145 11-Dec-2017 ozaki-r

Wrap if_ioctl_lock with IFNET_* macros (NFC)

Also if_ioctl_lock perhaps needs to be renamed to something because it's now
not just for ioctl...


# 1.144 08-Dec-2017 ozaki-r

Fix build of kernels without ether

By throwing out if_enable_vlan_mtu and if_disable_vlan_mtu that
created a unnecessary dependency from if.c to if_ethersubr.c.

PR kern/52790


# 1.143 06-Dec-2017 ozaki-r

Ensure to not turn on IFF_RUNNING of an interface until its initialization completes

And ensure to turn off it before destruction as per IFF_RUNNING's description
"resource allocated". (The description is a bit doubtful though, I believe the
change is still proper.)


# 1.142 06-Dec-2017 ozaki-r

Ensure to hold if_ioctl_lock when calling if_flags_set


Revision tags: tls-maxphys-base-20171202
# 1.141 17-Nov-2017 ozaki-r

Add missing IFEF_NO_LINK_STATE_CHANGE to bridge


# 1.140 16-Nov-2017 ozaki-r

Unify IFEF_*_MPSAFE into IFEF_MPSAFE

There are already two flags for if_output and if_start, however, it seems such
MPSAFE flags are eventually needed for all if_XXX operations. Having discrete
flags for each operation is wasteful of if_extflags bits. So let's unify
the flags into one: IFEF_MPSAFE.

Fortunately IFEF_*_MPSAFE flags have never been included in any releases, so
we can change them without breaking backward compatibility of the releases
(though the kernel version of -current should be bumped).

Note that if an interface have both MP-safe and non-MP-safe operations at a
time, we have to set the IFEF_MPSAFE flag and let callees of non-MP-safe
opeartions take the kernel lock.

Proposed on tech-kern@ and tech-net@


# 1.139 15-Nov-2017 ozaki-r

Mark callouts of bridge CALLOUT_MPSAFE


# 1.138 25-Oct-2017 ozaki-r

Remove unnecessary splsoftnet


# 1.137 25-Oct-2017 ozaki-r

Don't free sc_rthash twice


# 1.136 23-Oct-2017 msaitoh

- If if_initialize() failed in the attach function, free resources and return.
- Add some missing frees in bridge_clone_destroy().
- KNF


# 1.135 02-Oct-2017 ozaki-r

Add curlwp_bind to bridge_input for psref

It can be called in a thread context via tap (tap_dev_write).

Fix PR kern/52587


Revision tags: nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320
# 1.134 07-Mar-2017 ozaki-r

branches: 1.134.6;
Remove unnecessary splnet for bridge_enqueue

bridge_enqueue now uses if_transmit_lock that does splnet for device
drivers, so splnet for bridge_enqueue isn't needed anymore.


# 1.133 16-Feb-2017 knakahara

add l2tp(4) L2TPv3 interface.

originally implemented by IIJ SEIL team.


Revision tags: nick-nhusb-base-20170204
# 1.132 23-Jan-2017 ozaki-r

Replace some splnet with splsoftnet


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107 nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004
# 1.131 15-Sep-2016 christos

branches: 1.131.2;
Always do the mbuf checks. The packet filters (npf) expect the mbuf to be
pulled-up. (Krists Krilovs)


Revision tags: localcount-20160914
# 1.130 29-Aug-2016 ozaki-r

KNF; replace white spaces with hard tabs

No functional change.


Revision tags: pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.129 22-Jun-2016 knakahara

branches: 1.129.2;
fix: locking about IFQ_ENQUEUE and ALTQ

- If NET_MPSAFE is not defined, IFQ_LOCK is nop. Currently, that means
IFQ_ENQUEUE() of some paths such as bridge_enqueue() is called parallel
wrongly.
- If ALTQ is enabled, Tx processing should call if_transmit() (= IFQ_ENQUEUE
+ ifp->if_start()) instead of ifp->if_transmit() to call ALTQ_ENQUEUE()
and ALTQ_DEQUEUE().
Furthermore, ALTQ processing is always required KERNEL_LOCK currently.


# 1.128 20-Jun-2016 knakahara

fix: should not assert IFEF_OUTPUT_MPSAFE in bridge_output()


# 1.127 20-Jun-2016 knakahara

tentative fix for ATF(net/if_bridge/t_bridge)


# 1.126 20-Jun-2016 knakahara

make bridge_output MP-safe, so that bridge(4) can enable IFEF_OUTPUT_MPSAFE.

making MP-scalable is future work.


# 1.125 10-Jun-2016 ozaki-r

Avoid storing a pointer of an interface in a mbuf

Having a pointer of an interface in a mbuf isn't safe if we remove big
kernel locks; an interface object (ifnet) can be destroyed anytime in any
packet processing and accessing such object via a pointer is racy. Instead
we have to get an object from the interface collection (ifindex2ifnet) via
an interface index (if_index) that is stored to a mbuf instead of an
pointer.

The change provides two APIs: m_{get,put}_rcvif_psref that use psref(9)
for sleep-able critical sections and m_{get,put}_rcvif that use
pserialize(9) for other critical sections. The change also adds another
API called m_get_rcvif_NOMPSAFE, that is NOT MP-safe and for transition
moratorium, i.e., it is intended to be used for places where are not
planned to be MP-ified soon.

The change adds some overhead due to psref to performance sensitive paths,
however the overhead is not serious, 2% down at worst.

Proposed on tech-kern and tech-net.


# 1.124 10-Jun-2016 ozaki-r

Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.123 16-May-2016 ozaki-r

Apply if_get and if_put to bridge(4)


# 1.122 04-May-2016 roy

Allow multicast/broadcast packets from a bridge member to other members.
Note this should just call bridge_broadcast when more locking issues are
resolved.


# 1.121 28-Apr-2016 knakahara

introduce new ifnet MP-scalable sending interface "if_transmit".


# 1.120 28-Apr-2016 ozaki-r

Constify rtentry of if_output

We no longer need to change rtentry below if_output.

The change makes it clear where rtentries are changed (or not)
and helps forthcoming locking (os psrefing) rtentries.


# 1.119 24-Apr-2016 christos

CID 1358673: dead code


Revision tags: nick-nhusb-base-20160422
# 1.118 22-Apr-2016 roy

Change used from int to bool.
If used, abort the loop because we think we're already at the end.


# 1.117 20-Apr-2016 knakahara

IFQ_ENQUEUE refactor (3/3) : eliminate pktattr argument from IFQ_ENQUEUE caller


# 1.116 20-Apr-2016 knakahara

IFQ_ENQUEUE refactor (2/3) : eliminate pktattr argument from altq implemantation


# 1.115 19-Apr-2016 ozaki-r

Apply psref(9) to bridge(4)

Note that there is an issue that ioctls for an interface and a destruction
of the interface can run in parallel and it causes race conditions on
bridge as well (it rarely happens). The issue will be addressed in the
interface common code (if.c).


# 1.114 19-Apr-2016 ozaki-r

Remove BRIDGE_MPSAFE switch and enable MP-safe code by default

We need to enable it by default because bridge_input now runs
in softint, but bridge_input w/o BRIDGE_MPSAFE was designed as
it runs in hardware interrupt.

Note that there remains a racy code in bridge_output; it will be
solved in the upcoming change (applying psref(9)).


# 1.113 11-Apr-2016 ozaki-r

Fix usage of pslist(9)

Pointed out by riastradh@.


# 1.112 11-Apr-2016 ozaki-r

Use pslist(9) in bridge(4)

This adds missing memory barriers to list operations for pserialize.


# 1.111 28-Mar-2016 ozaki-r

Remove unused global bridge list

Pointed out by riastradh@


# 1.110 23-Mar-2016 ozaki-r

Fix LIST_FOREACH argument


# 1.109 23-Mar-2016 ozaki-r

Use LIST_FOREACH instead of LIST_FOREACH_SAFE

No need to use *_SAFE because we don't remove any items in the loop.


Revision tags: nick-nhusb-base-20160319
# 1.108 15-Feb-2016 ozaki-r

Simplify bridge(4)

Thanks to introducing softint-based if_input, the entire bridge code now
never run in hardware interrupt context. So we can simplify the code.

- Remove spin mutexes
- They were needed because some code of bridge could run in
hardware interrupt context
- We now need only an adaptive mutex for each shared object
(a member list and a forwarding table)
- Remove pktqueue
- bridge_input is already in softint, using another softint
(for bridge_forward) is useless
- Packet distribution should be down at device drivers


# 1.107 10-Feb-2016 ozaki-r

Don't share struct work, instead have one per softc

Pointed out by riastradh@


# 1.106 09-Feb-2016 ozaki-r

Introduce softint-based if_input

This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.

This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.

To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.

Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html

Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!


Revision tags: nick-nhusb-base-20151226
# 1.105 19-Nov-2015 christos

Add handling of VLAN packets in if_bridge where the parent interface supports
them (Jean-Jacques.Puig@espci.fr). Factor out the vlan_mtu enabling and
disabling code.


# 1.104 20-Oct-2015 maxv

Harmless alloc inconsistency; make sure the exact same argument is given to
kmem_alloc/kmem_free. Found by Brainy.


# 1.103 07-Oct-2015 ozaki-r

Enqueue frames to a curcpu's pktqueue

Currently RX can run on a CPU other than CPU#0, so always enqueuing
to a pktqueue of CPU#0 makes no sense. Let's use a curcpu's pktqueue,
although bridge_foward softint doesn't run in parallel without
NET_MPSAFE.

This is a temporal solution. We need a fundamental solution.


Revision tags: nick-nhusb-base-20150921
# 1.102 28-Aug-2015 rjs

Don't set M_PROTO1 in mbuf flags.

This was left over from the old usage of gif(4) with bridges.


# 1.101 20-Aug-2015 christos

include "ioconf.h" to get the 'void <driver>attach(int count);' prototype.


# 1.100 23-Jul-2015 ozaki-r

Fix PR 48104

So far bridge cannot receive frames via a member interface when the frames
come from another member interface. So when we assign an IP address to
a member interface, hosts connected to another member interface cannot
ping to the IP address. That behavior isn't expected. See PR 48104 for
more realistic examples of this issue.

The change does:
- drop M_PROMISC before ether_input, which allows a bridge member interface
to receive a frame coming from another bridge member interface
- receive broadcast/multicast frames via all bridge member interfaces,
which is required to receive IPv6 multicast packets destined to a
multicast group belonging to a bridge member interface that is different
from a packet arrival interface

roy@ helped testing of the fix, thanks!


Revision tags: nick-nhusb-base-20150606
# 1.99 01-Jun-2015 matt

Modify the BRDGGIFS and BRDGRTS cmds to be more COMPAT_NETBSD32 friendly.
(XXX whitespace)


# 1.98 16-Apr-2015 ozaki-r

Fix racy bridge_delete_member

It can be called from bridge_ioctl_del and bridge_clone_destroy with
a same bridge member (bif) at the same time. We have to prevent
that happens.

Pointed out by riastradh@


Revision tags: nick-nhusb-base-20150406
# 1.97 08-Jan-2015 ozaki-r

Use pserialize for rtlist in bridge

This change enables lockless accesses to bridge rtable lists.
See locking notes in a comment to know how pserialize and
mutexes are used. Some functions are rearranged to use
pserialize. A workqueue is introduced to use pserialize in
bridge_rtage via bridge_timer callout.

As usual, pserialize and mutexes are used only when NET_MPSAFE
on. On the other hand, the newly added workqueue is used
regardless of NET_MPSAFE on or off.


# 1.96 01-Jan-2015 ozaki-r

Reset the expire time of a cache on receiving a frame for the cache

The expire time of a cache in a bridge MAC address table was never reset
once it is initialized regardless of traffic for the cache. The behavior
isn't supposed and active caches are unnecessarily expired and removed.

PR kern/49507


# 1.95 31-Dec-2014 ozaki-r

Use pserialize in bridge

This change enables lockless accesses to bridge member lists.
See locking notes in a comment to know how pserialize and
mutexes are used.

This change also provides support for softint-based interrupt
handling; pserialize readers can run in both HW interrupt and
softint contexts.

As usual, pserialize is used only when NET_MPSAFE on.


# 1.94 25-Dec-2014 ozaki-r

Use LIST_FOREACH_SAFE in bridge_rt* functions


# 1.93 24-Dec-2014 ozaki-r

Replace malloc/free with kmem_* in if_bridge

Additionally M_NOWAIT is replaced with KM_SLEEP.


# 1.92 22-Dec-2014 ozaki-r

Call ether_input/m_freem without holding a lock or referencing unnecessary objects

When NET_MPSAFE on, a bridge tries to pass up a packet to Layer 3
(or call m_freem) with holding a lock or referencing unnecessary
objects. That causes random lock ups. The change fixes the issue.


Revision tags: nick-nhusb-base
# 1.91 15-Aug-2014 ozaki-r

branches: 1.91.2;
bridge: reject non-IFF_SIMPLEX interfaces

bridge does not work with !IFF_SIMPLEX interfaces (PR/18035);
the bug is not yet fixed. Until it gets fixed, we should
reject non-IFF_SIMPLEX interfaces.

Discussed with pooka@


Revision tags: netbsd-7-1-2-RELEASE netbsd-7-1-1-RELEASE netbsd-7-1-RELEASE netbsd-7-1-RC2 netbsd-7-nhusb-base-20170116 netbsd-7-1-RC1 netbsd-7-0-2-RELEASE netbsd-7-nhusb-base netbsd-7-0-1-RELEASE netbsd-7-0-RELEASE netbsd-7-0-RC3 netbsd-7-0-RC2 netbsd-7-0-RC1 netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.90 23-Jul-2014 ozaki-r

branches: 1.90.2;
Avoid calling copyout with holding mutex(IPL_NET)

Because copyout may lead a page fault that may sleep, we have to pull it
out from the critical section of mutex(IPL_NET) in bridge_ioctl_gifs.


# 1.89 23-Jul-2014 ozaki-r

Add missing unlock


# 1.88 20-Jul-2014 ozaki-r

Don't return ENETRESET when ioctl SIOCSIFMTU

Otherwise, just changing MTU with ifconfig shows
a confusable error message.

RP kern/48996


# 1.87 14-Jul-2014 ozaki-r

Make bridge MPSAFE

- Introduce BRIDGE_MPSAFE
- It's enabled only when NET_MPSAFE is defined
in if.h or the kernel config
- Add iflist and rtlist mutex locks
- Locking iflist is performance sensitive,
so it's not used when !BRIDGE_MPSAFE
- Add bif object reference counting
- It enables fine-grain locking for bridge member lists
by allowing to not hold a lock during touching a bif
- bridge_release_member is added to decrement the
reference count
- A condition variable is added to do bridge_delete_member
gracefully
- Add if_bridgeif to ifnet
- It's a shortcut to a bif object of a bridge member
- It reduces a bif lookup cost and so lock contention on iflist
- Make bridgestp MPSAFE too


# 1.86 02-Jul-2014 ozaki-r

Protect bridge_list with a mutex


# 1.85 02-Jul-2014 ozaki-r

Remove obsolete codes for if_snd


# 1.84 23-Jun-2014 ozaki-r

Get rid of unnecessary xc_broadcast after pktq_barrier

Pointed out by rmind@


# 1.83 18-Jun-2014 ozaki-r

Restructure bridge_input and bridge_broadcast

There are two changes:
- Assemble the places calling pktq_enqueue (bridge_forward)
for unicast and {b,m}cast frames into one
- Receive {b,m}cast frames in bridge_broadcast, not in
bridge_input

The changes make the code clear and readable. bridge_input
now doesn't need to take care of {b,m}cast frames;
bridge_forward and bridge_broadcast have the responsibility.

The changes are based on a patch of Lloyd Parkes submitted
in PR 48104, but don't fix its issue yet.


# 1.82 18-Jun-2014 ozaki-r

Tidy up bridge_input

No functional change.


# 1.81 17-Jun-2014 ozaki-r

Restructure ether_input and bridge_input

The network stack of NetBSD is well organized and
layered. A packet reception is processed from a
lower layer to an upper layer one by one. However,
ether_input and bridge_input are not structured so.
bridge_input is called inside ether_input.

The new structure replaces ifnet#if_input of a bridge
member with bridge_input when the member is attached.
So a packet goes straight on a packet reception via
a bridge, bridge_input => ether_input => ip_input.

The change is part of a patch of Lloyd Parkes submitted
in PR 48104. Unlike the patch, the change doesn't
intend to change the behavior of the packet processing.
Another patch will fix PR 48104.


# 1.80 16-Jun-2014 ozaki-r

Add net.interfaces.bridgeN.fwdq.{maxlen,len,drops} sysctl


# 1.79 16-Jun-2014 ozaki-r

Use pktqueue for bridge forwarding queue and softint


# 1.78 15-Jun-2014 ozaki-r

Get rid of unnecessary splnet for pool_{get,put}

A mutex prevents interrupts in the functions now.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base
# 1.77 29-Jun-2013 rmind

branches: 1.77.4;
- Rewrite parts of pfil(9): use array to store hooks and thus be more cache
friendly (there are only few hooks in the system). Make the structures
opaque and the interface more strict.
- Remove PFIL_HOOKS option by making pfil(9) mandatory.


Revision tags: agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8
# 1.76 22-Mar-2012 wiz

branches: 1.76.2; 1.76.4;
Fix typo in kauth name. From PR 46234 by Matthew Mondor.
Tested by Geoff Adams and Ryo ONODERA.


# 1.75 13-Mar-2012 elad

Replace the remaining KAUTH_GENERIC_ISSUSER authorization calls with
something meaningful. All relevant documentation has been updated or
written.

Most of these changes were brought up in the following messages:

http://mail-index.netbsd.org/tech-kern/2012/01/18/msg012490.html
http://mail-index.netbsd.org/tech-kern/2012/01/19/msg012502.html
http://mail-index.netbsd.org/tech-kern/2012/02/17/msg012728.html

Thanks to christos, manu, njoly, and jmmv for input.

Huge thanks to pgoyette for spinning these changes through some build
cycles and ATF.


Revision tags: netbsd-6-0-6-RELEASE netbsd-6-1-5-RELEASE netbsd-6-1-4-RELEASE netbsd-6-0-5-RELEASE netbsd-6-1-3-RELEASE netbsd-6-0-4-RELEASE netbsd-6-1-2-RELEASE netbsd-6-0-3-RELEASE netbsd-6-1-1-RELEASE netbsd-6-0-2-RELEASE netbsd-6-1-RELEASE netbsd-6-1-RC4 netbsd-6-1-RC3 netbsd-6-1-RC2 netbsd-6-1-RC1 netbsd-6-0-1-RELEASE matt-nb6-plus-nbase netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base
# 1.74 19-Nov-2011 tls

branches: 1.74.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.73 23-May-2011 joerg

branches: 1.73.4;
simplify


Revision tags: bouyer-quota2-nbase bouyer-quota2-base jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.72 07-Dec-2010 pooka

branches: 1.72.2;
_KERNEL_TOP


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1 yamt-nfs-mp-base9 uebayasi-xip-base
# 1.71 19-Jan-2010 pooka

branches: 1.71.4;
Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


Revision tags: matt-premerge-20091211 yamt-nfs-mp-base8 yamt-nfs-mp-base7 jymxensuspend-base yamt-nfs-mp-base6 yamt-nfs-mp-base5 jym-xensuspend-nbase
# 1.70 17-May-2009 cegger

fix crash in bridge_ioctl():


BRDGGFLT and BRDGSFILT bridge controls are only available with BRIDGE_IPF and PFIL_HOOKS defined.
In amd64 GENERIC and XEN kernel configs PFIL_HOOKS is defined but BRIDGE_IPF is not.

When a BRDGGFLT or BRDGSFILT command comes in, then ifd->ifd_cmd is not in range
of bridge_control_table_size. Then bc is not set and is dereferenced
later => BOOM.


Revision tags: yamt-nfs-mp-base4 jym-xensuspend-base
# 1.69 12-May-2009 elad

Move kauth(9) call before going into splnet().

Mailing list reference:

http://mail-index.netbsd.org/tech-net/2009/05/08/msg001286.html


Revision tags: yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 nick-hppapmap-base
# 1.68 04-Apr-2009 bouyer

Fix another typo


# 1.67 04-Apr-2009 bouyer

Fix a comment, and make it build.


# 1.66 04-Apr-2009 bouyer

Fixes from Masao Uebayashi


# 1.65 04-Apr-2009 bouyer

Fix for if_start() and pfil_hook() being called from hardware interrupt
context (reported on various mailing-lists, and part of PR kern/41114,
causing panic in pf(4) and possibly ipf(4) when BRIDGE_IPF is used).
Defer bridge_forward() to a software interrupt; bridge_input() enqueues
mbufs to ifp->if_snd which is handled in bridge_forward().


Revision tags: nick-hppapmap-base2
# 1.64 18-Jan-2009 mrg

branches: 1.64.2;
Fix multiple problems:

* A sign extension error creating the bridge ID corrupted the
priority (always making it the maximum).
* Do not catch STP packets on an interface for which STP is not
enabled -- it's a violation of the spec, and causes STP to fail on
neighboring bridges.
* An optimization to bstp_input() -- some information is already
known when we call it.

contributed anonymously.


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.63 07-Nov-2008 dyoung

*** Summary ***

When a link-layer address changes (e.g., ifconfig ex0 link
02:de:ad:be:ef:02 active), send a gratuitous ARP and/or a Neighbor
Advertisement to update the network-/link-layer address bindings
on our LAN peers.

Refuse a change of ethernet address to the address 00:00:00:00:00:00
or to any multicast/broadcast address. (Thanks matt@.)

Reorder ifnet ioctl operations so that driver ioctls may inherit
the functions of their "class"---ether_ioctl(), fddi_ioctl(), et
cetera---and the class ioctls may inherit from the generic ioctl,
ifioctl_common(), but both driver- and class-ioctls may override
the generic behavior. Make network drivers share more code.

Distinguish a "factory" link-layer address from others for the
purposes of both protecting that address from deletion and computing
EUI64.

Return consistent, appropriate error codes from network drivers.

Improve readability. KNF.

*** Details ***

In if_attach(), always initialize the interface ioctl routine,
ifnet->if_ioctl, if the driver has not already initialized it.
Delete if_ioctl == NULL tests everywhere else, because it cannot
happen.

In the ioctl routines of network interfaces, inherit common ioctl
behaviors by calling either ifioctl_common() or whichever ioctl
routine is appropriate for the class of interface---e.g., ether_ioctl()
for ethernets.

Stop (ab)using SIOCSIFADDR and start to use SIOCINITIFADDR. In
the user->kernel interface, SIOCSIFADDR's argument was an ifreq,
but on the protocol->ifnet interface, SIOCSIFADDR's argument was
an ifaddr. That was confusing, and it would work against me as I
make it possible for a network interface to overload most ioctls.
On the protocol->ifnet interface, replace SIOCSIFADDR with
SIOCINITIFADDR. In ifioctl(), return EPERM if userland tries to
invoke SIOCINITIFADDR.

In ifioctl(), give the interface the first shot at handling most
interface ioctls, and give the protocol the second shot, instead
of the other way around. Finally, let compatibility code (COMPAT_OSOCK)
take a shot.

Pull device initialization out of switch statements under
SIOCINITIFADDR. For example, pull ..._init() out of any switch
statement that looks like this:

switch (...->sa_family) {
case ...:
..._init();
...
break;
...
default:
..._init();
...
break;
}

Rewrite many if-else clauses that handle all permutations of IFF_UP
and IFF_RUNNING to use a switch statement,

switch (x & (IFF_UP|IFF_RUNNING)) {
case 0:
...
break;
case IFF_RUNNING:
...
break;
case IFF_UP:
...
break;
case IFF_UP|IFF_RUNNING:
...
break;
}

unifdef lots of code containing #ifdef FreeBSD, #ifdef NetBSD, and
#ifdef SIOCSIFMTU, especially in fwip(4) and in ndis(4).

In ipw(4), remove an if_set_sadl() call that is out of place.

In nfe(4), reuse the jumbo MTU logic in ether_ioctl().

Let ethernets register a callback for setting h/w state such as
promiscuous mode and the multicast filter in accord with a change
in the if_flags: ether_set_ifflags_cb() registers a callback that
returns ENETRESET if the caller should reset the ethernet by calling
if_init(), 0 on success, != 0 on failure. Pull common code from
ex(4), gem(4), nfe(4), sip(4), tlp(4), vge(4) into ether_ioctl(),
and register if_flags callbacks for those drivers.

Return ENOTTY instead of EINVAL for inappropriate ioctls. In
zyd(4), use ENXIO instead of ENOTTY to indicate that the device is
not any longer attached.

Add to if_set_sadl() a boolean 'factory' argument that indicates
whether a link-layer address was assigned by the factory or some
other source. In a comment, recommend using the factory address
for generating an EUI64, and update in6_get_hw_ifid() to prefer a
factory address to any other link-layer address.

Add a routing message, RTM_LLINFO_UPD, that tells protocols to
update the binding of network-layer addresses to link-layer addresses.
Implement this message in IPv4 and IPv6 by sending a gratuitous
ARP or a neighbor advertisement, respectively. Generate RTM_LLINFO_UPD
messages on a change of an interface's link-layer address.

In ether_ioctl(), do not let SIOCALIFADDR set a link-layer address
that is broadcast/multicast or equal to 00:00:00:00:00:00.

Make ether_ioctl() call ifioctl_common() to handle ioctls that it
does not understand.

In gif(4), initialize if_softc and use it, instead of assuming that
the gif_softc and ifp overlap.

Let ifioctl_common() handle SIOCGIFADDR.

Sprinkle rtcache_invariants(), which checks on DIAGNOSTIC kernels
that certain invariants on a struct route are satisfied.

In agr(4), rewrite agr_ioctl_filter() to be a bit more explicit
about the ioctls that we do not allow on an agr(4) member interface.

bzero -> memset. Delete unnecessary casts to void *. Use
sockaddr_in_init() and sockaddr_in6_init(). Compare pointers with
NULL instead of "testing truth". Replace some instances of (type
*)0 with NULL. Change some K&R prototypes to ANSI C, and join
lines.


Revision tags: netbsd-5-0-RC3 netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 wrstuden-revivesa-base-1 simonb-wapbl-nbase yamt-pf42-base4 simonb-wapbl-base wrstuden-revivesa-base
# 1.62 15-Jun-2008 christos

branches: 1.62.2; 1.62.4; 1.62.6;
- add if_alloc (ours just mallocs), and if_initname and use them (from FreeBSD)
- kill memsets where M_ZERO can be used.


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base
# 1.61 15-Apr-2008 thorpej

branches: 1.61.2; 1.61.4; 1.61.6; 1.61.8;
Make ip6 and icmp6 stats per-cpu.


# 1.60 12-Apr-2008 cegger

make this build with BRIDGE_IPF and PFIL_HOOKS options


# 1.59 12-Apr-2008 thorpej

Make IP, TCP, UDP, and ICMP statistics per-CPU. The stats are collated
when the user requests them via sysctl.


# 1.58 08-Apr-2008 thorpej

Change IPv6 stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ip6stat structure; old netstat
binaries will continue to work properly.


# 1.57 07-Apr-2008 thorpej

Change IP stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ipstat structure; old netstat
binaries will continue to work properly.


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase nick-net80211-sync-base keiichi-mipv6-base matt-armv6-nbase hpcarm-cleanup-base
# 1.56 20-Feb-2008 matt

branches: 1.56.6;
s/u_\(int[0-9]*_t\)/u\1/g
(change u_int*_t to uint*_t)


Revision tags: bouyer-xeni386-nbase bouyer-xeni386-base mjf-devfs-base
# 1.55 19-Jan-2008 dyoung

Use C99 array initializers for bridge_control_table[].


Revision tags: nick-csl-alignment-base5 bouyer-xeni386-merge1 matt-armv6-prevmlocking vmlocking2-base3 yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 jmcneill-base bouyer-xenamd64-base2 vmlocking-nbase yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base matt-armv6-base jmcneill-pm-base reinoud-bufcleanup-base vmlocking-base
# 1.54 27-Aug-2007 dyoung

branches: 1.54.2; 1.54.8; 1.54.14;
LLADDR -> CLLADDR.


# 1.53 26-Aug-2007 dyoung

Constify: LLADDR -> CLLADDR. I'm aiming here to make it easier to
identify sockaddr_dl abuse that remains in the kernel, especially
the potential for overwriting memory past the end of a sockaddr_dl
with, e.g., memcpy(LLADDR(), ...).

Use sockaddr_dl_setaddr() in a few places.


Revision tags: matt-mips64-base nick-csl-alignment-base mjf-ufs-trans-base
# 1.52 09-Jul-2007 ad

branches: 1.52.2; 1.52.6;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.51 12-Mar-2007 ad

branches: 1.51.2;
Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.


# 1.50 04-Mar-2007 christos

branches: 1.50.2;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.49 21-Feb-2007 dyoung

Use __arraycount().


# 1.48 17-Feb-2007 dyoung

KNF: de-__P, bzero -> memset, bcmp -> memcmp. Remove extraneous
parentheses in return statements.

Cosmetic: don't open-code TAILQ_FOREACH().

Cosmetic: change types of variables to avoid oodles of casts: in
in6_src.c, avoid casts by changing several route_in6 pointers
to struct route pointers. Remove unnecessary casts to caddr_t
elsewhere.

Pave the way for eliminating address family-specific route caches:
soon, struct route will not embed a sockaddr, but it will hold
a reference to an external sockaddr, instead. We will set the
destination sockaddr using rtcache_setdst(). (I created a stub
for it, but it isn't used anywhere, yet.) rtcache_free() will
free the sockaddr. I have extracted from rtcache_free() a helper
subroutine, rtcache_clear(). rtcache_clear() will "forget" a
cached route, but it will not forget the destination by releasing
the sockaddr. I use rtcache_clear() instead of rtcache_free()
in rtcache_update(), because rtcache_update() is not supposed
to forget the destination.

Constify:

1 Introduce const accessor for route->ro_dst, rtcache_getdst().

2 Constify the 'dst' argument to ifnet->if_output(). This
led me to constify a lot of code called by output routines.

3 Constify the sockaddr argument to protosw->pr_ctlinput. This
led me to constify a lot of code called by ctlinput routines.

4 Introduce const macros for converting from a generic sockaddr
to family-specific sockaddrs, e.g., sockaddr_in: satocsin6,
satocsin, et cetera.


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.47 04-Jan-2007 elad

branches: 1.47.2;
Consistent usage of KAUTH_GENERIC_ISSUSER.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3 netbsd-4-base
# 1.46 23-Nov-2006 rpaulo

New EtherIP driver based on tap(4) and gif(4) by Hans Rosenfeld.
Notable changes:
* Fixes PR 34268.
* Separates the code from gif(4) (which is more cleaner).
* Allows the usage of STP (Spanning Tree Protocol).
* Removed EtherIP implementation from gif(4)/tap(4).

Some input from Christos.


# 1.45 16-Nov-2006 christos

__unused removal on arguments; approved by core.


Revision tags: yamt-splraiseipl-base2
# 1.44 17-Oct-2006 dogcow

now that we have -Wno-unused-parameter, back out all the tremendously ugly
code to gratuitously access said parameters.


# 1.43 13-Oct-2006 dogcow

More -Wunused fallout. sprinkle __unused when possible; otherwise, use the
do { if (&x) {} } while (/* CONSTCOND */ 0);
construct as suggested by uwe in <20061012224845.GA9449@snark.ptc.spbu.ru>.


# 1.42 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


# 1.41 05-Oct-2006 tls

Protect calls to pool_put/pool_get that may occur in interrupt context
with spl used to protect other allocations and frees, or datastructure
element insertion and removal, in adjacent code.

It is almost unquestionably the case that some of the spl()/splx() calls
added here are superfluous, but it really seems wrong to see:

s=splfoo();
/* frob data structure */
splx(s);
pool_put(x);

and if we think we need to protect the first operation, then it is hard
to see why we should not think we need to protect the next. "Better
safe than sorry".

It is also almost unquestionably the case that I missed some pool
gets/puts from interrupt context with my strategy for finding these
calls; use of PR_NOWAIT is a strong hint that a pool may be used from
interrupt context but many callers in the kernel pass a "can wait/can't
wait" flag down such that my searches might not have found them. One
notable area that needs to be looked at is pf.

See also:

http://mail-index.netbsd.org/tech-kern/2006/07/19/0003.html
http://mail-index.netbsd.org/tech-kern/2006/07/19/0009.html


Revision tags: abandoned-netbsd-4-base yamt-splraiseipl-base yamt-pdpolicy-base9 yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.40 23-Jul-2006 ad

branches: 1.40.4; 1.40.6;
Use the LWP cached credentials where sane.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.39 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.38 18-May-2006 liamjfoy

branches: 1.38.2;
Integrate Common Address Redundancy Procotol (CARP) from OpenBSD

'pseudo-device carp'

Thanks to: joerg@ christos@ riz@ and others who tested
Ok: core@


# 1.37 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base yamt-pdpolicy-base2 elad-kernelauth-base yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.36 17-Jan-2006 christos

branches: 1.36.2; 1.36.4; 1.36.6; 1.36.8; 1.36.10;
Make sure that breq is also cleared (from Xin LI)


# 1.35 09-Jan-2006 christos

Make sure we initialize all structs to 0; from Xin LI


# 1.34 24-Dec-2005 perry

branches: 1.34.2;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.33 11-Dec-2005 thorpej

ANSI function decls and application of static.


# 1.32 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.31 01-Jun-2005 jdc

branches: 1.31.2;
Fix this properly by renaming the conflicting variables.


# 1.30 01-Jun-2005 jdc

Remove extraneous definition of struct llc (found by shadow warning).


Revision tags: netbsd-3-0-RELEASE netbsd-3-0-RC6 netbsd-3-0-RC5 netbsd-3-0-RC4 netbsd-3-0-RC3 netbsd-3-0-RC2 netbsd-3-0-RC1 yamt-km-base4 yamt-km-base3 netbsd-3-base kent-audio2-base
# 1.29 26-Feb-2005 perry

branches: 1.29.2; 1.29.4;
nuke trailing whitespace


Revision tags: yamt-km-base2
# 1.28 31-Jan-2005 kim

Add RFC 3378 EtherIP support, ported from OpenBSD to NetBSD by
Hans Rosenfeld (rosenfeld at grumpf.hope-2000.org)

This change makes it possible to add gif interfaces to bridges, which
will then send and receive IP protocol 97 packets. Packets are Ethernet
frames with an EtherIP header prepended.


Revision tags: yamt-km-base kent-audio1-beforemerge kent-audio1-base
# 1.27 04-Dec-2004 peter

branches: 1.27.4; 1.27.6;
Change ifc_destroy to return an int instead of void, so that it
can pass back errors to ifconfig.


# 1.26 06-Oct-2004 bad

Interfaces that do checksum offloading indicate the checksum status of
received packets in csum_flags in the packet header. Packets that are
forwarded over the bridge need to have csum_flags cleared before being
put on the output queue. Do so in bridge_enqueue().

Discussed with Jason Thorpe.

Fixes PR kern/27007 and the first part of PR kern/21831.


# 1.25 05-Oct-2004 christos

Only enable BRIDGE_IPF code if PFIL_HOOKS is enabled.


# 1.24 21-Apr-2004 itojun

kill a sprintf


# 1.23 21-Apr-2004 itojun

kill sprintf, use snprintf


Revision tags: netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.22 31-Jan-2004 jdc

branches: 1.22.2;
Use m_copydata(), m_adj() and M_PREPEND() to manipulate mbuf's in
bridge_ipf(). Fixes kernel memory corruption that occured when using
m_split() and m_cat().
Idea from OpenBSD.


# 1.21 09-Dec-2003 augustss

Fix spelling mistake in a comment.


# 1.20 28-Oct-2003 mycroft

Mark this initializer in the canonical way so it can be found later.


# 1.19 25-Oct-2003 christos

Fix uninitialized variable warnings


# 1.18 16-Sep-2003 jdc

Add a flag parameter to bridge_enqueue() to tell it whether to run the
filter or not. We only need to run the filter for bridge_forward() and
bridge_broadcast(). If we also run it for bridge_output(), we will run
the filter twice outbound per packet, so don't.

In bridge_ipf(), make sure we don't run m_cat() on a single mbuf chain
by checking to see (and remembering) if we need to m_split() the mbuf.
This fixes bridge + ipfilter on sparc.

Fixes PR kern/22063.


# 1.17 11-Aug-2003 itojun

rm extra blank line


# 1.16 13-Jul-2003 jdc

Include opt_inet.h to get INET6 definition.
Now, bridged ipv6 packets are passed through ipfilter.
However, some v6 packets still do not get transmitted when ipf is enabled.
Partial fix for PR kern/22063.


# 1.15 23-Jun-2003 martin

branches: 1.15.2;
Make sure to include opt_foo.h if a defflag option FOO is used.


# 1.14 24-May-2003 kristerw

Make sure splx() is called for all bridge_ioctl() error cases.


# 1.13 16-May-2003 itojun

use strlcpy


# 1.12 14-May-2003 itojun

use arc4random


# 1.11 19-Mar-2003 bouyer

Fix 2 bugs:
- initialise stp when the bridge is turned up, without this stp will keep
all interfaces disabled in a sequence like:
brconfig bridge0 add if0 add if1 stp if0 stp if1 up
- s/BRDGSPRI/BRDGSIFPRIO in brconfig.c:cmd_ifpriority()

add a command (ifpathcost) to change the stp path cost of the STP path cost of
an interface. Display the interface path cost with the others STP parameters.


# 1.10 27-Feb-2003 perseant

Make BRIDGE_IPF an option, and document it. Add it (commented) to GENERIC.
Let brconfig tell whether the bridge is using the ipfilter hook, or not.


# 1.9 15-Feb-2003 perseant

Add ipf packet-filtering option to if_bridge. The option is controlled at
compile-time by BRIDGE_IPF, and at runtime by brconfig with the {ipf,-ipf}
option on a per-bridge basis.

As a side-effect, add PFIL_HOOKS processing to if_bridge.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base kqueue-aftermerge kqueue-beforemerge gehenna-devsw-base kqueue-base
# 1.8 24-Aug-2002 martin

Add a function to lookup bridge members by struct ifnet * and use
it at all call sites that have such a pointer readily available.
This avoids unnecessary strcmp()s in critical paths, and removes
some XXX comments.


# 1.7 08-Jun-2002 itojun

reject "add" request if if_mtu is different.


# 1.6 23-May-2002 itojun

use IFT_BRIDGE


Revision tags: netbsd-1-6-base
# 1.5 24-Mar-2002 jdolecek

branches: 1.5.2; 1.5.4;
Fix a memory leak in bridge_ioctl_add() when the called for non-ethernet
interface.
Problem noted and fix provided by in kern/16019 by Love.


Revision tags: eeh-devprop-base newlock-base
# 1.4 08-Mar-2002 thorpej

Pool deals fairly well with physical memory shortage, but it doesn't
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map). Try to deal with this:

* Group all information about the backend allocator for a pool in a
separate structure. The pool references this structure, rather than
the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
to become available, but will still fail if it cannot callocate KVA
space for the pages. If this happens, carefully drain all pools using
the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
some pages, and use that information to make draining easier and more
efficient.
* Get rid of PR_URGENT. There was only one use of it, and it could be
dealt with by the caller.

From art@openbsd.org.


Revision tags: ifpoll-base
# 1.3 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.2 17-Aug-2001 thorpej

branches: 1.2.2; 1.2.4;
Only report expire time for DYNAMIC forwarding table entries.


# 1.1 17-Aug-2001 thorpej

Add support for building Ethernet bridges, based on Jason Wright's
bridge driver from OpenBSD, although the bridge code has been *heavily*
modified by me (the 802.1D code remains mostly unchanged from the
original).


# 1.170 27-Mar-2020 jdolecek

replace the conditional m_pullup() on start of bridge_output() with
a KASSERT(), to make it clear no mbuf manipulation is ever done here

the condition should never trigger, this always runs after ether_output()
M_PREPEND()s ether_header


# 1.169 24-Mar-2020 jdolecek

reset the csum_flags in bridge_brodcast() also for bmcast path

for destination interfaces with real hardware offloading this fixes
multicast packet corruption; for xvif(4) this fix stops treating them
as having no csum

may fix PR kern/42386


Revision tags: ad-namecache-base3
# 1.168 24-Feb-2020 rin

Remove debug printf I put into bridge_calc_csum_flags().
Sorry for noise.


# 1.167 23-Feb-2020 jdolecek

disable the DEBUG bridge_calc_csum_flags() printf


# 1.166 29-Jan-2020 thorpej

Adopt <net/if_stats.h>.


Revision tags: ad-namecache-base2 ad-namecache-base1 ad-namecache-base phil-wifi-20191119
# 1.165 05-Aug-2019 msaitoh

branches: 1.165.2;
Cast uint32_t to avoid undefined behavior in bridge_rthash(). Found by kUBSan.


Revision tags: netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609 isaki-audio2-base pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226
# 1.164 22-Dec-2018 rin

branches: 1.164.4;
Take the interface out of promiscuous mode in bridge_delete_member()
instead of bridge_ioctl_del(). Otherwise, the member interfaces are
left in promiscuous mode when the bridge is destroyed.


# 1.163 15-Dec-2018 rin

Improve wording in comments: replace "chain" with "queue" for
sequence of mbuf's connected by m_nextpkt, in order to avoid
confusion with those connected by m_next.

No binary changes.


# 1.162 14-Dec-2018 martin

Need <netinet6/ip6_var.h> for ip6_statinc() prototype.


# 1.161 12-Dec-2018 rin

PR kern/53562

Handle TX offload in software when a packet is sent via
bridge_output(). We can send it as is in the following
exceptional cases:

For unicast:

(1) When the destination interface is the same as source.

(2) When the destination supports all TX offload options
specified in a packet.

For multicast/broadcast:

(3) When all the members of the bridge support the specified
TX offload options.

For (3), add sc_csum_flags_tx flag to bridge softc, which is
logical AND b/w capabilities of TX offload options in member
interface (ifp->if_csum_flags_tx). The flag is updated when a
member is (i) added to or (ii) removed from a bridge, or (iii)
if_csum_flags_tx flag of a member interface is manipulated via
ifconfig(8).

Turn on M_CSUM_TSOv[46] bit in ifp->if_csum_flags_tx flag when
TSO[46] is enabled for that interface.

OK msaitoh thorpej


Revision tags: pgoyette-compat-1126
# 1.160 09-Nov-2018 ozaki-r

Fix that brconfig <bridge> (addr) can't show a large number of MAC addresses

The command shows only 256 addresses at maximum even if a bridge caches more
addresses. It occurs because the kernel doesn't return an error if the command
passes a short buffer that can't store all cached addresses; the kernel fills
cached addresses as much as possible and returns it without telling that the
result is truncated.

Fix the issue by telling a required size of a buffer if a buffer passed from the
command is not enough, which lets the command retry with an enough buffer.

Reported by k-goda@IIJ


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930
# 1.159 19-Sep-2018 msaitoh

Micro optimization. m_copym(M_COPYALL) -> m_copypacket().


# 1.158 18-Sep-2018 msaitoh

- Fix bridge_enqueue() which was broken by last commit. Use correct mbuf
pointer.
- Modify comment.


# 1.157 14-Sep-2018 msaitoh

Fix a bug that bridge_enqueue() incorrectly cleared outgoing packet's offload
flags. bridge_enqueue() is called from bridge_output() when a packet is
spontaneous. Clear csum_flags before calling brige_enqueue() in
bridge_forward() or bridge_broadcast() instead of in the beginning of
bridge_enqueue().

Note that this change doesn't fix a problem on the following configuration:

A bridge has two or more interfaces.

An address is assigned to an bridge member interface and
some offload flags are set.

Another interface has no address and has no any offload flag.

XXX pullup-[78]


Revision tags: pgoyette-compat-0906 pgoyette-compat-0728 phil-wifi-base pgoyette-compat-0625
# 1.156 25-May-2018 ozaki-r

branches: 1.156.2;
Ensure to call if_register after interface initializations finish


Revision tags: pgoyette-compat-0521
# 1.155 14-May-2018 ozaki-r

Protect packet input routines with KERNEL_LOCK and splsoftnet

if_input, i.e, ether_input and friends, now runs in softint without any
protections. It's ok for ether_input itself because it's already MP-safe,
however, subsequent routines called from it such as carp_input and agr_input
aren't safe because they're not MP-safe. Protect if_input with KERNEL_LOCK.

if_input can be called from a normal LWP context. In that case we need to
prevent interrupts (softint) from running by splsoftnet to protect non-MP-safe
codes (e.g., carp_input and agr_input).

Pointed out by mlelstv@


Revision tags: pgoyette-compat-0502 pgoyette-compat-0422
# 1.154 18-Apr-2018 ozaki-r

Add missing PSLIST_ENTRY_INIT and PSLIST_ENTRY_DESTROY


# 1.153 18-Apr-2018 ozaki-r

Get rid of a unnecessary semicolon

Pointed out by kamil@


# 1.152 18-Apr-2018 ozaki-r

bridge: use pslist(9) for rtlist and rthash

The change fixes race conditions on list operations. One example is that a
reader may see invalid pointers on a looking item in a list due to lack of
membar_producer.


# 1.151 18-Apr-2018 ozaki-r

Simplify bridge_rtnode_insert (NFC)


# 1.150 18-Apr-2018 ozaki-r

Remove obsolete NULL checks


Revision tags: pgoyette-compat-0415
# 1.149 10-Apr-2018 ozaki-r

Fix bridge_rtdelete

It removes a rtable entry that belongs to a specified interface, however, its
original behavior was to delete all belonging entries. Restore the original
behavior.


Revision tags: pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.148 15-Jan-2018 maxv

branches: 1.148.2;
If the bridge is not running, don't call bridge_stop. Otherwise the
following commands will crash the kernel:

ifconfig bridge0 create
ifconfig bridge0 destroy


# 1.147 28-Dec-2017 ozaki-r

Ensure the timer isn't running by using workqueue_wait


# 1.146 19-Dec-2017 ozaki-r

Don't set IFEF_MPSAFE unless NET_MPSAFE at this point

Because recent investigations show that interfaces with IFEF_MPSAFE need to
follow additional restrictions to work with the flag safely. We should enable it
on an interface by default only if the interface surely satisfies the
restrictions, which are described in if.h.

Note that enabling IFEF_MPSAFE solely gains a few benefit on performance because
the network stack is still serialized by the big kernel locks by default.


# 1.145 11-Dec-2017 ozaki-r

Wrap if_ioctl_lock with IFNET_* macros (NFC)

Also if_ioctl_lock perhaps needs to be renamed to something because it's now
not just for ioctl...


# 1.144 08-Dec-2017 ozaki-r

Fix build of kernels without ether

By throwing out if_enable_vlan_mtu and if_disable_vlan_mtu that
created a unnecessary dependency from if.c to if_ethersubr.c.

PR kern/52790


# 1.143 06-Dec-2017 ozaki-r

Ensure to not turn on IFF_RUNNING of an interface until its initialization completes

And ensure to turn off it before destruction as per IFF_RUNNING's description
"resource allocated". (The description is a bit doubtful though, I believe the
change is still proper.)


# 1.142 06-Dec-2017 ozaki-r

Ensure to hold if_ioctl_lock when calling if_flags_set


Revision tags: tls-maxphys-base-20171202
# 1.141 17-Nov-2017 ozaki-r

Add missing IFEF_NO_LINK_STATE_CHANGE to bridge


# 1.140 16-Nov-2017 ozaki-r

Unify IFEF_*_MPSAFE into IFEF_MPSAFE

There are already two flags for if_output and if_start, however, it seems such
MPSAFE flags are eventually needed for all if_XXX operations. Having discrete
flags for each operation is wasteful of if_extflags bits. So let's unify
the flags into one: IFEF_MPSAFE.

Fortunately IFEF_*_MPSAFE flags have never been included in any releases, so
we can change them without breaking backward compatibility of the releases
(though the kernel version of -current should be bumped).

Note that if an interface have both MP-safe and non-MP-safe operations at a
time, we have to set the IFEF_MPSAFE flag and let callees of non-MP-safe
opeartions take the kernel lock.

Proposed on tech-kern@ and tech-net@


# 1.139 15-Nov-2017 ozaki-r

Mark callouts of bridge CALLOUT_MPSAFE


# 1.138 25-Oct-2017 ozaki-r

Remove unnecessary splsoftnet


# 1.137 25-Oct-2017 ozaki-r

Don't free sc_rthash twice


# 1.136 23-Oct-2017 msaitoh

- If if_initialize() failed in the attach function, free resources and return.
- Add some missing frees in bridge_clone_destroy().
- KNF


# 1.135 02-Oct-2017 ozaki-r

Add curlwp_bind to bridge_input for psref

It can be called in a thread context via tap (tap_dev_write).

Fix PR kern/52587


Revision tags: nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320
# 1.134 07-Mar-2017 ozaki-r

branches: 1.134.6;
Remove unnecessary splnet for bridge_enqueue

bridge_enqueue now uses if_transmit_lock that does splnet for device
drivers, so splnet for bridge_enqueue isn't needed anymore.


# 1.133 16-Feb-2017 knakahara

add l2tp(4) L2TPv3 interface.

originally implemented by IIJ SEIL team.


Revision tags: nick-nhusb-base-20170204
# 1.132 23-Jan-2017 ozaki-r

Replace some splnet with splsoftnet


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107 nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004
# 1.131 15-Sep-2016 christos

branches: 1.131.2;
Always do the mbuf checks. The packet filters (npf) expect the mbuf to be
pulled-up. (Krists Krilovs)


Revision tags: localcount-20160914
# 1.130 29-Aug-2016 ozaki-r

KNF; replace white spaces with hard tabs

No functional change.


Revision tags: pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.129 22-Jun-2016 knakahara

branches: 1.129.2;
fix: locking about IFQ_ENQUEUE and ALTQ

- If NET_MPSAFE is not defined, IFQ_LOCK is nop. Currently, that means
IFQ_ENQUEUE() of some paths such as bridge_enqueue() is called parallel
wrongly.
- If ALTQ is enabled, Tx processing should call if_transmit() (= IFQ_ENQUEUE
+ ifp->if_start()) instead of ifp->if_transmit() to call ALTQ_ENQUEUE()
and ALTQ_DEQUEUE().
Furthermore, ALTQ processing is always required KERNEL_LOCK currently.


# 1.128 20-Jun-2016 knakahara

fix: should not assert IFEF_OUTPUT_MPSAFE in bridge_output()


# 1.127 20-Jun-2016 knakahara

tentative fix for ATF(net/if_bridge/t_bridge)


# 1.126 20-Jun-2016 knakahara

make bridge_output MP-safe, so that bridge(4) can enable IFEF_OUTPUT_MPSAFE.

making MP-scalable is future work.


# 1.125 10-Jun-2016 ozaki-r

Avoid storing a pointer of an interface in a mbuf

Having a pointer of an interface in a mbuf isn't safe if we remove big
kernel locks; an interface object (ifnet) can be destroyed anytime in any
packet processing and accessing such object via a pointer is racy. Instead
we have to get an object from the interface collection (ifindex2ifnet) via
an interface index (if_index) that is stored to a mbuf instead of an
pointer.

The change provides two APIs: m_{get,put}_rcvif_psref that use psref(9)
for sleep-able critical sections and m_{get,put}_rcvif that use
pserialize(9) for other critical sections. The change also adds another
API called m_get_rcvif_NOMPSAFE, that is NOT MP-safe and for transition
moratorium, i.e., it is intended to be used for places where are not
planned to be MP-ified soon.

The change adds some overhead due to psref to performance sensitive paths,
however the overhead is not serious, 2% down at worst.

Proposed on tech-kern and tech-net.


# 1.124 10-Jun-2016 ozaki-r

Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.123 16-May-2016 ozaki-r

Apply if_get and if_put to bridge(4)


# 1.122 04-May-2016 roy

Allow multicast/broadcast packets from a bridge member to other members.
Note this should just call bridge_broadcast when more locking issues are
resolved.


# 1.121 28-Apr-2016 knakahara

introduce new ifnet MP-scalable sending interface "if_transmit".


# 1.120 28-Apr-2016 ozaki-r

Constify rtentry of if_output

We no longer need to change rtentry below if_output.

The change makes it clear where rtentries are changed (or not)
and helps forthcoming locking (os psrefing) rtentries.


# 1.119 24-Apr-2016 christos

CID 1358673: dead code


Revision tags: nick-nhusb-base-20160422
# 1.118 22-Apr-2016 roy

Change used from int to bool.
If used, abort the loop because we think we're already at the end.


# 1.117 20-Apr-2016 knakahara

IFQ_ENQUEUE refactor (3/3) : eliminate pktattr argument from IFQ_ENQUEUE caller


# 1.116 20-Apr-2016 knakahara

IFQ_ENQUEUE refactor (2/3) : eliminate pktattr argument from altq implemantation


# 1.115 19-Apr-2016 ozaki-r

Apply psref(9) to bridge(4)

Note that there is an issue that ioctls for an interface and a destruction
of the interface can run in parallel and it causes race conditions on
bridge as well (it rarely happens). The issue will be addressed in the
interface common code (if.c).


# 1.114 19-Apr-2016 ozaki-r

Remove BRIDGE_MPSAFE switch and enable MP-safe code by default

We need to enable it by default because bridge_input now runs
in softint, but bridge_input w/o BRIDGE_MPSAFE was designed as
it runs in hardware interrupt.

Note that there remains a racy code in bridge_output; it will be
solved in the upcoming change (applying psref(9)).


# 1.113 11-Apr-2016 ozaki-r

Fix usage of pslist(9)

Pointed out by riastradh@.


# 1.112 11-Apr-2016 ozaki-r

Use pslist(9) in bridge(4)

This adds missing memory barriers to list operations for pserialize.


# 1.111 28-Mar-2016 ozaki-r

Remove unused global bridge list

Pointed out by riastradh@


# 1.110 23-Mar-2016 ozaki-r

Fix LIST_FOREACH argument


# 1.109 23-Mar-2016 ozaki-r

Use LIST_FOREACH instead of LIST_FOREACH_SAFE

No need to use *_SAFE because we don't remove any items in the loop.


Revision tags: nick-nhusb-base-20160319
# 1.108 15-Feb-2016 ozaki-r

Simplify bridge(4)

Thanks to introducing softint-based if_input, the entire bridge code now
never run in hardware interrupt context. So we can simplify the code.

- Remove spin mutexes
- They were needed because some code of bridge could run in
hardware interrupt context
- We now need only an adaptive mutex for each shared object
(a member list and a forwarding table)
- Remove pktqueue
- bridge_input is already in softint, using another softint
(for bridge_forward) is useless
- Packet distribution should be down at device drivers


# 1.107 10-Feb-2016 ozaki-r

Don't share struct work, instead have one per softc

Pointed out by riastradh@


# 1.106 09-Feb-2016 ozaki-r

Introduce softint-based if_input

This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.

This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.

To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.

Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html

Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!


Revision tags: nick-nhusb-base-20151226
# 1.105 19-Nov-2015 christos

Add handling of VLAN packets in if_bridge where the parent interface supports
them (Jean-Jacques.Puig@espci.fr). Factor out the vlan_mtu enabling and
disabling code.


# 1.104 20-Oct-2015 maxv

Harmless alloc inconsistency; make sure the exact same argument is given to
kmem_alloc/kmem_free. Found by Brainy.


# 1.103 07-Oct-2015 ozaki-r

Enqueue frames to a curcpu's pktqueue

Currently RX can run on a CPU other than CPU#0, so always enqueuing
to a pktqueue of CPU#0 makes no sense. Let's use a curcpu's pktqueue,
although bridge_foward softint doesn't run in parallel without
NET_MPSAFE.

This is a temporal solution. We need a fundamental solution.


Revision tags: nick-nhusb-base-20150921
# 1.102 28-Aug-2015 rjs

Don't set M_PROTO1 in mbuf flags.

This was left over from the old usage of gif(4) with bridges.


# 1.101 20-Aug-2015 christos

include "ioconf.h" to get the 'void <driver>attach(int count);' prototype.


# 1.100 23-Jul-2015 ozaki-r

Fix PR 48104

So far bridge cannot receive frames via a member interface when the frames
come from another member interface. So when we assign an IP address to
a member interface, hosts connected to another member interface cannot
ping to the IP address. That behavior isn't expected. See PR 48104 for
more realistic examples of this issue.

The change does:
- drop M_PROMISC before ether_input, which allows a bridge member interface
to receive a frame coming from another bridge member interface
- receive broadcast/multicast frames via all bridge member interfaces,
which is required to receive IPv6 multicast packets destined to a
multicast group belonging to a bridge member interface that is different
from a packet arrival interface

roy@ helped testing of the fix, thanks!


Revision tags: nick-nhusb-base-20150606
# 1.99 01-Jun-2015 matt

Modify the BRDGGIFS and BRDGRTS cmds to be more COMPAT_NETBSD32 friendly.
(XXX whitespace)


# 1.98 16-Apr-2015 ozaki-r

Fix racy bridge_delete_member

It can be called from bridge_ioctl_del and bridge_clone_destroy with
a same bridge member (bif) at the same time. We have to prevent
that happens.

Pointed out by riastradh@


Revision tags: nick-nhusb-base-20150406
# 1.97 08-Jan-2015 ozaki-r

Use pserialize for rtlist in bridge

This change enables lockless accesses to bridge rtable lists.
See locking notes in a comment to know how pserialize and
mutexes are used. Some functions are rearranged to use
pserialize. A workqueue is introduced to use pserialize in
bridge_rtage via bridge_timer callout.

As usual, pserialize and mutexes are used only when NET_MPSAFE
on. On the other hand, the newly added workqueue is used
regardless of NET_MPSAFE on or off.


# 1.96 01-Jan-2015 ozaki-r

Reset the expire time of a cache on receiving a frame for the cache

The expire time of a cache in a bridge MAC address table was never reset
once it is initialized regardless of traffic for the cache. The behavior
isn't supposed and active caches are unnecessarily expired and removed.

PR kern/49507


# 1.95 31-Dec-2014 ozaki-r

Use pserialize in bridge

This change enables lockless accesses to bridge member lists.
See locking notes in a comment to know how pserialize and
mutexes are used.

This change also provides support for softint-based interrupt
handling; pserialize readers can run in both HW interrupt and
softint contexts.

As usual, pserialize is used only when NET_MPSAFE on.


# 1.94 25-Dec-2014 ozaki-r

Use LIST_FOREACH_SAFE in bridge_rt* functions


# 1.93 24-Dec-2014 ozaki-r

Replace malloc/free with kmem_* in if_bridge

Additionally M_NOWAIT is replaced with KM_SLEEP.


# 1.92 22-Dec-2014 ozaki-r

Call ether_input/m_freem without holding a lock or referencing unnecessary objects

When NET_MPSAFE on, a bridge tries to pass up a packet to Layer 3
(or call m_freem) with holding a lock or referencing unnecessary
objects. That causes random lock ups. The change fixes the issue.


Revision tags: nick-nhusb-base
# 1.91 15-Aug-2014 ozaki-r

branches: 1.91.2;
bridge: reject non-IFF_SIMPLEX interfaces

bridge does not work with !IFF_SIMPLEX interfaces (PR/18035);
the bug is not yet fixed. Until it gets fixed, we should
reject non-IFF_SIMPLEX interfaces.

Discussed with pooka@


Revision tags: netbsd-7-1-2-RELEASE netbsd-7-1-1-RELEASE netbsd-7-1-RELEASE netbsd-7-1-RC2 netbsd-7-nhusb-base-20170116 netbsd-7-1-RC1 netbsd-7-0-2-RELEASE netbsd-7-nhusb-base netbsd-7-0-1-RELEASE netbsd-7-0-RELEASE netbsd-7-0-RC3 netbsd-7-0-RC2 netbsd-7-0-RC1 netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.90 23-Jul-2014 ozaki-r

branches: 1.90.2;
Avoid calling copyout with holding mutex(IPL_NET)

Because copyout may lead a page fault that may sleep, we have to pull it
out from the critical section of mutex(IPL_NET) in bridge_ioctl_gifs.


# 1.89 23-Jul-2014 ozaki-r

Add missing unlock


# 1.88 20-Jul-2014 ozaki-r

Don't return ENETRESET when ioctl SIOCSIFMTU

Otherwise, just changing MTU with ifconfig shows
a confusable error message.

RP kern/48996


# 1.87 14-Jul-2014 ozaki-r

Make bridge MPSAFE

- Introduce BRIDGE_MPSAFE
- It's enabled only when NET_MPSAFE is defined
in if.h or the kernel config
- Add iflist and rtlist mutex locks
- Locking iflist is performance sensitive,
so it's not used when !BRIDGE_MPSAFE
- Add bif object reference counting
- It enables fine-grain locking for bridge member lists
by allowing to not hold a lock during touching a bif
- bridge_release_member is added to decrement the
reference count
- A condition variable is added to do bridge_delete_member
gracefully
- Add if_bridgeif to ifnet
- It's a shortcut to a bif object of a bridge member
- It reduces a bif lookup cost and so lock contention on iflist
- Make bridgestp MPSAFE too


# 1.86 02-Jul-2014 ozaki-r

Protect bridge_list with a mutex


# 1.85 02-Jul-2014 ozaki-r

Remove obsolete codes for if_snd


# 1.84 23-Jun-2014 ozaki-r

Get rid of unnecessary xc_broadcast after pktq_barrier

Pointed out by rmind@


# 1.83 18-Jun-2014 ozaki-r

Restructure bridge_input and bridge_broadcast

There are two changes:
- Assemble the places calling pktq_enqueue (bridge_forward)
for unicast and {b,m}cast frames into one
- Receive {b,m}cast frames in bridge_broadcast, not in
bridge_input

The changes make the code clear and readable. bridge_input
now doesn't need to take care of {b,m}cast frames;
bridge_forward and bridge_broadcast have the responsibility.

The changes are based on a patch of Lloyd Parkes submitted
in PR 48104, but don't fix its issue yet.


# 1.82 18-Jun-2014 ozaki-r

Tidy up bridge_input

No functional change.


# 1.81 17-Jun-2014 ozaki-r

Restructure ether_input and bridge_input

The network stack of NetBSD is well organized and
layered. A packet reception is processed from a
lower layer to an upper layer one by one. However,
ether_input and bridge_input are not structured so.
bridge_input is called inside ether_input.

The new structure replaces ifnet#if_input of a bridge
member with bridge_input when the member is attached.
So a packet goes straight on a packet reception via
a bridge, bridge_input => ether_input => ip_input.

The change is part of a patch of Lloyd Parkes submitted
in PR 48104. Unlike the patch, the change doesn't
intend to change the behavior of the packet processing.
Another patch will fix PR 48104.


# 1.80 16-Jun-2014 ozaki-r

Add net.interfaces.bridgeN.fwdq.{maxlen,len,drops} sysctl


# 1.79 16-Jun-2014 ozaki-r

Use pktqueue for bridge forwarding queue and softint


# 1.78 15-Jun-2014 ozaki-r

Get rid of unnecessary splnet for pool_{get,put}

A mutex prevents interrupts in the functions now.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base
# 1.77 29-Jun-2013 rmind

branches: 1.77.4;
- Rewrite parts of pfil(9): use array to store hooks and thus be more cache
friendly (there are only few hooks in the system). Make the structures
opaque and the interface more strict.
- Remove PFIL_HOOKS option by making pfil(9) mandatory.


Revision tags: agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8
# 1.76 22-Mar-2012 wiz

branches: 1.76.2; 1.76.4;
Fix typo in kauth name. From PR 46234 by Matthew Mondor.
Tested by Geoff Adams and Ryo ONODERA.


# 1.75 13-Mar-2012 elad

Replace the remaining KAUTH_GENERIC_ISSUSER authorization calls with
something meaningful. All relevant documentation has been updated or
written.

Most of these changes were brought up in the following messages:

http://mail-index.netbsd.org/tech-kern/2012/01/18/msg012490.html
http://mail-index.netbsd.org/tech-kern/2012/01/19/msg012502.html
http://mail-index.netbsd.org/tech-kern/2012/02/17/msg012728.html

Thanks to christos, manu, njoly, and jmmv for input.

Huge thanks to pgoyette for spinning these changes through some build
cycles and ATF.


Revision tags: netbsd-6-0-6-RELEASE netbsd-6-1-5-RELEASE netbsd-6-1-4-RELEASE netbsd-6-0-5-RELEASE netbsd-6-1-3-RELEASE netbsd-6-0-4-RELEASE netbsd-6-1-2-RELEASE netbsd-6-0-3-RELEASE netbsd-6-1-1-RELEASE netbsd-6-0-2-RELEASE netbsd-6-1-RELEASE netbsd-6-1-RC4 netbsd-6-1-RC3 netbsd-6-1-RC2 netbsd-6-1-RC1 netbsd-6-0-1-RELEASE matt-nb6-plus-nbase netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base
# 1.74 19-Nov-2011 tls

branches: 1.74.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.73 23-May-2011 joerg

branches: 1.73.4;
simplify


Revision tags: bouyer-quota2-nbase bouyer-quota2-base jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.72 07-Dec-2010 pooka

branches: 1.72.2;
_KERNEL_TOP


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1 yamt-nfs-mp-base9 uebayasi-xip-base
# 1.71 19-Jan-2010 pooka

branches: 1.71.4;
Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


Revision tags: matt-premerge-20091211 yamt-nfs-mp-base8 yamt-nfs-mp-base7 jymxensuspend-base yamt-nfs-mp-base6 yamt-nfs-mp-base5 jym-xensuspend-nbase
# 1.70 17-May-2009 cegger

fix crash in bridge_ioctl():


BRDGGFLT and BRDGSFILT bridge controls are only available with BRIDGE_IPF and PFIL_HOOKS defined.
In amd64 GENERIC and XEN kernel configs PFIL_HOOKS is defined but BRIDGE_IPF is not.

When a BRDGGFLT or BRDGSFILT command comes in, then ifd->ifd_cmd is not in range
of bridge_control_table_size. Then bc is not set and is dereferenced
later => BOOM.


Revision tags: yamt-nfs-mp-base4 jym-xensuspend-base
# 1.69 12-May-2009 elad

Move kauth(9) call before going into splnet().

Mailing list reference:

http://mail-index.netbsd.org/tech-net/2009/05/08/msg001286.html


Revision tags: yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 nick-hppapmap-base
# 1.68 04-Apr-2009 bouyer

Fix another typo


# 1.67 04-Apr-2009 bouyer

Fix a comment, and make it build.


# 1.66 04-Apr-2009 bouyer

Fixes from Masao Uebayashi


# 1.65 04-Apr-2009 bouyer

Fix for if_start() and pfil_hook() being called from hardware interrupt
context (reported on various mailing-lists, and part of PR kern/41114,
causing panic in pf(4) and possibly ipf(4) when BRIDGE_IPF is used).
Defer bridge_forward() to a software interrupt; bridge_input() enqueues
mbufs to ifp->if_snd which is handled in bridge_forward().


Revision tags: nick-hppapmap-base2
# 1.64 18-Jan-2009 mrg

branches: 1.64.2;
Fix multiple problems:

* A sign extension error creating the bridge ID corrupted the
priority (always making it the maximum).
* Do not catch STP packets on an interface for which STP is not
enabled -- it's a violation of the spec, and causes STP to fail on
neighboring bridges.
* An optimization to bstp_input() -- some information is already
known when we call it.

contributed anonymously.


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.63 07-Nov-2008 dyoung

*** Summary ***

When a link-layer address changes (e.g., ifconfig ex0 link
02:de:ad:be:ef:02 active), send a gratuitous ARP and/or a Neighbor
Advertisement to update the network-/link-layer address bindings
on our LAN peers.

Refuse a change of ethernet address to the address 00:00:00:00:00:00
or to any multicast/broadcast address. (Thanks matt@.)

Reorder ifnet ioctl operations so that driver ioctls may inherit
the functions of their "class"---ether_ioctl(), fddi_ioctl(), et
cetera---and the class ioctls may inherit from the generic ioctl,
ifioctl_common(), but both driver- and class-ioctls may override
the generic behavior. Make network drivers share more code.

Distinguish a "factory" link-layer address from others for the
purposes of both protecting that address from deletion and computing
EUI64.

Return consistent, appropriate error codes from network drivers.

Improve readability. KNF.

*** Details ***

In if_attach(), always initialize the interface ioctl routine,
ifnet->if_ioctl, if the driver has not already initialized it.
Delete if_ioctl == NULL tests everywhere else, because it cannot
happen.

In the ioctl routines of network interfaces, inherit common ioctl
behaviors by calling either ifioctl_common() or whichever ioctl
routine is appropriate for the class of interface---e.g., ether_ioctl()
for ethernets.

Stop (ab)using SIOCSIFADDR and start to use SIOCINITIFADDR. In
the user->kernel interface, SIOCSIFADDR's argument was an ifreq,
but on the protocol->ifnet interface, SIOCSIFADDR's argument was
an ifaddr. That was confusing, and it would work against me as I
make it possible for a network interface to overload most ioctls.
On the protocol->ifnet interface, replace SIOCSIFADDR with
SIOCINITIFADDR. In ifioctl(), return EPERM if userland tries to
invoke SIOCINITIFADDR.

In ifioctl(), give the interface the first shot at handling most
interface ioctls, and give the protocol the second shot, instead
of the other way around. Finally, let compatibility code (COMPAT_OSOCK)
take a shot.

Pull device initialization out of switch statements under
SIOCINITIFADDR. For example, pull ..._init() out of any switch
statement that looks like this:

switch (...->sa_family) {
case ...:
..._init();
...
break;
...
default:
..._init();
...
break;
}

Rewrite many if-else clauses that handle all permutations of IFF_UP
and IFF_RUNNING to use a switch statement,

switch (x & (IFF_UP|IFF_RUNNING)) {
case 0:
...
break;
case IFF_RUNNING:
...
break;
case IFF_UP:
...
break;
case IFF_UP|IFF_RUNNING:
...
break;
}

unifdef lots of code containing #ifdef FreeBSD, #ifdef NetBSD, and
#ifdef SIOCSIFMTU, especially in fwip(4) and in ndis(4).

In ipw(4), remove an if_set_sadl() call that is out of place.

In nfe(4), reuse the jumbo MTU logic in ether_ioctl().

Let ethernets register a callback for setting h/w state such as
promiscuous mode and the multicast filter in accord with a change
in the if_flags: ether_set_ifflags_cb() registers a callback that
returns ENETRESET if the caller should reset the ethernet by calling
if_init(), 0 on success, != 0 on failure. Pull common code from
ex(4), gem(4), nfe(4), sip(4), tlp(4), vge(4) into ether_ioctl(),
and register if_flags callbacks for those drivers.

Return ENOTTY instead of EINVAL for inappropriate ioctls. In
zyd(4), use ENXIO instead of ENOTTY to indicate that the device is
not any longer attached.

Add to if_set_sadl() a boolean 'factory' argument that indicates
whether a link-layer address was assigned by the factory or some
other source. In a comment, recommend using the factory address
for generating an EUI64, and update in6_get_hw_ifid() to prefer a
factory address to any other link-layer address.

Add a routing message, RTM_LLINFO_UPD, that tells protocols to
update the binding of network-layer addresses to link-layer addresses.
Implement this message in IPv4 and IPv6 by sending a gratuitous
ARP or a neighbor advertisement, respectively. Generate RTM_LLINFO_UPD
messages on a change of an interface's link-layer address.

In ether_ioctl(), do not let SIOCALIFADDR set a link-layer address
that is broadcast/multicast or equal to 00:00:00:00:00:00.

Make ether_ioctl() call ifioctl_common() to handle ioctls that it
does not understand.

In gif(4), initialize if_softc and use it, instead of assuming that
the gif_softc and ifp overlap.

Let ifioctl_common() handle SIOCGIFADDR.

Sprinkle rtcache_invariants(), which checks on DIAGNOSTIC kernels
that certain invariants on a struct route are satisfied.

In agr(4), rewrite agr_ioctl_filter() to be a bit more explicit
about the ioctls that we do not allow on an agr(4) member interface.

bzero -> memset. Delete unnecessary casts to void *. Use
sockaddr_in_init() and sockaddr_in6_init(). Compare pointers with
NULL instead of "testing truth". Replace some instances of (type
*)0 with NULL. Change some K&R prototypes to ANSI C, and join
lines.


Revision tags: netbsd-5-0-RC3 netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 wrstuden-revivesa-base-1 simonb-wapbl-nbase yamt-pf42-base4 simonb-wapbl-base wrstuden-revivesa-base
# 1.62 15-Jun-2008 christos

branches: 1.62.2; 1.62.4; 1.62.6;
- add if_alloc (ours just mallocs), and if_initname and use them (from FreeBSD)
- kill memsets where M_ZERO can be used.


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base
# 1.61 15-Apr-2008 thorpej

branches: 1.61.2; 1.61.4; 1.61.6; 1.61.8;
Make ip6 and icmp6 stats per-cpu.


# 1.60 12-Apr-2008 cegger

make this build with BRIDGE_IPF and PFIL_HOOKS options


# 1.59 12-Apr-2008 thorpej

Make IP, TCP, UDP, and ICMP statistics per-CPU. The stats are collated
when the user requests them via sysctl.


# 1.58 08-Apr-2008 thorpej

Change IPv6 stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ip6stat structure; old netstat
binaries will continue to work properly.


# 1.57 07-Apr-2008 thorpej

Change IP stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ipstat structure; old netstat
binaries will continue to work properly.


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase nick-net80211-sync-base keiichi-mipv6-base matt-armv6-nbase hpcarm-cleanup-base
# 1.56 20-Feb-2008 matt

branches: 1.56.6;
s/u_\(int[0-9]*_t\)/u\1/g
(change u_int*_t to uint*_t)


Revision tags: bouyer-xeni386-nbase bouyer-xeni386-base mjf-devfs-base
# 1.55 19-Jan-2008 dyoung

Use C99 array initializers for bridge_control_table[].


Revision tags: nick-csl-alignment-base5 bouyer-xeni386-merge1 matt-armv6-prevmlocking vmlocking2-base3 yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 jmcneill-base bouyer-xenamd64-base2 vmlocking-nbase yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base matt-armv6-base jmcneill-pm-base reinoud-bufcleanup-base vmlocking-base
# 1.54 27-Aug-2007 dyoung

branches: 1.54.2; 1.54.8; 1.54.14;
LLADDR -> CLLADDR.


# 1.53 26-Aug-2007 dyoung

Constify: LLADDR -> CLLADDR. I'm aiming here to make it easier to
identify sockaddr_dl abuse that remains in the kernel, especially
the potential for overwriting memory past the end of a sockaddr_dl
with, e.g., memcpy(LLADDR(), ...).

Use sockaddr_dl_setaddr() in a few places.


Revision tags: matt-mips64-base nick-csl-alignment-base mjf-ufs-trans-base
# 1.52 09-Jul-2007 ad

branches: 1.52.2; 1.52.6;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.51 12-Mar-2007 ad

branches: 1.51.2;
Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.


# 1.50 04-Mar-2007 christos

branches: 1.50.2;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.49 21-Feb-2007 dyoung

Use __arraycount().


# 1.48 17-Feb-2007 dyoung

KNF: de-__P, bzero -> memset, bcmp -> memcmp. Remove extraneous
parentheses in return statements.

Cosmetic: don't open-code TAILQ_FOREACH().

Cosmetic: change types of variables to avoid oodles of casts: in
in6_src.c, avoid casts by changing several route_in6 pointers
to struct route pointers. Remove unnecessary casts to caddr_t
elsewhere.

Pave the way for eliminating address family-specific route caches:
soon, struct route will not embed a sockaddr, but it will hold
a reference to an external sockaddr, instead. We will set the
destination sockaddr using rtcache_setdst(). (I created a stub
for it, but it isn't used anywhere, yet.) rtcache_free() will
free the sockaddr. I have extracted from rtcache_free() a helper
subroutine, rtcache_clear(). rtcache_clear() will "forget" a
cached route, but it will not forget the destination by releasing
the sockaddr. I use rtcache_clear() instead of rtcache_free()
in rtcache_update(), because rtcache_update() is not supposed
to forget the destination.

Constify:

1 Introduce const accessor for route->ro_dst, rtcache_getdst().

2 Constify the 'dst' argument to ifnet->if_output(). This
led me to constify a lot of code called by output routines.

3 Constify the sockaddr argument to protosw->pr_ctlinput. This
led me to constify a lot of code called by ctlinput routines.

4 Introduce const macros for converting from a generic sockaddr
to family-specific sockaddrs, e.g., sockaddr_in: satocsin6,
satocsin, et cetera.


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.47 04-Jan-2007 elad

branches: 1.47.2;
Consistent usage of KAUTH_GENERIC_ISSUSER.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3 netbsd-4-base
# 1.46 23-Nov-2006 rpaulo

New EtherIP driver based on tap(4) and gif(4) by Hans Rosenfeld.
Notable changes:
* Fixes PR 34268.
* Separates the code from gif(4) (which is more cleaner).
* Allows the usage of STP (Spanning Tree Protocol).
* Removed EtherIP implementation from gif(4)/tap(4).

Some input from Christos.


# 1.45 16-Nov-2006 christos

__unused removal on arguments; approved by core.


Revision tags: yamt-splraiseipl-base2
# 1.44 17-Oct-2006 dogcow

now that we have -Wno-unused-parameter, back out all the tremendously ugly
code to gratuitously access said parameters.


# 1.43 13-Oct-2006 dogcow

More -Wunused fallout. sprinkle __unused when possible; otherwise, use the
do { if (&x) {} } while (/* CONSTCOND */ 0);
construct as suggested by uwe in <20061012224845.GA9449@snark.ptc.spbu.ru>.


# 1.42 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


# 1.41 05-Oct-2006 tls

Protect calls to pool_put/pool_get that may occur in interrupt context
with spl used to protect other allocations and frees, or datastructure
element insertion and removal, in adjacent code.

It is almost unquestionably the case that some of the spl()/splx() calls
added here are superfluous, but it really seems wrong to see:

s=splfoo();
/* frob data structure */
splx(s);
pool_put(x);

and if we think we need to protect the first operation, then it is hard
to see why we should not think we need to protect the next. "Better
safe than sorry".

It is also almost unquestionably the case that I missed some pool
gets/puts from interrupt context with my strategy for finding these
calls; use of PR_NOWAIT is a strong hint that a pool may be used from
interrupt context but many callers in the kernel pass a "can wait/can't
wait" flag down such that my searches might not have found them. One
notable area that needs to be looked at is pf.

See also:

http://mail-index.netbsd.org/tech-kern/2006/07/19/0003.html
http://mail-index.netbsd.org/tech-kern/2006/07/19/0009.html


Revision tags: abandoned-netbsd-4-base yamt-splraiseipl-base yamt-pdpolicy-base9 yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.40 23-Jul-2006 ad

branches: 1.40.4; 1.40.6;
Use the LWP cached credentials where sane.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.39 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.38 18-May-2006 liamjfoy

branches: 1.38.2;
Integrate Common Address Redundancy Procotol (CARP) from OpenBSD

'pseudo-device carp'

Thanks to: joerg@ christos@ riz@ and others who tested
Ok: core@


# 1.37 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base yamt-pdpolicy-base2 elad-kernelauth-base yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.36 17-Jan-2006 christos

branches: 1.36.2; 1.36.4; 1.36.6; 1.36.8; 1.36.10;
Make sure that breq is also cleared (from Xin LI)


# 1.35 09-Jan-2006 christos

Make sure we initialize all structs to 0; from Xin LI


# 1.34 24-Dec-2005 perry

branches: 1.34.2;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.33 11-Dec-2005 thorpej

ANSI function decls and application of static.


# 1.32 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.31 01-Jun-2005 jdc

branches: 1.31.2;
Fix this properly by renaming the conflicting variables.


# 1.30 01-Jun-2005 jdc

Remove extraneous definition of struct llc (found by shadow warning).


Revision tags: netbsd-3-0-RELEASE netbsd-3-0-RC6 netbsd-3-0-RC5 netbsd-3-0-RC4 netbsd-3-0-RC3 netbsd-3-0-RC2 netbsd-3-0-RC1 yamt-km-base4 yamt-km-base3 netbsd-3-base kent-audio2-base
# 1.29 26-Feb-2005 perry

branches: 1.29.2; 1.29.4;
nuke trailing whitespace


Revision tags: yamt-km-base2
# 1.28 31-Jan-2005 kim

Add RFC 3378 EtherIP support, ported from OpenBSD to NetBSD by
Hans Rosenfeld (rosenfeld at grumpf.hope-2000.org)

This change makes it possible to add gif interfaces to bridges, which
will then send and receive IP protocol 97 packets. Packets are Ethernet
frames with an EtherIP header prepended.


Revision tags: yamt-km-base kent-audio1-beforemerge kent-audio1-base
# 1.27 04-Dec-2004 peter

branches: 1.27.4; 1.27.6;
Change ifc_destroy to return an int instead of void, so that it
can pass back errors to ifconfig.


# 1.26 06-Oct-2004 bad

Interfaces that do checksum offloading indicate the checksum status of
received packets in csum_flags in the packet header. Packets that are
forwarded over the bridge need to have csum_flags cleared before being
put on the output queue. Do so in bridge_enqueue().

Discussed with Jason Thorpe.

Fixes PR kern/27007 and the first part of PR kern/21831.


# 1.25 05-Oct-2004 christos

Only enable BRIDGE_IPF code if PFIL_HOOKS is enabled.


# 1.24 21-Apr-2004 itojun

kill a sprintf


# 1.23 21-Apr-2004 itojun

kill sprintf, use snprintf


Revision tags: netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.22 31-Jan-2004 jdc

branches: 1.22.2;
Use m_copydata(), m_adj() and M_PREPEND() to manipulate mbuf's in
bridge_ipf(). Fixes kernel memory corruption that occured when using
m_split() and m_cat().
Idea from OpenBSD.


# 1.21 09-Dec-2003 augustss

Fix spelling mistake in a comment.


# 1.20 28-Oct-2003 mycroft

Mark this initializer in the canonical way so it can be found later.


# 1.19 25-Oct-2003 christos

Fix uninitialized variable warnings


# 1.18 16-Sep-2003 jdc

Add a flag parameter to bridge_enqueue() to tell it whether to run the
filter or not. We only need to run the filter for bridge_forward() and
bridge_broadcast(). If we also run it for bridge_output(), we will run
the filter twice outbound per packet, so don't.

In bridge_ipf(), make sure we don't run m_cat() on a single mbuf chain
by checking to see (and remembering) if we need to m_split() the mbuf.
This fixes bridge + ipfilter on sparc.

Fixes PR kern/22063.


# 1.17 11-Aug-2003 itojun

rm extra blank line


# 1.16 13-Jul-2003 jdc

Include opt_inet.h to get INET6 definition.
Now, bridged ipv6 packets are passed through ipfilter.
However, some v6 packets still do not get transmitted when ipf is enabled.
Partial fix for PR kern/22063.


# 1.15 23-Jun-2003 martin

branches: 1.15.2;
Make sure to include opt_foo.h if a defflag option FOO is used.


# 1.14 24-May-2003 kristerw

Make sure splx() is called for all bridge_ioctl() error cases.


# 1.13 16-May-2003 itojun

use strlcpy


# 1.12 14-May-2003 itojun

use arc4random


# 1.11 19-Mar-2003 bouyer

Fix 2 bugs:
- initialise stp when the bridge is turned up, without this stp will keep
all interfaces disabled in a sequence like:
brconfig bridge0 add if0 add if1 stp if0 stp if1 up
- s/BRDGSPRI/BRDGSIFPRIO in brconfig.c:cmd_ifpriority()

add a command (ifpathcost) to change the stp path cost of the STP path cost of
an interface. Display the interface path cost with the others STP parameters.


# 1.10 27-Feb-2003 perseant

Make BRIDGE_IPF an option, and document it. Add it (commented) to GENERIC.
Let brconfig tell whether the bridge is using the ipfilter hook, or not.


# 1.9 15-Feb-2003 perseant

Add ipf packet-filtering option to if_bridge. The option is controlled at
compile-time by BRIDGE_IPF, and at runtime by brconfig with the {ipf,-ipf}
option on a per-bridge basis.

As a side-effect, add PFIL_HOOKS processing to if_bridge.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base kqueue-aftermerge kqueue-beforemerge gehenna-devsw-base kqueue-base
# 1.8 24-Aug-2002 martin

Add a function to lookup bridge members by struct ifnet * and use
it at all call sites that have such a pointer readily available.
This avoids unnecessary strcmp()s in critical paths, and removes
some XXX comments.


# 1.7 08-Jun-2002 itojun

reject "add" request if if_mtu is different.


# 1.6 23-May-2002 itojun

use IFT_BRIDGE


Revision tags: netbsd-1-6-base
# 1.5 24-Mar-2002 jdolecek

branches: 1.5.2; 1.5.4;
Fix a memory leak in bridge_ioctl_add() when the called for non-ethernet
interface.
Problem noted and fix provided by in kern/16019 by Love.


Revision tags: eeh-devprop-base newlock-base
# 1.4 08-Mar-2002 thorpej

Pool deals fairly well with physical memory shortage, but it doesn't
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map). Try to deal with this:

* Group all information about the backend allocator for a pool in a
separate structure. The pool references this structure, rather than
the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
to become available, but will still fail if it cannot callocate KVA
space for the pages. If this happens, carefully drain all pools using
the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
some pages, and use that information to make draining easier and more
efficient.
* Get rid of PR_URGENT. There was only one use of it, and it could be
dealt with by the caller.

From art@openbsd.org.


Revision tags: ifpoll-base
# 1.3 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.2 17-Aug-2001 thorpej

branches: 1.2.2; 1.2.4;
Only report expire time for DYNAMIC forwarding table entries.


# 1.1 17-Aug-2001 thorpej

Add support for building Ethernet bridges, based on Jason Wright's
bridge driver from OpenBSD, although the bridge code has been *heavily*
modified by me (the 802.1D code remains mostly unchanged from the
original).


# 1.169 24-Mar-2020 jdolecek

reset the csum_flags in bridge_brodcast() also for bmcast path

for destination interfaces with real hardware offloading this fixes
multicast packet corruption; for xvif(4) this fix stops treating them
as having no csum

may fix PR kern/42386


Revision tags: ad-namecache-base3
# 1.168 24-Feb-2020 rin

Remove debug printf I put into bridge_calc_csum_flags().
Sorry for noise.


# 1.167 23-Feb-2020 jdolecek

disable the DEBUG bridge_calc_csum_flags() printf


# 1.166 29-Jan-2020 thorpej

Adopt <net/if_stats.h>.


Revision tags: ad-namecache-base2 ad-namecache-base1 ad-namecache-base phil-wifi-20191119
# 1.165 05-Aug-2019 msaitoh

branches: 1.165.2;
Cast uint32_t to avoid undefined behavior in bridge_rthash(). Found by kUBSan.


Revision tags: netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609 isaki-audio2-base pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226
# 1.164 22-Dec-2018 rin

branches: 1.164.4;
Take the interface out of promiscuous mode in bridge_delete_member()
instead of bridge_ioctl_del(). Otherwise, the member interfaces are
left in promiscuous mode when the bridge is destroyed.


# 1.163 15-Dec-2018 rin

Improve wording in comments: replace "chain" with "queue" for
sequence of mbuf's connected by m_nextpkt, in order to avoid
confusion with those connected by m_next.

No binary changes.


# 1.162 14-Dec-2018 martin

Need <netinet6/ip6_var.h> for ip6_statinc() prototype.


# 1.161 12-Dec-2018 rin

PR kern/53562

Handle TX offload in software when a packet is sent via
bridge_output(). We can send it as is in the following
exceptional cases:

For unicast:

(1) When the destination interface is the same as source.

(2) When the destination supports all TX offload options
specified in a packet.

For multicast/broadcast:

(3) When all the members of the bridge support the specified
TX offload options.

For (3), add sc_csum_flags_tx flag to bridge softc, which is
logical AND b/w capabilities of TX offload options in member
interface (ifp->if_csum_flags_tx). The flag is updated when a
member is (i) added to or (ii) removed from a bridge, or (iii)
if_csum_flags_tx flag of a member interface is manipulated via
ifconfig(8).

Turn on M_CSUM_TSOv[46] bit in ifp->if_csum_flags_tx flag when
TSO[46] is enabled for that interface.

OK msaitoh thorpej


Revision tags: pgoyette-compat-1126
# 1.160 09-Nov-2018 ozaki-r

Fix that brconfig <bridge> (addr) can't show a large number of MAC addresses

The command shows only 256 addresses at maximum even if a bridge caches more
addresses. It occurs because the kernel doesn't return an error if the command
passes a short buffer that can't store all cached addresses; the kernel fills
cached addresses as much as possible and returns it without telling that the
result is truncated.

Fix the issue by telling a required size of a buffer if a buffer passed from the
command is not enough, which lets the command retry with an enough buffer.

Reported by k-goda@IIJ


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930
# 1.159 19-Sep-2018 msaitoh

Micro optimization. m_copym(M_COPYALL) -> m_copypacket().


# 1.158 18-Sep-2018 msaitoh

- Fix bridge_enqueue() which was broken by last commit. Use correct mbuf
pointer.
- Modify comment.


# 1.157 14-Sep-2018 msaitoh

Fix a bug that bridge_enqueue() incorrectly cleared outgoing packet's offload
flags. bridge_enqueue() is called from bridge_output() when a packet is
spontaneous. Clear csum_flags before calling brige_enqueue() in
bridge_forward() or bridge_broadcast() instead of in the beginning of
bridge_enqueue().

Note that this change doesn't fix a problem on the following configuration:

A bridge has two or more interfaces.

An address is assigned to an bridge member interface and
some offload flags are set.

Another interface has no address and has no any offload flag.

XXX pullup-[78]


Revision tags: pgoyette-compat-0906 pgoyette-compat-0728 phil-wifi-base pgoyette-compat-0625
# 1.156 25-May-2018 ozaki-r

branches: 1.156.2;
Ensure to call if_register after interface initializations finish


Revision tags: pgoyette-compat-0521
# 1.155 14-May-2018 ozaki-r

Protect packet input routines with KERNEL_LOCK and splsoftnet

if_input, i.e, ether_input and friends, now runs in softint without any
protections. It's ok for ether_input itself because it's already MP-safe,
however, subsequent routines called from it such as carp_input and agr_input
aren't safe because they're not MP-safe. Protect if_input with KERNEL_LOCK.

if_input can be called from a normal LWP context. In that case we need to
prevent interrupts (softint) from running by splsoftnet to protect non-MP-safe
codes (e.g., carp_input and agr_input).

Pointed out by mlelstv@


Revision tags: pgoyette-compat-0502 pgoyette-compat-0422
# 1.154 18-Apr-2018 ozaki-r

Add missing PSLIST_ENTRY_INIT and PSLIST_ENTRY_DESTROY


# 1.153 18-Apr-2018 ozaki-r

Get rid of a unnecessary semicolon

Pointed out by kamil@


# 1.152 18-Apr-2018 ozaki-r

bridge: use pslist(9) for rtlist and rthash

The change fixes race conditions on list operations. One example is that a
reader may see invalid pointers on a looking item in a list due to lack of
membar_producer.


# 1.151 18-Apr-2018 ozaki-r

Simplify bridge_rtnode_insert (NFC)


# 1.150 18-Apr-2018 ozaki-r

Remove obsolete NULL checks


Revision tags: pgoyette-compat-0415
# 1.149 10-Apr-2018 ozaki-r

Fix bridge_rtdelete

It removes a rtable entry that belongs to a specified interface, however, its
original behavior was to delete all belonging entries. Restore the original
behavior.


Revision tags: pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.148 15-Jan-2018 maxv

branches: 1.148.2;
If the bridge is not running, don't call bridge_stop. Otherwise the
following commands will crash the kernel:

ifconfig bridge0 create
ifconfig bridge0 destroy


# 1.147 28-Dec-2017 ozaki-r

Ensure the timer isn't running by using workqueue_wait


# 1.146 19-Dec-2017 ozaki-r

Don't set IFEF_MPSAFE unless NET_MPSAFE at this point

Because recent investigations show that interfaces with IFEF_MPSAFE need to
follow additional restrictions to work with the flag safely. We should enable it
on an interface by default only if the interface surely satisfies the
restrictions, which are described in if.h.

Note that enabling IFEF_MPSAFE solely gains a few benefit on performance because
the network stack is still serialized by the big kernel locks by default.


# 1.145 11-Dec-2017 ozaki-r

Wrap if_ioctl_lock with IFNET_* macros (NFC)

Also if_ioctl_lock perhaps needs to be renamed to something because it's now
not just for ioctl...


# 1.144 08-Dec-2017 ozaki-r

Fix build of kernels without ether

By throwing out if_enable_vlan_mtu and if_disable_vlan_mtu that
created a unnecessary dependency from if.c to if_ethersubr.c.

PR kern/52790


# 1.143 06-Dec-2017 ozaki-r

Ensure to not turn on IFF_RUNNING of an interface until its initialization completes

And ensure to turn off it before destruction as per IFF_RUNNING's description
"resource allocated". (The description is a bit doubtful though, I believe the
change is still proper.)


# 1.142 06-Dec-2017 ozaki-r

Ensure to hold if_ioctl_lock when calling if_flags_set


Revision tags: tls-maxphys-base-20171202
# 1.141 17-Nov-2017 ozaki-r

Add missing IFEF_NO_LINK_STATE_CHANGE to bridge


# 1.140 16-Nov-2017 ozaki-r

Unify IFEF_*_MPSAFE into IFEF_MPSAFE

There are already two flags for if_output and if_start, however, it seems such
MPSAFE flags are eventually needed for all if_XXX operations. Having discrete
flags for each operation is wasteful of if_extflags bits. So let's unify
the flags into one: IFEF_MPSAFE.

Fortunately IFEF_*_MPSAFE flags have never been included in any releases, so
we can change them without breaking backward compatibility of the releases
(though the kernel version of -current should be bumped).

Note that if an interface have both MP-safe and non-MP-safe operations at a
time, we have to set the IFEF_MPSAFE flag and let callees of non-MP-safe
opeartions take the kernel lock.

Proposed on tech-kern@ and tech-net@


# 1.139 15-Nov-2017 ozaki-r

Mark callouts of bridge CALLOUT_MPSAFE


# 1.138 25-Oct-2017 ozaki-r

Remove unnecessary splsoftnet


# 1.137 25-Oct-2017 ozaki-r

Don't free sc_rthash twice


# 1.136 23-Oct-2017 msaitoh

- If if_initialize() failed in the attach function, free resources and return.
- Add some missing frees in bridge_clone_destroy().
- KNF


# 1.135 02-Oct-2017 ozaki-r

Add curlwp_bind to bridge_input for psref

It can be called in a thread context via tap (tap_dev_write).

Fix PR kern/52587


Revision tags: nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320
# 1.134 07-Mar-2017 ozaki-r

branches: 1.134.6;
Remove unnecessary splnet for bridge_enqueue

bridge_enqueue now uses if_transmit_lock that does splnet for device
drivers, so splnet for bridge_enqueue isn't needed anymore.


# 1.133 16-Feb-2017 knakahara

add l2tp(4) L2TPv3 interface.

originally implemented by IIJ SEIL team.


Revision tags: nick-nhusb-base-20170204
# 1.132 23-Jan-2017 ozaki-r

Replace some splnet with splsoftnet


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107 nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004
# 1.131 15-Sep-2016 christos

branches: 1.131.2;
Always do the mbuf checks. The packet filters (npf) expect the mbuf to be
pulled-up. (Krists Krilovs)


Revision tags: localcount-20160914
# 1.130 29-Aug-2016 ozaki-r

KNF; replace white spaces with hard tabs

No functional change.


Revision tags: pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.129 22-Jun-2016 knakahara

branches: 1.129.2;
fix: locking about IFQ_ENQUEUE and ALTQ

- If NET_MPSAFE is not defined, IFQ_LOCK is nop. Currently, that means
IFQ_ENQUEUE() of some paths such as bridge_enqueue() is called parallel
wrongly.
- If ALTQ is enabled, Tx processing should call if_transmit() (= IFQ_ENQUEUE
+ ifp->if_start()) instead of ifp->if_transmit() to call ALTQ_ENQUEUE()
and ALTQ_DEQUEUE().
Furthermore, ALTQ processing is always required KERNEL_LOCK currently.


# 1.128 20-Jun-2016 knakahara

fix: should not assert IFEF_OUTPUT_MPSAFE in bridge_output()


# 1.127 20-Jun-2016 knakahara

tentative fix for ATF(net/if_bridge/t_bridge)


# 1.126 20-Jun-2016 knakahara

make bridge_output MP-safe, so that bridge(4) can enable IFEF_OUTPUT_MPSAFE.

making MP-scalable is future work.


# 1.125 10-Jun-2016 ozaki-r

Avoid storing a pointer of an interface in a mbuf

Having a pointer of an interface in a mbuf isn't safe if we remove big
kernel locks; an interface object (ifnet) can be destroyed anytime in any
packet processing and accessing such object via a pointer is racy. Instead
we have to get an object from the interface collection (ifindex2ifnet) via
an interface index (if_index) that is stored to a mbuf instead of an
pointer.

The change provides two APIs: m_{get,put}_rcvif_psref that use psref(9)
for sleep-able critical sections and m_{get,put}_rcvif that use
pserialize(9) for other critical sections. The change also adds another
API called m_get_rcvif_NOMPSAFE, that is NOT MP-safe and for transition
moratorium, i.e., it is intended to be used for places where are not
planned to be MP-ified soon.

The change adds some overhead due to psref to performance sensitive paths,
however the overhead is not serious, 2% down at worst.

Proposed on tech-kern and tech-net.


# 1.124 10-Jun-2016 ozaki-r

Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.123 16-May-2016 ozaki-r

Apply if_get and if_put to bridge(4)


# 1.122 04-May-2016 roy

Allow multicast/broadcast packets from a bridge member to other members.
Note this should just call bridge_broadcast when more locking issues are
resolved.


# 1.121 28-Apr-2016 knakahara

introduce new ifnet MP-scalable sending interface "if_transmit".


# 1.120 28-Apr-2016 ozaki-r

Constify rtentry of if_output

We no longer need to change rtentry below if_output.

The change makes it clear where rtentries are changed (or not)
and helps forthcoming locking (os psrefing) rtentries.


# 1.119 24-Apr-2016 christos

CID 1358673: dead code


Revision tags: nick-nhusb-base-20160422
# 1.118 22-Apr-2016 roy

Change used from int to bool.
If used, abort the loop because we think we're already at the end.


# 1.117 20-Apr-2016 knakahara

IFQ_ENQUEUE refactor (3/3) : eliminate pktattr argument from IFQ_ENQUEUE caller


# 1.116 20-Apr-2016 knakahara

IFQ_ENQUEUE refactor (2/3) : eliminate pktattr argument from altq implemantation


# 1.115 19-Apr-2016 ozaki-r

Apply psref(9) to bridge(4)

Note that there is an issue that ioctls for an interface and a destruction
of the interface can run in parallel and it causes race conditions on
bridge as well (it rarely happens). The issue will be addressed in the
interface common code (if.c).


# 1.114 19-Apr-2016 ozaki-r

Remove BRIDGE_MPSAFE switch and enable MP-safe code by default

We need to enable it by default because bridge_input now runs
in softint, but bridge_input w/o BRIDGE_MPSAFE was designed as
it runs in hardware interrupt.

Note that there remains a racy code in bridge_output; it will be
solved in the upcoming change (applying psref(9)).


# 1.113 11-Apr-2016 ozaki-r

Fix usage of pslist(9)

Pointed out by riastradh@.


# 1.112 11-Apr-2016 ozaki-r

Use pslist(9) in bridge(4)

This adds missing memory barriers to list operations for pserialize.


# 1.111 28-Mar-2016 ozaki-r

Remove unused global bridge list

Pointed out by riastradh@


# 1.110 23-Mar-2016 ozaki-r

Fix LIST_FOREACH argument


# 1.109 23-Mar-2016 ozaki-r

Use LIST_FOREACH instead of LIST_FOREACH_SAFE

No need to use *_SAFE because we don't remove any items in the loop.


Revision tags: nick-nhusb-base-20160319
# 1.108 15-Feb-2016 ozaki-r

Simplify bridge(4)

Thanks to introducing softint-based if_input, the entire bridge code now
never run in hardware interrupt context. So we can simplify the code.

- Remove spin mutexes
- They were needed because some code of bridge could run in
hardware interrupt context
- We now need only an adaptive mutex for each shared object
(a member list and a forwarding table)
- Remove pktqueue
- bridge_input is already in softint, using another softint
(for bridge_forward) is useless
- Packet distribution should be down at device drivers


# 1.107 10-Feb-2016 ozaki-r

Don't share struct work, instead have one per softc

Pointed out by riastradh@


# 1.106 09-Feb-2016 ozaki-r

Introduce softint-based if_input

This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.

This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.

To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.

Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html

Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!


Revision tags: nick-nhusb-base-20151226
# 1.105 19-Nov-2015 christos

Add handling of VLAN packets in if_bridge where the parent interface supports
them (Jean-Jacques.Puig@espci.fr). Factor out the vlan_mtu enabling and
disabling code.


# 1.104 20-Oct-2015 maxv

Harmless alloc inconsistency; make sure the exact same argument is given to
kmem_alloc/kmem_free. Found by Brainy.


# 1.103 07-Oct-2015 ozaki-r

Enqueue frames to a curcpu's pktqueue

Currently RX can run on a CPU other than CPU#0, so always enqueuing
to a pktqueue of CPU#0 makes no sense. Let's use a curcpu's pktqueue,
although bridge_foward softint doesn't run in parallel without
NET_MPSAFE.

This is a temporal solution. We need a fundamental solution.


Revision tags: nick-nhusb-base-20150921
# 1.102 28-Aug-2015 rjs

Don't set M_PROTO1 in mbuf flags.

This was left over from the old usage of gif(4) with bridges.


# 1.101 20-Aug-2015 christos

include "ioconf.h" to get the 'void <driver>attach(int count);' prototype.


# 1.100 23-Jul-2015 ozaki-r

Fix PR 48104

So far bridge cannot receive frames via a member interface when the frames
come from another member interface. So when we assign an IP address to
a member interface, hosts connected to another member interface cannot
ping to the IP address. That behavior isn't expected. See PR 48104 for
more realistic examples of this issue.

The change does:
- drop M_PROMISC before ether_input, which allows a bridge member interface
to receive a frame coming from another bridge member interface
- receive broadcast/multicast frames via all bridge member interfaces,
which is required to receive IPv6 multicast packets destined to a
multicast group belonging to a bridge member interface that is different
from a packet arrival interface

roy@ helped testing of the fix, thanks!


Revision tags: nick-nhusb-base-20150606
# 1.99 01-Jun-2015 matt

Modify the BRDGGIFS and BRDGRTS cmds to be more COMPAT_NETBSD32 friendly.
(XXX whitespace)


# 1.98 16-Apr-2015 ozaki-r

Fix racy bridge_delete_member

It can be called from bridge_ioctl_del and bridge_clone_destroy with
a same bridge member (bif) at the same time. We have to prevent
that happens.

Pointed out by riastradh@


Revision tags: nick-nhusb-base-20150406
# 1.97 08-Jan-2015 ozaki-r

Use pserialize for rtlist in bridge

This change enables lockless accesses to bridge rtable lists.
See locking notes in a comment to know how pserialize and
mutexes are used. Some functions are rearranged to use
pserialize. A workqueue is introduced to use pserialize in
bridge_rtage via bridge_timer callout.

As usual, pserialize and mutexes are used only when NET_MPSAFE
on. On the other hand, the newly added workqueue is used
regardless of NET_MPSAFE on or off.


# 1.96 01-Jan-2015 ozaki-r

Reset the expire time of a cache on receiving a frame for the cache

The expire time of a cache in a bridge MAC address table was never reset
once it is initialized regardless of traffic for the cache. The behavior
isn't supposed and active caches are unnecessarily expired and removed.

PR kern/49507


# 1.95 31-Dec-2014 ozaki-r

Use pserialize in bridge

This change enables lockless accesses to bridge member lists.
See locking notes in a comment to know how pserialize and
mutexes are used.

This change also provides support for softint-based interrupt
handling; pserialize readers can run in both HW interrupt and
softint contexts.

As usual, pserialize is used only when NET_MPSAFE on.


# 1.94 25-Dec-2014 ozaki-r

Use LIST_FOREACH_SAFE in bridge_rt* functions


# 1.93 24-Dec-2014 ozaki-r

Replace malloc/free with kmem_* in if_bridge

Additionally M_NOWAIT is replaced with KM_SLEEP.


# 1.92 22-Dec-2014 ozaki-r

Call ether_input/m_freem without holding a lock or referencing unnecessary objects

When NET_MPSAFE on, a bridge tries to pass up a packet to Layer 3
(or call m_freem) with holding a lock or referencing unnecessary
objects. That causes random lock ups. The change fixes the issue.


Revision tags: nick-nhusb-base
# 1.91 15-Aug-2014 ozaki-r

branches: 1.91.2;
bridge: reject non-IFF_SIMPLEX interfaces

bridge does not work with !IFF_SIMPLEX interfaces (PR/18035);
the bug is not yet fixed. Until it gets fixed, we should
reject non-IFF_SIMPLEX interfaces.

Discussed with pooka@


Revision tags: netbsd-7-1-2-RELEASE netbsd-7-1-1-RELEASE netbsd-7-1-RELEASE netbsd-7-1-RC2 netbsd-7-nhusb-base-20170116 netbsd-7-1-RC1 netbsd-7-0-2-RELEASE netbsd-7-nhusb-base netbsd-7-0-1-RELEASE netbsd-7-0-RELEASE netbsd-7-0-RC3 netbsd-7-0-RC2 netbsd-7-0-RC1 netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.90 23-Jul-2014 ozaki-r

branches: 1.90.2;
Avoid calling copyout with holding mutex(IPL_NET)

Because copyout may lead a page fault that may sleep, we have to pull it
out from the critical section of mutex(IPL_NET) in bridge_ioctl_gifs.


# 1.89 23-Jul-2014 ozaki-r

Add missing unlock


# 1.88 20-Jul-2014 ozaki-r

Don't return ENETRESET when ioctl SIOCSIFMTU

Otherwise, just changing MTU with ifconfig shows
a confusable error message.

RP kern/48996


# 1.87 14-Jul-2014 ozaki-r

Make bridge MPSAFE

- Introduce BRIDGE_MPSAFE
- It's enabled only when NET_MPSAFE is defined
in if.h or the kernel config
- Add iflist and rtlist mutex locks
- Locking iflist is performance sensitive,
so it's not used when !BRIDGE_MPSAFE
- Add bif object reference counting
- It enables fine-grain locking for bridge member lists
by allowing to not hold a lock during touching a bif
- bridge_release_member is added to decrement the
reference count
- A condition variable is added to do bridge_delete_member
gracefully
- Add if_bridgeif to ifnet
- It's a shortcut to a bif object of a bridge member
- It reduces a bif lookup cost and so lock contention on iflist
- Make bridgestp MPSAFE too


# 1.86 02-Jul-2014 ozaki-r

Protect bridge_list with a mutex


# 1.85 02-Jul-2014 ozaki-r

Remove obsolete codes for if_snd


# 1.84 23-Jun-2014 ozaki-r

Get rid of unnecessary xc_broadcast after pktq_barrier

Pointed out by rmind@


# 1.83 18-Jun-2014 ozaki-r

Restructure bridge_input and bridge_broadcast

There are two changes:
- Assemble the places calling pktq_enqueue (bridge_forward)
for unicast and {b,m}cast frames into one
- Receive {b,m}cast frames in bridge_broadcast, not in
bridge_input

The changes make the code clear and readable. bridge_input
now doesn't need to take care of {b,m}cast frames;
bridge_forward and bridge_broadcast have the responsibility.

The changes are based on a patch of Lloyd Parkes submitted
in PR 48104, but don't fix its issue yet.


# 1.82 18-Jun-2014 ozaki-r

Tidy up bridge_input

No functional change.


# 1.81 17-Jun-2014 ozaki-r

Restructure ether_input and bridge_input

The network stack of NetBSD is well organized and
layered. A packet reception is processed from a
lower layer to an upper layer one by one. However,
ether_input and bridge_input are not structured so.
bridge_input is called inside ether_input.

The new structure replaces ifnet#if_input of a bridge
member with bridge_input when the member is attached.
So a packet goes straight on a packet reception via
a bridge, bridge_input => ether_input => ip_input.

The change is part of a patch of Lloyd Parkes submitted
in PR 48104. Unlike the patch, the change doesn't
intend to change the behavior of the packet processing.
Another patch will fix PR 48104.


# 1.80 16-Jun-2014 ozaki-r

Add net.interfaces.bridgeN.fwdq.{maxlen,len,drops} sysctl


# 1.79 16-Jun-2014 ozaki-r

Use pktqueue for bridge forwarding queue and softint


# 1.78 15-Jun-2014 ozaki-r

Get rid of unnecessary splnet for pool_{get,put}

A mutex prevents interrupts in the functions now.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base
# 1.77 29-Jun-2013 rmind

branches: 1.77.4;
- Rewrite parts of pfil(9): use array to store hooks and thus be more cache
friendly (there are only few hooks in the system). Make the structures
opaque and the interface more strict.
- Remove PFIL_HOOKS option by making pfil(9) mandatory.


Revision tags: agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8
# 1.76 22-Mar-2012 wiz

branches: 1.76.2; 1.76.4;
Fix typo in kauth name. From PR 46234 by Matthew Mondor.
Tested by Geoff Adams and Ryo ONODERA.


# 1.75 13-Mar-2012 elad

Replace the remaining KAUTH_GENERIC_ISSUSER authorization calls with
something meaningful. All relevant documentation has been updated or
written.

Most of these changes were brought up in the following messages:

http://mail-index.netbsd.org/tech-kern/2012/01/18/msg012490.html
http://mail-index.netbsd.org/tech-kern/2012/01/19/msg012502.html
http://mail-index.netbsd.org/tech-kern/2012/02/17/msg012728.html

Thanks to christos, manu, njoly, and jmmv for input.

Huge thanks to pgoyette for spinning these changes through some build
cycles and ATF.


Revision tags: netbsd-6-0-6-RELEASE netbsd-6-1-5-RELEASE netbsd-6-1-4-RELEASE netbsd-6-0-5-RELEASE netbsd-6-1-3-RELEASE netbsd-6-0-4-RELEASE netbsd-6-1-2-RELEASE netbsd-6-0-3-RELEASE netbsd-6-1-1-RELEASE netbsd-6-0-2-RELEASE netbsd-6-1-RELEASE netbsd-6-1-RC4 netbsd-6-1-RC3 netbsd-6-1-RC2 netbsd-6-1-RC1 netbsd-6-0-1-RELEASE matt-nb6-plus-nbase netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base
# 1.74 19-Nov-2011 tls

branches: 1.74.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.73 23-May-2011 joerg

branches: 1.73.4;
simplify


Revision tags: bouyer-quota2-nbase bouyer-quota2-base jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.72 07-Dec-2010 pooka

branches: 1.72.2;
_KERNEL_TOP


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1 yamt-nfs-mp-base9 uebayasi-xip-base
# 1.71 19-Jan-2010 pooka

branches: 1.71.4;
Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


Revision tags: matt-premerge-20091211 yamt-nfs-mp-base8 yamt-nfs-mp-base7 jymxensuspend-base yamt-nfs-mp-base6 yamt-nfs-mp-base5 jym-xensuspend-nbase
# 1.70 17-May-2009 cegger

fix crash in bridge_ioctl():


BRDGGFLT and BRDGSFILT bridge controls are only available with BRIDGE_IPF and PFIL_HOOKS defined.
In amd64 GENERIC and XEN kernel configs PFIL_HOOKS is defined but BRIDGE_IPF is not.

When a BRDGGFLT or BRDGSFILT command comes in, then ifd->ifd_cmd is not in range
of bridge_control_table_size. Then bc is not set and is dereferenced
later => BOOM.


Revision tags: yamt-nfs-mp-base4 jym-xensuspend-base
# 1.69 12-May-2009 elad

Move kauth(9) call before going into splnet().

Mailing list reference:

http://mail-index.netbsd.org/tech-net/2009/05/08/msg001286.html


Revision tags: yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 nick-hppapmap-base
# 1.68 04-Apr-2009 bouyer

Fix another typo


# 1.67 04-Apr-2009 bouyer

Fix a comment, and make it build.


# 1.66 04-Apr-2009 bouyer

Fixes from Masao Uebayashi


# 1.65 04-Apr-2009 bouyer

Fix for if_start() and pfil_hook() being called from hardware interrupt
context (reported on various mailing-lists, and part of PR kern/41114,
causing panic in pf(4) and possibly ipf(4) when BRIDGE_IPF is used).
Defer bridge_forward() to a software interrupt; bridge_input() enqueues
mbufs to ifp->if_snd which is handled in bridge_forward().


Revision tags: nick-hppapmap-base2
# 1.64 18-Jan-2009 mrg

branches: 1.64.2;
Fix multiple problems:

* A sign extension error creating the bridge ID corrupted the
priority (always making it the maximum).
* Do not catch STP packets on an interface for which STP is not
enabled -- it's a violation of the spec, and causes STP to fail on
neighboring bridges.
* An optimization to bstp_input() -- some information is already
known when we call it.

contributed anonymously.


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.63 07-Nov-2008 dyoung

*** Summary ***

When a link-layer address changes (e.g., ifconfig ex0 link
02:de:ad:be:ef:02 active), send a gratuitous ARP and/or a Neighbor
Advertisement to update the network-/link-layer address bindings
on our LAN peers.

Refuse a change of ethernet address to the address 00:00:00:00:00:00
or to any multicast/broadcast address. (Thanks matt@.)

Reorder ifnet ioctl operations so that driver ioctls may inherit
the functions of their "class"---ether_ioctl(), fddi_ioctl(), et
cetera---and the class ioctls may inherit from the generic ioctl,
ifioctl_common(), but both driver- and class-ioctls may override
the generic behavior. Make network drivers share more code.

Distinguish a "factory" link-layer address from others for the
purposes of both protecting that address from deletion and computing
EUI64.

Return consistent, appropriate error codes from network drivers.

Improve readability. KNF.

*** Details ***

In if_attach(), always initialize the interface ioctl routine,
ifnet->if_ioctl, if the driver has not already initialized it.
Delete if_ioctl == NULL tests everywhere else, because it cannot
happen.

In the ioctl routines of network interfaces, inherit common ioctl
behaviors by calling either ifioctl_common() or whichever ioctl
routine is appropriate for the class of interface---e.g., ether_ioctl()
for ethernets.

Stop (ab)using SIOCSIFADDR and start to use SIOCINITIFADDR. In
the user->kernel interface, SIOCSIFADDR's argument was an ifreq,
but on the protocol->ifnet interface, SIOCSIFADDR's argument was
an ifaddr. That was confusing, and it would work against me as I
make it possible for a network interface to overload most ioctls.
On the protocol->ifnet interface, replace SIOCSIFADDR with
SIOCINITIFADDR. In ifioctl(), return EPERM if userland tries to
invoke SIOCINITIFADDR.

In ifioctl(), give the interface the first shot at handling most
interface ioctls, and give the protocol the second shot, instead
of the other way around. Finally, let compatibility code (COMPAT_OSOCK)
take a shot.

Pull device initialization out of switch statements under
SIOCINITIFADDR. For example, pull ..._init() out of any switch
statement that looks like this:

switch (...->sa_family) {
case ...:
..._init();
...
break;
...
default:
..._init();
...
break;
}

Rewrite many if-else clauses that handle all permutations of IFF_UP
and IFF_RUNNING to use a switch statement,

switch (x & (IFF_UP|IFF_RUNNING)) {
case 0:
...
break;
case IFF_RUNNING:
...
break;
case IFF_UP:
...
break;
case IFF_UP|IFF_RUNNING:
...
break;
}

unifdef lots of code containing #ifdef FreeBSD, #ifdef NetBSD, and
#ifdef SIOCSIFMTU, especially in fwip(4) and in ndis(4).

In ipw(4), remove an if_set_sadl() call that is out of place.

In nfe(4), reuse the jumbo MTU logic in ether_ioctl().

Let ethernets register a callback for setting h/w state such as
promiscuous mode and the multicast filter in accord with a change
in the if_flags: ether_set_ifflags_cb() registers a callback that
returns ENETRESET if the caller should reset the ethernet by calling
if_init(), 0 on success, != 0 on failure. Pull common code from
ex(4), gem(4), nfe(4), sip(4), tlp(4), vge(4) into ether_ioctl(),
and register if_flags callbacks for those drivers.

Return ENOTTY instead of EINVAL for inappropriate ioctls. In
zyd(4), use ENXIO instead of ENOTTY to indicate that the device is
not any longer attached.

Add to if_set_sadl() a boolean 'factory' argument that indicates
whether a link-layer address was assigned by the factory or some
other source. In a comment, recommend using the factory address
for generating an EUI64, and update in6_get_hw_ifid() to prefer a
factory address to any other link-layer address.

Add a routing message, RTM_LLINFO_UPD, that tells protocols to
update the binding of network-layer addresses to link-layer addresses.
Implement this message in IPv4 and IPv6 by sending a gratuitous
ARP or a neighbor advertisement, respectively. Generate RTM_LLINFO_UPD
messages on a change of an interface's link-layer address.

In ether_ioctl(), do not let SIOCALIFADDR set a link-layer address
that is broadcast/multicast or equal to 00:00:00:00:00:00.

Make ether_ioctl() call ifioctl_common() to handle ioctls that it
does not understand.

In gif(4), initialize if_softc and use it, instead of assuming that
the gif_softc and ifp overlap.

Let ifioctl_common() handle SIOCGIFADDR.

Sprinkle rtcache_invariants(), which checks on DIAGNOSTIC kernels
that certain invariants on a struct route are satisfied.

In agr(4), rewrite agr_ioctl_filter() to be a bit more explicit
about the ioctls that we do not allow on an agr(4) member interface.

bzero -> memset. Delete unnecessary casts to void *. Use
sockaddr_in_init() and sockaddr_in6_init(). Compare pointers with
NULL instead of "testing truth". Replace some instances of (type
*)0 with NULL. Change some K&R prototypes to ANSI C, and join
lines.


Revision tags: netbsd-5-0-RC3 netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 wrstuden-revivesa-base-1 simonb-wapbl-nbase yamt-pf42-base4 simonb-wapbl-base wrstuden-revivesa-base
# 1.62 15-Jun-2008 christos

branches: 1.62.2; 1.62.4; 1.62.6;
- add if_alloc (ours just mallocs), and if_initname and use them (from FreeBSD)
- kill memsets where M_ZERO can be used.


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base
# 1.61 15-Apr-2008 thorpej

branches: 1.61.2; 1.61.4; 1.61.6; 1.61.8;
Make ip6 and icmp6 stats per-cpu.


# 1.60 12-Apr-2008 cegger

make this build with BRIDGE_IPF and PFIL_HOOKS options


# 1.59 12-Apr-2008 thorpej

Make IP, TCP, UDP, and ICMP statistics per-CPU. The stats are collated
when the user requests them via sysctl.


# 1.58 08-Apr-2008 thorpej

Change IPv6 stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ip6stat structure; old netstat
binaries will continue to work properly.


# 1.57 07-Apr-2008 thorpej

Change IP stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ipstat structure; old netstat
binaries will continue to work properly.


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase nick-net80211-sync-base keiichi-mipv6-base matt-armv6-nbase hpcarm-cleanup-base
# 1.56 20-Feb-2008 matt

branches: 1.56.6;
s/u_\(int[0-9]*_t\)/u\1/g
(change u_int*_t to uint*_t)


Revision tags: bouyer-xeni386-nbase bouyer-xeni386-base mjf-devfs-base
# 1.55 19-Jan-2008 dyoung

Use C99 array initializers for bridge_control_table[].


Revision tags: nick-csl-alignment-base5 bouyer-xeni386-merge1 matt-armv6-prevmlocking vmlocking2-base3 yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 jmcneill-base bouyer-xenamd64-base2 vmlocking-nbase yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base matt-armv6-base jmcneill-pm-base reinoud-bufcleanup-base vmlocking-base
# 1.54 27-Aug-2007 dyoung

branches: 1.54.2; 1.54.8; 1.54.14;
LLADDR -> CLLADDR.


# 1.53 26-Aug-2007 dyoung

Constify: LLADDR -> CLLADDR. I'm aiming here to make it easier to
identify sockaddr_dl abuse that remains in the kernel, especially
the potential for overwriting memory past the end of a sockaddr_dl
with, e.g., memcpy(LLADDR(), ...).

Use sockaddr_dl_setaddr() in a few places.


Revision tags: matt-mips64-base nick-csl-alignment-base mjf-ufs-trans-base
# 1.52 09-Jul-2007 ad

branches: 1.52.2; 1.52.6;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.51 12-Mar-2007 ad

branches: 1.51.2;
Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.


# 1.50 04-Mar-2007 christos

branches: 1.50.2;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.49 21-Feb-2007 dyoung

Use __arraycount().


# 1.48 17-Feb-2007 dyoung

KNF: de-__P, bzero -> memset, bcmp -> memcmp. Remove extraneous
parentheses in return statements.

Cosmetic: don't open-code TAILQ_FOREACH().

Cosmetic: change types of variables to avoid oodles of casts: in
in6_src.c, avoid casts by changing several route_in6 pointers
to struct route pointers. Remove unnecessary casts to caddr_t
elsewhere.

Pave the way for eliminating address family-specific route caches:
soon, struct route will not embed a sockaddr, but it will hold
a reference to an external sockaddr, instead. We will set the
destination sockaddr using rtcache_setdst(). (I created a stub
for it, but it isn't used anywhere, yet.) rtcache_free() will
free the sockaddr. I have extracted from rtcache_free() a helper
subroutine, rtcache_clear(). rtcache_clear() will "forget" a
cached route, but it will not forget the destination by releasing
the sockaddr. I use rtcache_clear() instead of rtcache_free()
in rtcache_update(), because rtcache_update() is not supposed
to forget the destination.

Constify:

1 Introduce const accessor for route->ro_dst, rtcache_getdst().

2 Constify the 'dst' argument to ifnet->if_output(). This
led me to constify a lot of code called by output routines.

3 Constify the sockaddr argument to protosw->pr_ctlinput. This
led me to constify a lot of code called by ctlinput routines.

4 Introduce const macros for converting from a generic sockaddr
to family-specific sockaddrs, e.g., sockaddr_in: satocsin6,
satocsin, et cetera.


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.47 04-Jan-2007 elad

branches: 1.47.2;
Consistent usage of KAUTH_GENERIC_ISSUSER.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3 netbsd-4-base
# 1.46 23-Nov-2006 rpaulo

New EtherIP driver based on tap(4) and gif(4) by Hans Rosenfeld.
Notable changes:
* Fixes PR 34268.
* Separates the code from gif(4) (which is more cleaner).
* Allows the usage of STP (Spanning Tree Protocol).
* Removed EtherIP implementation from gif(4)/tap(4).

Some input from Christos.


# 1.45 16-Nov-2006 christos

__unused removal on arguments; approved by core.


Revision tags: yamt-splraiseipl-base2
# 1.44 17-Oct-2006 dogcow

now that we have -Wno-unused-parameter, back out all the tremendously ugly
code to gratuitously access said parameters.


# 1.43 13-Oct-2006 dogcow

More -Wunused fallout. sprinkle __unused when possible; otherwise, use the
do { if (&x) {} } while (/* CONSTCOND */ 0);
construct as suggested by uwe in <20061012224845.GA9449@snark.ptc.spbu.ru>.


# 1.42 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


# 1.41 05-Oct-2006 tls

Protect calls to pool_put/pool_get that may occur in interrupt context
with spl used to protect other allocations and frees, or datastructure
element insertion and removal, in adjacent code.

It is almost unquestionably the case that some of the spl()/splx() calls
added here are superfluous, but it really seems wrong to see:

s=splfoo();
/* frob data structure */
splx(s);
pool_put(x);

and if we think we need to protect the first operation, then it is hard
to see why we should not think we need to protect the next. "Better
safe than sorry".

It is also almost unquestionably the case that I missed some pool
gets/puts from interrupt context with my strategy for finding these
calls; use of PR_NOWAIT is a strong hint that a pool may be used from
interrupt context but many callers in the kernel pass a "can wait/can't
wait" flag down such that my searches might not have found them. One
notable area that needs to be looked at is pf.

See also:

http://mail-index.netbsd.org/tech-kern/2006/07/19/0003.html
http://mail-index.netbsd.org/tech-kern/2006/07/19/0009.html


Revision tags: abandoned-netbsd-4-base yamt-splraiseipl-base yamt-pdpolicy-base9 yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.40 23-Jul-2006 ad

branches: 1.40.4; 1.40.6;
Use the LWP cached credentials where sane.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.39 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.38 18-May-2006 liamjfoy

branches: 1.38.2;
Integrate Common Address Redundancy Procotol (CARP) from OpenBSD

'pseudo-device carp'

Thanks to: joerg@ christos@ riz@ and others who tested
Ok: core@


# 1.37 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base yamt-pdpolicy-base2 elad-kernelauth-base yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.36 17-Jan-2006 christos

branches: 1.36.2; 1.36.4; 1.36.6; 1.36.8; 1.36.10;
Make sure that breq is also cleared (from Xin LI)


# 1.35 09-Jan-2006 christos

Make sure we initialize all structs to 0; from Xin LI


# 1.34 24-Dec-2005 perry

branches: 1.34.2;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.33 11-Dec-2005 thorpej

ANSI function decls and application of static.


# 1.32 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.31 01-Jun-2005 jdc

branches: 1.31.2;
Fix this properly by renaming the conflicting variables.


# 1.30 01-Jun-2005 jdc

Remove extraneous definition of struct llc (found by shadow warning).


Revision tags: netbsd-3-0-RELEASE netbsd-3-0-RC6 netbsd-3-0-RC5 netbsd-3-0-RC4 netbsd-3-0-RC3 netbsd-3-0-RC2 netbsd-3-0-RC1 yamt-km-base4 yamt-km-base3 netbsd-3-base kent-audio2-base
# 1.29 26-Feb-2005 perry

branches: 1.29.2; 1.29.4;
nuke trailing whitespace


Revision tags: yamt-km-base2
# 1.28 31-Jan-2005 kim

Add RFC 3378 EtherIP support, ported from OpenBSD to NetBSD by
Hans Rosenfeld (rosenfeld at grumpf.hope-2000.org)

This change makes it possible to add gif interfaces to bridges, which
will then send and receive IP protocol 97 packets. Packets are Ethernet
frames with an EtherIP header prepended.


Revision tags: yamt-km-base kent-audio1-beforemerge kent-audio1-base
# 1.27 04-Dec-2004 peter

branches: 1.27.4; 1.27.6;
Change ifc_destroy to return an int instead of void, so that it
can pass back errors to ifconfig.


# 1.26 06-Oct-2004 bad

Interfaces that do checksum offloading indicate the checksum status of
received packets in csum_flags in the packet header. Packets that are
forwarded over the bridge need to have csum_flags cleared before being
put on the output queue. Do so in bridge_enqueue().

Discussed with Jason Thorpe.

Fixes PR kern/27007 and the first part of PR kern/21831.


# 1.25 05-Oct-2004 christos

Only enable BRIDGE_IPF code if PFIL_HOOKS is enabled.


# 1.24 21-Apr-2004 itojun

kill a sprintf


# 1.23 21-Apr-2004 itojun

kill sprintf, use snprintf


Revision tags: netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.22 31-Jan-2004 jdc

branches: 1.22.2;
Use m_copydata(), m_adj() and M_PREPEND() to manipulate mbuf's in
bridge_ipf(). Fixes kernel memory corruption that occured when using
m_split() and m_cat().
Idea from OpenBSD.


# 1.21 09-Dec-2003 augustss

Fix spelling mistake in a comment.


# 1.20 28-Oct-2003 mycroft

Mark this initializer in the canonical way so it can be found later.


# 1.19 25-Oct-2003 christos

Fix uninitialized variable warnings


# 1.18 16-Sep-2003 jdc

Add a flag parameter to bridge_enqueue() to tell it whether to run the
filter or not. We only need to run the filter for bridge_forward() and
bridge_broadcast(). If we also run it for bridge_output(), we will run
the filter twice outbound per packet, so don't.

In bridge_ipf(), make sure we don't run m_cat() on a single mbuf chain
by checking to see (and remembering) if we need to m_split() the mbuf.
This fixes bridge + ipfilter on sparc.

Fixes PR kern/22063.


# 1.17 11-Aug-2003 itojun

rm extra blank line


# 1.16 13-Jul-2003 jdc

Include opt_inet.h to get INET6 definition.
Now, bridged ipv6 packets are passed through ipfilter.
However, some v6 packets still do not get transmitted when ipf is enabled.
Partial fix for PR kern/22063.


# 1.15 23-Jun-2003 martin

branches: 1.15.2;
Make sure to include opt_foo.h if a defflag option FOO is used.


# 1.14 24-May-2003 kristerw

Make sure splx() is called for all bridge_ioctl() error cases.


# 1.13 16-May-2003 itojun

use strlcpy


# 1.12 14-May-2003 itojun

use arc4random


# 1.11 19-Mar-2003 bouyer

Fix 2 bugs:
- initialise stp when the bridge is turned up, without this stp will keep
all interfaces disabled in a sequence like:
brconfig bridge0 add if0 add if1 stp if0 stp if1 up
- s/BRDGSPRI/BRDGSIFPRIO in brconfig.c:cmd_ifpriority()

add a command (ifpathcost) to change the stp path cost of the STP path cost of
an interface. Display the interface path cost with the others STP parameters.


# 1.10 27-Feb-2003 perseant

Make BRIDGE_IPF an option, and document it. Add it (commented) to GENERIC.
Let brconfig tell whether the bridge is using the ipfilter hook, or not.


# 1.9 15-Feb-2003 perseant

Add ipf packet-filtering option to if_bridge. The option is controlled at
compile-time by BRIDGE_IPF, and at runtime by brconfig with the {ipf,-ipf}
option on a per-bridge basis.

As a side-effect, add PFIL_HOOKS processing to if_bridge.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base kqueue-aftermerge kqueue-beforemerge gehenna-devsw-base kqueue-base
# 1.8 24-Aug-2002 martin

Add a function to lookup bridge members by struct ifnet * and use
it at all call sites that have such a pointer readily available.
This avoids unnecessary strcmp()s in critical paths, and removes
some XXX comments.


# 1.7 08-Jun-2002 itojun

reject "add" request if if_mtu is different.


# 1.6 23-May-2002 itojun

use IFT_BRIDGE


Revision tags: netbsd-1-6-base
# 1.5 24-Mar-2002 jdolecek

branches: 1.5.2; 1.5.4;
Fix a memory leak in bridge_ioctl_add() when the called for non-ethernet
interface.
Problem noted and fix provided by in kern/16019 by Love.


Revision tags: eeh-devprop-base newlock-base
# 1.4 08-Mar-2002 thorpej

Pool deals fairly well with physical memory shortage, but it doesn't
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map). Try to deal with this:

* Group all information about the backend allocator for a pool in a
separate structure. The pool references this structure, rather than
the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
to become available, but will still fail if it cannot callocate KVA
space for the pages. If this happens, carefully drain all pools using
the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
some pages, and use that information to make draining easier and more
efficient.
* Get rid of PR_URGENT. There was only one use of it, and it could be
dealt with by the caller.

From art@openbsd.org.


Revision tags: ifpoll-base
# 1.3 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.2 17-Aug-2001 thorpej

branches: 1.2.2; 1.2.4;
Only report expire time for DYNAMIC forwarding table entries.


# 1.1 17-Aug-2001 thorpej

Add support for building Ethernet bridges, based on Jason Wright's
bridge driver from OpenBSD, although the bridge code has been *heavily*
modified by me (the 802.1D code remains mostly unchanged from the
original).


# 1.168 24-Feb-2020 rin

Remove debug printf I put into bridge_calc_csum_flags().
Sorry for noise.


# 1.167 23-Feb-2020 jdolecek

disable the DEBUG bridge_calc_csum_flags() printf


# 1.166 29-Jan-2020 thorpej

Adopt <net/if_stats.h>.


Revision tags: ad-namecache-base2 ad-namecache-base1 ad-namecache-base phil-wifi-20191119
# 1.165 05-Aug-2019 msaitoh

Cast uint32_t to avoid undefined behavior in bridge_rthash(). Found by kUBSan.


Revision tags: netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609 isaki-audio2-base pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226
# 1.164 22-Dec-2018 rin

Take the interface out of promiscuous mode in bridge_delete_member()
instead of bridge_ioctl_del(). Otherwise, the member interfaces are
left in promiscuous mode when the bridge is destroyed.


# 1.163 15-Dec-2018 rin

Improve wording in comments: replace "chain" with "queue" for
sequence of mbuf's connected by m_nextpkt, in order to avoid
confusion with those connected by m_next.

No binary changes.


# 1.162 14-Dec-2018 martin

Need <netinet6/ip6_var.h> for ip6_statinc() prototype.


# 1.161 12-Dec-2018 rin

PR kern/53562

Handle TX offload in software when a packet is sent via
bridge_output(). We can send it as is in the following
exceptional cases:

For unicast:

(1) When the destination interface is the same as source.

(2) When the destination supports all TX offload options
specified in a packet.

For multicast/broadcast:

(3) When all the members of the bridge support the specified
TX offload options.

For (3), add sc_csum_flags_tx flag to bridge softc, which is
logical AND b/w capabilities of TX offload options in member
interface (ifp->if_csum_flags_tx). The flag is updated when a
member is (i) added to or (ii) removed from a bridge, or (iii)
if_csum_flags_tx flag of a member interface is manipulated via
ifconfig(8).

Turn on M_CSUM_TSOv[46] bit in ifp->if_csum_flags_tx flag when
TSO[46] is enabled for that interface.

OK msaitoh thorpej


Revision tags: pgoyette-compat-1126
# 1.160 09-Nov-2018 ozaki-r

Fix that brconfig <bridge> (addr) can't show a large number of MAC addresses

The command shows only 256 addresses at maximum even if a bridge caches more
addresses. It occurs because the kernel doesn't return an error if the command
passes a short buffer that can't store all cached addresses; the kernel fills
cached addresses as much as possible and returns it without telling that the
result is truncated.

Fix the issue by telling a required size of a buffer if a buffer passed from the
command is not enough, which lets the command retry with an enough buffer.

Reported by k-goda@IIJ


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930
# 1.159 19-Sep-2018 msaitoh

Micro optimization. m_copym(M_COPYALL) -> m_copypacket().


# 1.158 18-Sep-2018 msaitoh

- Fix bridge_enqueue() which was broken by last commit. Use correct mbuf
pointer.
- Modify comment.


# 1.157 14-Sep-2018 msaitoh

Fix a bug that bridge_enqueue() incorrectly cleared outgoing packet's offload
flags. bridge_enqueue() is called from bridge_output() when a packet is
spontaneous. Clear csum_flags before calling brige_enqueue() in
bridge_forward() or bridge_broadcast() instead of in the beginning of
bridge_enqueue().

Note that this change doesn't fix a problem on the following configuration:

A bridge has two or more interfaces.

An address is assigned to an bridge member interface and
some offload flags are set.

Another interface has no address and has no any offload flag.

XXX pullup-[78]


Revision tags: pgoyette-compat-0906 pgoyette-compat-0728 phil-wifi-base pgoyette-compat-0625
# 1.156 25-May-2018 ozaki-r

branches: 1.156.2;
Ensure to call if_register after interface initializations finish


Revision tags: pgoyette-compat-0521
# 1.155 14-May-2018 ozaki-r

Protect packet input routines with KERNEL_LOCK and splsoftnet

if_input, i.e, ether_input and friends, now runs in softint without any
protections. It's ok for ether_input itself because it's already MP-safe,
however, subsequent routines called from it such as carp_input and agr_input
aren't safe because they're not MP-safe. Protect if_input with KERNEL_LOCK.

if_input can be called from a normal LWP context. In that case we need to
prevent interrupts (softint) from running by splsoftnet to protect non-MP-safe
codes (e.g., carp_input and agr_input).

Pointed out by mlelstv@


Revision tags: pgoyette-compat-0502 pgoyette-compat-0422
# 1.154 18-Apr-2018 ozaki-r

Add missing PSLIST_ENTRY_INIT and PSLIST_ENTRY_DESTROY


# 1.153 18-Apr-2018 ozaki-r

Get rid of a unnecessary semicolon

Pointed out by kamil@


# 1.152 18-Apr-2018 ozaki-r

bridge: use pslist(9) for rtlist and rthash

The change fixes race conditions on list operations. One example is that a
reader may see invalid pointers on a looking item in a list due to lack of
membar_producer.


# 1.151 18-Apr-2018 ozaki-r

Simplify bridge_rtnode_insert (NFC)


# 1.150 18-Apr-2018 ozaki-r

Remove obsolete NULL checks


Revision tags: pgoyette-compat-0415
# 1.149 10-Apr-2018 ozaki-r

Fix bridge_rtdelete

It removes a rtable entry that belongs to a specified interface, however, its
original behavior was to delete all belonging entries. Restore the original
behavior.


Revision tags: pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.148 15-Jan-2018 maxv

branches: 1.148.2;
If the bridge is not running, don't call bridge_stop. Otherwise the
following commands will crash the kernel:

ifconfig bridge0 create
ifconfig bridge0 destroy


# 1.147 28-Dec-2017 ozaki-r

Ensure the timer isn't running by using workqueue_wait


# 1.146 19-Dec-2017 ozaki-r

Don't set IFEF_MPSAFE unless NET_MPSAFE at this point

Because recent investigations show that interfaces with IFEF_MPSAFE need to
follow additional restrictions to work with the flag safely. We should enable it
on an interface by default only if the interface surely satisfies the
restrictions, which are described in if.h.

Note that enabling IFEF_MPSAFE solely gains a few benefit on performance because
the network stack is still serialized by the big kernel locks by default.


# 1.145 11-Dec-2017 ozaki-r

Wrap if_ioctl_lock with IFNET_* macros (NFC)

Also if_ioctl_lock perhaps needs to be renamed to something because it's now
not just for ioctl...


# 1.144 08-Dec-2017 ozaki-r

Fix build of kernels without ether

By throwing out if_enable_vlan_mtu and if_disable_vlan_mtu that
created a unnecessary dependency from if.c to if_ethersubr.c.

PR kern/52790


# 1.143 06-Dec-2017 ozaki-r

Ensure to not turn on IFF_RUNNING of an interface until its initialization completes

And ensure to turn off it before destruction as per IFF_RUNNING's description
"resource allocated". (The description is a bit doubtful though, I believe the
change is still proper.)


# 1.142 06-Dec-2017 ozaki-r

Ensure to hold if_ioctl_lock when calling if_flags_set


Revision tags: tls-maxphys-base-20171202
# 1.141 17-Nov-2017 ozaki-r

Add missing IFEF_NO_LINK_STATE_CHANGE to bridge


# 1.140 16-Nov-2017 ozaki-r

Unify IFEF_*_MPSAFE into IFEF_MPSAFE

There are already two flags for if_output and if_start, however, it seems such
MPSAFE flags are eventually needed for all if_XXX operations. Having discrete
flags for each operation is wasteful of if_extflags bits. So let's unify
the flags into one: IFEF_MPSAFE.

Fortunately IFEF_*_MPSAFE flags have never been included in any releases, so
we can change them without breaking backward compatibility of the releases
(though the kernel version of -current should be bumped).

Note that if an interface have both MP-safe and non-MP-safe operations at a
time, we have to set the IFEF_MPSAFE flag and let callees of non-MP-safe
opeartions take the kernel lock.

Proposed on tech-kern@ and tech-net@


# 1.139 15-Nov-2017 ozaki-r

Mark callouts of bridge CALLOUT_MPSAFE


# 1.138 25-Oct-2017 ozaki-r

Remove unnecessary splsoftnet


# 1.137 25-Oct-2017 ozaki-r

Don't free sc_rthash twice


# 1.136 23-Oct-2017 msaitoh

- If if_initialize() failed in the attach function, free resources and return.
- Add some missing frees in bridge_clone_destroy().
- KNF


# 1.135 02-Oct-2017 ozaki-r

Add curlwp_bind to bridge_input for psref

It can be called in a thread context via tap (tap_dev_write).

Fix PR kern/52587


Revision tags: nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320
# 1.134 07-Mar-2017 ozaki-r

branches: 1.134.6;
Remove unnecessary splnet for bridge_enqueue

bridge_enqueue now uses if_transmit_lock that does splnet for device
drivers, so splnet for bridge_enqueue isn't needed anymore.


# 1.133 16-Feb-2017 knakahara

add l2tp(4) L2TPv3 interface.

originally implemented by IIJ SEIL team.


Revision tags: nick-nhusb-base-20170204
# 1.132 23-Jan-2017 ozaki-r

Replace some splnet with splsoftnet


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107 nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004
# 1.131 15-Sep-2016 christos

branches: 1.131.2;
Always do the mbuf checks. The packet filters (npf) expect the mbuf to be
pulled-up. (Krists Krilovs)


Revision tags: localcount-20160914
# 1.130 29-Aug-2016 ozaki-r

KNF; replace white spaces with hard tabs

No functional change.


Revision tags: pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.129 22-Jun-2016 knakahara

branches: 1.129.2;
fix: locking about IFQ_ENQUEUE and ALTQ

- If NET_MPSAFE is not defined, IFQ_LOCK is nop. Currently, that means
IFQ_ENQUEUE() of some paths such as bridge_enqueue() is called parallel
wrongly.
- If ALTQ is enabled, Tx processing should call if_transmit() (= IFQ_ENQUEUE
+ ifp->if_start()) instead of ifp->if_transmit() to call ALTQ_ENQUEUE()
and ALTQ_DEQUEUE().
Furthermore, ALTQ processing is always required KERNEL_LOCK currently.


# 1.128 20-Jun-2016 knakahara

fix: should not assert IFEF_OUTPUT_MPSAFE in bridge_output()


# 1.127 20-Jun-2016 knakahara

tentative fix for ATF(net/if_bridge/t_bridge)


# 1.126 20-Jun-2016 knakahara

make bridge_output MP-safe, so that bridge(4) can enable IFEF_OUTPUT_MPSAFE.

making MP-scalable is future work.


# 1.125 10-Jun-2016 ozaki-r

Avoid storing a pointer of an interface in a mbuf

Having a pointer of an interface in a mbuf isn't safe if we remove big
kernel locks; an interface object (ifnet) can be destroyed anytime in any
packet processing and accessing such object via a pointer is racy. Instead
we have to get an object from the interface collection (ifindex2ifnet) via
an interface index (if_index) that is stored to a mbuf instead of an
pointer.

The change provides two APIs: m_{get,put}_rcvif_psref that use psref(9)
for sleep-able critical sections and m_{get,put}_rcvif that use
pserialize(9) for other critical sections. The change also adds another
API called m_get_rcvif_NOMPSAFE, that is NOT MP-safe and for transition
moratorium, i.e., it is intended to be used for places where are not
planned to be MP-ified soon.

The change adds some overhead due to psref to performance sensitive paths,
however the overhead is not serious, 2% down at worst.

Proposed on tech-kern and tech-net.


# 1.124 10-Jun-2016 ozaki-r

Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.123 16-May-2016 ozaki-r

Apply if_get and if_put to bridge(4)


# 1.122 04-May-2016 roy

Allow multicast/broadcast packets from a bridge member to other members.
Note this should just call bridge_broadcast when more locking issues are
resolved.


# 1.121 28-Apr-2016 knakahara

introduce new ifnet MP-scalable sending interface "if_transmit".


# 1.120 28-Apr-2016 ozaki-r

Constify rtentry of if_output

We no longer need to change rtentry below if_output.

The change makes it clear where rtentries are changed (or not)
and helps forthcoming locking (os psrefing) rtentries.


# 1.119 24-Apr-2016 christos

CID 1358673: dead code


Revision tags: nick-nhusb-base-20160422
# 1.118 22-Apr-2016 roy

Change used from int to bool.
If used, abort the loop because we think we're already at the end.


# 1.117 20-Apr-2016 knakahara

IFQ_ENQUEUE refactor (3/3) : eliminate pktattr argument from IFQ_ENQUEUE caller


# 1.116 20-Apr-2016 knakahara

IFQ_ENQUEUE refactor (2/3) : eliminate pktattr argument from altq implemantation


# 1.115 19-Apr-2016 ozaki-r

Apply psref(9) to bridge(4)

Note that there is an issue that ioctls for an interface and a destruction
of the interface can run in parallel and it causes race conditions on
bridge as well (it rarely happens). The issue will be addressed in the
interface common code (if.c).


# 1.114 19-Apr-2016 ozaki-r

Remove BRIDGE_MPSAFE switch and enable MP-safe code by default

We need to enable it by default because bridge_input now runs
in softint, but bridge_input w/o BRIDGE_MPSAFE was designed as
it runs in hardware interrupt.

Note that there remains a racy code in bridge_output; it will be
solved in the upcoming change (applying psref(9)).


# 1.113 11-Apr-2016 ozaki-r

Fix usage of pslist(9)

Pointed out by riastradh@.


# 1.112 11-Apr-2016 ozaki-r

Use pslist(9) in bridge(4)

This adds missing memory barriers to list operations for pserialize.


# 1.111 28-Mar-2016 ozaki-r

Remove unused global bridge list

Pointed out by riastradh@


# 1.110 23-Mar-2016 ozaki-r

Fix LIST_FOREACH argument


# 1.109 23-Mar-2016 ozaki-r

Use LIST_FOREACH instead of LIST_FOREACH_SAFE

No need to use *_SAFE because we don't remove any items in the loop.


Revision tags: nick-nhusb-base-20160319
# 1.108 15-Feb-2016 ozaki-r

Simplify bridge(4)

Thanks to introducing softint-based if_input, the entire bridge code now
never run in hardware interrupt context. So we can simplify the code.

- Remove spin mutexes
- They were needed because some code of bridge could run in
hardware interrupt context
- We now need only an adaptive mutex for each shared object
(a member list and a forwarding table)
- Remove pktqueue
- bridge_input is already in softint, using another softint
(for bridge_forward) is useless
- Packet distribution should be down at device drivers


# 1.107 10-Feb-2016 ozaki-r

Don't share struct work, instead have one per softc

Pointed out by riastradh@


# 1.106 09-Feb-2016 ozaki-r

Introduce softint-based if_input

This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.

This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.

To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.

Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html

Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!


Revision tags: nick-nhusb-base-20151226
# 1.105 19-Nov-2015 christos

Add handling of VLAN packets in if_bridge where the parent interface supports
them (Jean-Jacques.Puig@espci.fr). Factor out the vlan_mtu enabling and
disabling code.


# 1.104 20-Oct-2015 maxv

Harmless alloc inconsistency; make sure the exact same argument is given to
kmem_alloc/kmem_free. Found by Brainy.


# 1.103 07-Oct-2015 ozaki-r

Enqueue frames to a curcpu's pktqueue

Currently RX can run on a CPU other than CPU#0, so always enqueuing
to a pktqueue of CPU#0 makes no sense. Let's use a curcpu's pktqueue,
although bridge_foward softint doesn't run in parallel without
NET_MPSAFE.

This is a temporal solution. We need a fundamental solution.


Revision tags: nick-nhusb-base-20150921
# 1.102 28-Aug-2015 rjs

Don't set M_PROTO1 in mbuf flags.

This was left over from the old usage of gif(4) with bridges.


# 1.101 20-Aug-2015 christos

include "ioconf.h" to get the 'void <driver>attach(int count);' prototype.


# 1.100 23-Jul-2015 ozaki-r

Fix PR 48104

So far bridge cannot receive frames via a member interface when the frames
come from another member interface. So when we assign an IP address to
a member interface, hosts connected to another member interface cannot
ping to the IP address. That behavior isn't expected. See PR 48104 for
more realistic examples of this issue.

The change does:
- drop M_PROMISC before ether_input, which allows a bridge member interface
to receive a frame coming from another bridge member interface
- receive broadcast/multicast frames via all bridge member interfaces,
which is required to receive IPv6 multicast packets destined to a
multicast group belonging to a bridge member interface that is different
from a packet arrival interface

roy@ helped testing of the fix, thanks!


Revision tags: nick-nhusb-base-20150606
# 1.99 01-Jun-2015 matt

Modify the BRDGGIFS and BRDGRTS cmds to be more COMPAT_NETBSD32 friendly.
(XXX whitespace)


# 1.98 16-Apr-2015 ozaki-r

Fix racy bridge_delete_member

It can be called from bridge_ioctl_del and bridge_clone_destroy with
a same bridge member (bif) at the same time. We have to prevent
that happens.

Pointed out by riastradh@


Revision tags: nick-nhusb-base-20150406
# 1.97 08-Jan-2015 ozaki-r

Use pserialize for rtlist in bridge

This change enables lockless accesses to bridge rtable lists.
See locking notes in a comment to know how pserialize and
mutexes are used. Some functions are rearranged to use
pserialize. A workqueue is introduced to use pserialize in
bridge_rtage via bridge_timer callout.

As usual, pserialize and mutexes are used only when NET_MPSAFE
on. On the other hand, the newly added workqueue is used
regardless of NET_MPSAFE on or off.


# 1.96 01-Jan-2015 ozaki-r

Reset the expire time of a cache on receiving a frame for the cache

The expire time of a cache in a bridge MAC address table was never reset
once it is initialized regardless of traffic for the cache. The behavior
isn't supposed and active caches are unnecessarily expired and removed.

PR kern/49507


# 1.95 31-Dec-2014 ozaki-r

Use pserialize in bridge

This change enables lockless accesses to bridge member lists.
See locking notes in a comment to know how pserialize and
mutexes are used.

This change also provides support for softint-based interrupt
handling; pserialize readers can run in both HW interrupt and
softint contexts.

As usual, pserialize is used only when NET_MPSAFE on.


# 1.94 25-Dec-2014 ozaki-r

Use LIST_FOREACH_SAFE in bridge_rt* functions


# 1.93 24-Dec-2014 ozaki-r

Replace malloc/free with kmem_* in if_bridge

Additionally M_NOWAIT is replaced with KM_SLEEP.


# 1.92 22-Dec-2014 ozaki-r

Call ether_input/m_freem without holding a lock or referencing unnecessary objects

When NET_MPSAFE on, a bridge tries to pass up a packet to Layer 3
(or call m_freem) with holding a lock or referencing unnecessary
objects. That causes random lock ups. The change fixes the issue.


Revision tags: nick-nhusb-base
# 1.91 15-Aug-2014 ozaki-r

branches: 1.91.2;
bridge: reject non-IFF_SIMPLEX interfaces

bridge does not work with !IFF_SIMPLEX interfaces (PR/18035);
the bug is not yet fixed. Until it gets fixed, we should
reject non-IFF_SIMPLEX interfaces.

Discussed with pooka@


Revision tags: netbsd-7-1-2-RELEASE netbsd-7-1-1-RELEASE netbsd-7-1-RELEASE netbsd-7-1-RC2 netbsd-7-nhusb-base-20170116 netbsd-7-1-RC1 netbsd-7-0-2-RELEASE netbsd-7-nhusb-base netbsd-7-0-1-RELEASE netbsd-7-0-RELEASE netbsd-7-0-RC3 netbsd-7-0-RC2 netbsd-7-0-RC1 netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.90 23-Jul-2014 ozaki-r

branches: 1.90.2;
Avoid calling copyout with holding mutex(IPL_NET)

Because copyout may lead a page fault that may sleep, we have to pull it
out from the critical section of mutex(IPL_NET) in bridge_ioctl_gifs.


# 1.89 23-Jul-2014 ozaki-r

Add missing unlock


# 1.88 20-Jul-2014 ozaki-r

Don't return ENETRESET when ioctl SIOCSIFMTU

Otherwise, just changing MTU with ifconfig shows
a confusable error message.

RP kern/48996


# 1.87 14-Jul-2014 ozaki-r

Make bridge MPSAFE

- Introduce BRIDGE_MPSAFE
- It's enabled only when NET_MPSAFE is defined
in if.h or the kernel config
- Add iflist and rtlist mutex locks
- Locking iflist is performance sensitive,
so it's not used when !BRIDGE_MPSAFE
- Add bif object reference counting
- It enables fine-grain locking for bridge member lists
by allowing to not hold a lock during touching a bif
- bridge_release_member is added to decrement the
reference count
- A condition variable is added to do bridge_delete_member
gracefully
- Add if_bridgeif to ifnet
- It's a shortcut to a bif object of a bridge member
- It reduces a bif lookup cost and so lock contention on iflist
- Make bridgestp MPSAFE too


# 1.86 02-Jul-2014 ozaki-r

Protect bridge_list with a mutex


# 1.85 02-Jul-2014 ozaki-r

Remove obsolete codes for if_snd


# 1.84 23-Jun-2014 ozaki-r

Get rid of unnecessary xc_broadcast after pktq_barrier

Pointed out by rmind@


# 1.83 18-Jun-2014 ozaki-r

Restructure bridge_input and bridge_broadcast

There are two changes:
- Assemble the places calling pktq_enqueue (bridge_forward)
for unicast and {b,m}cast frames into one
- Receive {b,m}cast frames in bridge_broadcast, not in
bridge_input

The changes make the code clear and readable. bridge_input
now doesn't need to take care of {b,m}cast frames;
bridge_forward and bridge_broadcast have the responsibility.

The changes are based on a patch of Lloyd Parkes submitted
in PR 48104, but don't fix its issue yet.


# 1.82 18-Jun-2014 ozaki-r

Tidy up bridge_input

No functional change.


# 1.81 17-Jun-2014 ozaki-r

Restructure ether_input and bridge_input

The network stack of NetBSD is well organized and
layered. A packet reception is processed from a
lower layer to an upper layer one by one. However,
ether_input and bridge_input are not structured so.
bridge_input is called inside ether_input.

The new structure replaces ifnet#if_input of a bridge
member with bridge_input when the member is attached.
So a packet goes straight on a packet reception via
a bridge, bridge_input => ether_input => ip_input.

The change is part of a patch of Lloyd Parkes submitted
in PR 48104. Unlike the patch, the change doesn't
intend to change the behavior of the packet processing.
Another patch will fix PR 48104.


# 1.80 16-Jun-2014 ozaki-r

Add net.interfaces.bridgeN.fwdq.{maxlen,len,drops} sysctl


# 1.79 16-Jun-2014 ozaki-r

Use pktqueue for bridge forwarding queue and softint


# 1.78 15-Jun-2014 ozaki-r

Get rid of unnecessary splnet for pool_{get,put}

A mutex prevents interrupts in the functions now.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base
# 1.77 29-Jun-2013 rmind

branches: 1.77.4;
- Rewrite parts of pfil(9): use array to store hooks and thus be more cache
friendly (there are only few hooks in the system). Make the structures
opaque and the interface more strict.
- Remove PFIL_HOOKS option by making pfil(9) mandatory.


Revision tags: agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8
# 1.76 22-Mar-2012 wiz

branches: 1.76.2; 1.76.4;
Fix typo in kauth name. From PR 46234 by Matthew Mondor.
Tested by Geoff Adams and Ryo ONODERA.


# 1.75 13-Mar-2012 elad

Replace the remaining KAUTH_GENERIC_ISSUSER authorization calls with
something meaningful. All relevant documentation has been updated or
written.

Most of these changes were brought up in the following messages:

http://mail-index.netbsd.org/tech-kern/2012/01/18/msg012490.html
http://mail-index.netbsd.org/tech-kern/2012/01/19/msg012502.html
http://mail-index.netbsd.org/tech-kern/2012/02/17/msg012728.html

Thanks to christos, manu, njoly, and jmmv for input.

Huge thanks to pgoyette for spinning these changes through some build
cycles and ATF.


Revision tags: netbsd-6-0-6-RELEASE netbsd-6-1-5-RELEASE netbsd-6-1-4-RELEASE netbsd-6-0-5-RELEASE netbsd-6-1-3-RELEASE netbsd-6-0-4-RELEASE netbsd-6-1-2-RELEASE netbsd-6-0-3-RELEASE netbsd-6-1-1-RELEASE netbsd-6-0-2-RELEASE netbsd-6-1-RELEASE netbsd-6-1-RC4 netbsd-6-1-RC3 netbsd-6-1-RC2 netbsd-6-1-RC1 netbsd-6-0-1-RELEASE matt-nb6-plus-nbase netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base
# 1.74 19-Nov-2011 tls

branches: 1.74.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.73 23-May-2011 joerg

branches: 1.73.4;
simplify


Revision tags: bouyer-quota2-nbase bouyer-quota2-base jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.72 07-Dec-2010 pooka

branches: 1.72.2;
_KERNEL_TOP


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1 yamt-nfs-mp-base9 uebayasi-xip-base
# 1.71 19-Jan-2010 pooka

branches: 1.71.4;
Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


Revision tags: matt-premerge-20091211 yamt-nfs-mp-base8 yamt-nfs-mp-base7 jymxensuspend-base yamt-nfs-mp-base6 yamt-nfs-mp-base5 jym-xensuspend-nbase
# 1.70 17-May-2009 cegger

fix crash in bridge_ioctl():


BRDGGFLT and BRDGSFILT bridge controls are only available with BRIDGE_IPF and PFIL_HOOKS defined.
In amd64 GENERIC and XEN kernel configs PFIL_HOOKS is defined but BRIDGE_IPF is not.

When a BRDGGFLT or BRDGSFILT command comes in, then ifd->ifd_cmd is not in range
of bridge_control_table_size. Then bc is not set and is dereferenced
later => BOOM.


Revision tags: yamt-nfs-mp-base4 jym-xensuspend-base
# 1.69 12-May-2009 elad

Move kauth(9) call before going into splnet().

Mailing list reference:

http://mail-index.netbsd.org/tech-net/2009/05/08/msg001286.html


Revision tags: yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 nick-hppapmap-base
# 1.68 04-Apr-2009 bouyer

Fix another typo


# 1.67 04-Apr-2009 bouyer

Fix a comment, and make it build.


# 1.66 04-Apr-2009 bouyer

Fixes from Masao Uebayashi


# 1.65 04-Apr-2009 bouyer

Fix for if_start() and pfil_hook() being called from hardware interrupt
context (reported on various mailing-lists, and part of PR kern/41114,
causing panic in pf(4) and possibly ipf(4) when BRIDGE_IPF is used).
Defer bridge_forward() to a software interrupt; bridge_input() enqueues
mbufs to ifp->if_snd which is handled in bridge_forward().


Revision tags: nick-hppapmap-base2
# 1.64 18-Jan-2009 mrg

branches: 1.64.2;
Fix multiple problems:

* A sign extension error creating the bridge ID corrupted the
priority (always making it the maximum).
* Do not catch STP packets on an interface for which STP is not
enabled -- it's a violation of the spec, and causes STP to fail on
neighboring bridges.
* An optimization to bstp_input() -- some information is already
known when we call it.

contributed anonymously.


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.63 07-Nov-2008 dyoung

*** Summary ***

When a link-layer address changes (e.g., ifconfig ex0 link
02:de:ad:be:ef:02 active), send a gratuitous ARP and/or a Neighbor
Advertisement to update the network-/link-layer address bindings
on our LAN peers.

Refuse a change of ethernet address to the address 00:00:00:00:00:00
or to any multicast/broadcast address. (Thanks matt@.)

Reorder ifnet ioctl operations so that driver ioctls may inherit
the functions of their "class"---ether_ioctl(), fddi_ioctl(), et
cetera---and the class ioctls may inherit from the generic ioctl,
ifioctl_common(), but both driver- and class-ioctls may override
the generic behavior. Make network drivers share more code.

Distinguish a "factory" link-layer address from others for the
purposes of both protecting that address from deletion and computing
EUI64.

Return consistent, appropriate error codes from network drivers.

Improve readability. KNF.

*** Details ***

In if_attach(), always initialize the interface ioctl routine,
ifnet->if_ioctl, if the driver has not already initialized it.
Delete if_ioctl == NULL tests everywhere else, because it cannot
happen.

In the ioctl routines of network interfaces, inherit common ioctl
behaviors by calling either ifioctl_common() or whichever ioctl
routine is appropriate for the class of interface---e.g., ether_ioctl()
for ethernets.

Stop (ab)using SIOCSIFADDR and start to use SIOCINITIFADDR. In
the user->kernel interface, SIOCSIFADDR's argument was an ifreq,
but on the protocol->ifnet interface, SIOCSIFADDR's argument was
an ifaddr. That was confusing, and it would work against me as I
make it possible for a network interface to overload most ioctls.
On the protocol->ifnet interface, replace SIOCSIFADDR with
SIOCINITIFADDR. In ifioctl(), return EPERM if userland tries to
invoke SIOCINITIFADDR.

In ifioctl(), give the interface the first shot at handling most
interface ioctls, and give the protocol the second shot, instead
of the other way around. Finally, let compatibility code (COMPAT_OSOCK)
take a shot.

Pull device initialization out of switch statements under
SIOCINITIFADDR. For example, pull ..._init() out of any switch
statement that looks like this:

switch (...->sa_family) {
case ...:
..._init();
...
break;
...
default:
..._init();
...
break;
}

Rewrite many if-else clauses that handle all permutations of IFF_UP
and IFF_RUNNING to use a switch statement,

switch (x & (IFF_UP|IFF_RUNNING)) {
case 0:
...
break;
case IFF_RUNNING:
...
break;
case IFF_UP:
...
break;
case IFF_UP|IFF_RUNNING:
...
break;
}

unifdef lots of code containing #ifdef FreeBSD, #ifdef NetBSD, and
#ifdef SIOCSIFMTU, especially in fwip(4) and in ndis(4).

In ipw(4), remove an if_set_sadl() call that is out of place.

In nfe(4), reuse the jumbo MTU logic in ether_ioctl().

Let ethernets register a callback for setting h/w state such as
promiscuous mode and the multicast filter in accord with a change
in the if_flags: ether_set_ifflags_cb() registers a callback that
returns ENETRESET if the caller should reset the ethernet by calling
if_init(), 0 on success, != 0 on failure. Pull common code from
ex(4), gem(4), nfe(4), sip(4), tlp(4), vge(4) into ether_ioctl(),
and register if_flags callbacks for those drivers.

Return ENOTTY instead of EINVAL for inappropriate ioctls. In
zyd(4), use ENXIO instead of ENOTTY to indicate that the device is
not any longer attached.

Add to if_set_sadl() a boolean 'factory' argument that indicates
whether a link-layer address was assigned by the factory or some
other source. In a comment, recommend using the factory address
for generating an EUI64, and update in6_get_hw_ifid() to prefer a
factory address to any other link-layer address.

Add a routing message, RTM_LLINFO_UPD, that tells protocols to
update the binding of network-layer addresses to link-layer addresses.
Implement this message in IPv4 and IPv6 by sending a gratuitous
ARP or a neighbor advertisement, respectively. Generate RTM_LLINFO_UPD
messages on a change of an interface's link-layer address.

In ether_ioctl(), do not let SIOCALIFADDR set a link-layer address
that is broadcast/multicast or equal to 00:00:00:00:00:00.

Make ether_ioctl() call ifioctl_common() to handle ioctls that it
does not understand.

In gif(4), initialize if_softc and use it, instead of assuming that
the gif_softc and ifp overlap.

Let ifioctl_common() handle SIOCGIFADDR.

Sprinkle rtcache_invariants(), which checks on DIAGNOSTIC kernels
that certain invariants on a struct route are satisfied.

In agr(4), rewrite agr_ioctl_filter() to be a bit more explicit
about the ioctls that we do not allow on an agr(4) member interface.

bzero -> memset. Delete unnecessary casts to void *. Use
sockaddr_in_init() and sockaddr_in6_init(). Compare pointers with
NULL instead of "testing truth". Replace some instances of (type
*)0 with NULL. Change some K&R prototypes to ANSI C, and join
lines.


Revision tags: netbsd-5-0-RC3 netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 wrstuden-revivesa-base-1 simonb-wapbl-nbase yamt-pf42-base4 simonb-wapbl-base wrstuden-revivesa-base
# 1.62 15-Jun-2008 christos

branches: 1.62.2; 1.62.4; 1.62.6;
- add if_alloc (ours just mallocs), and if_initname and use them (from FreeBSD)
- kill memsets where M_ZERO can be used.


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base
# 1.61 15-Apr-2008 thorpej

branches: 1.61.2; 1.61.4; 1.61.6; 1.61.8;
Make ip6 and icmp6 stats per-cpu.


# 1.60 12-Apr-2008 cegger

make this build with BRIDGE_IPF and PFIL_HOOKS options


# 1.59 12-Apr-2008 thorpej

Make IP, TCP, UDP, and ICMP statistics per-CPU. The stats are collated
when the user requests them via sysctl.


# 1.58 08-Apr-2008 thorpej

Change IPv6 stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ip6stat structure; old netstat
binaries will continue to work properly.


# 1.57 07-Apr-2008 thorpej

Change IP stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ipstat structure; old netstat
binaries will continue to work properly.


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase nick-net80211-sync-base keiichi-mipv6-base matt-armv6-nbase hpcarm-cleanup-base
# 1.56 20-Feb-2008 matt

branches: 1.56.6;
s/u_\(int[0-9]*_t\)/u\1/g
(change u_int*_t to uint*_t)


Revision tags: bouyer-xeni386-nbase bouyer-xeni386-base mjf-devfs-base
# 1.55 19-Jan-2008 dyoung

Use C99 array initializers for bridge_control_table[].


Revision tags: nick-csl-alignment-base5 bouyer-xeni386-merge1 matt-armv6-prevmlocking vmlocking2-base3 yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 jmcneill-base bouyer-xenamd64-base2 vmlocking-nbase yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base matt-armv6-base jmcneill-pm-base reinoud-bufcleanup-base vmlocking-base
# 1.54 27-Aug-2007 dyoung

branches: 1.54.2; 1.54.8; 1.54.14;
LLADDR -> CLLADDR.


# 1.53 26-Aug-2007 dyoung

Constify: LLADDR -> CLLADDR. I'm aiming here to make it easier to
identify sockaddr_dl abuse that remains in the kernel, especially
the potential for overwriting memory past the end of a sockaddr_dl
with, e.g., memcpy(LLADDR(), ...).

Use sockaddr_dl_setaddr() in a few places.


Revision tags: matt-mips64-base nick-csl-alignment-base mjf-ufs-trans-base
# 1.52 09-Jul-2007 ad

branches: 1.52.2; 1.52.6;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.51 12-Mar-2007 ad

branches: 1.51.2;
Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.


# 1.50 04-Mar-2007 christos

branches: 1.50.2;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.49 21-Feb-2007 dyoung

Use __arraycount().


# 1.48 17-Feb-2007 dyoung

KNF: de-__P, bzero -> memset, bcmp -> memcmp. Remove extraneous
parentheses in return statements.

Cosmetic: don't open-code TAILQ_FOREACH().

Cosmetic: change types of variables to avoid oodles of casts: in
in6_src.c, avoid casts by changing several route_in6 pointers
to struct route pointers. Remove unnecessary casts to caddr_t
elsewhere.

Pave the way for eliminating address family-specific route caches:
soon, struct route will not embed a sockaddr, but it will hold
a reference to an external sockaddr, instead. We will set the
destination sockaddr using rtcache_setdst(). (I created a stub
for it, but it isn't used anywhere, yet.) rtcache_free() will
free the sockaddr. I have extracted from rtcache_free() a helper
subroutine, rtcache_clear(). rtcache_clear() will "forget" a
cached route, but it will not forget the destination by releasing
the sockaddr. I use rtcache_clear() instead of rtcache_free()
in rtcache_update(), because rtcache_update() is not supposed
to forget the destination.

Constify:

1 Introduce const accessor for route->ro_dst, rtcache_getdst().

2 Constify the 'dst' argument to ifnet->if_output(). This
led me to constify a lot of code called by output routines.

3 Constify the sockaddr argument to protosw->pr_ctlinput. This
led me to constify a lot of code called by ctlinput routines.

4 Introduce const macros for converting from a generic sockaddr
to family-specific sockaddrs, e.g., sockaddr_in: satocsin6,
satocsin, et cetera.


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.47 04-Jan-2007 elad

branches: 1.47.2;
Consistent usage of KAUTH_GENERIC_ISSUSER.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3 netbsd-4-base
# 1.46 23-Nov-2006 rpaulo

New EtherIP driver based on tap(4) and gif(4) by Hans Rosenfeld.
Notable changes:
* Fixes PR 34268.
* Separates the code from gif(4) (which is more cleaner).
* Allows the usage of STP (Spanning Tree Protocol).
* Removed EtherIP implementation from gif(4)/tap(4).

Some input from Christos.


# 1.45 16-Nov-2006 christos

__unused removal on arguments; approved by core.


Revision tags: yamt-splraiseipl-base2
# 1.44 17-Oct-2006 dogcow

now that we have -Wno-unused-parameter, back out all the tremendously ugly
code to gratuitously access said parameters.


# 1.43 13-Oct-2006 dogcow

More -Wunused fallout. sprinkle __unused when possible; otherwise, use the
do { if (&x) {} } while (/* CONSTCOND */ 0);
construct as suggested by uwe in <20061012224845.GA9449@snark.ptc.spbu.ru>.


# 1.42 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


# 1.41 05-Oct-2006 tls

Protect calls to pool_put/pool_get that may occur in interrupt context
with spl used to protect other allocations and frees, or datastructure
element insertion and removal, in adjacent code.

It is almost unquestionably the case that some of the spl()/splx() calls
added here are superfluous, but it really seems wrong to see:

s=splfoo();
/* frob data structure */
splx(s);
pool_put(x);

and if we think we need to protect the first operation, then it is hard
to see why we should not think we need to protect the next. "Better
safe than sorry".

It is also almost unquestionably the case that I missed some pool
gets/puts from interrupt context with my strategy for finding these
calls; use of PR_NOWAIT is a strong hint that a pool may be used from
interrupt context but many callers in the kernel pass a "can wait/can't
wait" flag down such that my searches might not have found them. One
notable area that needs to be looked at is pf.

See also:

http://mail-index.netbsd.org/tech-kern/2006/07/19/0003.html
http://mail-index.netbsd.org/tech-kern/2006/07/19/0009.html


Revision tags: abandoned-netbsd-4-base yamt-splraiseipl-base yamt-pdpolicy-base9 yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.40 23-Jul-2006 ad

branches: 1.40.4; 1.40.6;
Use the LWP cached credentials where sane.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.39 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.38 18-May-2006 liamjfoy

branches: 1.38.2;
Integrate Common Address Redundancy Procotol (CARP) from OpenBSD

'pseudo-device carp'

Thanks to: joerg@ christos@ riz@ and others who tested
Ok: core@


# 1.37 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base yamt-pdpolicy-base2 elad-kernelauth-base yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.36 17-Jan-2006 christos

branches: 1.36.2; 1.36.4; 1.36.6; 1.36.8; 1.36.10;
Make sure that breq is also cleared (from Xin LI)


# 1.35 09-Jan-2006 christos

Make sure we initialize all structs to 0; from Xin LI


# 1.34 24-Dec-2005 perry

branches: 1.34.2;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.33 11-Dec-2005 thorpej

ANSI function decls and application of static.


# 1.32 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.31 01-Jun-2005 jdc

branches: 1.31.2;
Fix this properly by renaming the conflicting variables.


# 1.30 01-Jun-2005 jdc

Remove extraneous definition of struct llc (found by shadow warning).


Revision tags: netbsd-3-0-RELEASE netbsd-3-0-RC6 netbsd-3-0-RC5 netbsd-3-0-RC4 netbsd-3-0-RC3 netbsd-3-0-RC2 netbsd-3-0-RC1 yamt-km-base4 yamt-km-base3 netbsd-3-base kent-audio2-base
# 1.29 26-Feb-2005 perry

branches: 1.29.2; 1.29.4;
nuke trailing whitespace


Revision tags: yamt-km-base2
# 1.28 31-Jan-2005 kim

Add RFC 3378 EtherIP support, ported from OpenBSD to NetBSD by
Hans Rosenfeld (rosenfeld at grumpf.hope-2000.org)

This change makes it possible to add gif interfaces to bridges, which
will then send and receive IP protocol 97 packets. Packets are Ethernet
frames with an EtherIP header prepended.


Revision tags: yamt-km-base kent-audio1-beforemerge kent-audio1-base
# 1.27 04-Dec-2004 peter

branches: 1.27.4; 1.27.6;
Change ifc_destroy to return an int instead of void, so that it
can pass back errors to ifconfig.


# 1.26 06-Oct-2004 bad

Interfaces that do checksum offloading indicate the checksum status of
received packets in csum_flags in the packet header. Packets that are
forwarded over the bridge need to have csum_flags cleared before being
put on the output queue. Do so in bridge_enqueue().

Discussed with Jason Thorpe.

Fixes PR kern/27007 and the first part of PR kern/21831.


# 1.25 05-Oct-2004 christos

Only enable BRIDGE_IPF code if PFIL_HOOKS is enabled.


# 1.24 21-Apr-2004 itojun

kill a sprintf


# 1.23 21-Apr-2004 itojun

kill sprintf, use snprintf


Revision tags: netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.22 31-Jan-2004 jdc

branches: 1.22.2;
Use m_copydata(), m_adj() and M_PREPEND() to manipulate mbuf's in
bridge_ipf(). Fixes kernel memory corruption that occured when using
m_split() and m_cat().
Idea from OpenBSD.


# 1.21 09-Dec-2003 augustss

Fix spelling mistake in a comment.


# 1.20 28-Oct-2003 mycroft

Mark this initializer in the canonical way so it can be found later.


# 1.19 25-Oct-2003 christos

Fix uninitialized variable warnings


# 1.18 16-Sep-2003 jdc

Add a flag parameter to bridge_enqueue() to tell it whether to run the
filter or not. We only need to run the filter for bridge_forward() and
bridge_broadcast(). If we also run it for bridge_output(), we will run
the filter twice outbound per packet, so don't.

In bridge_ipf(), make sure we don't run m_cat() on a single mbuf chain
by checking to see (and remembering) if we need to m_split() the mbuf.
This fixes bridge + ipfilter on sparc.

Fixes PR kern/22063.


# 1.17 11-Aug-2003 itojun

rm extra blank line


# 1.16 13-Jul-2003 jdc

Include opt_inet.h to get INET6 definition.
Now, bridged ipv6 packets are passed through ipfilter.
However, some v6 packets still do not get transmitted when ipf is enabled.
Partial fix for PR kern/22063.


# 1.15 23-Jun-2003 martin

branches: 1.15.2;
Make sure to include opt_foo.h if a defflag option FOO is used.


# 1.14 24-May-2003 kristerw

Make sure splx() is called for all bridge_ioctl() error cases.


# 1.13 16-May-2003 itojun

use strlcpy


# 1.12 14-May-2003 itojun

use arc4random


# 1.11 19-Mar-2003 bouyer

Fix 2 bugs:
- initialise stp when the bridge is turned up, without this stp will keep
all interfaces disabled in a sequence like:
brconfig bridge0 add if0 add if1 stp if0 stp if1 up
- s/BRDGSPRI/BRDGSIFPRIO in brconfig.c:cmd_ifpriority()

add a command (ifpathcost) to change the stp path cost of the STP path cost of
an interface. Display the interface path cost with the others STP parameters.


# 1.10 27-Feb-2003 perseant

Make BRIDGE_IPF an option, and document it. Add it (commented) to GENERIC.
Let brconfig tell whether the bridge is using the ipfilter hook, or not.


# 1.9 15-Feb-2003 perseant

Add ipf packet-filtering option to if_bridge. The option is controlled at
compile-time by BRIDGE_IPF, and at runtime by brconfig with the {ipf,-ipf}
option on a per-bridge basis.

As a side-effect, add PFIL_HOOKS processing to if_bridge.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base kqueue-aftermerge kqueue-beforemerge gehenna-devsw-base kqueue-base
# 1.8 24-Aug-2002 martin

Add a function to lookup bridge members by struct ifnet * and use
it at all call sites that have such a pointer readily available.
This avoids unnecessary strcmp()s in critical paths, and removes
some XXX comments.


# 1.7 08-Jun-2002 itojun

reject "add" request if if_mtu is different.


# 1.6 23-May-2002 itojun

use IFT_BRIDGE


Revision tags: netbsd-1-6-base
# 1.5 24-Mar-2002 jdolecek

branches: 1.5.2; 1.5.4;
Fix a memory leak in bridge_ioctl_add() when the called for non-ethernet
interface.
Problem noted and fix provided by in kern/16019 by Love.


Revision tags: eeh-devprop-base newlock-base
# 1.4 08-Mar-2002 thorpej

Pool deals fairly well with physical memory shortage, but it doesn't
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map). Try to deal with this:

* Group all information about the backend allocator for a pool in a
separate structure. The pool references this structure, rather than
the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
to become available, but will still fail if it cannot callocate KVA
space for the pages. If this happens, carefully drain all pools using
the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
some pages, and use that information to make draining easier and more
efficient.
* Get rid of PR_URGENT. There was only one use of it, and it could be
dealt with by the caller.

From art@openbsd.org.


Revision tags: ifpoll-base
# 1.3 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.2 17-Aug-2001 thorpej

branches: 1.2.2; 1.2.4;
Only report expire time for DYNAMIC forwarding table entries.


# 1.1 17-Aug-2001 thorpej

Add support for building Ethernet bridges, based on Jason Wright's
bridge driver from OpenBSD, although the bridge code has been *heavily*
modified by me (the 802.1D code remains mostly unchanged from the
original).


# 1.166 29-Jan-2020 thorpej

Adopt <net/if_stats.h>.


Revision tags: ad-namecache-base2 ad-namecache-base1 ad-namecache-base phil-wifi-20191119
# 1.165 05-Aug-2019 msaitoh

Cast uint32_t to avoid undefined behavior in bridge_rthash(). Found by kUBSan.


Revision tags: netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609 isaki-audio2-base pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226
# 1.164 22-Dec-2018 rin

Take the interface out of promiscuous mode in bridge_delete_member()
instead of bridge_ioctl_del(). Otherwise, the member interfaces are
left in promiscuous mode when the bridge is destroyed.


# 1.163 15-Dec-2018 rin

Improve wording in comments: replace "chain" with "queue" for
sequence of mbuf's connected by m_nextpkt, in order to avoid
confusion with those connected by m_next.

No binary changes.


# 1.162 14-Dec-2018 martin

Need <netinet6/ip6_var.h> for ip6_statinc() prototype.


# 1.161 12-Dec-2018 rin

PR kern/53562

Handle TX offload in software when a packet is sent via
bridge_output(). We can send it as is in the following
exceptional cases:

For unicast:

(1) When the destination interface is the same as source.

(2) When the destination supports all TX offload options
specified in a packet.

For multicast/broadcast:

(3) When all the members of the bridge support the specified
TX offload options.

For (3), add sc_csum_flags_tx flag to bridge softc, which is
logical AND b/w capabilities of TX offload options in member
interface (ifp->if_csum_flags_tx). The flag is updated when a
member is (i) added to or (ii) removed from a bridge, or (iii)
if_csum_flags_tx flag of a member interface is manipulated via
ifconfig(8).

Turn on M_CSUM_TSOv[46] bit in ifp->if_csum_flags_tx flag when
TSO[46] is enabled for that interface.

OK msaitoh thorpej


Revision tags: pgoyette-compat-1126
# 1.160 09-Nov-2018 ozaki-r

Fix that brconfig <bridge> (addr) can't show a large number of MAC addresses

The command shows only 256 addresses at maximum even if a bridge caches more
addresses. It occurs because the kernel doesn't return an error if the command
passes a short buffer that can't store all cached addresses; the kernel fills
cached addresses as much as possible and returns it without telling that the
result is truncated.

Fix the issue by telling a required size of a buffer if a buffer passed from the
command is not enough, which lets the command retry with an enough buffer.

Reported by k-goda@IIJ


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930
# 1.159 19-Sep-2018 msaitoh

Micro optimization. m_copym(M_COPYALL) -> m_copypacket().


# 1.158 18-Sep-2018 msaitoh

- Fix bridge_enqueue() which was broken by last commit. Use correct mbuf
pointer.
- Modify comment.


# 1.157 14-Sep-2018 msaitoh

Fix a bug that bridge_enqueue() incorrectly cleared outgoing packet's offload
flags. bridge_enqueue() is called from bridge_output() when a packet is
spontaneous. Clear csum_flags before calling brige_enqueue() in
bridge_forward() or bridge_broadcast() instead of in the beginning of
bridge_enqueue().

Note that this change doesn't fix a problem on the following configuration:

A bridge has two or more interfaces.

An address is assigned to an bridge member interface and
some offload flags are set.

Another interface has no address and has no any offload flag.

XXX pullup-[78]


Revision tags: pgoyette-compat-0906 pgoyette-compat-0728 phil-wifi-base pgoyette-compat-0625
# 1.156 25-May-2018 ozaki-r

branches: 1.156.2;
Ensure to call if_register after interface initializations finish


Revision tags: pgoyette-compat-0521
# 1.155 14-May-2018 ozaki-r

Protect packet input routines with KERNEL_LOCK and splsoftnet

if_input, i.e, ether_input and friends, now runs in softint without any
protections. It's ok for ether_input itself because it's already MP-safe,
however, subsequent routines called from it such as carp_input and agr_input
aren't safe because they're not MP-safe. Protect if_input with KERNEL_LOCK.

if_input can be called from a normal LWP context. In that case we need to
prevent interrupts (softint) from running by splsoftnet to protect non-MP-safe
codes (e.g., carp_input and agr_input).

Pointed out by mlelstv@


Revision tags: pgoyette-compat-0502 pgoyette-compat-0422
# 1.154 18-Apr-2018 ozaki-r

Add missing PSLIST_ENTRY_INIT and PSLIST_ENTRY_DESTROY


# 1.153 18-Apr-2018 ozaki-r

Get rid of a unnecessary semicolon

Pointed out by kamil@


# 1.152 18-Apr-2018 ozaki-r

bridge: use pslist(9) for rtlist and rthash

The change fixes race conditions on list operations. One example is that a
reader may see invalid pointers on a looking item in a list due to lack of
membar_producer.


# 1.151 18-Apr-2018 ozaki-r

Simplify bridge_rtnode_insert (NFC)


# 1.150 18-Apr-2018 ozaki-r

Remove obsolete NULL checks


Revision tags: pgoyette-compat-0415
# 1.149 10-Apr-2018 ozaki-r

Fix bridge_rtdelete

It removes a rtable entry that belongs to a specified interface, however, its
original behavior was to delete all belonging entries. Restore the original
behavior.


Revision tags: pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.148 15-Jan-2018 maxv

branches: 1.148.2;
If the bridge is not running, don't call bridge_stop. Otherwise the
following commands will crash the kernel:

ifconfig bridge0 create
ifconfig bridge0 destroy


# 1.147 28-Dec-2017 ozaki-r

Ensure the timer isn't running by using workqueue_wait


# 1.146 19-Dec-2017 ozaki-r

Don't set IFEF_MPSAFE unless NET_MPSAFE at this point

Because recent investigations show that interfaces with IFEF_MPSAFE need to
follow additional restrictions to work with the flag safely. We should enable it
on an interface by default only if the interface surely satisfies the
restrictions, which are described in if.h.

Note that enabling IFEF_MPSAFE solely gains a few benefit on performance because
the network stack is still serialized by the big kernel locks by default.


# 1.145 11-Dec-2017 ozaki-r

Wrap if_ioctl_lock with IFNET_* macros (NFC)

Also if_ioctl_lock perhaps needs to be renamed to something because it's now
not just for ioctl...


# 1.144 08-Dec-2017 ozaki-r

Fix build of kernels without ether

By throwing out if_enable_vlan_mtu and if_disable_vlan_mtu that
created a unnecessary dependency from if.c to if_ethersubr.c.

PR kern/52790


# 1.143 06-Dec-2017 ozaki-r

Ensure to not turn on IFF_RUNNING of an interface until its initialization completes

And ensure to turn off it before destruction as per IFF_RUNNING's description
"resource allocated". (The description is a bit doubtful though, I believe the
change is still proper.)


# 1.142 06-Dec-2017 ozaki-r

Ensure to hold if_ioctl_lock when calling if_flags_set


Revision tags: tls-maxphys-base-20171202
# 1.141 17-Nov-2017 ozaki-r

Add missing IFEF_NO_LINK_STATE_CHANGE to bridge


# 1.140 16-Nov-2017 ozaki-r

Unify IFEF_*_MPSAFE into IFEF_MPSAFE

There are already two flags for if_output and if_start, however, it seems such
MPSAFE flags are eventually needed for all if_XXX operations. Having discrete
flags for each operation is wasteful of if_extflags bits. So let's unify
the flags into one: IFEF_MPSAFE.

Fortunately IFEF_*_MPSAFE flags have never been included in any releases, so
we can change them without breaking backward compatibility of the releases
(though the kernel version of -current should be bumped).

Note that if an interface have both MP-safe and non-MP-safe operations at a
time, we have to set the IFEF_MPSAFE flag and let callees of non-MP-safe
opeartions take the kernel lock.

Proposed on tech-kern@ and tech-net@


# 1.139 15-Nov-2017 ozaki-r

Mark callouts of bridge CALLOUT_MPSAFE


# 1.138 25-Oct-2017 ozaki-r

Remove unnecessary splsoftnet


# 1.137 25-Oct-2017 ozaki-r

Don't free sc_rthash twice


# 1.136 23-Oct-2017 msaitoh

- If if_initialize() failed in the attach function, free resources and return.
- Add some missing frees in bridge_clone_destroy().
- KNF


# 1.135 02-Oct-2017 ozaki-r

Add curlwp_bind to bridge_input for psref

It can be called in a thread context via tap (tap_dev_write).

Fix PR kern/52587


Revision tags: nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320
# 1.134 07-Mar-2017 ozaki-r

branches: 1.134.6;
Remove unnecessary splnet for bridge_enqueue

bridge_enqueue now uses if_transmit_lock that does splnet for device
drivers, so splnet for bridge_enqueue isn't needed anymore.


# 1.133 16-Feb-2017 knakahara

add l2tp(4) L2TPv3 interface.

originally implemented by IIJ SEIL team.


Revision tags: nick-nhusb-base-20170204
# 1.132 23-Jan-2017 ozaki-r

Replace some splnet with splsoftnet


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107 nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004
# 1.131 15-Sep-2016 christos

branches: 1.131.2;
Always do the mbuf checks. The packet filters (npf) expect the mbuf to be
pulled-up. (Krists Krilovs)


Revision tags: localcount-20160914
# 1.130 29-Aug-2016 ozaki-r

KNF; replace white spaces with hard tabs

No functional change.


Revision tags: pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.129 22-Jun-2016 knakahara

branches: 1.129.2;
fix: locking about IFQ_ENQUEUE and ALTQ

- If NET_MPSAFE is not defined, IFQ_LOCK is nop. Currently, that means
IFQ_ENQUEUE() of some paths such as bridge_enqueue() is called parallel
wrongly.
- If ALTQ is enabled, Tx processing should call if_transmit() (= IFQ_ENQUEUE
+ ifp->if_start()) instead of ifp->if_transmit() to call ALTQ_ENQUEUE()
and ALTQ_DEQUEUE().
Furthermore, ALTQ processing is always required KERNEL_LOCK currently.


# 1.128 20-Jun-2016 knakahara

fix: should not assert IFEF_OUTPUT_MPSAFE in bridge_output()


# 1.127 20-Jun-2016 knakahara

tentative fix for ATF(net/if_bridge/t_bridge)


# 1.126 20-Jun-2016 knakahara

make bridge_output MP-safe, so that bridge(4) can enable IFEF_OUTPUT_MPSAFE.

making MP-scalable is future work.


# 1.125 10-Jun-2016 ozaki-r

Avoid storing a pointer of an interface in a mbuf

Having a pointer of an interface in a mbuf isn't safe if we remove big
kernel locks; an interface object (ifnet) can be destroyed anytime in any
packet processing and accessing such object via a pointer is racy. Instead
we have to get an object from the interface collection (ifindex2ifnet) via
an interface index (if_index) that is stored to a mbuf instead of an
pointer.

The change provides two APIs: m_{get,put}_rcvif_psref that use psref(9)
for sleep-able critical sections and m_{get,put}_rcvif that use
pserialize(9) for other critical sections. The change also adds another
API called m_get_rcvif_NOMPSAFE, that is NOT MP-safe and for transition
moratorium, i.e., it is intended to be used for places where are not
planned to be MP-ified soon.

The change adds some overhead due to psref to performance sensitive paths,
however the overhead is not serious, 2% down at worst.

Proposed on tech-kern and tech-net.


# 1.124 10-Jun-2016 ozaki-r

Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.123 16-May-2016 ozaki-r

Apply if_get and if_put to bridge(4)


# 1.122 04-May-2016 roy

Allow multicast/broadcast packets from a bridge member to other members.
Note this should just call bridge_broadcast when more locking issues are
resolved.


# 1.121 28-Apr-2016 knakahara

introduce new ifnet MP-scalable sending interface "if_transmit".


# 1.120 28-Apr-2016 ozaki-r

Constify rtentry of if_output

We no longer need to change rtentry below if_output.

The change makes it clear where rtentries are changed (or not)
and helps forthcoming locking (os psrefing) rtentries.


# 1.119 24-Apr-2016 christos

CID 1358673: dead code


Revision tags: nick-nhusb-base-20160422
# 1.118 22-Apr-2016 roy

Change used from int to bool.
If used, abort the loop because we think we're already at the end.


# 1.117 20-Apr-2016 knakahara

IFQ_ENQUEUE refactor (3/3) : eliminate pktattr argument from IFQ_ENQUEUE caller


# 1.116 20-Apr-2016 knakahara

IFQ_ENQUEUE refactor (2/3) : eliminate pktattr argument from altq implemantation


# 1.115 19-Apr-2016 ozaki-r

Apply psref(9) to bridge(4)

Note that there is an issue that ioctls for an interface and a destruction
of the interface can run in parallel and it causes race conditions on
bridge as well (it rarely happens). The issue will be addressed in the
interface common code (if.c).


# 1.114 19-Apr-2016 ozaki-r

Remove BRIDGE_MPSAFE switch and enable MP-safe code by default

We need to enable it by default because bridge_input now runs
in softint, but bridge_input w/o BRIDGE_MPSAFE was designed as
it runs in hardware interrupt.

Note that there remains a racy code in bridge_output; it will be
solved in the upcoming change (applying psref(9)).


# 1.113 11-Apr-2016 ozaki-r

Fix usage of pslist(9)

Pointed out by riastradh@.


# 1.112 11-Apr-2016 ozaki-r

Use pslist(9) in bridge(4)

This adds missing memory barriers to list operations for pserialize.


# 1.111 28-Mar-2016 ozaki-r

Remove unused global bridge list

Pointed out by riastradh@


# 1.110 23-Mar-2016 ozaki-r

Fix LIST_FOREACH argument


# 1.109 23-Mar-2016 ozaki-r

Use LIST_FOREACH instead of LIST_FOREACH_SAFE

No need to use *_SAFE because we don't remove any items in the loop.


Revision tags: nick-nhusb-base-20160319
# 1.108 15-Feb-2016 ozaki-r

Simplify bridge(4)

Thanks to introducing softint-based if_input, the entire bridge code now
never run in hardware interrupt context. So we can simplify the code.

- Remove spin mutexes
- They were needed because some code of bridge could run in
hardware interrupt context
- We now need only an adaptive mutex for each shared object
(a member list and a forwarding table)
- Remove pktqueue
- bridge_input is already in softint, using another softint
(for bridge_forward) is useless
- Packet distribution should be down at device drivers


# 1.107 10-Feb-2016 ozaki-r

Don't share struct work, instead have one per softc

Pointed out by riastradh@


# 1.106 09-Feb-2016 ozaki-r

Introduce softint-based if_input

This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.

This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.

To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.

Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html

Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!


Revision tags: nick-nhusb-base-20151226
# 1.105 19-Nov-2015 christos

Add handling of VLAN packets in if_bridge where the parent interface supports
them (Jean-Jacques.Puig@espci.fr). Factor out the vlan_mtu enabling and
disabling code.


# 1.104 20-Oct-2015 maxv

Harmless alloc inconsistency; make sure the exact same argument is given to
kmem_alloc/kmem_free. Found by Brainy.


# 1.103 07-Oct-2015 ozaki-r

Enqueue frames to a curcpu's pktqueue

Currently RX can run on a CPU other than CPU#0, so always enqueuing
to a pktqueue of CPU#0 makes no sense. Let's use a curcpu's pktqueue,
although bridge_foward softint doesn't run in parallel without
NET_MPSAFE.

This is a temporal solution. We need a fundamental solution.


Revision tags: nick-nhusb-base-20150921
# 1.102 28-Aug-2015 rjs

Don't set M_PROTO1 in mbuf flags.

This was left over from the old usage of gif(4) with bridges.


# 1.101 20-Aug-2015 christos

include "ioconf.h" to get the 'void <driver>attach(int count);' prototype.


# 1.100 23-Jul-2015 ozaki-r

Fix PR 48104

So far bridge cannot receive frames via a member interface when the frames
come from another member interface. So when we assign an IP address to
a member interface, hosts connected to another member interface cannot
ping to the IP address. That behavior isn't expected. See PR 48104 for
more realistic examples of this issue.

The change does:
- drop M_PROMISC before ether_input, which allows a bridge member interface
to receive a frame coming from another bridge member interface
- receive broadcast/multicast frames via all bridge member interfaces,
which is required to receive IPv6 multicast packets destined to a
multicast group belonging to a bridge member interface that is different
from a packet arrival interface

roy@ helped testing of the fix, thanks!


Revision tags: nick-nhusb-base-20150606
# 1.99 01-Jun-2015 matt

Modify the BRDGGIFS and BRDGRTS cmds to be more COMPAT_NETBSD32 friendly.
(XXX whitespace)


# 1.98 16-Apr-2015 ozaki-r

Fix racy bridge_delete_member

It can be called from bridge_ioctl_del and bridge_clone_destroy with
a same bridge member (bif) at the same time. We have to prevent
that happens.

Pointed out by riastradh@


Revision tags: nick-nhusb-base-20150406
# 1.97 08-Jan-2015 ozaki-r

Use pserialize for rtlist in bridge

This change enables lockless accesses to bridge rtable lists.
See locking notes in a comment to know how pserialize and
mutexes are used. Some functions are rearranged to use
pserialize. A workqueue is introduced to use pserialize in
bridge_rtage via bridge_timer callout.

As usual, pserialize and mutexes are used only when NET_MPSAFE
on. On the other hand, the newly added workqueue is used
regardless of NET_MPSAFE on or off.


# 1.96 01-Jan-2015 ozaki-r

Reset the expire time of a cache on receiving a frame for the cache

The expire time of a cache in a bridge MAC address table was never reset
once it is initialized regardless of traffic for the cache. The behavior
isn't supposed and active caches are unnecessarily expired and removed.

PR kern/49507


# 1.95 31-Dec-2014 ozaki-r

Use pserialize in bridge

This change enables lockless accesses to bridge member lists.
See locking notes in a comment to know how pserialize and
mutexes are used.

This change also provides support for softint-based interrupt
handling; pserialize readers can run in both HW interrupt and
softint contexts.

As usual, pserialize is used only when NET_MPSAFE on.


# 1.94 25-Dec-2014 ozaki-r

Use LIST_FOREACH_SAFE in bridge_rt* functions


# 1.93 24-Dec-2014 ozaki-r

Replace malloc/free with kmem_* in if_bridge

Additionally M_NOWAIT is replaced with KM_SLEEP.


# 1.92 22-Dec-2014 ozaki-r

Call ether_input/m_freem without holding a lock or referencing unnecessary objects

When NET_MPSAFE on, a bridge tries to pass up a packet to Layer 3
(or call m_freem) with holding a lock or referencing unnecessary
objects. That causes random lock ups. The change fixes the issue.


Revision tags: nick-nhusb-base
# 1.91 15-Aug-2014 ozaki-r

branches: 1.91.2;
bridge: reject non-IFF_SIMPLEX interfaces

bridge does not work with !IFF_SIMPLEX interfaces (PR/18035);
the bug is not yet fixed. Until it gets fixed, we should
reject non-IFF_SIMPLEX interfaces.

Discussed with pooka@


Revision tags: netbsd-7-1-2-RELEASE netbsd-7-1-1-RELEASE netbsd-7-1-RELEASE netbsd-7-1-RC2 netbsd-7-nhusb-base-20170116 netbsd-7-1-RC1 netbsd-7-0-2-RELEASE netbsd-7-nhusb-base netbsd-7-0-1-RELEASE netbsd-7-0-RELEASE netbsd-7-0-RC3 netbsd-7-0-RC2 netbsd-7-0-RC1 netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.90 23-Jul-2014 ozaki-r

branches: 1.90.2;
Avoid calling copyout with holding mutex(IPL_NET)

Because copyout may lead a page fault that may sleep, we have to pull it
out from the critical section of mutex(IPL_NET) in bridge_ioctl_gifs.


# 1.89 23-Jul-2014 ozaki-r

Add missing unlock


# 1.88 20-Jul-2014 ozaki-r

Don't return ENETRESET when ioctl SIOCSIFMTU

Otherwise, just changing MTU with ifconfig shows
a confusable error message.

RP kern/48996


# 1.87 14-Jul-2014 ozaki-r

Make bridge MPSAFE

- Introduce BRIDGE_MPSAFE
- It's enabled only when NET_MPSAFE is defined
in if.h or the kernel config
- Add iflist and rtlist mutex locks
- Locking iflist is performance sensitive,
so it's not used when !BRIDGE_MPSAFE
- Add bif object reference counting
- It enables fine-grain locking for bridge member lists
by allowing to not hold a lock during touching a bif
- bridge_release_member is added to decrement the
reference count
- A condition variable is added to do bridge_delete_member
gracefully
- Add if_bridgeif to ifnet
- It's a shortcut to a bif object of a bridge member
- It reduces a bif lookup cost and so lock contention on iflist
- Make bridgestp MPSAFE too


# 1.86 02-Jul-2014 ozaki-r

Protect bridge_list with a mutex


# 1.85 02-Jul-2014 ozaki-r

Remove obsolete codes for if_snd


# 1.84 23-Jun-2014 ozaki-r

Get rid of unnecessary xc_broadcast after pktq_barrier

Pointed out by rmind@


# 1.83 18-Jun-2014 ozaki-r

Restructure bridge_input and bridge_broadcast

There are two changes:
- Assemble the places calling pktq_enqueue (bridge_forward)
for unicast and {b,m}cast frames into one
- Receive {b,m}cast frames in bridge_broadcast, not in
bridge_input

The changes make the code clear and readable. bridge_input
now doesn't need to take care of {b,m}cast frames;
bridge_forward and bridge_broadcast have the responsibility.

The changes are based on a patch of Lloyd Parkes submitted
in PR 48104, but don't fix its issue yet.


# 1.82 18-Jun-2014 ozaki-r

Tidy up bridge_input

No functional change.


# 1.81 17-Jun-2014 ozaki-r

Restructure ether_input and bridge_input

The network stack of NetBSD is well organized and
layered. A packet reception is processed from a
lower layer to an upper layer one by one. However,
ether_input and bridge_input are not structured so.
bridge_input is called inside ether_input.

The new structure replaces ifnet#if_input of a bridge
member with bridge_input when the member is attached.
So a packet goes straight on a packet reception via
a bridge, bridge_input => ether_input => ip_input.

The change is part of a patch of Lloyd Parkes submitted
in PR 48104. Unlike the patch, the change doesn't
intend to change the behavior of the packet processing.
Another patch will fix PR 48104.


# 1.80 16-Jun-2014 ozaki-r

Add net.interfaces.bridgeN.fwdq.{maxlen,len,drops} sysctl


# 1.79 16-Jun-2014 ozaki-r

Use pktqueue for bridge forwarding queue and softint


# 1.78 15-Jun-2014 ozaki-r

Get rid of unnecessary splnet for pool_{get,put}

A mutex prevents interrupts in the functions now.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base
# 1.77 29-Jun-2013 rmind

branches: 1.77.4;
- Rewrite parts of pfil(9): use array to store hooks and thus be more cache
friendly (there are only few hooks in the system). Make the structures
opaque and the interface more strict.
- Remove PFIL_HOOKS option by making pfil(9) mandatory.


Revision tags: agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8
# 1.76 22-Mar-2012 wiz

branches: 1.76.2; 1.76.4;
Fix typo in kauth name. From PR 46234 by Matthew Mondor.
Tested by Geoff Adams and Ryo ONODERA.


# 1.75 13-Mar-2012 elad

Replace the remaining KAUTH_GENERIC_ISSUSER authorization calls with
something meaningful. All relevant documentation has been updated or
written.

Most of these changes were brought up in the following messages:

http://mail-index.netbsd.org/tech-kern/2012/01/18/msg012490.html
http://mail-index.netbsd.org/tech-kern/2012/01/19/msg012502.html
http://mail-index.netbsd.org/tech-kern/2012/02/17/msg012728.html

Thanks to christos, manu, njoly, and jmmv for input.

Huge thanks to pgoyette for spinning these changes through some build
cycles and ATF.


Revision tags: netbsd-6-0-6-RELEASE netbsd-6-1-5-RELEASE netbsd-6-1-4-RELEASE netbsd-6-0-5-RELEASE netbsd-6-1-3-RELEASE netbsd-6-0-4-RELEASE netbsd-6-1-2-RELEASE netbsd-6-0-3-RELEASE netbsd-6-1-1-RELEASE netbsd-6-0-2-RELEASE netbsd-6-1-RELEASE netbsd-6-1-RC4 netbsd-6-1-RC3 netbsd-6-1-RC2 netbsd-6-1-RC1 netbsd-6-0-1-RELEASE matt-nb6-plus-nbase netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base
# 1.74 19-Nov-2011 tls

branches: 1.74.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.73 23-May-2011 joerg

branches: 1.73.4;
simplify


Revision tags: bouyer-quota2-nbase bouyer-quota2-base jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.72 07-Dec-2010 pooka

branches: 1.72.2;
_KERNEL_TOP


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1 yamt-nfs-mp-base9 uebayasi-xip-base
# 1.71 19-Jan-2010 pooka

branches: 1.71.4;
Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


Revision tags: matt-premerge-20091211 yamt-nfs-mp-base8 yamt-nfs-mp-base7 jymxensuspend-base yamt-nfs-mp-base6 yamt-nfs-mp-base5 jym-xensuspend-nbase
# 1.70 17-May-2009 cegger

fix crash in bridge_ioctl():


BRDGGFLT and BRDGSFILT bridge controls are only available with BRIDGE_IPF and PFIL_HOOKS defined.
In amd64 GENERIC and XEN kernel configs PFIL_HOOKS is defined but BRIDGE_IPF is not.

When a BRDGGFLT or BRDGSFILT command comes in, then ifd->ifd_cmd is not in range
of bridge_control_table_size. Then bc is not set and is dereferenced
later => BOOM.


Revision tags: yamt-nfs-mp-base4 jym-xensuspend-base
# 1.69 12-May-2009 elad

Move kauth(9) call before going into splnet().

Mailing list reference:

http://mail-index.netbsd.org/tech-net/2009/05/08/msg001286.html


Revision tags: yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 nick-hppapmap-base
# 1.68 04-Apr-2009 bouyer

Fix another typo


# 1.67 04-Apr-2009 bouyer

Fix a comment, and make it build.


# 1.66 04-Apr-2009 bouyer

Fixes from Masao Uebayashi


# 1.65 04-Apr-2009 bouyer

Fix for if_start() and pfil_hook() being called from hardware interrupt
context (reported on various mailing-lists, and part of PR kern/41114,
causing panic in pf(4) and possibly ipf(4) when BRIDGE_IPF is used).
Defer bridge_forward() to a software interrupt; bridge_input() enqueues
mbufs to ifp->if_snd which is handled in bridge_forward().


Revision tags: nick-hppapmap-base2
# 1.64 18-Jan-2009 mrg

branches: 1.64.2;
Fix multiple problems:

* A sign extension error creating the bridge ID corrupted the
priority (always making it the maximum).
* Do not catch STP packets on an interface for which STP is not
enabled -- it's a violation of the spec, and causes STP to fail on
neighboring bridges.
* An optimization to bstp_input() -- some information is already
known when we call it.

contributed anonymously.


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.63 07-Nov-2008 dyoung

*** Summary ***

When a link-layer address changes (e.g., ifconfig ex0 link
02:de:ad:be:ef:02 active), send a gratuitous ARP and/or a Neighbor
Advertisement to update the network-/link-layer address bindings
on our LAN peers.

Refuse a change of ethernet address to the address 00:00:00:00:00:00
or to any multicast/broadcast address. (Thanks matt@.)

Reorder ifnet ioctl operations so that driver ioctls may inherit
the functions of their "class"---ether_ioctl(), fddi_ioctl(), et
cetera---and the class ioctls may inherit from the generic ioctl,
ifioctl_common(), but both driver- and class-ioctls may override
the generic behavior. Make network drivers share more code.

Distinguish a "factory" link-layer address from others for the
purposes of both protecting that address from deletion and computing
EUI64.

Return consistent, appropriate error codes from network drivers.

Improve readability. KNF.

*** Details ***

In if_attach(), always initialize the interface ioctl routine,
ifnet->if_ioctl, if the driver has not already initialized it.
Delete if_ioctl == NULL tests everywhere else, because it cannot
happen.

In the ioctl routines of network interfaces, inherit common ioctl
behaviors by calling either ifioctl_common() or whichever ioctl
routine is appropriate for the class of interface---e.g., ether_ioctl()
for ethernets.

Stop (ab)using SIOCSIFADDR and start to use SIOCINITIFADDR. In
the user->kernel interface, SIOCSIFADDR's argument was an ifreq,
but on the protocol->ifnet interface, SIOCSIFADDR's argument was
an ifaddr. That was confusing, and it would work against me as I
make it possible for a network interface to overload most ioctls.
On the protocol->ifnet interface, replace SIOCSIFADDR with
SIOCINITIFADDR. In ifioctl(), return EPERM if userland tries to
invoke SIOCINITIFADDR.

In ifioctl(), give the interface the first shot at handling most
interface ioctls, and give the protocol the second shot, instead
of the other way around. Finally, let compatibility code (COMPAT_OSOCK)
take a shot.

Pull device initialization out of switch statements under
SIOCINITIFADDR. For example, pull ..._init() out of any switch
statement that looks like this:

switch (...->sa_family) {
case ...:
..._init();
...
break;
...
default:
..._init();
...
break;
}

Rewrite many if-else clauses that handle all permutations of IFF_UP
and IFF_RUNNING to use a switch statement,

switch (x & (IFF_UP|IFF_RUNNING)) {
case 0:
...
break;
case IFF_RUNNING:
...
break;
case IFF_UP:
...
break;
case IFF_UP|IFF_RUNNING:
...
break;
}

unifdef lots of code containing #ifdef FreeBSD, #ifdef NetBSD, and
#ifdef SIOCSIFMTU, especially in fwip(4) and in ndis(4).

In ipw(4), remove an if_set_sadl() call that is out of place.

In nfe(4), reuse the jumbo MTU logic in ether_ioctl().

Let ethernets register a callback for setting h/w state such as
promiscuous mode and the multicast filter in accord with a change
in the if_flags: ether_set_ifflags_cb() registers a callback that
returns ENETRESET if the caller should reset the ethernet by calling
if_init(), 0 on success, != 0 on failure. Pull common code from
ex(4), gem(4), nfe(4), sip(4), tlp(4), vge(4) into ether_ioctl(),
and register if_flags callbacks for those drivers.

Return ENOTTY instead of EINVAL for inappropriate ioctls. In
zyd(4), use ENXIO instead of ENOTTY to indicate that the device is
not any longer attached.

Add to if_set_sadl() a boolean 'factory' argument that indicates
whether a link-layer address was assigned by the factory or some
other source. In a comment, recommend using the factory address
for generating an EUI64, and update in6_get_hw_ifid() to prefer a
factory address to any other link-layer address.

Add a routing message, RTM_LLINFO_UPD, that tells protocols to
update the binding of network-layer addresses to link-layer addresses.
Implement this message in IPv4 and IPv6 by sending a gratuitous
ARP or a neighbor advertisement, respectively. Generate RTM_LLINFO_UPD
messages on a change of an interface's link-layer address.

In ether_ioctl(), do not let SIOCALIFADDR set a link-layer address
that is broadcast/multicast or equal to 00:00:00:00:00:00.

Make ether_ioctl() call ifioctl_common() to handle ioctls that it
does not understand.

In gif(4), initialize if_softc and use it, instead of assuming that
the gif_softc and ifp overlap.

Let ifioctl_common() handle SIOCGIFADDR.

Sprinkle rtcache_invariants(), which checks on DIAGNOSTIC kernels
that certain invariants on a struct route are satisfied.

In agr(4), rewrite agr_ioctl_filter() to be a bit more explicit
about the ioctls that we do not allow on an agr(4) member interface.

bzero -> memset. Delete unnecessary casts to void *. Use
sockaddr_in_init() and sockaddr_in6_init(). Compare pointers with
NULL instead of "testing truth". Replace some instances of (type
*)0 with NULL. Change some K&R prototypes to ANSI C, and join
lines.


Revision tags: netbsd-5-0-RC3 netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 wrstuden-revivesa-base-1 simonb-wapbl-nbase yamt-pf42-base4 simonb-wapbl-base wrstuden-revivesa-base
# 1.62 15-Jun-2008 christos

branches: 1.62.2; 1.62.4; 1.62.6;
- add if_alloc (ours just mallocs), and if_initname and use them (from FreeBSD)
- kill memsets where M_ZERO can be used.


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base
# 1.61 15-Apr-2008 thorpej

branches: 1.61.2; 1.61.4; 1.61.6; 1.61.8;
Make ip6 and icmp6 stats per-cpu.


# 1.60 12-Apr-2008 cegger

make this build with BRIDGE_IPF and PFIL_HOOKS options


# 1.59 12-Apr-2008 thorpej

Make IP, TCP, UDP, and ICMP statistics per-CPU. The stats are collated
when the user requests them via sysctl.


# 1.58 08-Apr-2008 thorpej

Change IPv6 stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ip6stat structure; old netstat
binaries will continue to work properly.


# 1.57 07-Apr-2008 thorpej

Change IP stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ipstat structure; old netstat
binaries will continue to work properly.


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase nick-net80211-sync-base keiichi-mipv6-base matt-armv6-nbase hpcarm-cleanup-base
# 1.56 20-Feb-2008 matt

branches: 1.56.6;
s/u_\(int[0-9]*_t\)/u\1/g
(change u_int*_t to uint*_t)


Revision tags: bouyer-xeni386-nbase bouyer-xeni386-base mjf-devfs-base
# 1.55 19-Jan-2008 dyoung

Use C99 array initializers for bridge_control_table[].


Revision tags: nick-csl-alignment-base5 bouyer-xeni386-merge1 matt-armv6-prevmlocking vmlocking2-base3 yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 jmcneill-base bouyer-xenamd64-base2 vmlocking-nbase yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base matt-armv6-base jmcneill-pm-base reinoud-bufcleanup-base vmlocking-base
# 1.54 27-Aug-2007 dyoung

branches: 1.54.2; 1.54.8; 1.54.14;
LLADDR -> CLLADDR.


# 1.53 26-Aug-2007 dyoung

Constify: LLADDR -> CLLADDR. I'm aiming here to make it easier to
identify sockaddr_dl abuse that remains in the kernel, especially
the potential for overwriting memory past the end of a sockaddr_dl
with, e.g., memcpy(LLADDR(), ...).

Use sockaddr_dl_setaddr() in a few places.


Revision tags: matt-mips64-base nick-csl-alignment-base mjf-ufs-trans-base
# 1.52 09-Jul-2007 ad

branches: 1.52.2; 1.52.6;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.51 12-Mar-2007 ad

branches: 1.51.2;
Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.


# 1.50 04-Mar-2007 christos

branches: 1.50.2;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.49 21-Feb-2007 dyoung

Use __arraycount().


# 1.48 17-Feb-2007 dyoung

KNF: de-__P, bzero -> memset, bcmp -> memcmp. Remove extraneous
parentheses in return statements.

Cosmetic: don't open-code TAILQ_FOREACH().

Cosmetic: change types of variables to avoid oodles of casts: in
in6_src.c, avoid casts by changing several route_in6 pointers
to struct route pointers. Remove unnecessary casts to caddr_t
elsewhere.

Pave the way for eliminating address family-specific route caches:
soon, struct route will not embed a sockaddr, but it will hold
a reference to an external sockaddr, instead. We will set the
destination sockaddr using rtcache_setdst(). (I created a stub
for it, but it isn't used anywhere, yet.) rtcache_free() will
free the sockaddr. I have extracted from rtcache_free() a helper
subroutine, rtcache_clear(). rtcache_clear() will "forget" a
cached route, but it will not forget the destination by releasing
the sockaddr. I use rtcache_clear() instead of rtcache_free()
in rtcache_update(), because rtcache_update() is not supposed
to forget the destination.

Constify:

1 Introduce const accessor for route->ro_dst, rtcache_getdst().

2 Constify the 'dst' argument to ifnet->if_output(). This
led me to constify a lot of code called by output routines.

3 Constify the sockaddr argument to protosw->pr_ctlinput. This
led me to constify a lot of code called by ctlinput routines.

4 Introduce const macros for converting from a generic sockaddr
to family-specific sockaddrs, e.g., sockaddr_in: satocsin6,
satocsin, et cetera.


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.47 04-Jan-2007 elad

branches: 1.47.2;
Consistent usage of KAUTH_GENERIC_ISSUSER.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3 netbsd-4-base
# 1.46 23-Nov-2006 rpaulo

New EtherIP driver based on tap(4) and gif(4) by Hans Rosenfeld.
Notable changes:
* Fixes PR 34268.
* Separates the code from gif(4) (which is more cleaner).
* Allows the usage of STP (Spanning Tree Protocol).
* Removed EtherIP implementation from gif(4)/tap(4).

Some input from Christos.


# 1.45 16-Nov-2006 christos

__unused removal on arguments; approved by core.


Revision tags: yamt-splraiseipl-base2
# 1.44 17-Oct-2006 dogcow

now that we have -Wno-unused-parameter, back out all the tremendously ugly
code to gratuitously access said parameters.


# 1.43 13-Oct-2006 dogcow

More -Wunused fallout. sprinkle __unused when possible; otherwise, use the
do { if (&x) {} } while (/* CONSTCOND */ 0);
construct as suggested by uwe in <20061012224845.GA9449@snark.ptc.spbu.ru>.


# 1.42 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


# 1.41 05-Oct-2006 tls

Protect calls to pool_put/pool_get that may occur in interrupt context
with spl used to protect other allocations and frees, or datastructure
element insertion and removal, in adjacent code.

It is almost unquestionably the case that some of the spl()/splx() calls
added here are superfluous, but it really seems wrong to see:

s=splfoo();
/* frob data structure */
splx(s);
pool_put(x);

and if we think we need to protect the first operation, then it is hard
to see why we should not think we need to protect the next. "Better
safe than sorry".

It is also almost unquestionably the case that I missed some pool
gets/puts from interrupt context with my strategy for finding these
calls; use of PR_NOWAIT is a strong hint that a pool may be used from
interrupt context but many callers in the kernel pass a "can wait/can't
wait" flag down such that my searches might not have found them. One
notable area that needs to be looked at is pf.

See also:

http://mail-index.netbsd.org/tech-kern/2006/07/19/0003.html
http://mail-index.netbsd.org/tech-kern/2006/07/19/0009.html


Revision tags: abandoned-netbsd-4-base yamt-splraiseipl-base yamt-pdpolicy-base9 yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.40 23-Jul-2006 ad

branches: 1.40.4; 1.40.6;
Use the LWP cached credentials where sane.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.39 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.38 18-May-2006 liamjfoy

branches: 1.38.2;
Integrate Common Address Redundancy Procotol (CARP) from OpenBSD

'pseudo-device carp'

Thanks to: joerg@ christos@ riz@ and others who tested
Ok: core@


# 1.37 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base yamt-pdpolicy-base2 elad-kernelauth-base yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.36 17-Jan-2006 christos

branches: 1.36.2; 1.36.4; 1.36.6; 1.36.8; 1.36.10;
Make sure that breq is also cleared (from Xin LI)


# 1.35 09-Jan-2006 christos

Make sure we initialize all structs to 0; from Xin LI


# 1.34 24-Dec-2005 perry

branches: 1.34.2;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.33 11-Dec-2005 thorpej

ANSI function decls and application of static.


# 1.32 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.31 01-Jun-2005 jdc

branches: 1.31.2;
Fix this properly by renaming the conflicting variables.


# 1.30 01-Jun-2005 jdc

Remove extraneous definition of struct llc (found by shadow warning).


Revision tags: netbsd-3-0-RELEASE netbsd-3-0-RC6 netbsd-3-0-RC5 netbsd-3-0-RC4 netbsd-3-0-RC3 netbsd-3-0-RC2 netbsd-3-0-RC1 yamt-km-base4 yamt-km-base3 netbsd-3-base kent-audio2-base
# 1.29 26-Feb-2005 perry

branches: 1.29.2; 1.29.4;
nuke trailing whitespace


Revision tags: yamt-km-base2
# 1.28 31-Jan-2005 kim

Add RFC 3378 EtherIP support, ported from OpenBSD to NetBSD by
Hans Rosenfeld (rosenfeld at grumpf.hope-2000.org)

This change makes it possible to add gif interfaces to bridges, which
will then send and receive IP protocol 97 packets. Packets are Ethernet
frames with an EtherIP header prepended.


Revision tags: yamt-km-base kent-audio1-beforemerge kent-audio1-base
# 1.27 04-Dec-2004 peter

branches: 1.27.4; 1.27.6;
Change ifc_destroy to return an int instead of void, so that it
can pass back errors to ifconfig.


# 1.26 06-Oct-2004 bad

Interfaces that do checksum offloading indicate the checksum status of
received packets in csum_flags in the packet header. Packets that are
forwarded over the bridge need to have csum_flags cleared before being
put on the output queue. Do so in bridge_enqueue().

Discussed with Jason Thorpe.

Fixes PR kern/27007 and the first part of PR kern/21831.


# 1.25 05-Oct-2004 christos

Only enable BRIDGE_IPF code if PFIL_HOOKS is enabled.


# 1.24 21-Apr-2004 itojun

kill a sprintf


# 1.23 21-Apr-2004 itojun

kill sprintf, use snprintf


Revision tags: netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.22 31-Jan-2004 jdc

branches: 1.22.2;
Use m_copydata(), m_adj() and M_PREPEND() to manipulate mbuf's in
bridge_ipf(). Fixes kernel memory corruption that occured when using
m_split() and m_cat().
Idea from OpenBSD.


# 1.21 09-Dec-2003 augustss

Fix spelling mistake in a comment.


# 1.20 28-Oct-2003 mycroft

Mark this initializer in the canonical way so it can be found later.


# 1.19 25-Oct-2003 christos

Fix uninitialized variable warnings


# 1.18 16-Sep-2003 jdc

Add a flag parameter to bridge_enqueue() to tell it whether to run the
filter or not. We only need to run the filter for bridge_forward() and
bridge_broadcast(). If we also run it for bridge_output(), we will run
the filter twice outbound per packet, so don't.

In bridge_ipf(), make sure we don't run m_cat() on a single mbuf chain
by checking to see (and remembering) if we need to m_split() the mbuf.
This fixes bridge + ipfilter on sparc.

Fixes PR kern/22063.


# 1.17 11-Aug-2003 itojun

rm extra blank line


# 1.16 13-Jul-2003 jdc

Include opt_inet.h to get INET6 definition.
Now, bridged ipv6 packets are passed through ipfilter.
However, some v6 packets still do not get transmitted when ipf is enabled.
Partial fix for PR kern/22063.


# 1.15 23-Jun-2003 martin

branches: 1.15.2;
Make sure to include opt_foo.h if a defflag option FOO is used.


# 1.14 24-May-2003 kristerw

Make sure splx() is called for all bridge_ioctl() error cases.


# 1.13 16-May-2003 itojun

use strlcpy


# 1.12 14-May-2003 itojun

use arc4random


# 1.11 19-Mar-2003 bouyer

Fix 2 bugs:
- initialise stp when the bridge is turned up, without this stp will keep
all interfaces disabled in a sequence like:
brconfig bridge0 add if0 add if1 stp if0 stp if1 up
- s/BRDGSPRI/BRDGSIFPRIO in brconfig.c:cmd_ifpriority()

add a command (ifpathcost) to change the stp path cost of the STP path cost of
an interface. Display the interface path cost with the others STP parameters.


# 1.10 27-Feb-2003 perseant

Make BRIDGE_IPF an option, and document it. Add it (commented) to GENERIC.
Let brconfig tell whether the bridge is using the ipfilter hook, or not.


# 1.9 15-Feb-2003 perseant

Add ipf packet-filtering option to if_bridge. The option is controlled at
compile-time by BRIDGE_IPF, and at runtime by brconfig with the {ipf,-ipf}
option on a per-bridge basis.

As a side-effect, add PFIL_HOOKS processing to if_bridge.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base kqueue-aftermerge kqueue-beforemerge gehenna-devsw-base kqueue-base
# 1.8 24-Aug-2002 martin

Add a function to lookup bridge members by struct ifnet * and use
it at all call sites that have such a pointer readily available.
This avoids unnecessary strcmp()s in critical paths, and removes
some XXX comments.


# 1.7 08-Jun-2002 itojun

reject "add" request if if_mtu is different.


# 1.6 23-May-2002 itojun

use IFT_BRIDGE


Revision tags: netbsd-1-6-base
# 1.5 24-Mar-2002 jdolecek

branches: 1.5.2; 1.5.4;
Fix a memory leak in bridge_ioctl_add() when the called for non-ethernet
interface.
Problem noted and fix provided by in kern/16019 by Love.


Revision tags: eeh-devprop-base newlock-base
# 1.4 08-Mar-2002 thorpej

Pool deals fairly well with physical memory shortage, but it doesn't
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map). Try to deal with this:

* Group all information about the backend allocator for a pool in a
separate structure. The pool references this structure, rather than
the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
to become available, but will still fail if it cannot callocate KVA
space for the pages. If this happens, carefully drain all pools using
the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
some pages, and use that information to make draining easier and more
efficient.
* Get rid of PR_URGENT. There was only one use of it, and it could be
dealt with by the caller.

From art@openbsd.org.


Revision tags: ifpoll-base
# 1.3 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.2 17-Aug-2001 thorpej

branches: 1.2.2; 1.2.4;
Only report expire time for DYNAMIC forwarding table entries.


# 1.1 17-Aug-2001 thorpej

Add support for building Ethernet bridges, based on Jason Wright's
bridge driver from OpenBSD, although the bridge code has been *heavily*
modified by me (the 802.1D code remains mostly unchanged from the
original).


# 1.165 05-Aug-2019 msaitoh

Cast uint32_t to avoid undefined behavior in bridge_rthash(). Found by kUBSan.


Revision tags: netbsd-9-base phil-wifi-20190609 isaki-audio2-base pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226
# 1.164 22-Dec-2018 rin

Take the interface out of promiscuous mode in bridge_delete_member()
instead of bridge_ioctl_del(). Otherwise, the member interfaces are
left in promiscuous mode when the bridge is destroyed.


# 1.163 15-Dec-2018 rin

Improve wording in comments: replace "chain" with "queue" for
sequence of mbuf's connected by m_nextpkt, in order to avoid
confusion with those connected by m_next.

No binary changes.


# 1.162 14-Dec-2018 martin

Need <netinet6/ip6_var.h> for ip6_statinc() prototype.


# 1.161 12-Dec-2018 rin

PR kern/53562

Handle TX offload in software when a packet is sent via
bridge_output(). We can send it as is in the following
exceptional cases:

For unicast:

(1) When the destination interface is the same as source.

(2) When the destination supports all TX offload options
specified in a packet.

For multicast/broadcast:

(3) When all the members of the bridge support the specified
TX offload options.

For (3), add sc_csum_flags_tx flag to bridge softc, which is
logical AND b/w capabilities of TX offload options in member
interface (ifp->if_csum_flags_tx). The flag is updated when a
member is (i) added to or (ii) removed from a bridge, or (iii)
if_csum_flags_tx flag of a member interface is manipulated via
ifconfig(8).

Turn on M_CSUM_TSOv[46] bit in ifp->if_csum_flags_tx flag when
TSO[46] is enabled for that interface.

OK msaitoh thorpej


Revision tags: pgoyette-compat-1126
# 1.160 09-Nov-2018 ozaki-r

Fix that brconfig <bridge> (addr) can't show a large number of MAC addresses

The command shows only 256 addresses at maximum even if a bridge caches more
addresses. It occurs because the kernel doesn't return an error if the command
passes a short buffer that can't store all cached addresses; the kernel fills
cached addresses as much as possible and returns it without telling that the
result is truncated.

Fix the issue by telling a required size of a buffer if a buffer passed from the
command is not enough, which lets the command retry with an enough buffer.

Reported by k-goda@IIJ


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930
# 1.159 19-Sep-2018 msaitoh

Micro optimization. m_copym(M_COPYALL) -> m_copypacket().


# 1.158 18-Sep-2018 msaitoh

- Fix bridge_enqueue() which was broken by last commit. Use correct mbuf
pointer.
- Modify comment.


# 1.157 14-Sep-2018 msaitoh

Fix a bug that bridge_enqueue() incorrectly cleared outgoing packet's offload
flags. bridge_enqueue() is called from bridge_output() when a packet is
spontaneous. Clear csum_flags before calling brige_enqueue() in
bridge_forward() or bridge_broadcast() instead of in the beginning of
bridge_enqueue().

Note that this change doesn't fix a problem on the following configuration:

A bridge has two or more interfaces.

An address is assigned to an bridge member interface and
some offload flags are set.

Another interface has no address and has no any offload flag.

XXX pullup-[78]


Revision tags: pgoyette-compat-0906 pgoyette-compat-0728 phil-wifi-base pgoyette-compat-0625
# 1.156 25-May-2018 ozaki-r

branches: 1.156.2;
Ensure to call if_register after interface initializations finish


Revision tags: pgoyette-compat-0521
# 1.155 14-May-2018 ozaki-r

Protect packet input routines with KERNEL_LOCK and splsoftnet

if_input, i.e, ether_input and friends, now runs in softint without any
protections. It's ok for ether_input itself because it's already MP-safe,
however, subsequent routines called from it such as carp_input and agr_input
aren't safe because they're not MP-safe. Protect if_input with KERNEL_LOCK.

if_input can be called from a normal LWP context. In that case we need to
prevent interrupts (softint) from running by splsoftnet to protect non-MP-safe
codes (e.g., carp_input and agr_input).

Pointed out by mlelstv@


Revision tags: pgoyette-compat-0502 pgoyette-compat-0422
# 1.154 18-Apr-2018 ozaki-r

Add missing PSLIST_ENTRY_INIT and PSLIST_ENTRY_DESTROY


# 1.153 18-Apr-2018 ozaki-r

Get rid of a unnecessary semicolon

Pointed out by kamil@


# 1.152 18-Apr-2018 ozaki-r

bridge: use pslist(9) for rtlist and rthash

The change fixes race conditions on list operations. One example is that a
reader may see invalid pointers on a looking item in a list due to lack of
membar_producer.


# 1.151 18-Apr-2018 ozaki-r

Simplify bridge_rtnode_insert (NFC)


# 1.150 18-Apr-2018 ozaki-r

Remove obsolete NULL checks


Revision tags: pgoyette-compat-0415
# 1.149 10-Apr-2018 ozaki-r

Fix bridge_rtdelete

It removes a rtable entry that belongs to a specified interface, however, its
original behavior was to delete all belonging entries. Restore the original
behavior.


Revision tags: pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.148 15-Jan-2018 maxv

branches: 1.148.2;
If the bridge is not running, don't call bridge_stop. Otherwise the
following commands will crash the kernel:

ifconfig bridge0 create
ifconfig bridge0 destroy


# 1.147 28-Dec-2017 ozaki-r

Ensure the timer isn't running by using workqueue_wait


# 1.146 19-Dec-2017 ozaki-r

Don't set IFEF_MPSAFE unless NET_MPSAFE at this point

Because recent investigations show that interfaces with IFEF_MPSAFE need to
follow additional restrictions to work with the flag safely. We should enable it
on an interface by default only if the interface surely satisfies the
restrictions, which are described in if.h.

Note that enabling IFEF_MPSAFE solely gains a few benefit on performance because
the network stack is still serialized by the big kernel locks by default.


# 1.145 11-Dec-2017 ozaki-r

Wrap if_ioctl_lock with IFNET_* macros (NFC)

Also if_ioctl_lock perhaps needs to be renamed to something because it's now
not just for ioctl...


# 1.144 08-Dec-2017 ozaki-r

Fix build of kernels without ether

By throwing out if_enable_vlan_mtu and if_disable_vlan_mtu that
created a unnecessary dependency from if.c to if_ethersubr.c.

PR kern/52790


# 1.143 06-Dec-2017 ozaki-r

Ensure to not turn on IFF_RUNNING of an interface until its initialization completes

And ensure to turn off it before destruction as per IFF_RUNNING's description
"resource allocated". (The description is a bit doubtful though, I believe the
change is still proper.)


# 1.142 06-Dec-2017 ozaki-r

Ensure to hold if_ioctl_lock when calling if_flags_set


Revision tags: tls-maxphys-base-20171202
# 1.141 17-Nov-2017 ozaki-r

Add missing IFEF_NO_LINK_STATE_CHANGE to bridge


# 1.140 16-Nov-2017 ozaki-r

Unify IFEF_*_MPSAFE into IFEF_MPSAFE

There are already two flags for if_output and if_start, however, it seems such
MPSAFE flags are eventually needed for all if_XXX operations. Having discrete
flags for each operation is wasteful of if_extflags bits. So let's unify
the flags into one: IFEF_MPSAFE.

Fortunately IFEF_*_MPSAFE flags have never been included in any releases, so
we can change them without breaking backward compatibility of the releases
(though the kernel version of -current should be bumped).

Note that if an interface have both MP-safe and non-MP-safe operations at a
time, we have to set the IFEF_MPSAFE flag and let callees of non-MP-safe
opeartions take the kernel lock.

Proposed on tech-kern@ and tech-net@


# 1.139 15-Nov-2017 ozaki-r

Mark callouts of bridge CALLOUT_MPSAFE


# 1.138 25-Oct-2017 ozaki-r

Remove unnecessary splsoftnet


# 1.137 25-Oct-2017 ozaki-r

Don't free sc_rthash twice


# 1.136 23-Oct-2017 msaitoh

- If if_initialize() failed in the attach function, free resources and return.
- Add some missing frees in bridge_clone_destroy().
- KNF


# 1.135 02-Oct-2017 ozaki-r

Add curlwp_bind to bridge_input for psref

It can be called in a thread context via tap (tap_dev_write).

Fix PR kern/52587


Revision tags: nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320
# 1.134 07-Mar-2017 ozaki-r

branches: 1.134.6;
Remove unnecessary splnet for bridge_enqueue

bridge_enqueue now uses if_transmit_lock that does splnet for device
drivers, so splnet for bridge_enqueue isn't needed anymore.


# 1.133 16-Feb-2017 knakahara

add l2tp(4) L2TPv3 interface.

originally implemented by IIJ SEIL team.


Revision tags: nick-nhusb-base-20170204
# 1.132 23-Jan-2017 ozaki-r

Replace some splnet with splsoftnet


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107 nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004
# 1.131 15-Sep-2016 christos

branches: 1.131.2;
Always do the mbuf checks. The packet filters (npf) expect the mbuf to be
pulled-up. (Krists Krilovs)


Revision tags: localcount-20160914
# 1.130 29-Aug-2016 ozaki-r

KNF; replace white spaces with hard tabs

No functional change.


Revision tags: pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.129 22-Jun-2016 knakahara

branches: 1.129.2;
fix: locking about IFQ_ENQUEUE and ALTQ

- If NET_MPSAFE is not defined, IFQ_LOCK is nop. Currently, that means
IFQ_ENQUEUE() of some paths such as bridge_enqueue() is called parallel
wrongly.
- If ALTQ is enabled, Tx processing should call if_transmit() (= IFQ_ENQUEUE
+ ifp->if_start()) instead of ifp->if_transmit() to call ALTQ_ENQUEUE()
and ALTQ_DEQUEUE().
Furthermore, ALTQ processing is always required KERNEL_LOCK currently.


# 1.128 20-Jun-2016 knakahara

fix: should not assert IFEF_OUTPUT_MPSAFE in bridge_output()


# 1.127 20-Jun-2016 knakahara

tentative fix for ATF(net/if_bridge/t_bridge)


# 1.126 20-Jun-2016 knakahara

make bridge_output MP-safe, so that bridge(4) can enable IFEF_OUTPUT_MPSAFE.

making MP-scalable is future work.


# 1.125 10-Jun-2016 ozaki-r

Avoid storing a pointer of an interface in a mbuf

Having a pointer of an interface in a mbuf isn't safe if we remove big
kernel locks; an interface object (ifnet) can be destroyed anytime in any
packet processing and accessing such object via a pointer is racy. Instead
we have to get an object from the interface collection (ifindex2ifnet) via
an interface index (if_index) that is stored to a mbuf instead of an
pointer.

The change provides two APIs: m_{get,put}_rcvif_psref that use psref(9)
for sleep-able critical sections and m_{get,put}_rcvif that use
pserialize(9) for other critical sections. The change also adds another
API called m_get_rcvif_NOMPSAFE, that is NOT MP-safe and for transition
moratorium, i.e., it is intended to be used for places where are not
planned to be MP-ified soon.

The change adds some overhead due to psref to performance sensitive paths,
however the overhead is not serious, 2% down at worst.

Proposed on tech-kern and tech-net.


# 1.124 10-Jun-2016 ozaki-r

Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.123 16-May-2016 ozaki-r

Apply if_get and if_put to bridge(4)


# 1.122 04-May-2016 roy

Allow multicast/broadcast packets from a bridge member to other members.
Note this should just call bridge_broadcast when more locking issues are
resolved.


# 1.121 28-Apr-2016 knakahara

introduce new ifnet MP-scalable sending interface "if_transmit".


# 1.120 28-Apr-2016 ozaki-r

Constify rtentry of if_output

We no longer need to change rtentry below if_output.

The change makes it clear where rtentries are changed (or not)
and helps forthcoming locking (os psrefing) rtentries.


# 1.119 24-Apr-2016 christos

CID 1358673: dead code


Revision tags: nick-nhusb-base-20160422
# 1.118 22-Apr-2016 roy

Change used from int to bool.
If used, abort the loop because we think we're already at the end.


# 1.117 20-Apr-2016 knakahara

IFQ_ENQUEUE refactor (3/3) : eliminate pktattr argument from IFQ_ENQUEUE caller


# 1.116 20-Apr-2016 knakahara

IFQ_ENQUEUE refactor (2/3) : eliminate pktattr argument from altq implemantation


# 1.115 19-Apr-2016 ozaki-r

Apply psref(9) to bridge(4)

Note that there is an issue that ioctls for an interface and a destruction
of the interface can run in parallel and it causes race conditions on
bridge as well (it rarely happens). The issue will be addressed in the
interface common code (if.c).


# 1.114 19-Apr-2016 ozaki-r

Remove BRIDGE_MPSAFE switch and enable MP-safe code by default

We need to enable it by default because bridge_input now runs
in softint, but bridge_input w/o BRIDGE_MPSAFE was designed as
it runs in hardware interrupt.

Note that there remains a racy code in bridge_output; it will be
solved in the upcoming change (applying psref(9)).


# 1.113 11-Apr-2016 ozaki-r

Fix usage of pslist(9)

Pointed out by riastradh@.


# 1.112 11-Apr-2016 ozaki-r

Use pslist(9) in bridge(4)

This adds missing memory barriers to list operations for pserialize.


# 1.111 28-Mar-2016 ozaki-r

Remove unused global bridge list

Pointed out by riastradh@


# 1.110 23-Mar-2016 ozaki-r

Fix LIST_FOREACH argument


# 1.109 23-Mar-2016 ozaki-r

Use LIST_FOREACH instead of LIST_FOREACH_SAFE

No need to use *_SAFE because we don't remove any items in the loop.


Revision tags: nick-nhusb-base-20160319
# 1.108 15-Feb-2016 ozaki-r

Simplify bridge(4)

Thanks to introducing softint-based if_input, the entire bridge code now
never run in hardware interrupt context. So we can simplify the code.

- Remove spin mutexes
- They were needed because some code of bridge could run in
hardware interrupt context
- We now need only an adaptive mutex for each shared object
(a member list and a forwarding table)
- Remove pktqueue
- bridge_input is already in softint, using another softint
(for bridge_forward) is useless
- Packet distribution should be down at device drivers


# 1.107 10-Feb-2016 ozaki-r

Don't share struct work, instead have one per softc

Pointed out by riastradh@


# 1.106 09-Feb-2016 ozaki-r

Introduce softint-based if_input

This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.

This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.

To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.

Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html

Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!


Revision tags: nick-nhusb-base-20151226
# 1.105 19-Nov-2015 christos

Add handling of VLAN packets in if_bridge where the parent interface supports
them (Jean-Jacques.Puig@espci.fr). Factor out the vlan_mtu enabling and
disabling code.


# 1.104 20-Oct-2015 maxv

Harmless alloc inconsistency; make sure the exact same argument is given to
kmem_alloc/kmem_free. Found by Brainy.


# 1.103 07-Oct-2015 ozaki-r

Enqueue frames to a curcpu's pktqueue

Currently RX can run on a CPU other than CPU#0, so always enqueuing
to a pktqueue of CPU#0 makes no sense. Let's use a curcpu's pktqueue,
although bridge_foward softint doesn't run in parallel without
NET_MPSAFE.

This is a temporal solution. We need a fundamental solution.


Revision tags: nick-nhusb-base-20150921
# 1.102 28-Aug-2015 rjs

Don't set M_PROTO1 in mbuf flags.

This was left over from the old usage of gif(4) with bridges.


# 1.101 20-Aug-2015 christos

include "ioconf.h" to get the 'void <driver>attach(int count);' prototype.


# 1.100 23-Jul-2015 ozaki-r

Fix PR 48104

So far bridge cannot receive frames via a member interface when the frames
come from another member interface. So when we assign an IP address to
a member interface, hosts connected to another member interface cannot
ping to the IP address. That behavior isn't expected. See PR 48104 for
more realistic examples of this issue.

The change does:
- drop M_PROMISC before ether_input, which allows a bridge member interface
to receive a frame coming from another bridge member interface
- receive broadcast/multicast frames via all bridge member interfaces,
which is required to receive IPv6 multicast packets destined to a
multicast group belonging to a bridge member interface that is different
from a packet arrival interface

roy@ helped testing of the fix, thanks!


Revision tags: nick-nhusb-base-20150606
# 1.99 01-Jun-2015 matt

Modify the BRDGGIFS and BRDGRTS cmds to be more COMPAT_NETBSD32 friendly.
(XXX whitespace)


# 1.98 16-Apr-2015 ozaki-r

Fix racy bridge_delete_member

It can be called from bridge_ioctl_del and bridge_clone_destroy with
a same bridge member (bif) at the same time. We have to prevent
that happens.

Pointed out by riastradh@


Revision tags: nick-nhusb-base-20150406
# 1.97 08-Jan-2015 ozaki-r

Use pserialize for rtlist in bridge

This change enables lockless accesses to bridge rtable lists.
See locking notes in a comment to know how pserialize and
mutexes are used. Some functions are rearranged to use
pserialize. A workqueue is introduced to use pserialize in
bridge_rtage via bridge_timer callout.

As usual, pserialize and mutexes are used only when NET_MPSAFE
on. On the other hand, the newly added workqueue is used
regardless of NET_MPSAFE on or off.


# 1.96 01-Jan-2015 ozaki-r

Reset the expire time of a cache on receiving a frame for the cache

The expire time of a cache in a bridge MAC address table was never reset
once it is initialized regardless of traffic for the cache. The behavior
isn't supposed and active caches are unnecessarily expired and removed.

PR kern/49507


# 1.95 31-Dec-2014 ozaki-r

Use pserialize in bridge

This change enables lockless accesses to bridge member lists.
See locking notes in a comment to know how pserialize and
mutexes are used.

This change also provides support for softint-based interrupt
handling; pserialize readers can run in both HW interrupt and
softint contexts.

As usual, pserialize is used only when NET_MPSAFE on.


# 1.94 25-Dec-2014 ozaki-r

Use LIST_FOREACH_SAFE in bridge_rt* functions


# 1.93 24-Dec-2014 ozaki-r

Replace malloc/free with kmem_* in if_bridge

Additionally M_NOWAIT is replaced with KM_SLEEP.


# 1.92 22-Dec-2014 ozaki-r

Call ether_input/m_freem without holding a lock or referencing unnecessary objects

When NET_MPSAFE on, a bridge tries to pass up a packet to Layer 3
(or call m_freem) with holding a lock or referencing unnecessary
objects. That causes random lock ups. The change fixes the issue.


Revision tags: nick-nhusb-base
# 1.91 15-Aug-2014 ozaki-r

branches: 1.91.2;
bridge: reject non-IFF_SIMPLEX interfaces

bridge does not work with !IFF_SIMPLEX interfaces (PR/18035);
the bug is not yet fixed. Until it gets fixed, we should
reject non-IFF_SIMPLEX interfaces.

Discussed with pooka@


Revision tags: netbsd-7-1-2-RELEASE netbsd-7-1-1-RELEASE netbsd-7-1-RELEASE netbsd-7-1-RC2 netbsd-7-nhusb-base-20170116 netbsd-7-1-RC1 netbsd-7-0-2-RELEASE netbsd-7-nhusb-base netbsd-7-0-1-RELEASE netbsd-7-0-RELEASE netbsd-7-0-RC3 netbsd-7-0-RC2 netbsd-7-0-RC1 netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.90 23-Jul-2014 ozaki-r

branches: 1.90.2;
Avoid calling copyout with holding mutex(IPL_NET)

Because copyout may lead a page fault that may sleep, we have to pull it
out from the critical section of mutex(IPL_NET) in bridge_ioctl_gifs.


# 1.89 23-Jul-2014 ozaki-r

Add missing unlock


# 1.88 20-Jul-2014 ozaki-r

Don't return ENETRESET when ioctl SIOCSIFMTU

Otherwise, just changing MTU with ifconfig shows
a confusable error message.

RP kern/48996


# 1.87 14-Jul-2014 ozaki-r

Make bridge MPSAFE

- Introduce BRIDGE_MPSAFE
- It's enabled only when NET_MPSAFE is defined
in if.h or the kernel config
- Add iflist and rtlist mutex locks
- Locking iflist is performance sensitive,
so it's not used when !BRIDGE_MPSAFE
- Add bif object reference counting
- It enables fine-grain locking for bridge member lists
by allowing to not hold a lock during touching a bif
- bridge_release_member is added to decrement the
reference count
- A condition variable is added to do bridge_delete_member
gracefully
- Add if_bridgeif to ifnet
- It's a shortcut to a bif object of a bridge member
- It reduces a bif lookup cost and so lock contention on iflist
- Make bridgestp MPSAFE too


# 1.86 02-Jul-2014 ozaki-r

Protect bridge_list with a mutex


# 1.85 02-Jul-2014 ozaki-r

Remove obsolete codes for if_snd


# 1.84 23-Jun-2014 ozaki-r

Get rid of unnecessary xc_broadcast after pktq_barrier

Pointed out by rmind@


# 1.83 18-Jun-2014 ozaki-r

Restructure bridge_input and bridge_broadcast

There are two changes:
- Assemble the places calling pktq_enqueue (bridge_forward)
for unicast and {b,m}cast frames into one
- Receive {b,m}cast frames in bridge_broadcast, not in
bridge_input

The changes make the code clear and readable. bridge_input
now doesn't need to take care of {b,m}cast frames;
bridge_forward and bridge_broadcast have the responsibility.

The changes are based on a patch of Lloyd Parkes submitted
in PR 48104, but don't fix its issue yet.


# 1.82 18-Jun-2014 ozaki-r

Tidy up bridge_input

No functional change.


# 1.81 17-Jun-2014 ozaki-r

Restructure ether_input and bridge_input

The network stack of NetBSD is well organized and
layered. A packet reception is processed from a
lower layer to an upper layer one by one. However,
ether_input and bridge_input are not structured so.
bridge_input is called inside ether_input.

The new structure replaces ifnet#if_input of a bridge
member with bridge_input when the member is attached.
So a packet goes straight on a packet reception via
a bridge, bridge_input => ether_input => ip_input.

The change is part of a patch of Lloyd Parkes submitted
in PR 48104. Unlike the patch, the change doesn't
intend to change the behavior of the packet processing.
Another patch will fix PR 48104.


# 1.80 16-Jun-2014 ozaki-r

Add net.interfaces.bridgeN.fwdq.{maxlen,len,drops} sysctl


# 1.79 16-Jun-2014 ozaki-r

Use pktqueue for bridge forwarding queue and softint


# 1.78 15-Jun-2014 ozaki-r

Get rid of unnecessary splnet for pool_{get,put}

A mutex prevents interrupts in the functions now.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base
# 1.77 29-Jun-2013 rmind

branches: 1.77.4;
- Rewrite parts of pfil(9): use array to store hooks and thus be more cache
friendly (there are only few hooks in the system). Make the structures
opaque and the interface more strict.
- Remove PFIL_HOOKS option by making pfil(9) mandatory.


Revision tags: agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8
# 1.76 22-Mar-2012 wiz

branches: 1.76.2; 1.76.4;
Fix typo in kauth name. From PR 46234 by Matthew Mondor.
Tested by Geoff Adams and Ryo ONODERA.


# 1.75 13-Mar-2012 elad

Replace the remaining KAUTH_GENERIC_ISSUSER authorization calls with
something meaningful. All relevant documentation has been updated or
written.

Most of these changes were brought up in the following messages:

http://mail-index.netbsd.org/tech-kern/2012/01/18/msg012490.html
http://mail-index.netbsd.org/tech-kern/2012/01/19/msg012502.html
http://mail-index.netbsd.org/tech-kern/2012/02/17/msg012728.html

Thanks to christos, manu, njoly, and jmmv for input.

Huge thanks to pgoyette for spinning these changes through some build
cycles and ATF.


Revision tags: netbsd-6-0-6-RELEASE netbsd-6-1-5-RELEASE netbsd-6-1-4-RELEASE netbsd-6-0-5-RELEASE netbsd-6-1-3-RELEASE netbsd-6-0-4-RELEASE netbsd-6-1-2-RELEASE netbsd-6-0-3-RELEASE netbsd-6-1-1-RELEASE netbsd-6-0-2-RELEASE netbsd-6-1-RELEASE netbsd-6-1-RC4 netbsd-6-1-RC3 netbsd-6-1-RC2 netbsd-6-1-RC1 netbsd-6-0-1-RELEASE matt-nb6-plus-nbase netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base
# 1.74 19-Nov-2011 tls

branches: 1.74.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.73 23-May-2011 joerg

branches: 1.73.4;
simplify


Revision tags: bouyer-quota2-nbase bouyer-quota2-base jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.72 07-Dec-2010 pooka

branches: 1.72.2;
_KERNEL_TOP


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1 yamt-nfs-mp-base9 uebayasi-xip-base
# 1.71 19-Jan-2010 pooka

branches: 1.71.4;
Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


Revision tags: matt-premerge-20091211 yamt-nfs-mp-base8 yamt-nfs-mp-base7 jymxensuspend-base yamt-nfs-mp-base6 yamt-nfs-mp-base5 jym-xensuspend-nbase
# 1.70 17-May-2009 cegger

fix crash in bridge_ioctl():


BRDGGFLT and BRDGSFILT bridge controls are only available with BRIDGE_IPF and PFIL_HOOKS defined.
In amd64 GENERIC and XEN kernel configs PFIL_HOOKS is defined but BRIDGE_IPF is not.

When a BRDGGFLT or BRDGSFILT command comes in, then ifd->ifd_cmd is not in range
of bridge_control_table_size. Then bc is not set and is dereferenced
later => BOOM.


Revision tags: yamt-nfs-mp-base4 jym-xensuspend-base
# 1.69 12-May-2009 elad

Move kauth(9) call before going into splnet().

Mailing list reference:

http://mail-index.netbsd.org/tech-net/2009/05/08/msg001286.html


Revision tags: yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 nick-hppapmap-base
# 1.68 04-Apr-2009 bouyer

Fix another typo


# 1.67 04-Apr-2009 bouyer

Fix a comment, and make it build.


# 1.66 04-Apr-2009 bouyer

Fixes from Masao Uebayashi


# 1.65 04-Apr-2009 bouyer

Fix for if_start() and pfil_hook() being called from hardware interrupt
context (reported on various mailing-lists, and part of PR kern/41114,
causing panic in pf(4) and possibly ipf(4) when BRIDGE_IPF is used).
Defer bridge_forward() to a software interrupt; bridge_input() enqueues
mbufs to ifp->if_snd which is handled in bridge_forward().


Revision tags: nick-hppapmap-base2
# 1.64 18-Jan-2009 mrg

branches: 1.64.2;
Fix multiple problems:

* A sign extension error creating the bridge ID corrupted the
priority (always making it the maximum).
* Do not catch STP packets on an interface for which STP is not
enabled -- it's a violation of the spec, and causes STP to fail on
neighboring bridges.
* An optimization to bstp_input() -- some information is already
known when we call it.

contributed anonymously.


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.63 07-Nov-2008 dyoung

*** Summary ***

When a link-layer address changes (e.g., ifconfig ex0 link
02:de:ad:be:ef:02 active), send a gratuitous ARP and/or a Neighbor
Advertisement to update the network-/link-layer address bindings
on our LAN peers.

Refuse a change of ethernet address to the address 00:00:00:00:00:00
or to any multicast/broadcast address. (Thanks matt@.)

Reorder ifnet ioctl operations so that driver ioctls may inherit
the functions of their "class"---ether_ioctl(), fddi_ioctl(), et
cetera---and the class ioctls may inherit from the generic ioctl,
ifioctl_common(), but both driver- and class-ioctls may override
the generic behavior. Make network drivers share more code.

Distinguish a "factory" link-layer address from others for the
purposes of both protecting that address from deletion and computing
EUI64.

Return consistent, appropriate error codes from network drivers.

Improve readability. KNF.

*** Details ***

In if_attach(), always initialize the interface ioctl routine,
ifnet->if_ioctl, if the driver has not already initialized it.
Delete if_ioctl == NULL tests everywhere else, because it cannot
happen.

In the ioctl routines of network interfaces, inherit common ioctl
behaviors by calling either ifioctl_common() or whichever ioctl
routine is appropriate for the class of interface---e.g., ether_ioctl()
for ethernets.

Stop (ab)using SIOCSIFADDR and start to use SIOCINITIFADDR. In
the user->kernel interface, SIOCSIFADDR's argument was an ifreq,
but on the protocol->ifnet interface, SIOCSIFADDR's argument was
an ifaddr. That was confusing, and it would work against me as I
make it possible for a network interface to overload most ioctls.
On the protocol->ifnet interface, replace SIOCSIFADDR with
SIOCINITIFADDR. In ifioctl(), return EPERM if userland tries to
invoke SIOCINITIFADDR.

In ifioctl(), give the interface the first shot at handling most
interface ioctls, and give the protocol the second shot, instead
of the other way around. Finally, let compatibility code (COMPAT_OSOCK)
take a shot.

Pull device initialization out of switch statements under
SIOCINITIFADDR. For example, pull ..._init() out of any switch
statement that looks like this:

switch (...->sa_family) {
case ...:
..._init();
...
break;
...
default:
..._init();
...
break;
}

Rewrite many if-else clauses that handle all permutations of IFF_UP
and IFF_RUNNING to use a switch statement,

switch (x & (IFF_UP|IFF_RUNNING)) {
case 0:
...
break;
case IFF_RUNNING:
...
break;
case IFF_UP:
...
break;
case IFF_UP|IFF_RUNNING:
...
break;
}

unifdef lots of code containing #ifdef FreeBSD, #ifdef NetBSD, and
#ifdef SIOCSIFMTU, especially in fwip(4) and in ndis(4).

In ipw(4), remove an if_set_sadl() call that is out of place.

In nfe(4), reuse the jumbo MTU logic in ether_ioctl().

Let ethernets register a callback for setting h/w state such as
promiscuous mode and the multicast filter in accord with a change
in the if_flags: ether_set_ifflags_cb() registers a callback that
returns ENETRESET if the caller should reset the ethernet by calling
if_init(), 0 on success, != 0 on failure. Pull common code from
ex(4), gem(4), nfe(4), sip(4), tlp(4), vge(4) into ether_ioctl(),
and register if_flags callbacks for those drivers.

Return ENOTTY instead of EINVAL for inappropriate ioctls. In
zyd(4), use ENXIO instead of ENOTTY to indicate that the device is
not any longer attached.

Add to if_set_sadl() a boolean 'factory' argument that indicates
whether a link-layer address was assigned by the factory or some
other source. In a comment, recommend using the factory address
for generating an EUI64, and update in6_get_hw_ifid() to prefer a
factory address to any other link-layer address.

Add a routing message, RTM_LLINFO_UPD, that tells protocols to
update the binding of network-layer addresses to link-layer addresses.
Implement this message in IPv4 and IPv6 by sending a gratuitous
ARP or a neighbor advertisement, respectively. Generate RTM_LLINFO_UPD
messages on a change of an interface's link-layer address.

In ether_ioctl(), do not let SIOCALIFADDR set a link-layer address
that is broadcast/multicast or equal to 00:00:00:00:00:00.

Make ether_ioctl() call ifioctl_common() to handle ioctls that it
does not understand.

In gif(4), initialize if_softc and use it, instead of assuming that
the gif_softc and ifp overlap.

Let ifioctl_common() handle SIOCGIFADDR.

Sprinkle rtcache_invariants(), which checks on DIAGNOSTIC kernels
that certain invariants on a struct route are satisfied.

In agr(4), rewrite agr_ioctl_filter() to be a bit more explicit
about the ioctls that we do not allow on an agr(4) member interface.

bzero -> memset. Delete unnecessary casts to void *. Use
sockaddr_in_init() and sockaddr_in6_init(). Compare pointers with
NULL instead of "testing truth". Replace some instances of (type
*)0 with NULL. Change some K&R prototypes to ANSI C, and join
lines.


Revision tags: netbsd-5-0-RC3 netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 wrstuden-revivesa-base-1 simonb-wapbl-nbase yamt-pf42-base4 simonb-wapbl-base wrstuden-revivesa-base
# 1.62 15-Jun-2008 christos

branches: 1.62.2; 1.62.4; 1.62.6;
- add if_alloc (ours just mallocs), and if_initname and use them (from FreeBSD)
- kill memsets where M_ZERO can be used.


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base
# 1.61 15-Apr-2008 thorpej

branches: 1.61.2; 1.61.4; 1.61.6; 1.61.8;
Make ip6 and icmp6 stats per-cpu.


# 1.60 12-Apr-2008 cegger

make this build with BRIDGE_IPF and PFIL_HOOKS options


# 1.59 12-Apr-2008 thorpej

Make IP, TCP, UDP, and ICMP statistics per-CPU. The stats are collated
when the user requests them via sysctl.


# 1.58 08-Apr-2008 thorpej

Change IPv6 stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ip6stat structure; old netstat
binaries will continue to work properly.


# 1.57 07-Apr-2008 thorpej

Change IP stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ipstat structure; old netstat
binaries will continue to work properly.


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase nick-net80211-sync-base keiichi-mipv6-base matt-armv6-nbase hpcarm-cleanup-base
# 1.56 20-Feb-2008 matt

branches: 1.56.6;
s/u_\(int[0-9]*_t\)/u\1/g
(change u_int*_t to uint*_t)


Revision tags: bouyer-xeni386-nbase bouyer-xeni386-base mjf-devfs-base
# 1.55 19-Jan-2008 dyoung

Use C99 array initializers for bridge_control_table[].


Revision tags: nick-csl-alignment-base5 bouyer-xeni386-merge1 matt-armv6-prevmlocking vmlocking2-base3 yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 jmcneill-base bouyer-xenamd64-base2 vmlocking-nbase yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base matt-armv6-base jmcneill-pm-base reinoud-bufcleanup-base vmlocking-base
# 1.54 27-Aug-2007 dyoung

branches: 1.54.2; 1.54.8; 1.54.14;
LLADDR -> CLLADDR.


# 1.53 26-Aug-2007 dyoung

Constify: LLADDR -> CLLADDR. I'm aiming here to make it easier to
identify sockaddr_dl abuse that remains in the kernel, especially
the potential for overwriting memory past the end of a sockaddr_dl
with, e.g., memcpy(LLADDR(), ...).

Use sockaddr_dl_setaddr() in a few places.


Revision tags: matt-mips64-base nick-csl-alignment-base mjf-ufs-trans-base
# 1.52 09-Jul-2007 ad

branches: 1.52.2; 1.52.6;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.51 12-Mar-2007 ad

branches: 1.51.2;
Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.


# 1.50 04-Mar-2007 christos

branches: 1.50.2;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.49 21-Feb-2007 dyoung

Use __arraycount().


# 1.48 17-Feb-2007 dyoung

KNF: de-__P, bzero -> memset, bcmp -> memcmp. Remove extraneous
parentheses in return statements.

Cosmetic: don't open-code TAILQ_FOREACH().

Cosmetic: change types of variables to avoid oodles of casts: in
in6_src.c, avoid casts by changing several route_in6 pointers
to struct route pointers. Remove unnecessary casts to caddr_t
elsewhere.

Pave the way for eliminating address family-specific route caches:
soon, struct route will not embed a sockaddr, but it will hold
a reference to an external sockaddr, instead. We will set the
destination sockaddr using rtcache_setdst(). (I created a stub
for it, but it isn't used anywhere, yet.) rtcache_free() will
free the sockaddr. I have extracted from rtcache_free() a helper
subroutine, rtcache_clear(). rtcache_clear() will "forget" a
cached route, but it will not forget the destination by releasing
the sockaddr. I use rtcache_clear() instead of rtcache_free()
in rtcache_update(), because rtcache_update() is not supposed
to forget the destination.

Constify:

1 Introduce const accessor for route->ro_dst, rtcache_getdst().

2 Constify the 'dst' argument to ifnet->if_output(). This
led me to constify a lot of code called by output routines.

3 Constify the sockaddr argument to protosw->pr_ctlinput. This
led me to constify a lot of code called by ctlinput routines.

4 Introduce const macros for converting from a generic sockaddr
to family-specific sockaddrs, e.g., sockaddr_in: satocsin6,
satocsin, et cetera.


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.47 04-Jan-2007 elad

branches: 1.47.2;
Consistent usage of KAUTH_GENERIC_ISSUSER.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3 netbsd-4-base
# 1.46 23-Nov-2006 rpaulo

New EtherIP driver based on tap(4) and gif(4) by Hans Rosenfeld.
Notable changes:
* Fixes PR 34268.
* Separates the code from gif(4) (which is more cleaner).
* Allows the usage of STP (Spanning Tree Protocol).
* Removed EtherIP implementation from gif(4)/tap(4).

Some input from Christos.


# 1.45 16-Nov-2006 christos

__unused removal on arguments; approved by core.


Revision tags: yamt-splraiseipl-base2
# 1.44 17-Oct-2006 dogcow

now that we have -Wno-unused-parameter, back out all the tremendously ugly
code to gratuitously access said parameters.


# 1.43 13-Oct-2006 dogcow

More -Wunused fallout. sprinkle __unused when possible; otherwise, use the
do { if (&x) {} } while (/* CONSTCOND */ 0);
construct as suggested by uwe in <20061012224845.GA9449@snark.ptc.spbu.ru>.


# 1.42 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


# 1.41 05-Oct-2006 tls

Protect calls to pool_put/pool_get that may occur in interrupt context
with spl used to protect other allocations and frees, or datastructure
element insertion and removal, in adjacent code.

It is almost unquestionably the case that some of the spl()/splx() calls
added here are superfluous, but it really seems wrong to see:

s=splfoo();
/* frob data structure */
splx(s);
pool_put(x);

and if we think we need to protect the first operation, then it is hard
to see why we should not think we need to protect the next. "Better
safe than sorry".

It is also almost unquestionably the case that I missed some pool
gets/puts from interrupt context with my strategy for finding these
calls; use of PR_NOWAIT is a strong hint that a pool may be used from
interrupt context but many callers in the kernel pass a "can wait/can't
wait" flag down such that my searches might not have found them. One
notable area that needs to be looked at is pf.

See also:

http://mail-index.netbsd.org/tech-kern/2006/07/19/0003.html
http://mail-index.netbsd.org/tech-kern/2006/07/19/0009.html


Revision tags: abandoned-netbsd-4-base yamt-splraiseipl-base yamt-pdpolicy-base9 yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.40 23-Jul-2006 ad

branches: 1.40.4; 1.40.6;
Use the LWP cached credentials where sane.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.39 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.38 18-May-2006 liamjfoy

branches: 1.38.2;
Integrate Common Address Redundancy Procotol (CARP) from OpenBSD

'pseudo-device carp'

Thanks to: joerg@ christos@ riz@ and others who tested
Ok: core@


# 1.37 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base yamt-pdpolicy-base2 elad-kernelauth-base yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.36 17-Jan-2006 christos

branches: 1.36.2; 1.36.4; 1.36.6; 1.36.8; 1.36.10;
Make sure that breq is also cleared (from Xin LI)


# 1.35 09-Jan-2006 christos

Make sure we initialize all structs to 0; from Xin LI


# 1.34 24-Dec-2005 perry

branches: 1.34.2;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.33 11-Dec-2005 thorpej

ANSI function decls and application of static.


# 1.32 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.31 01-Jun-2005 jdc

branches: 1.31.2;
Fix this properly by renaming the conflicting variables.


# 1.30 01-Jun-2005 jdc

Remove extraneous definition of struct llc (found by shadow warning).


Revision tags: netbsd-3-0-RELEASE netbsd-3-0-RC6 netbsd-3-0-RC5 netbsd-3-0-RC4 netbsd-3-0-RC3 netbsd-3-0-RC2 netbsd-3-0-RC1 yamt-km-base4 yamt-km-base3 netbsd-3-base kent-audio2-base
# 1.29 26-Feb-2005 perry

branches: 1.29.2; 1.29.4;
nuke trailing whitespace


Revision tags: yamt-km-base2
# 1.28 31-Jan-2005 kim

Add RFC 3378 EtherIP support, ported from OpenBSD to NetBSD by
Hans Rosenfeld (rosenfeld at grumpf.hope-2000.org)

This change makes it possible to add gif interfaces to bridges, which
will then send and receive IP protocol 97 packets. Packets are Ethernet
frames with an EtherIP header prepended.


Revision tags: yamt-km-base kent-audio1-beforemerge kent-audio1-base
# 1.27 04-Dec-2004 peter

branches: 1.27.4; 1.27.6;
Change ifc_destroy to return an int instead of void, so that it
can pass back errors to ifconfig.


# 1.26 06-Oct-2004 bad

Interfaces that do checksum offloading indicate the checksum status of
received packets in csum_flags in the packet header. Packets that are
forwarded over the bridge need to have csum_flags cleared before being
put on the output queue. Do so in bridge_enqueue().

Discussed with Jason Thorpe.

Fixes PR kern/27007 and the first part of PR kern/21831.


# 1.25 05-Oct-2004 christos

Only enable BRIDGE_IPF code if PFIL_HOOKS is enabled.


# 1.24 21-Apr-2004 itojun

kill a sprintf


# 1.23 21-Apr-2004 itojun

kill sprintf, use snprintf


Revision tags: netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.22 31-Jan-2004 jdc

branches: 1.22.2;
Use m_copydata(), m_adj() and M_PREPEND() to manipulate mbuf's in
bridge_ipf(). Fixes kernel memory corruption that occured when using
m_split() and m_cat().
Idea from OpenBSD.


# 1.21 09-Dec-2003 augustss

Fix spelling mistake in a comment.


# 1.20 28-Oct-2003 mycroft

Mark this initializer in the canonical way so it can be found later.


# 1.19 25-Oct-2003 christos

Fix uninitialized variable warnings


# 1.18 16-Sep-2003 jdc

Add a flag parameter to bridge_enqueue() to tell it whether to run the
filter or not. We only need to run the filter for bridge_forward() and
bridge_broadcast(). If we also run it for bridge_output(), we will run
the filter twice outbound per packet, so don't.

In bridge_ipf(), make sure we don't run m_cat() on a single mbuf chain
by checking to see (and remembering) if we need to m_split() the mbuf.
This fixes bridge + ipfilter on sparc.

Fixes PR kern/22063.


# 1.17 11-Aug-2003 itojun

rm extra blank line


# 1.16 13-Jul-2003 jdc

Include opt_inet.h to get INET6 definition.
Now, bridged ipv6 packets are passed through ipfilter.
However, some v6 packets still do not get transmitted when ipf is enabled.
Partial fix for PR kern/22063.


# 1.15 23-Jun-2003 martin

branches: 1.15.2;
Make sure to include opt_foo.h if a defflag option FOO is used.


# 1.14 24-May-2003 kristerw

Make sure splx() is called for all bridge_ioctl() error cases.


# 1.13 16-May-2003 itojun

use strlcpy


# 1.12 14-May-2003 itojun

use arc4random


# 1.11 19-Mar-2003 bouyer

Fix 2 bugs:
- initialise stp when the bridge is turned up, without this stp will keep
all interfaces disabled in a sequence like:
brconfig bridge0 add if0 add if1 stp if0 stp if1 up
- s/BRDGSPRI/BRDGSIFPRIO in brconfig.c:cmd_ifpriority()

add a command (ifpathcost) to change the stp path cost of the STP path cost of
an interface. Display the interface path cost with the others STP parameters.


# 1.10 27-Feb-2003 perseant

Make BRIDGE_IPF an option, and document it. Add it (commented) to GENERIC.
Let brconfig tell whether the bridge is using the ipfilter hook, or not.


# 1.9 15-Feb-2003 perseant

Add ipf packet-filtering option to if_bridge. The option is controlled at
compile-time by BRIDGE_IPF, and at runtime by brconfig with the {ipf,-ipf}
option on a per-bridge basis.

As a side-effect, add PFIL_HOOKS processing to if_bridge.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base kqueue-aftermerge kqueue-beforemerge gehenna-devsw-base kqueue-base
# 1.8 24-Aug-2002 martin

Add a function to lookup bridge members by struct ifnet * and use
it at all call sites that have such a pointer readily available.
This avoids unnecessary strcmp()s in critical paths, and removes
some XXX comments.


# 1.7 08-Jun-2002 itojun

reject "add" request if if_mtu is different.


# 1.6 23-May-2002 itojun

use IFT_BRIDGE


Revision tags: netbsd-1-6-base
# 1.5 24-Mar-2002 jdolecek

branches: 1.5.2; 1.5.4;
Fix a memory leak in bridge_ioctl_add() when the called for non-ethernet
interface.
Problem noted and fix provided by in kern/16019 by Love.


Revision tags: eeh-devprop-base newlock-base
# 1.4 08-Mar-2002 thorpej

Pool deals fairly well with physical memory shortage, but it doesn't
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map). Try to deal with this:

* Group all information about the backend allocator for a pool in a
separate structure. The pool references this structure, rather than
the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
to become available, but will still fail if it cannot callocate KVA
space for the pages. If this happens, carefully drain all pools using
the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
some pages, and use that information to make draining easier and more
efficient.
* Get rid of PR_URGENT. There was only one use of it, and it could be
dealt with by the caller.

From art@openbsd.org.


Revision tags: ifpoll-base
# 1.3 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.2 17-Aug-2001 thorpej

branches: 1.2.2; 1.2.4;
Only report expire time for DYNAMIC forwarding table entries.


# 1.1 17-Aug-2001 thorpej

Add support for building Ethernet bridges, based on Jason Wright's
bridge driver from OpenBSD, although the bridge code has been *heavily*
modified by me (the 802.1D code remains mostly unchanged from the
original).


Revision tags: isaki-audio2-base pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226
# 1.164 22-Dec-2018 rin

Take the interface out of promiscuous mode in bridge_delete_member()
instead of bridge_ioctl_del(). Otherwise, the member interfaces are
left in promiscuous mode when the bridge is destroyed.


# 1.163 15-Dec-2018 rin

Improve wording in comments: replace "chain" with "queue" for
sequence of mbuf's connected by m_nextpkt, in order to avoid
confusion with those connected by m_next.

No binary changes.


# 1.162 14-Dec-2018 martin

Need <netinet6/ip6_var.h> for ip6_statinc() prototype.


# 1.161 12-Dec-2018 rin

PR kern/53562

Handle TX offload in software when a packet is sent via
bridge_output(). We can send it as is in the following
exceptional cases:

For unicast:

(1) When the destination interface is the same as source.

(2) When the destination supports all TX offload options
specified in a packet.

For multicast/broadcast:

(3) When all the members of the bridge support the specified
TX offload options.

For (3), add sc_csum_flags_tx flag to bridge softc, which is
logical AND b/w capabilities of TX offload options in member
interface (ifp->if_csum_flags_tx). The flag is updated when a
member is (i) added to or (ii) removed from a bridge, or (iii)
if_csum_flags_tx flag of a member interface is manipulated via
ifconfig(8).

Turn on M_CSUM_TSOv[46] bit in ifp->if_csum_flags_tx flag when
TSO[46] is enabled for that interface.

OK msaitoh thorpej


Revision tags: pgoyette-compat-1126
# 1.160 09-Nov-2018 ozaki-r

Fix that brconfig <bridge> (addr) can't show a large number of MAC addresses

The command shows only 256 addresses at maximum even if a bridge caches more
addresses. It occurs because the kernel doesn't return an error if the command
passes a short buffer that can't store all cached addresses; the kernel fills
cached addresses as much as possible and returns it without telling that the
result is truncated.

Fix the issue by telling a required size of a buffer if a buffer passed from the
command is not enough, which lets the command retry with an enough buffer.

Reported by k-goda@IIJ


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930
# 1.159 19-Sep-2018 msaitoh

Micro optimization. m_copym(M_COPYALL) -> m_copypacket().


# 1.158 18-Sep-2018 msaitoh

- Fix bridge_enqueue() which was broken by last commit. Use correct mbuf
pointer.
- Modify comment.


# 1.157 14-Sep-2018 msaitoh

Fix a bug that bridge_enqueue() incorrectly cleared outgoing packet's offload
flags. bridge_enqueue() is called from bridge_output() when a packet is
spontaneous. Clear csum_flags before calling brige_enqueue() in
bridge_forward() or bridge_broadcast() instead of in the beginning of
bridge_enqueue().

Note that this change doesn't fix a problem on the following configuration:

A bridge has two or more interfaces.

An address is assigned to an bridge member interface and
some offload flags are set.

Another interface has no address and has no any offload flag.

XXX pullup-[78]


Revision tags: pgoyette-compat-0906 pgoyette-compat-0728 phil-wifi-base pgoyette-compat-0625
# 1.156 25-May-2018 ozaki-r

Ensure to call if_register after interface initializations finish


Revision tags: pgoyette-compat-0521
# 1.155 14-May-2018 ozaki-r

Protect packet input routines with KERNEL_LOCK and splsoftnet

if_input, i.e, ether_input and friends, now runs in softint without any
protections. It's ok for ether_input itself because it's already MP-safe,
however, subsequent routines called from it such as carp_input and agr_input
aren't safe because they're not MP-safe. Protect if_input with KERNEL_LOCK.

if_input can be called from a normal LWP context. In that case we need to
prevent interrupts (softint) from running by splsoftnet to protect non-MP-safe
codes (e.g., carp_input and agr_input).

Pointed out by mlelstv@


Revision tags: pgoyette-compat-0502 pgoyette-compat-0422
# 1.154 18-Apr-2018 ozaki-r

Add missing PSLIST_ENTRY_INIT and PSLIST_ENTRY_DESTROY


# 1.153 18-Apr-2018 ozaki-r

Get rid of a unnecessary semicolon

Pointed out by kamil@


# 1.152 18-Apr-2018 ozaki-r

bridge: use pslist(9) for rtlist and rthash

The change fixes race conditions on list operations. One example is that a
reader may see invalid pointers on a looking item in a list due to lack of
membar_producer.


# 1.151 18-Apr-2018 ozaki-r

Simplify bridge_rtnode_insert (NFC)


# 1.150 18-Apr-2018 ozaki-r

Remove obsolete NULL checks


Revision tags: pgoyette-compat-0415
# 1.149 10-Apr-2018 ozaki-r

Fix bridge_rtdelete

It removes a rtable entry that belongs to a specified interface, however, its
original behavior was to delete all belonging entries. Restore the original
behavior.


Revision tags: pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.148 15-Jan-2018 maxv

branches: 1.148.2;
If the bridge is not running, don't call bridge_stop. Otherwise the
following commands will crash the kernel:

ifconfig bridge0 create
ifconfig bridge0 destroy


# 1.147 28-Dec-2017 ozaki-r

Ensure the timer isn't running by using workqueue_wait


# 1.146 19-Dec-2017 ozaki-r

Don't set IFEF_MPSAFE unless NET_MPSAFE at this point

Because recent investigations show that interfaces with IFEF_MPSAFE need to
follow additional restrictions to work with the flag safely. We should enable it
on an interface by default only if the interface surely satisfies the
restrictions, which are described in if.h.

Note that enabling IFEF_MPSAFE solely gains a few benefit on performance because
the network stack is still serialized by the big kernel locks by default.


# 1.145 11-Dec-2017 ozaki-r

Wrap if_ioctl_lock with IFNET_* macros (NFC)

Also if_ioctl_lock perhaps needs to be renamed to something because it's now
not just for ioctl...


# 1.144 08-Dec-2017 ozaki-r

Fix build of kernels without ether

By throwing out if_enable_vlan_mtu and if_disable_vlan_mtu that
created a unnecessary dependency from if.c to if_ethersubr.c.

PR kern/52790


# 1.143 06-Dec-2017 ozaki-r

Ensure to not turn on IFF_RUNNING of an interface until its initialization completes

And ensure to turn off it before destruction as per IFF_RUNNING's description
"resource allocated". (The description is a bit doubtful though, I believe the
change is still proper.)


# 1.142 06-Dec-2017 ozaki-r

Ensure to hold if_ioctl_lock when calling if_flags_set


Revision tags: tls-maxphys-base-20171202
# 1.141 17-Nov-2017 ozaki-r

Add missing IFEF_NO_LINK_STATE_CHANGE to bridge


# 1.140 16-Nov-2017 ozaki-r

Unify IFEF_*_MPSAFE into IFEF_MPSAFE

There are already two flags for if_output and if_start, however, it seems such
MPSAFE flags are eventually needed for all if_XXX operations. Having discrete
flags for each operation is wasteful of if_extflags bits. So let's unify
the flags into one: IFEF_MPSAFE.

Fortunately IFEF_*_MPSAFE flags have never been included in any releases, so
we can change them without breaking backward compatibility of the releases
(though the kernel version of -current should be bumped).

Note that if an interface have both MP-safe and non-MP-safe operations at a
time, we have to set the IFEF_MPSAFE flag and let callees of non-MP-safe
opeartions take the kernel lock.

Proposed on tech-kern@ and tech-net@


# 1.139 15-Nov-2017 ozaki-r

Mark callouts of bridge CALLOUT_MPSAFE


# 1.138 25-Oct-2017 ozaki-r

Remove unnecessary splsoftnet


# 1.137 25-Oct-2017 ozaki-r

Don't free sc_rthash twice


# 1.136 23-Oct-2017 msaitoh

- If if_initialize() failed in the attach function, free resources and return.
- Add some missing frees in bridge_clone_destroy().
- KNF


# 1.135 02-Oct-2017 ozaki-r

Add curlwp_bind to bridge_input for psref

It can be called in a thread context via tap (tap_dev_write).

Fix PR kern/52587


Revision tags: nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320
# 1.134 07-Mar-2017 ozaki-r

branches: 1.134.6;
Remove unnecessary splnet for bridge_enqueue

bridge_enqueue now uses if_transmit_lock that does splnet for device
drivers, so splnet for bridge_enqueue isn't needed anymore.


# 1.133 16-Feb-2017 knakahara

add l2tp(4) L2TPv3 interface.

originally implemented by IIJ SEIL team.


Revision tags: nick-nhusb-base-20170204
# 1.132 23-Jan-2017 ozaki-r

Replace some splnet with splsoftnet


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107 nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004
# 1.131 15-Sep-2016 christos

branches: 1.131.2;
Always do the mbuf checks. The packet filters (npf) expect the mbuf to be
pulled-up. (Krists Krilovs)


Revision tags: localcount-20160914
# 1.130 29-Aug-2016 ozaki-r

KNF; replace white spaces with hard tabs

No functional change.


Revision tags: pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.129 22-Jun-2016 knakahara

branches: 1.129.2;
fix: locking about IFQ_ENQUEUE and ALTQ

- If NET_MPSAFE is not defined, IFQ_LOCK is nop. Currently, that means
IFQ_ENQUEUE() of some paths such as bridge_enqueue() is called parallel
wrongly.
- If ALTQ is enabled, Tx processing should call if_transmit() (= IFQ_ENQUEUE
+ ifp->if_start()) instead of ifp->if_transmit() to call ALTQ_ENQUEUE()
and ALTQ_DEQUEUE().
Furthermore, ALTQ processing is always required KERNEL_LOCK currently.


# 1.128 20-Jun-2016 knakahara

fix: should not assert IFEF_OUTPUT_MPSAFE in bridge_output()


# 1.127 20-Jun-2016 knakahara

tentative fix for ATF(net/if_bridge/t_bridge)


# 1.126 20-Jun-2016 knakahara

make bridge_output MP-safe, so that bridge(4) can enable IFEF_OUTPUT_MPSAFE.

making MP-scalable is future work.


# 1.125 10-Jun-2016 ozaki-r

Avoid storing a pointer of an interface in a mbuf

Having a pointer of an interface in a mbuf isn't safe if we remove big
kernel locks; an interface object (ifnet) can be destroyed anytime in any
packet processing and accessing such object via a pointer is racy. Instead
we have to get an object from the interface collection (ifindex2ifnet) via
an interface index (if_index) that is stored to a mbuf instead of an
pointer.

The change provides two APIs: m_{get,put}_rcvif_psref that use psref(9)
for sleep-able critical sections and m_{get,put}_rcvif that use
pserialize(9) for other critical sections. The change also adds another
API called m_get_rcvif_NOMPSAFE, that is NOT MP-safe and for transition
moratorium, i.e., it is intended to be used for places where are not
planned to be MP-ified soon.

The change adds some overhead due to psref to performance sensitive paths,
however the overhead is not serious, 2% down at worst.

Proposed on tech-kern and tech-net.


# 1.124 10-Jun-2016 ozaki-r

Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.123 16-May-2016 ozaki-r

Apply if_get and if_put to bridge(4)


# 1.122 04-May-2016 roy

Allow multicast/broadcast packets from a bridge member to other members.
Note this should just call bridge_broadcast when more locking issues are
resolved.


# 1.121 28-Apr-2016 knakahara

introduce new ifnet MP-scalable sending interface "if_transmit".


# 1.120 28-Apr-2016 ozaki-r

Constify rtentry of if_output

We no longer need to change rtentry below if_output.

The change makes it clear where rtentries are changed (or not)
and helps forthcoming locking (os psrefing) rtentries.


# 1.119 24-Apr-2016 christos

CID 1358673: dead code


Revision tags: nick-nhusb-base-20160422
# 1.118 22-Apr-2016 roy

Change used from int to bool.
If used, abort the loop because we think we're already at the end.


# 1.117 20-Apr-2016 knakahara

IFQ_ENQUEUE refactor (3/3) : eliminate pktattr argument from IFQ_ENQUEUE caller


# 1.116 20-Apr-2016 knakahara

IFQ_ENQUEUE refactor (2/3) : eliminate pktattr argument from altq implemantation


# 1.115 19-Apr-2016 ozaki-r

Apply psref(9) to bridge(4)

Note that there is an issue that ioctls for an interface and a destruction
of the interface can run in parallel and it causes race conditions on
bridge as well (it rarely happens). The issue will be addressed in the
interface common code (if.c).


# 1.114 19-Apr-2016 ozaki-r

Remove BRIDGE_MPSAFE switch and enable MP-safe code by default

We need to enable it by default because bridge_input now runs
in softint, but bridge_input w/o BRIDGE_MPSAFE was designed as
it runs in hardware interrupt.

Note that there remains a racy code in bridge_output; it will be
solved in the upcoming change (applying psref(9)).


# 1.113 11-Apr-2016 ozaki-r

Fix usage of pslist(9)

Pointed out by riastradh@.


# 1.112 11-Apr-2016 ozaki-r

Use pslist(9) in bridge(4)

This adds missing memory barriers to list operations for pserialize.


# 1.111 28-Mar-2016 ozaki-r

Remove unused global bridge list

Pointed out by riastradh@


# 1.110 23-Mar-2016 ozaki-r

Fix LIST_FOREACH argument


# 1.109 23-Mar-2016 ozaki-r

Use LIST_FOREACH instead of LIST_FOREACH_SAFE

No need to use *_SAFE because we don't remove any items in the loop.


Revision tags: nick-nhusb-base-20160319
# 1.108 15-Feb-2016 ozaki-r

Simplify bridge(4)

Thanks to introducing softint-based if_input, the entire bridge code now
never run in hardware interrupt context. So we can simplify the code.

- Remove spin mutexes
- They were needed because some code of bridge could run in
hardware interrupt context
- We now need only an adaptive mutex for each shared object
(a member list and a forwarding table)
- Remove pktqueue
- bridge_input is already in softint, using another softint
(for bridge_forward) is useless
- Packet distribution should be down at device drivers


# 1.107 10-Feb-2016 ozaki-r

Don't share struct work, instead have one per softc

Pointed out by riastradh@


# 1.106 09-Feb-2016 ozaki-r

Introduce softint-based if_input

This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.

This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.

To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.

Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html

Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!


Revision tags: nick-nhusb-base-20151226
# 1.105 19-Nov-2015 christos

Add handling of VLAN packets in if_bridge where the parent interface supports
them (Jean-Jacques.Puig@espci.fr). Factor out the vlan_mtu enabling and
disabling code.


# 1.104 20-Oct-2015 maxv

Harmless alloc inconsistency; make sure the exact same argument is given to
kmem_alloc/kmem_free. Found by Brainy.


# 1.103 07-Oct-2015 ozaki-r

Enqueue frames to a curcpu's pktqueue

Currently RX can run on a CPU other than CPU#0, so always enqueuing
to a pktqueue of CPU#0 makes no sense. Let's use a curcpu's pktqueue,
although bridge_foward softint doesn't run in parallel without
NET_MPSAFE.

This is a temporal solution. We need a fundamental solution.


Revision tags: nick-nhusb-base-20150921
# 1.102 28-Aug-2015 rjs

Don't set M_PROTO1 in mbuf flags.

This was left over from the old usage of gif(4) with bridges.


# 1.101 20-Aug-2015 christos

include "ioconf.h" to get the 'void <driver>attach(int count);' prototype.


# 1.100 23-Jul-2015 ozaki-r

Fix PR 48104

So far bridge cannot receive frames via a member interface when the frames
come from another member interface. So when we assign an IP address to
a member interface, hosts connected to another member interface cannot
ping to the IP address. That behavior isn't expected. See PR 48104 for
more realistic examples of this issue.

The change does:
- drop M_PROMISC before ether_input, which allows a bridge member interface
to receive a frame coming from another bridge member interface
- receive broadcast/multicast frames via all bridge member interfaces,
which is required to receive IPv6 multicast packets destined to a
multicast group belonging to a bridge member interface that is different
from a packet arrival interface

roy@ helped testing of the fix, thanks!


Revision tags: nick-nhusb-base-20150606
# 1.99 01-Jun-2015 matt

Modify the BRDGGIFS and BRDGRTS cmds to be more COMPAT_NETBSD32 friendly.
(XXX whitespace)


# 1.98 16-Apr-2015 ozaki-r

Fix racy bridge_delete_member

It can be called from bridge_ioctl_del and bridge_clone_destroy with
a same bridge member (bif) at the same time. We have to prevent
that happens.

Pointed out by riastradh@


Revision tags: nick-nhusb-base-20150406
# 1.97 08-Jan-2015 ozaki-r

Use pserialize for rtlist in bridge

This change enables lockless accesses to bridge rtable lists.
See locking notes in a comment to know how pserialize and
mutexes are used. Some functions are rearranged to use
pserialize. A workqueue is introduced to use pserialize in
bridge_rtage via bridge_timer callout.

As usual, pserialize and mutexes are used only when NET_MPSAFE
on. On the other hand, the newly added workqueue is used
regardless of NET_MPSAFE on or off.


# 1.96 01-Jan-2015 ozaki-r

Reset the expire time of a cache on receiving a frame for the cache

The expire time of a cache in a bridge MAC address table was never reset
once it is initialized regardless of traffic for the cache. The behavior
isn't supposed and active caches are unnecessarily expired and removed.

PR kern/49507


# 1.95 31-Dec-2014 ozaki-r

Use pserialize in bridge

This change enables lockless accesses to bridge member lists.
See locking notes in a comment to know how pserialize and
mutexes are used.

This change also provides support for softint-based interrupt
handling; pserialize readers can run in both HW interrupt and
softint contexts.

As usual, pserialize is used only when NET_MPSAFE on.


# 1.94 25-Dec-2014 ozaki-r

Use LIST_FOREACH_SAFE in bridge_rt* functions


# 1.93 24-Dec-2014 ozaki-r

Replace malloc/free with kmem_* in if_bridge

Additionally M_NOWAIT is replaced with KM_SLEEP.


# 1.92 22-Dec-2014 ozaki-r

Call ether_input/m_freem without holding a lock or referencing unnecessary objects

When NET_MPSAFE on, a bridge tries to pass up a packet to Layer 3
(or call m_freem) with holding a lock or referencing unnecessary
objects. That causes random lock ups. The change fixes the issue.


Revision tags: nick-nhusb-base
# 1.91 15-Aug-2014 ozaki-r

branches: 1.91.2;
bridge: reject non-IFF_SIMPLEX interfaces

bridge does not work with !IFF_SIMPLEX interfaces (PR/18035);
the bug is not yet fixed. Until it gets fixed, we should
reject non-IFF_SIMPLEX interfaces.

Discussed with pooka@


Revision tags: netbsd-7-1-2-RELEASE netbsd-7-1-1-RELEASE netbsd-7-1-RELEASE netbsd-7-1-RC2 netbsd-7-nhusb-base-20170116 netbsd-7-1-RC1 netbsd-7-0-2-RELEASE netbsd-7-nhusb-base netbsd-7-0-1-RELEASE netbsd-7-0-RELEASE netbsd-7-0-RC3 netbsd-7-0-RC2 netbsd-7-0-RC1 netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.90 23-Jul-2014 ozaki-r

branches: 1.90.2;
Avoid calling copyout with holding mutex(IPL_NET)

Because copyout may lead a page fault that may sleep, we have to pull it
out from the critical section of mutex(IPL_NET) in bridge_ioctl_gifs.


# 1.89 23-Jul-2014 ozaki-r

Add missing unlock


# 1.88 20-Jul-2014 ozaki-r

Don't return ENETRESET when ioctl SIOCSIFMTU

Otherwise, just changing MTU with ifconfig shows
a confusable error message.

RP kern/48996


# 1.87 14-Jul-2014 ozaki-r

Make bridge MPSAFE

- Introduce BRIDGE_MPSAFE
- It's enabled only when NET_MPSAFE is defined
in if.h or the kernel config
- Add iflist and rtlist mutex locks
- Locking iflist is performance sensitive,
so it's not used when !BRIDGE_MPSAFE
- Add bif object reference counting
- It enables fine-grain locking for bridge member lists
by allowing to not hold a lock during touching a bif
- bridge_release_member is added to decrement the
reference count
- A condition variable is added to do bridge_delete_member
gracefully
- Add if_bridgeif to ifnet
- It's a shortcut to a bif object of a bridge member
- It reduces a bif lookup cost and so lock contention on iflist
- Make bridgestp MPSAFE too


# 1.86 02-Jul-2014 ozaki-r

Protect bridge_list with a mutex


# 1.85 02-Jul-2014 ozaki-r

Remove obsolete codes for if_snd


# 1.84 23-Jun-2014 ozaki-r

Get rid of unnecessary xc_broadcast after pktq_barrier

Pointed out by rmind@


# 1.83 18-Jun-2014 ozaki-r

Restructure bridge_input and bridge_broadcast

There are two changes:
- Assemble the places calling pktq_enqueue (bridge_forward)
for unicast and {b,m}cast frames into one
- Receive {b,m}cast frames in bridge_broadcast, not in
bridge_input

The changes make the code clear and readable. bridge_input
now doesn't need to take care of {b,m}cast frames;
bridge_forward and bridge_broadcast have the responsibility.

The changes are based on a patch of Lloyd Parkes submitted
in PR 48104, but don't fix its issue yet.


# 1.82 18-Jun-2014 ozaki-r

Tidy up bridge_input

No functional change.


# 1.81 17-Jun-2014 ozaki-r

Restructure ether_input and bridge_input

The network stack of NetBSD is well organized and
layered. A packet reception is processed from a
lower layer to an upper layer one by one. However,
ether_input and bridge_input are not structured so.
bridge_input is called inside ether_input.

The new structure replaces ifnet#if_input of a bridge
member with bridge_input when the member is attached.
So a packet goes straight on a packet reception via
a bridge, bridge_input => ether_input => ip_input.

The change is part of a patch of Lloyd Parkes submitted
in PR 48104. Unlike the patch, the change doesn't
intend to change the behavior of the packet processing.
Another patch will fix PR 48104.


# 1.80 16-Jun-2014 ozaki-r

Add net.interfaces.bridgeN.fwdq.{maxlen,len,drops} sysctl


# 1.79 16-Jun-2014 ozaki-r

Use pktqueue for bridge forwarding queue and softint


# 1.78 15-Jun-2014 ozaki-r

Get rid of unnecessary splnet for pool_{get,put}

A mutex prevents interrupts in the functions now.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base
# 1.77 29-Jun-2013 rmind

branches: 1.77.4;
- Rewrite parts of pfil(9): use array to store hooks and thus be more cache
friendly (there are only few hooks in the system). Make the structures
opaque and the interface more strict.
- Remove PFIL_HOOKS option by making pfil(9) mandatory.


Revision tags: agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8
# 1.76 22-Mar-2012 wiz

branches: 1.76.2; 1.76.4;
Fix typo in kauth name. From PR 46234 by Matthew Mondor.
Tested by Geoff Adams and Ryo ONODERA.


# 1.75 13-Mar-2012 elad

Replace the remaining KAUTH_GENERIC_ISSUSER authorization calls with
something meaningful. All relevant documentation has been updated or
written.

Most of these changes were brought up in the following messages:

http://mail-index.netbsd.org/tech-kern/2012/01/18/msg012490.html
http://mail-index.netbsd.org/tech-kern/2012/01/19/msg012502.html
http://mail-index.netbsd.org/tech-kern/2012/02/17/msg012728.html

Thanks to christos, manu, njoly, and jmmv for input.

Huge thanks to pgoyette for spinning these changes through some build
cycles and ATF.


Revision tags: netbsd-6-0-6-RELEASE netbsd-6-1-5-RELEASE netbsd-6-1-4-RELEASE netbsd-6-0-5-RELEASE netbsd-6-1-3-RELEASE netbsd-6-0-4-RELEASE netbsd-6-1-2-RELEASE netbsd-6-0-3-RELEASE netbsd-6-1-1-RELEASE netbsd-6-0-2-RELEASE netbsd-6-1-RELEASE netbsd-6-1-RC4 netbsd-6-1-RC3 netbsd-6-1-RC2 netbsd-6-1-RC1 netbsd-6-0-1-RELEASE matt-nb6-plus-nbase netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base
# 1.74 19-Nov-2011 tls

branches: 1.74.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.73 23-May-2011 joerg

branches: 1.73.4;
simplify


Revision tags: bouyer-quota2-nbase bouyer-quota2-base jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.72 07-Dec-2010 pooka

branches: 1.72.2;
_KERNEL_TOP


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1 yamt-nfs-mp-base9 uebayasi-xip-base
# 1.71 19-Jan-2010 pooka

branches: 1.71.4;
Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


Revision tags: matt-premerge-20091211 yamt-nfs-mp-base8 yamt-nfs-mp-base7 jymxensuspend-base yamt-nfs-mp-base6 yamt-nfs-mp-base5 jym-xensuspend-nbase
# 1.70 17-May-2009 cegger

fix crash in bridge_ioctl():


BRDGGFLT and BRDGSFILT bridge controls are only available with BRIDGE_IPF and PFIL_HOOKS defined.
In amd64 GENERIC and XEN kernel configs PFIL_HOOKS is defined but BRIDGE_IPF is not.

When a BRDGGFLT or BRDGSFILT command comes in, then ifd->ifd_cmd is not in range
of bridge_control_table_size. Then bc is not set and is dereferenced
later => BOOM.


Revision tags: yamt-nfs-mp-base4 jym-xensuspend-base
# 1.69 12-May-2009 elad

Move kauth(9) call before going into splnet().

Mailing list reference:

http://mail-index.netbsd.org/tech-net/2009/05/08/msg001286.html


Revision tags: yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 nick-hppapmap-base
# 1.68 04-Apr-2009 bouyer

Fix another typo


# 1.67 04-Apr-2009 bouyer

Fix a comment, and make it build.


# 1.66 04-Apr-2009 bouyer

Fixes from Masao Uebayashi


# 1.65 04-Apr-2009 bouyer

Fix for if_start() and pfil_hook() being called from hardware interrupt
context (reported on various mailing-lists, and part of PR kern/41114,
causing panic in pf(4) and possibly ipf(4) when BRIDGE_IPF is used).
Defer bridge_forward() to a software interrupt; bridge_input() enqueues
mbufs to ifp->if_snd which is handled in bridge_forward().


Revision tags: nick-hppapmap-base2
# 1.64 18-Jan-2009 mrg

branches: 1.64.2;
Fix multiple problems:

* A sign extension error creating the bridge ID corrupted the
priority (always making it the maximum).
* Do not catch STP packets on an interface for which STP is not
enabled -- it's a violation of the spec, and causes STP to fail on
neighboring bridges.
* An optimization to bstp_input() -- some information is already
known when we call it.

contributed anonymously.


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.63 07-Nov-2008 dyoung

*** Summary ***

When a link-layer address changes (e.g., ifconfig ex0 link
02:de:ad:be:ef:02 active), send a gratuitous ARP and/or a Neighbor
Advertisement to update the network-/link-layer address bindings
on our LAN peers.

Refuse a change of ethernet address to the address 00:00:00:00:00:00
or to any multicast/broadcast address. (Thanks matt@.)

Reorder ifnet ioctl operations so that driver ioctls may inherit
the functions of their "class"---ether_ioctl(), fddi_ioctl(), et
cetera---and the class ioctls may inherit from the generic ioctl,
ifioctl_common(), but both driver- and class-ioctls may override
the generic behavior. Make network drivers share more code.

Distinguish a "factory" link-layer address from others for the
purposes of both protecting that address from deletion and computing
EUI64.

Return consistent, appropriate error codes from network drivers.

Improve readability. KNF.

*** Details ***

In if_attach(), always initialize the interface ioctl routine,
ifnet->if_ioctl, if the driver has not already initialized it.
Delete if_ioctl == NULL tests everywhere else, because it cannot
happen.

In the ioctl routines of network interfaces, inherit common ioctl
behaviors by calling either ifioctl_common() or whichever ioctl
routine is appropriate for the class of interface---e.g., ether_ioctl()
for ethernets.

Stop (ab)using SIOCSIFADDR and start to use SIOCINITIFADDR. In
the user->kernel interface, SIOCSIFADDR's argument was an ifreq,
but on the protocol->ifnet interface, SIOCSIFADDR's argument was
an ifaddr. That was confusing, and it would work against me as I
make it possible for a network interface to overload most ioctls.
On the protocol->ifnet interface, replace SIOCSIFADDR with
SIOCINITIFADDR. In ifioctl(), return EPERM if userland tries to
invoke SIOCINITIFADDR.

In ifioctl(), give the interface the first shot at handling most
interface ioctls, and give the protocol the second shot, instead
of the other way around. Finally, let compatibility code (COMPAT_OSOCK)
take a shot.

Pull device initialization out of switch statements under
SIOCINITIFADDR. For example, pull ..._init() out of any switch
statement that looks like this:

switch (...->sa_family) {
case ...:
..._init();
...
break;
...
default:
..._init();
...
break;
}

Rewrite many if-else clauses that handle all permutations of IFF_UP
and IFF_RUNNING to use a switch statement,

switch (x & (IFF_UP|IFF_RUNNING)) {
case 0:
...
break;
case IFF_RUNNING:
...
break;
case IFF_UP:
...
break;
case IFF_UP|IFF_RUNNING:
...
break;
}

unifdef lots of code containing #ifdef FreeBSD, #ifdef NetBSD, and
#ifdef SIOCSIFMTU, especially in fwip(4) and in ndis(4).

In ipw(4), remove an if_set_sadl() call that is out of place.

In nfe(4), reuse the jumbo MTU logic in ether_ioctl().

Let ethernets register a callback for setting h/w state such as
promiscuous mode and the multicast filter in accord with a change
in the if_flags: ether_set_ifflags_cb() registers a callback that
returns ENETRESET if the caller should reset the ethernet by calling
if_init(), 0 on success, != 0 on failure. Pull common code from
ex(4), gem(4), nfe(4), sip(4), tlp(4), vge(4) into ether_ioctl(),
and register if_flags callbacks for those drivers.

Return ENOTTY instead of EINVAL for inappropriate ioctls. In
zyd(4), use ENXIO instead of ENOTTY to indicate that the device is
not any longer attached.

Add to if_set_sadl() a boolean 'factory' argument that indicates
whether a link-layer address was assigned by the factory or some
other source. In a comment, recommend using the factory address
for generating an EUI64, and update in6_get_hw_ifid() to prefer a
factory address to any other link-layer address.

Add a routing message, RTM_LLINFO_UPD, that tells protocols to
update the binding of network-layer addresses to link-layer addresses.
Implement this message in IPv4 and IPv6 by sending a gratuitous
ARP or a neighbor advertisement, respectively. Generate RTM_LLINFO_UPD
messages on a change of an interface's link-layer address.

In ether_ioctl(), do not let SIOCALIFADDR set a link-layer address
that is broadcast/multicast or equal to 00:00:00:00:00:00.

Make ether_ioctl() call ifioctl_common() to handle ioctls that it
does not understand.

In gif(4), initialize if_softc and use it, instead of assuming that
the gif_softc and ifp overlap.

Let ifioctl_common() handle SIOCGIFADDR.

Sprinkle rtcache_invariants(), which checks on DIAGNOSTIC kernels
that certain invariants on a struct route are satisfied.

In agr(4), rewrite agr_ioctl_filter() to be a bit more explicit
about the ioctls that we do not allow on an agr(4) member interface.

bzero -> memset. Delete unnecessary casts to void *. Use
sockaddr_in_init() and sockaddr_in6_init(). Compare pointers with
NULL instead of "testing truth". Replace some instances of (type
*)0 with NULL. Change some K&R prototypes to ANSI C, and join
lines.


Revision tags: netbsd-5-0-RC3 netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 wrstuden-revivesa-base-1 simonb-wapbl-nbase yamt-pf42-base4 simonb-wapbl-base wrstuden-revivesa-base
# 1.62 15-Jun-2008 christos

branches: 1.62.2; 1.62.4; 1.62.6;
- add if_alloc (ours just mallocs), and if_initname and use them (from FreeBSD)
- kill memsets where M_ZERO can be used.


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base
# 1.61 15-Apr-2008 thorpej

branches: 1.61.2; 1.61.4; 1.61.6; 1.61.8;
Make ip6 and icmp6 stats per-cpu.


# 1.60 12-Apr-2008 cegger

make this build with BRIDGE_IPF and PFIL_HOOKS options


# 1.59 12-Apr-2008 thorpej

Make IP, TCP, UDP, and ICMP statistics per-CPU. The stats are collated
when the user requests them via sysctl.


# 1.58 08-Apr-2008 thorpej

Change IPv6 stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ip6stat structure; old netstat
binaries will continue to work properly.


# 1.57 07-Apr-2008 thorpej

Change IP stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ipstat structure; old netstat
binaries will continue to work properly.


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase nick-net80211-sync-base keiichi-mipv6-base matt-armv6-nbase hpcarm-cleanup-base
# 1.56 20-Feb-2008 matt

branches: 1.56.6;
s/u_\(int[0-9]*_t\)/u\1/g
(change u_int*_t to uint*_t)


Revision tags: bouyer-xeni386-nbase bouyer-xeni386-base mjf-devfs-base
# 1.55 19-Jan-2008 dyoung

Use C99 array initializers for bridge_control_table[].


Revision tags: nick-csl-alignment-base5 bouyer-xeni386-merge1 matt-armv6-prevmlocking vmlocking2-base3 yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 jmcneill-base bouyer-xenamd64-base2 vmlocking-nbase yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base matt-armv6-base jmcneill-pm-base reinoud-bufcleanup-base vmlocking-base
# 1.54 27-Aug-2007 dyoung

branches: 1.54.2; 1.54.8; 1.54.14;
LLADDR -> CLLADDR.


# 1.53 26-Aug-2007 dyoung

Constify: LLADDR -> CLLADDR. I'm aiming here to make it easier to
identify sockaddr_dl abuse that remains in the kernel, especially
the potential for overwriting memory past the end of a sockaddr_dl
with, e.g., memcpy(LLADDR(), ...).

Use sockaddr_dl_setaddr() in a few places.


Revision tags: matt-mips64-base nick-csl-alignment-base mjf-ufs-trans-base
# 1.52 09-Jul-2007 ad

branches: 1.52.2; 1.52.6;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.51 12-Mar-2007 ad

branches: 1.51.2;
Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.


# 1.50 04-Mar-2007 christos

branches: 1.50.2;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.49 21-Feb-2007 dyoung

Use __arraycount().


# 1.48 17-Feb-2007 dyoung

KNF: de-__P, bzero -> memset, bcmp -> memcmp. Remove extraneous
parentheses in return statements.

Cosmetic: don't open-code TAILQ_FOREACH().

Cosmetic: change types of variables to avoid oodles of casts: in
in6_src.c, avoid casts by changing several route_in6 pointers
to struct route pointers. Remove unnecessary casts to caddr_t
elsewhere.

Pave the way for eliminating address family-specific route caches:
soon, struct route will not embed a sockaddr, but it will hold
a reference to an external sockaddr, instead. We will set the
destination sockaddr using rtcache_setdst(). (I created a stub
for it, but it isn't used anywhere, yet.) rtcache_free() will
free the sockaddr. I have extracted from rtcache_free() a helper
subroutine, rtcache_clear(). rtcache_clear() will "forget" a
cached route, but it will not forget the destination by releasing
the sockaddr. I use rtcache_clear() instead of rtcache_free()
in rtcache_update(), because rtcache_update() is not supposed
to forget the destination.

Constify:

1 Introduce const accessor for route->ro_dst, rtcache_getdst().

2 Constify the 'dst' argument to ifnet->if_output(). This
led me to constify a lot of code called by output routines.

3 Constify the sockaddr argument to protosw->pr_ctlinput. This
led me to constify a lot of code called by ctlinput routines.

4 Introduce const macros for converting from a generic sockaddr
to family-specific sockaddrs, e.g., sockaddr_in: satocsin6,
satocsin, et cetera.


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.47 04-Jan-2007 elad

branches: 1.47.2;
Consistent usage of KAUTH_GENERIC_ISSUSER.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3 netbsd-4-base
# 1.46 23-Nov-2006 rpaulo

New EtherIP driver based on tap(4) and gif(4) by Hans Rosenfeld.
Notable changes:
* Fixes PR 34268.
* Separates the code from gif(4) (which is more cleaner).
* Allows the usage of STP (Spanning Tree Protocol).
* Removed EtherIP implementation from gif(4)/tap(4).

Some input from Christos.


# 1.45 16-Nov-2006 christos

__unused removal on arguments; approved by core.


Revision tags: yamt-splraiseipl-base2
# 1.44 17-Oct-2006 dogcow

now that we have -Wno-unused-parameter, back out all the tremendously ugly
code to gratuitously access said parameters.


# 1.43 13-Oct-2006 dogcow

More -Wunused fallout. sprinkle __unused when possible; otherwise, use the
do { if (&x) {} } while (/* CONSTCOND */ 0);
construct as suggested by uwe in <20061012224845.GA9449@snark.ptc.spbu.ru>.


# 1.42 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


# 1.41 05-Oct-2006 tls

Protect calls to pool_put/pool_get that may occur in interrupt context
with spl used to protect other allocations and frees, or datastructure
element insertion and removal, in adjacent code.

It is almost unquestionably the case that some of the spl()/splx() calls
added here are superfluous, but it really seems wrong to see:

s=splfoo();
/* frob data structure */
splx(s);
pool_put(x);

and if we think we need to protect the first operation, then it is hard
to see why we should not think we need to protect the next. "Better
safe than sorry".

It is also almost unquestionably the case that I missed some pool
gets/puts from interrupt context with my strategy for finding these
calls; use of PR_NOWAIT is a strong hint that a pool may be used from
interrupt context but many callers in the kernel pass a "can wait/can't
wait" flag down such that my searches might not have found them. One
notable area that needs to be looked at is pf.

See also:

http://mail-index.netbsd.org/tech-kern/2006/07/19/0003.html
http://mail-index.netbsd.org/tech-kern/2006/07/19/0009.html


Revision tags: abandoned-netbsd-4-base yamt-splraiseipl-base yamt-pdpolicy-base9 yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.40 23-Jul-2006 ad

branches: 1.40.4; 1.40.6;
Use the LWP cached credentials where sane.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.39 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.38 18-May-2006 liamjfoy

branches: 1.38.2;
Integrate Common Address Redundancy Procotol (CARP) from OpenBSD

'pseudo-device carp'

Thanks to: joerg@ christos@ riz@ and others who tested
Ok: core@


# 1.37 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base yamt-pdpolicy-base2 elad-kernelauth-base yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.36 17-Jan-2006 christos

branches: 1.36.2; 1.36.4; 1.36.6; 1.36.8; 1.36.10;
Make sure that breq is also cleared (from Xin LI)


# 1.35 09-Jan-2006 christos

Make sure we initialize all structs to 0; from Xin LI


# 1.34 24-Dec-2005 perry

branches: 1.34.2;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.33 11-Dec-2005 thorpej

ANSI function decls and application of static.


# 1.32 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.31 01-Jun-2005 jdc

branches: 1.31.2;
Fix this properly by renaming the conflicting variables.


# 1.30 01-Jun-2005 jdc

Remove extraneous definition of struct llc (found by shadow warning).


Revision tags: netbsd-3-0-RELEASE netbsd-3-0-RC6 netbsd-3-0-RC5 netbsd-3-0-RC4 netbsd-3-0-RC3 netbsd-3-0-RC2 netbsd-3-0-RC1 yamt-km-base4 yamt-km-base3 netbsd-3-base kent-audio2-base
# 1.29 26-Feb-2005 perry

branches: 1.29.2; 1.29.4;
nuke trailing whitespace


Revision tags: yamt-km-base2
# 1.28 31-Jan-2005 kim

Add RFC 3378 EtherIP support, ported from OpenBSD to NetBSD by
Hans Rosenfeld (rosenfeld at grumpf.hope-2000.org)

This change makes it possible to add gif interfaces to bridges, which
will then send and receive IP protocol 97 packets. Packets are Ethernet
frames with an EtherIP header prepended.


Revision tags: yamt-km-base kent-audio1-beforemerge kent-audio1-base
# 1.27 04-Dec-2004 peter

branches: 1.27.4; 1.27.6;
Change ifc_destroy to return an int instead of void, so that it
can pass back errors to ifconfig.


# 1.26 06-Oct-2004 bad

Interfaces that do checksum offloading indicate the checksum status of
received packets in csum_flags in the packet header. Packets that are
forwarded over the bridge need to have csum_flags cleared before being
put on the output queue. Do so in bridge_enqueue().

Discussed with Jason Thorpe.

Fixes PR kern/27007 and the first part of PR kern/21831.


# 1.25 05-Oct-2004 christos

Only enable BRIDGE_IPF code if PFIL_HOOKS is enabled.


# 1.24 21-Apr-2004 itojun

kill a sprintf


# 1.23 21-Apr-2004 itojun

kill sprintf, use snprintf


Revision tags: netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.22 31-Jan-2004 jdc

branches: 1.22.2;
Use m_copydata(), m_adj() and M_PREPEND() to manipulate mbuf's in
bridge_ipf(). Fixes kernel memory corruption that occured when using
m_split() and m_cat().
Idea from OpenBSD.


# 1.21 09-Dec-2003 augustss

Fix spelling mistake in a comment.


# 1.20 28-Oct-2003 mycroft

Mark this initializer in the canonical way so it can be found later.


# 1.19 25-Oct-2003 christos

Fix uninitialized variable warnings


# 1.18 16-Sep-2003 jdc

Add a flag parameter to bridge_enqueue() to tell it whether to run the
filter or not. We only need to run the filter for bridge_forward() and
bridge_broadcast(). If we also run it for bridge_output(), we will run
the filter twice outbound per packet, so don't.

In bridge_ipf(), make sure we don't run m_cat() on a single mbuf chain
by checking to see (and remembering) if we need to m_split() the mbuf.
This fixes bridge + ipfilter on sparc.

Fixes PR kern/22063.


# 1.17 11-Aug-2003 itojun

rm extra blank line


# 1.16 13-Jul-2003 jdc

Include opt_inet.h to get INET6 definition.
Now, bridged ipv6 packets are passed through ipfilter.
However, some v6 packets still do not get transmitted when ipf is enabled.
Partial fix for PR kern/22063.


# 1.15 23-Jun-2003 martin

branches: 1.15.2;
Make sure to include opt_foo.h if a defflag option FOO is used.


# 1.14 24-May-2003 kristerw

Make sure splx() is called for all bridge_ioctl() error cases.


# 1.13 16-May-2003 itojun

use strlcpy


# 1.12 14-May-2003 itojun

use arc4random


# 1.11 19-Mar-2003 bouyer

Fix 2 bugs:
- initialise stp when the bridge is turned up, without this stp will keep
all interfaces disabled in a sequence like:
brconfig bridge0 add if0 add if1 stp if0 stp if1 up
- s/BRDGSPRI/BRDGSIFPRIO in brconfig.c:cmd_ifpriority()

add a command (ifpathcost) to change the stp path cost of the STP path cost of
an interface. Display the interface path cost with the others STP parameters.


# 1.10 27-Feb-2003 perseant

Make BRIDGE_IPF an option, and document it. Add it (commented) to GENERIC.
Let brconfig tell whether the bridge is using the ipfilter hook, or not.


# 1.9 15-Feb-2003 perseant

Add ipf packet-filtering option to if_bridge. The option is controlled at
compile-time by BRIDGE_IPF, and at runtime by brconfig with the {ipf,-ipf}
option on a per-bridge basis.

As a side-effect, add PFIL_HOOKS processing to if_bridge.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base kqueue-aftermerge kqueue-beforemerge gehenna-devsw-base kqueue-base
# 1.8 24-Aug-2002 martin

Add a function to lookup bridge members by struct ifnet * and use
it at all call sites that have such a pointer readily available.
This avoids unnecessary strcmp()s in critical paths, and removes
some XXX comments.


# 1.7 08-Jun-2002 itojun

reject "add" request if if_mtu is different.


# 1.6 23-May-2002 itojun

use IFT_BRIDGE


Revision tags: netbsd-1-6-base
# 1.5 24-Mar-2002 jdolecek

branches: 1.5.2; 1.5.4;
Fix a memory leak in bridge_ioctl_add() when the called for non-ethernet
interface.
Problem noted and fix provided by in kern/16019 by Love.


Revision tags: eeh-devprop-base newlock-base
# 1.4 08-Mar-2002 thorpej

Pool deals fairly well with physical memory shortage, but it doesn't
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map). Try to deal with this:

* Group all information about the backend allocator for a pool in a
separate structure. The pool references this structure, rather than
the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
to become available, but will still fail if it cannot callocate KVA
space for the pages. If this happens, carefully drain all pools using
the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
some pages, and use that information to make draining easier and more
efficient.
* Get rid of PR_URGENT. There was only one use of it, and it could be
dealt with by the caller.

From art@openbsd.org.


Revision tags: ifpoll-base
# 1.3 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.2 17-Aug-2001 thorpej

branches: 1.2.2; 1.2.4;
Only report expire time for DYNAMIC forwarding table entries.


# 1.1 17-Aug-2001 thorpej

Add support for building Ethernet bridges, based on Jason Wright's
bridge driver from OpenBSD, although the bridge code has been *heavily*
modified by me (the 802.1D code remains mostly unchanged from the
original).


# 1.148 15-Jan-2018 maxv

If the bridge is not running, don't call bridge_stop. Otherwise the
following commands will crash the kernel:

ifconfig bridge0 create
ifconfig bridge0 destroy


# 1.147 28-Dec-2017 ozaki-r

Ensure the timer isn't running by using workqueue_wait


# 1.146 19-Dec-2017 ozaki-r

Don't set IFEF_MPSAFE unless NET_MPSAFE at this point

Because recent investigations show that interfaces with IFEF_MPSAFE need to
follow additional restrictions to work with the flag safely. We should enable it
on an interface by default only if the interface surely satisfies the
restrictions, which are described in if.h.

Note that enabling IFEF_MPSAFE solely gains a few benefit on performance because
the network stack is still serialized by the big kernel locks by default.


# 1.145 11-Dec-2017 ozaki-r

Wrap if_ioctl_lock with IFNET_* macros (NFC)

Also if_ioctl_lock perhaps needs to be renamed to something because it's now
not just for ioctl...


# 1.144 08-Dec-2017 ozaki-r

Fix build of kernels without ether

By throwing out if_enable_vlan_mtu and if_disable_vlan_mtu that
created a unnecessary dependency from if.c to if_ethersubr.c.

PR kern/52790


# 1.143 06-Dec-2017 ozaki-r

Ensure to not turn on IFF_RUNNING of an interface until its initialization completes

And ensure to turn off it before destruction as per IFF_RUNNING's description
"resource allocated". (The description is a bit doubtful though, I believe the
change is still proper.)


# 1.142 06-Dec-2017 ozaki-r

Ensure to hold if_ioctl_lock when calling if_flags_set


Revision tags: tls-maxphys-base-20171202
# 1.141 17-Nov-2017 ozaki-r

Add missing IFEF_NO_LINK_STATE_CHANGE to bridge


# 1.140 16-Nov-2017 ozaki-r

Unify IFEF_*_MPSAFE into IFEF_MPSAFE

There are already two flags for if_output and if_start, however, it seems such
MPSAFE flags are eventually needed for all if_XXX operations. Having discrete
flags for each operation is wasteful of if_extflags bits. So let's unify
the flags into one: IFEF_MPSAFE.

Fortunately IFEF_*_MPSAFE flags have never been included in any releases, so
we can change them without breaking backward compatibility of the releases
(though the kernel version of -current should be bumped).

Note that if an interface have both MP-safe and non-MP-safe operations at a
time, we have to set the IFEF_MPSAFE flag and let callees of non-MP-safe
opeartions take the kernel lock.

Proposed on tech-kern@ and tech-net@


# 1.139 15-Nov-2017 ozaki-r

Mark callouts of bridge CALLOUT_MPSAFE


# 1.138 25-Oct-2017 ozaki-r

Remove unnecessary splsoftnet


# 1.137 25-Oct-2017 ozaki-r

Don't free sc_rthash twice


# 1.136 23-Oct-2017 msaitoh

- If if_initialize() failed in the attach function, free resources and return.
- Add some missing frees in bridge_clone_destroy().
- KNF


# 1.135 02-Oct-2017 ozaki-r

Add curlwp_bind to bridge_input for psref

It can be called in a thread context via tap (tap_dev_write).

Fix PR kern/52587


Revision tags: nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320
# 1.134 07-Mar-2017 ozaki-r

branches: 1.134.6;
Remove unnecessary splnet for bridge_enqueue

bridge_enqueue now uses if_transmit_lock that does splnet for device
drivers, so splnet for bridge_enqueue isn't needed anymore.


# 1.133 16-Feb-2017 knakahara

add l2tp(4) L2TPv3 interface.

originally implemented by IIJ SEIL team.


Revision tags: nick-nhusb-base-20170204
# 1.132 23-Jan-2017 ozaki-r

Replace some splnet with splsoftnet


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107 nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004
# 1.131 15-Sep-2016 christos

branches: 1.131.2;
Always do the mbuf checks. The packet filters (npf) expect the mbuf to be
pulled-up. (Krists Krilovs)


Revision tags: localcount-20160914
# 1.130 29-Aug-2016 ozaki-r

KNF; replace white spaces with hard tabs

No functional change.


Revision tags: pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.129 22-Jun-2016 knakahara

branches: 1.129.2;
fix: locking about IFQ_ENQUEUE and ALTQ

- If NET_MPSAFE is not defined, IFQ_LOCK is nop. Currently, that means
IFQ_ENQUEUE() of some paths such as bridge_enqueue() is called parallel
wrongly.
- If ALTQ is enabled, Tx processing should call if_transmit() (= IFQ_ENQUEUE
+ ifp->if_start()) instead of ifp->if_transmit() to call ALTQ_ENQUEUE()
and ALTQ_DEQUEUE().
Furthermore, ALTQ processing is always required KERNEL_LOCK currently.


# 1.128 20-Jun-2016 knakahara

fix: should not assert IFEF_OUTPUT_MPSAFE in bridge_output()


# 1.127 20-Jun-2016 knakahara

tentative fix for ATF(net/if_bridge/t_bridge)


# 1.126 20-Jun-2016 knakahara

make bridge_output MP-safe, so that bridge(4) can enable IFEF_OUTPUT_MPSAFE.

making MP-scalable is future work.


# 1.125 10-Jun-2016 ozaki-r

Avoid storing a pointer of an interface in a mbuf

Having a pointer of an interface in a mbuf isn't safe if we remove big
kernel locks; an interface object (ifnet) can be destroyed anytime in any
packet processing and accessing such object via a pointer is racy. Instead
we have to get an object from the interface collection (ifindex2ifnet) via
an interface index (if_index) that is stored to a mbuf instead of an
pointer.

The change provides two APIs: m_{get,put}_rcvif_psref that use psref(9)
for sleep-able critical sections and m_{get,put}_rcvif that use
pserialize(9) for other critical sections. The change also adds another
API called m_get_rcvif_NOMPSAFE, that is NOT MP-safe and for transition
moratorium, i.e., it is intended to be used for places where are not
planned to be MP-ified soon.

The change adds some overhead due to psref to performance sensitive paths,
however the overhead is not serious, 2% down at worst.

Proposed on tech-kern and tech-net.


# 1.124 10-Jun-2016 ozaki-r

Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.123 16-May-2016 ozaki-r

Apply if_get and if_put to bridge(4)


# 1.122 04-May-2016 roy

Allow multicast/broadcast packets from a bridge member to other members.
Note this should just call bridge_broadcast when more locking issues are
resolved.


# 1.121 28-Apr-2016 knakahara

introduce new ifnet MP-scalable sending interface "if_transmit".


# 1.120 28-Apr-2016 ozaki-r

Constify rtentry of if_output

We no longer need to change rtentry below if_output.

The change makes it clear where rtentries are changed (or not)
and helps forthcoming locking (os psrefing) rtentries.


# 1.119 24-Apr-2016 christos

CID 1358673: dead code


Revision tags: nick-nhusb-base-20160422
# 1.118 22-Apr-2016 roy

Change used from int to bool.
If used, abort the loop because we think we're already at the end.


# 1.117 20-Apr-2016 knakahara

IFQ_ENQUEUE refactor (3/3) : eliminate pktattr argument from IFQ_ENQUEUE caller


# 1.116 20-Apr-2016 knakahara

IFQ_ENQUEUE refactor (2/3) : eliminate pktattr argument from altq implemantation


# 1.115 19-Apr-2016 ozaki-r

Apply psref(9) to bridge(4)

Note that there is an issue that ioctls for an interface and a destruction
of the interface can run in parallel and it causes race conditions on
bridge as well (it rarely happens). The issue will be addressed in the
interface common code (if.c).


# 1.114 19-Apr-2016 ozaki-r

Remove BRIDGE_MPSAFE switch and enable MP-safe code by default

We need to enable it by default because bridge_input now runs
in softint, but bridge_input w/o BRIDGE_MPSAFE was designed as
it runs in hardware interrupt.

Note that there remains a racy code in bridge_output; it will be
solved in the upcoming change (applying psref(9)).


# 1.113 11-Apr-2016 ozaki-r

Fix usage of pslist(9)

Pointed out by riastradh@.


# 1.112 11-Apr-2016 ozaki-r

Use pslist(9) in bridge(4)

This adds missing memory barriers to list operations for pserialize.


# 1.111 28-Mar-2016 ozaki-r

Remove unused global bridge list

Pointed out by riastradh@


# 1.110 23-Mar-2016 ozaki-r

Fix LIST_FOREACH argument


# 1.109 23-Mar-2016 ozaki-r

Use LIST_FOREACH instead of LIST_FOREACH_SAFE

No need to use *_SAFE because we don't remove any items in the loop.


Revision tags: nick-nhusb-base-20160319
# 1.108 15-Feb-2016 ozaki-r

Simplify bridge(4)

Thanks to introducing softint-based if_input, the entire bridge code now
never run in hardware interrupt context. So we can simplify the code.

- Remove spin mutexes
- They were needed because some code of bridge could run in
hardware interrupt context
- We now need only an adaptive mutex for each shared object
(a member list and a forwarding table)
- Remove pktqueue
- bridge_input is already in softint, using another softint
(for bridge_forward) is useless
- Packet distribution should be down at device drivers


# 1.107 10-Feb-2016 ozaki-r

Don't share struct work, instead have one per softc

Pointed out by riastradh@


# 1.106 09-Feb-2016 ozaki-r

Introduce softint-based if_input

This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.

This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.

To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.

Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html

Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!


Revision tags: nick-nhusb-base-20151226
# 1.105 19-Nov-2015 christos

Add handling of VLAN packets in if_bridge where the parent interface supports
them (Jean-Jacques.Puig@espci.fr). Factor out the vlan_mtu enabling and
disabling code.


# 1.104 20-Oct-2015 maxv

Harmless alloc inconsistency; make sure the exact same argument is given to
kmem_alloc/kmem_free. Found by Brainy.


# 1.103 07-Oct-2015 ozaki-r

Enqueue frames to a curcpu's pktqueue

Currently RX can run on a CPU other than CPU#0, so always enqueuing
to a pktqueue of CPU#0 makes no sense. Let's use a curcpu's pktqueue,
although bridge_foward softint doesn't run in parallel without
NET_MPSAFE.

This is a temporal solution. We need a fundamental solution.


Revision tags: nick-nhusb-base-20150921
# 1.102 28-Aug-2015 rjs

Don't set M_PROTO1 in mbuf flags.

This was left over from the old usage of gif(4) with bridges.


# 1.101 20-Aug-2015 christos

include "ioconf.h" to get the 'void <driver>attach(int count);' prototype.


# 1.100 23-Jul-2015 ozaki-r

Fix PR 48104

So far bridge cannot receive frames via a member interface when the frames
come from another member interface. So when we assign an IP address to
a member interface, hosts connected to another member interface cannot
ping to the IP address. That behavior isn't expected. See PR 48104 for
more realistic examples of this issue.

The change does:
- drop M_PROMISC before ether_input, which allows a bridge member interface
to receive a frame coming from another bridge member interface
- receive broadcast/multicast frames via all bridge member interfaces,
which is required to receive IPv6 multicast packets destined to a
multicast group belonging to a bridge member interface that is different
from a packet arrival interface

roy@ helped testing of the fix, thanks!


Revision tags: nick-nhusb-base-20150606
# 1.99 01-Jun-2015 matt

Modify the BRDGGIFS and BRDGRTS cmds to be more COMPAT_NETBSD32 friendly.
(XXX whitespace)


# 1.98 16-Apr-2015 ozaki-r

Fix racy bridge_delete_member

It can be called from bridge_ioctl_del and bridge_clone_destroy with
a same bridge member (bif) at the same time. We have to prevent
that happens.

Pointed out by riastradh@


Revision tags: nick-nhusb-base-20150406
# 1.97 08-Jan-2015 ozaki-r

Use pserialize for rtlist in bridge

This change enables lockless accesses to bridge rtable lists.
See locking notes in a comment to know how pserialize and
mutexes are used. Some functions are rearranged to use
pserialize. A workqueue is introduced to use pserialize in
bridge_rtage via bridge_timer callout.

As usual, pserialize and mutexes are used only when NET_MPSAFE
on. On the other hand, the newly added workqueue is used
regardless of NET_MPSAFE on or off.


# 1.96 01-Jan-2015 ozaki-r

Reset the expire time of a cache on receiving a frame for the cache

The expire time of a cache in a bridge MAC address table was never reset
once it is initialized regardless of traffic for the cache. The behavior
isn't supposed and active caches are unnecessarily expired and removed.

PR kern/49507


# 1.95 31-Dec-2014 ozaki-r

Use pserialize in bridge

This change enables lockless accesses to bridge member lists.
See locking notes in a comment to know how pserialize and
mutexes are used.

This change also provides support for softint-based interrupt
handling; pserialize readers can run in both HW interrupt and
softint contexts.

As usual, pserialize is used only when NET_MPSAFE on.


# 1.94 25-Dec-2014 ozaki-r

Use LIST_FOREACH_SAFE in bridge_rt* functions


# 1.93 24-Dec-2014 ozaki-r

Replace malloc/free with kmem_* in if_bridge

Additionally M_NOWAIT is replaced with KM_SLEEP.


# 1.92 22-Dec-2014 ozaki-r

Call ether_input/m_freem without holding a lock or referencing unnecessary objects

When NET_MPSAFE on, a bridge tries to pass up a packet to Layer 3
(or call m_freem) with holding a lock or referencing unnecessary
objects. That causes random lock ups. The change fixes the issue.


Revision tags: nick-nhusb-base
# 1.91 15-Aug-2014 ozaki-r

branches: 1.91.2;
bridge: reject non-IFF_SIMPLEX interfaces

bridge does not work with !IFF_SIMPLEX interfaces (PR/18035);
the bug is not yet fixed. Until it gets fixed, we should
reject non-IFF_SIMPLEX interfaces.

Discussed with pooka@


Revision tags: netbsd-7-1-1-RELEASE netbsd-7-1-RELEASE netbsd-7-1-RC2 netbsd-7-nhusb-base-20170116 netbsd-7-1-RC1 netbsd-7-0-2-RELEASE netbsd-7-nhusb-base netbsd-7-0-1-RELEASE netbsd-7-0-RELEASE netbsd-7-0-RC3 netbsd-7-0-RC2 netbsd-7-0-RC1 netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.90 23-Jul-2014 ozaki-r

branches: 1.90.2;
Avoid calling copyout with holding mutex(IPL_NET)

Because copyout may lead a page fault that may sleep, we have to pull it
out from the critical section of mutex(IPL_NET) in bridge_ioctl_gifs.


# 1.89 23-Jul-2014 ozaki-r

Add missing unlock


# 1.88 20-Jul-2014 ozaki-r

Don't return ENETRESET when ioctl SIOCSIFMTU

Otherwise, just changing MTU with ifconfig shows
a confusable error message.

RP kern/48996


# 1.87 14-Jul-2014 ozaki-r

Make bridge MPSAFE

- Introduce BRIDGE_MPSAFE
- It's enabled only when NET_MPSAFE is defined
in if.h or the kernel config
- Add iflist and rtlist mutex locks
- Locking iflist is performance sensitive,
so it's not used when !BRIDGE_MPSAFE
- Add bif object reference counting
- It enables fine-grain locking for bridge member lists
by allowing to not hold a lock during touching a bif
- bridge_release_member is added to decrement the
reference count
- A condition variable is added to do bridge_delete_member
gracefully
- Add if_bridgeif to ifnet
- It's a shortcut to a bif object of a bridge member
- It reduces a bif lookup cost and so lock contention on iflist
- Make bridgestp MPSAFE too


# 1.86 02-Jul-2014 ozaki-r

Protect bridge_list with a mutex


# 1.85 02-Jul-2014 ozaki-r

Remove obsolete codes for if_snd


# 1.84 23-Jun-2014 ozaki-r

Get rid of unnecessary xc_broadcast after pktq_barrier

Pointed out by rmind@


# 1.83 18-Jun-2014 ozaki-r

Restructure bridge_input and bridge_broadcast

There are two changes:
- Assemble the places calling pktq_enqueue (bridge_forward)
for unicast and {b,m}cast frames into one
- Receive {b,m}cast frames in bridge_broadcast, not in
bridge_input

The changes make the code clear and readable. bridge_input
now doesn't need to take care of {b,m}cast frames;
bridge_forward and bridge_broadcast have the responsibility.

The changes are based on a patch of Lloyd Parkes submitted
in PR 48104, but don't fix its issue yet.


# 1.82 18-Jun-2014 ozaki-r

Tidy up bridge_input

No functional change.


# 1.81 17-Jun-2014 ozaki-r

Restructure ether_input and bridge_input

The network stack of NetBSD is well organized and
layered. A packet reception is processed from a
lower layer to an upper layer one by one. However,
ether_input and bridge_input are not structured so.
bridge_input is called inside ether_input.

The new structure replaces ifnet#if_input of a bridge
member with bridge_input when the member is attached.
So a packet goes straight on a packet reception via
a bridge, bridge_input => ether_input => ip_input.

The change is part of a patch of Lloyd Parkes submitted
in PR 48104. Unlike the patch, the change doesn't
intend to change the behavior of the packet processing.
Another patch will fix PR 48104.


# 1.80 16-Jun-2014 ozaki-r

Add net.interfaces.bridgeN.fwdq.{maxlen,len,drops} sysctl


# 1.79 16-Jun-2014 ozaki-r

Use pktqueue for bridge forwarding queue and softint


# 1.78 15-Jun-2014 ozaki-r

Get rid of unnecessary splnet for pool_{get,put}

A mutex prevents interrupts in the functions now.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base
# 1.77 29-Jun-2013 rmind

branches: 1.77.4;
- Rewrite parts of pfil(9): use array to store hooks and thus be more cache
friendly (there are only few hooks in the system). Make the structures
opaque and the interface more strict.
- Remove PFIL_HOOKS option by making pfil(9) mandatory.


Revision tags: agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8
# 1.76 22-Mar-2012 wiz

branches: 1.76.2; 1.76.4;
Fix typo in kauth name. From PR 46234 by Matthew Mondor.
Tested by Geoff Adams and Ryo ONODERA.


# 1.75 13-Mar-2012 elad

Replace the remaining KAUTH_GENERIC_ISSUSER authorization calls with
something meaningful. All relevant documentation has been updated or
written.

Most of these changes were brought up in the following messages:

http://mail-index.netbsd.org/tech-kern/2012/01/18/msg012490.html
http://mail-index.netbsd.org/tech-kern/2012/01/19/msg012502.html
http://mail-index.netbsd.org/tech-kern/2012/02/17/msg012728.html

Thanks to christos, manu, njoly, and jmmv for input.

Huge thanks to pgoyette for spinning these changes through some build
cycles and ATF.


Revision tags: netbsd-6-0-6-RELEASE netbsd-6-1-5-RELEASE netbsd-6-1-4-RELEASE netbsd-6-0-5-RELEASE netbsd-6-1-3-RELEASE netbsd-6-0-4-RELEASE netbsd-6-1-2-RELEASE netbsd-6-0-3-RELEASE netbsd-6-1-1-RELEASE netbsd-6-0-2-RELEASE netbsd-6-1-RELEASE netbsd-6-1-RC4 netbsd-6-1-RC3 netbsd-6-1-RC2 netbsd-6-1-RC1 netbsd-6-0-1-RELEASE matt-nb6-plus-nbase netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base
# 1.74 19-Nov-2011 tls

branches: 1.74.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.73 23-May-2011 joerg

branches: 1.73.4;
simplify


Revision tags: bouyer-quota2-nbase bouyer-quota2-base jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.72 07-Dec-2010 pooka

branches: 1.72.2;
_KERNEL_TOP


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1 yamt-nfs-mp-base9 uebayasi-xip-base
# 1.71 19-Jan-2010 pooka

branches: 1.71.4;
Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


Revision tags: matt-premerge-20091211 yamt-nfs-mp-base8 yamt-nfs-mp-base7 jymxensuspend-base yamt-nfs-mp-base6 yamt-nfs-mp-base5 jym-xensuspend-nbase
# 1.70 17-May-2009 cegger

fix crash in bridge_ioctl():


BRDGGFLT and BRDGSFILT bridge controls are only available with BRIDGE_IPF and PFIL_HOOKS defined.
In amd64 GENERIC and XEN kernel configs PFIL_HOOKS is defined but BRIDGE_IPF is not.

When a BRDGGFLT or BRDGSFILT command comes in, then ifd->ifd_cmd is not in range
of bridge_control_table_size. Then bc is not set and is dereferenced
later => BOOM.


Revision tags: yamt-nfs-mp-base4 jym-xensuspend-base
# 1.69 12-May-2009 elad

Move kauth(9) call before going into splnet().

Mailing list reference:

http://mail-index.netbsd.org/tech-net/2009/05/08/msg001286.html


Revision tags: yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 nick-hppapmap-base
# 1.68 04-Apr-2009 bouyer

Fix another typo


# 1.67 04-Apr-2009 bouyer

Fix a comment, and make it build.


# 1.66 04-Apr-2009 bouyer

Fixes from Masao Uebayashi


# 1.65 04-Apr-2009 bouyer

Fix for if_start() and pfil_hook() being called from hardware interrupt
context (reported on various mailing-lists, and part of PR kern/41114,
causing panic in pf(4) and possibly ipf(4) when BRIDGE_IPF is used).
Defer bridge_forward() to a software interrupt; bridge_input() enqueues
mbufs to ifp->if_snd which is handled in bridge_forward().


Revision tags: nick-hppapmap-base2
# 1.64 18-Jan-2009 mrg

branches: 1.64.2;
Fix multiple problems:

* A sign extension error creating the bridge ID corrupted the
priority (always making it the maximum).
* Do not catch STP packets on an interface for which STP is not
enabled -- it's a violation of the spec, and causes STP to fail on
neighboring bridges.
* An optimization to bstp_input() -- some information is already
known when we call it.

contributed anonymously.


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.63 07-Nov-2008 dyoung

*** Summary ***

When a link-layer address changes (e.g., ifconfig ex0 link
02:de:ad:be:ef:02 active), send a gratuitous ARP and/or a Neighbor
Advertisement to update the network-/link-layer address bindings
on our LAN peers.

Refuse a change of ethernet address to the address 00:00:00:00:00:00
or to any multicast/broadcast address. (Thanks matt@.)

Reorder ifnet ioctl operations so that driver ioctls may inherit
the functions of their "class"---ether_ioctl(), fddi_ioctl(), et
cetera---and the class ioctls may inherit from the generic ioctl,
ifioctl_common(), but both driver- and class-ioctls may override
the generic behavior. Make network drivers share more code.

Distinguish a "factory" link-layer address from others for the
purposes of both protecting that address from deletion and computing
EUI64.

Return consistent, appropriate error codes from network drivers.

Improve readability. KNF.

*** Details ***

In if_attach(), always initialize the interface ioctl routine,
ifnet->if_ioctl, if the driver has not already initialized it.
Delete if_ioctl == NULL tests everywhere else, because it cannot
happen.

In the ioctl routines of network interfaces, inherit common ioctl
behaviors by calling either ifioctl_common() or whichever ioctl
routine is appropriate for the class of interface---e.g., ether_ioctl()
for ethernets.

Stop (ab)using SIOCSIFADDR and start to use SIOCINITIFADDR. In
the user->kernel interface, SIOCSIFADDR's argument was an ifreq,
but on the protocol->ifnet interface, SIOCSIFADDR's argument was
an ifaddr. That was confusing, and it would work against me as I
make it possible for a network interface to overload most ioctls.
On the protocol->ifnet interface, replace SIOCSIFADDR with
SIOCINITIFADDR. In ifioctl(), return EPERM if userland tries to
invoke SIOCINITIFADDR.

In ifioctl(), give the interface the first shot at handling most
interface ioctls, and give the protocol the second shot, instead
of the other way around. Finally, let compatibility code (COMPAT_OSOCK)
take a shot.

Pull device initialization out of switch statements under
SIOCINITIFADDR. For example, pull ..._init() out of any switch
statement that looks like this:

switch (...->sa_family) {
case ...:
..._init();
...
break;
...
default:
..._init();
...
break;
}

Rewrite many if-else clauses that handle all permutations of IFF_UP
and IFF_RUNNING to use a switch statement,

switch (x & (IFF_UP|IFF_RUNNING)) {
case 0:
...
break;
case IFF_RUNNING:
...
break;
case IFF_UP:
...
break;
case IFF_UP|IFF_RUNNING:
...
break;
}

unifdef lots of code containing #ifdef FreeBSD, #ifdef NetBSD, and
#ifdef SIOCSIFMTU, especially in fwip(4) and in ndis(4).

In ipw(4), remove an if_set_sadl() call that is out of place.

In nfe(4), reuse the jumbo MTU logic in ether_ioctl().

Let ethernets register a callback for setting h/w state such as
promiscuous mode and the multicast filter in accord with a change
in the if_flags: ether_set_ifflags_cb() registers a callback that
returns ENETRESET if the caller should reset the ethernet by calling
if_init(), 0 on success, != 0 on failure. Pull common code from
ex(4), gem(4), nfe(4), sip(4), tlp(4), vge(4) into ether_ioctl(),
and register if_flags callbacks for those drivers.

Return ENOTTY instead of EINVAL for inappropriate ioctls. In
zyd(4), use ENXIO instead of ENOTTY to indicate that the device is
not any longer attached.

Add to if_set_sadl() a boolean 'factory' argument that indicates
whether a link-layer address was assigned by the factory or some
other source. In a comment, recommend using the factory address
for generating an EUI64, and update in6_get_hw_ifid() to prefer a
factory address to any other link-layer address.

Add a routing message, RTM_LLINFO_UPD, that tells protocols to
update the binding of network-layer addresses to link-layer addresses.
Implement this message in IPv4 and IPv6 by sending a gratuitous
ARP or a neighbor advertisement, respectively. Generate RTM_LLINFO_UPD
messages on a change of an interface's link-layer address.

In ether_ioctl(), do not let SIOCALIFADDR set a link-layer address
that is broadcast/multicast or equal to 00:00:00:00:00:00.

Make ether_ioctl() call ifioctl_common() to handle ioctls that it
does not understand.

In gif(4), initialize if_softc and use it, instead of assuming that
the gif_softc and ifp overlap.

Let ifioctl_common() handle SIOCGIFADDR.

Sprinkle rtcache_invariants(), which checks on DIAGNOSTIC kernels
that certain invariants on a struct route are satisfied.

In agr(4), rewrite agr_ioctl_filter() to be a bit more explicit
about the ioctls that we do not allow on an agr(4) member interface.

bzero -> memset. Delete unnecessary casts to void *. Use
sockaddr_in_init() and sockaddr_in6_init(). Compare pointers with
NULL instead of "testing truth". Replace some instances of (type
*)0 with NULL. Change some K&R prototypes to ANSI C, and join
lines.


Revision tags: netbsd-5-0-RC3 netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 wrstuden-revivesa-base-1 simonb-wapbl-nbase yamt-pf42-base4 simonb-wapbl-base wrstuden-revivesa-base
# 1.62 15-Jun-2008 christos

branches: 1.62.2; 1.62.4; 1.62.6;
- add if_alloc (ours just mallocs), and if_initname and use them (from FreeBSD)
- kill memsets where M_ZERO can be used.


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base
# 1.61 15-Apr-2008 thorpej

branches: 1.61.2; 1.61.4; 1.61.6; 1.61.8;
Make ip6 and icmp6 stats per-cpu.


# 1.60 12-Apr-2008 cegger

make this build with BRIDGE_IPF and PFIL_HOOKS options


# 1.59 12-Apr-2008 thorpej

Make IP, TCP, UDP, and ICMP statistics per-CPU. The stats are collated
when the user requests them via sysctl.


# 1.58 08-Apr-2008 thorpej

Change IPv6 stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ip6stat structure; old netstat
binaries will continue to work properly.


# 1.57 07-Apr-2008 thorpej

Change IP stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ipstat structure; old netstat
binaries will continue to work properly.


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase nick-net80211-sync-base keiichi-mipv6-base matt-armv6-nbase hpcarm-cleanup-base
# 1.56 20-Feb-2008 matt

branches: 1.56.6;
s/u_\(int[0-9]*_t\)/u\1/g
(change u_int*_t to uint*_t)


Revision tags: bouyer-xeni386-nbase bouyer-xeni386-base mjf-devfs-base
# 1.55 19-Jan-2008 dyoung

Use C99 array initializers for bridge_control_table[].


Revision tags: nick-csl-alignment-base5 bouyer-xeni386-merge1 matt-armv6-prevmlocking vmlocking2-base3 yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 jmcneill-base bouyer-xenamd64-base2 vmlocking-nbase yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base matt-armv6-base jmcneill-pm-base reinoud-bufcleanup-base vmlocking-base
# 1.54 27-Aug-2007 dyoung

branches: 1.54.2; 1.54.8; 1.54.14;
LLADDR -> CLLADDR.


# 1.53 26-Aug-2007 dyoung

Constify: LLADDR -> CLLADDR. I'm aiming here to make it easier to
identify sockaddr_dl abuse that remains in the kernel, especially
the potential for overwriting memory past the end of a sockaddr_dl
with, e.g., memcpy(LLADDR(), ...).

Use sockaddr_dl_setaddr() in a few places.


Revision tags: matt-mips64-base nick-csl-alignment-base mjf-ufs-trans-base
# 1.52 09-Jul-2007 ad

branches: 1.52.2; 1.52.6;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.51 12-Mar-2007 ad

branches: 1.51.2;
Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.


# 1.50 04-Mar-2007 christos

branches: 1.50.2;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.49 21-Feb-2007 dyoung

Use __arraycount().


# 1.48 17-Feb-2007 dyoung

KNF: de-__P, bzero -> memset, bcmp -> memcmp. Remove extraneous
parentheses in return statements.

Cosmetic: don't open-code TAILQ_FOREACH().

Cosmetic: change types of variables to avoid oodles of casts: in
in6_src.c, avoid casts by changing several route_in6 pointers
to struct route pointers. Remove unnecessary casts to caddr_t
elsewhere.

Pave the way for eliminating address family-specific route caches:
soon, struct route will not embed a sockaddr, but it will hold
a reference to an external sockaddr, instead. We will set the
destination sockaddr using rtcache_setdst(). (I created a stub
for it, but it isn't used anywhere, yet.) rtcache_free() will
free the sockaddr. I have extracted from rtcache_free() a helper
subroutine, rtcache_clear(). rtcache_clear() will "forget" a
cached route, but it will not forget the destination by releasing
the sockaddr. I use rtcache_clear() instead of rtcache_free()
in rtcache_update(), because rtcache_update() is not supposed
to forget the destination.

Constify:

1 Introduce const accessor for route->ro_dst, rtcache_getdst().

2 Constify the 'dst' argument to ifnet->if_output(). This
led me to constify a lot of code called by output routines.

3 Constify the sockaddr argument to protosw->pr_ctlinput. This
led me to constify a lot of code called by ctlinput routines.

4 Introduce const macros for converting from a generic sockaddr
to family-specific sockaddrs, e.g., sockaddr_in: satocsin6,
satocsin, et cetera.


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.47 04-Jan-2007 elad

branches: 1.47.2;
Consistent usage of KAUTH_GENERIC_ISSUSER.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3 netbsd-4-base
# 1.46 23-Nov-2006 rpaulo

New EtherIP driver based on tap(4) and gif(4) by Hans Rosenfeld.
Notable changes:
* Fixes PR 34268.
* Separates the code from gif(4) (which is more cleaner).
* Allows the usage of STP (Spanning Tree Protocol).
* Removed EtherIP implementation from gif(4)/tap(4).

Some input from Christos.


# 1.45 16-Nov-2006 christos

__unused removal on arguments; approved by core.


Revision tags: yamt-splraiseipl-base2
# 1.44 17-Oct-2006 dogcow

now that we have -Wno-unused-parameter, back out all the tremendously ugly
code to gratuitously access said parameters.


# 1.43 13-Oct-2006 dogcow

More -Wunused fallout. sprinkle __unused when possible; otherwise, use the
do { if (&x) {} } while (/* CONSTCOND */ 0);
construct as suggested by uwe in <20061012224845.GA9449@snark.ptc.spbu.ru>.


# 1.42 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


# 1.41 05-Oct-2006 tls

Protect calls to pool_put/pool_get that may occur in interrupt context
with spl used to protect other allocations and frees, or datastructure
element insertion and removal, in adjacent code.

It is almost unquestionably the case that some of the spl()/splx() calls
added here are superfluous, but it really seems wrong to see:

s=splfoo();
/* frob data structure */
splx(s);
pool_put(x);

and if we think we need to protect the first operation, then it is hard
to see why we should not think we need to protect the next. "Better
safe than sorry".

It is also almost unquestionably the case that I missed some pool
gets/puts from interrupt context with my strategy for finding these
calls; use of PR_NOWAIT is a strong hint that a pool may be used from
interrupt context but many callers in the kernel pass a "can wait/can't
wait" flag down such that my searches might not have found them. One
notable area that needs to be looked at is pf.

See also:

http://mail-index.netbsd.org/tech-kern/2006/07/19/0003.html
http://mail-index.netbsd.org/tech-kern/2006/07/19/0009.html


Revision tags: abandoned-netbsd-4-base yamt-splraiseipl-base yamt-pdpolicy-base9 yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.40 23-Jul-2006 ad

branches: 1.40.4; 1.40.6;
Use the LWP cached credentials where sane.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.39 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.38 18-May-2006 liamjfoy

branches: 1.38.2;
Integrate Common Address Redundancy Procotol (CARP) from OpenBSD

'pseudo-device carp'

Thanks to: joerg@ christos@ riz@ and others who tested
Ok: core@


# 1.37 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base yamt-pdpolicy-base2 elad-kernelauth-base yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.36 17-Jan-2006 christos

branches: 1.36.2; 1.36.4; 1.36.6; 1.36.8; 1.36.10;
Make sure that breq is also cleared (from Xin LI)


# 1.35 09-Jan-2006 christos

Make sure we initialize all structs to 0; from Xin LI


# 1.34 24-Dec-2005 perry

branches: 1.34.2;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.33 11-Dec-2005 thorpej

ANSI function decls and application of static.


# 1.32 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.31 01-Jun-2005 jdc

branches: 1.31.2;
Fix this properly by renaming the conflicting variables.


# 1.30 01-Jun-2005 jdc

Remove extraneous definition of struct llc (found by shadow warning).


Revision tags: netbsd-3-0-RELEASE netbsd-3-0-RC6 netbsd-3-0-RC5 netbsd-3-0-RC4 netbsd-3-0-RC3 netbsd-3-0-RC2 netbsd-3-0-RC1 yamt-km-base4 yamt-km-base3 netbsd-3-base kent-audio2-base
# 1.29 26-Feb-2005 perry

branches: 1.29.2; 1.29.4;
nuke trailing whitespace


Revision tags: yamt-km-base2
# 1.28 31-Jan-2005 kim

Add RFC 3378 EtherIP support, ported from OpenBSD to NetBSD by
Hans Rosenfeld (rosenfeld at grumpf.hope-2000.org)

This change makes it possible to add gif interfaces to bridges, which
will then send and receive IP protocol 97 packets. Packets are Ethernet
frames with an EtherIP header prepended.


Revision tags: yamt-km-base kent-audio1-beforemerge kent-audio1-base
# 1.27 04-Dec-2004 peter

branches: 1.27.4; 1.27.6;
Change ifc_destroy to return an int instead of void, so that it
can pass back errors to ifconfig.


# 1.26 06-Oct-2004 bad

Interfaces that do checksum offloading indicate the checksum status of
received packets in csum_flags in the packet header. Packets that are
forwarded over the bridge need to have csum_flags cleared before being
put on the output queue. Do so in bridge_enqueue().

Discussed with Jason Thorpe.

Fixes PR kern/27007 and the first part of PR kern/21831.


# 1.25 05-Oct-2004 christos

Only enable BRIDGE_IPF code if PFIL_HOOKS is enabled.


# 1.24 21-Apr-2004 itojun

kill a sprintf


# 1.23 21-Apr-2004 itojun

kill sprintf, use snprintf


Revision tags: netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.22 31-Jan-2004 jdc

branches: 1.22.2;
Use m_copydata(), m_adj() and M_PREPEND() to manipulate mbuf's in
bridge_ipf(). Fixes kernel memory corruption that occured when using
m_split() and m_cat().
Idea from OpenBSD.


# 1.21 09-Dec-2003 augustss

Fix spelling mistake in a comment.


# 1.20 28-Oct-2003 mycroft

Mark this initializer in the canonical way so it can be found later.


# 1.19 25-Oct-2003 christos

Fix uninitialized variable warnings


# 1.18 16-Sep-2003 jdc

Add a flag parameter to bridge_enqueue() to tell it whether to run the
filter or not. We only need to run the filter for bridge_forward() and
bridge_broadcast(). If we also run it for bridge_output(), we will run
the filter twice outbound per packet, so don't.

In bridge_ipf(), make sure we don't run m_cat() on a single mbuf chain
by checking to see (and remembering) if we need to m_split() the mbuf.
This fixes bridge + ipfilter on sparc.

Fixes PR kern/22063.


# 1.17 11-Aug-2003 itojun

rm extra blank line


# 1.16 13-Jul-2003 jdc

Include opt_inet.h to get INET6 definition.
Now, bridged ipv6 packets are passed through ipfilter.
However, some v6 packets still do not get transmitted when ipf is enabled.
Partial fix for PR kern/22063.


# 1.15 23-Jun-2003 martin

branches: 1.15.2;
Make sure to include opt_foo.h if a defflag option FOO is used.


# 1.14 24-May-2003 kristerw

Make sure splx() is called for all bridge_ioctl() error cases.


# 1.13 16-May-2003 itojun

use strlcpy


# 1.12 14-May-2003 itojun

use arc4random


# 1.11 19-Mar-2003 bouyer

Fix 2 bugs:
- initialise stp when the bridge is turned up, without this stp will keep
all interfaces disabled in a sequence like:
brconfig bridge0 add if0 add if1 stp if0 stp if1 up
- s/BRDGSPRI/BRDGSIFPRIO in brconfig.c:cmd_ifpriority()

add a command (ifpathcost) to change the stp path cost of the STP path cost of
an interface. Display the interface path cost with the others STP parameters.


# 1.10 27-Feb-2003 perseant

Make BRIDGE_IPF an option, and document it. Add it (commented) to GENERIC.
Let brconfig tell whether the bridge is using the ipfilter hook, or not.


# 1.9 15-Feb-2003 perseant

Add ipf packet-filtering option to if_bridge. The option is controlled at
compile-time by BRIDGE_IPF, and at runtime by brconfig with the {ipf,-ipf}
option on a per-bridge basis.

As a side-effect, add PFIL_HOOKS processing to if_bridge.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base kqueue-aftermerge kqueue-beforemerge gehenna-devsw-base kqueue-base
# 1.8 24-Aug-2002 martin

Add a function to lookup bridge members by struct ifnet * and use
it at all call sites that have such a pointer readily available.
This avoids unnecessary strcmp()s in critical paths, and removes
some XXX comments.


# 1.7 08-Jun-2002 itojun

reject "add" request if if_mtu is different.


# 1.6 23-May-2002 itojun

use IFT_BRIDGE


Revision tags: netbsd-1-6-base
# 1.5 24-Mar-2002 jdolecek

branches: 1.5.2; 1.5.4;
Fix a memory leak in bridge_ioctl_add() when the called for non-ethernet
interface.
Problem noted and fix provided by in kern/16019 by Love.


Revision tags: eeh-devprop-base newlock-base
# 1.4 08-Mar-2002 thorpej

Pool deals fairly well with physical memory shortage, but it doesn't
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map). Try to deal with this:

* Group all information about the backend allocator for a pool in a
separate structure. The pool references this structure, rather than
the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
to become available, but will still fail if it cannot callocate KVA
space for the pages. If this happens, carefully drain all pools using
the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
some pages, and use that information to make draining easier and more
efficient.
* Get rid of PR_URGENT. There was only one use of it, and it could be
dealt with by the caller.

From art@openbsd.org.


Revision tags: ifpoll-base
# 1.3 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.2 17-Aug-2001 thorpej

branches: 1.2.2; 1.2.4;
Only report expire time for DYNAMIC forwarding table entries.


# 1.1 17-Aug-2001 thorpej

Add support for building Ethernet bridges, based on Jason Wright's
bridge driver from OpenBSD, although the bridge code has been *heavily*
modified by me (the 802.1D code remains mostly unchanged from the
original).


# 1.138 25-Oct-2017 ozaki-r

Remove unnecessary splsoftnet


# 1.137 25-Oct-2017 ozaki-r

Don't free sc_rthash twice


# 1.136 23-Oct-2017 msaitoh

- If if_initialize() failed in the attach function, free resources and return.
- Add some missing frees in bridge_clone_destroy().
- KNF


# 1.135 02-Oct-2017 ozaki-r

Add curlwp_bind to bridge_input for psref

It can be called in a thread context via tap (tap_dev_write).

Fix PR kern/52587


Revision tags: nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320
# 1.134 07-Mar-2017 ozaki-r

branches: 1.134.6;
Remove unnecessary splnet for bridge_enqueue

bridge_enqueue now uses if_transmit_lock that does splnet for device
drivers, so splnet for bridge_enqueue isn't needed anymore.


# 1.133 16-Feb-2017 knakahara

add l2tp(4) L2TPv3 interface.

originally implemented by IIJ SEIL team.


Revision tags: nick-nhusb-base-20170204
# 1.132 23-Jan-2017 ozaki-r

Replace some splnet with splsoftnet


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107 nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004
# 1.131 15-Sep-2016 christos

branches: 1.131.2;
Always do the mbuf checks. The packet filters (npf) expect the mbuf to be
pulled-up. (Krists Krilovs)


Revision tags: localcount-20160914
# 1.130 29-Aug-2016 ozaki-r

KNF; replace white spaces with hard tabs

No functional change.


Revision tags: pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.129 22-Jun-2016 knakahara

branches: 1.129.2;
fix: locking about IFQ_ENQUEUE and ALTQ

- If NET_MPSAFE is not defined, IFQ_LOCK is nop. Currently, that means
IFQ_ENQUEUE() of some paths such as bridge_enqueue() is called parallel
wrongly.
- If ALTQ is enabled, Tx processing should call if_transmit() (= IFQ_ENQUEUE
+ ifp->if_start()) instead of ifp->if_transmit() to call ALTQ_ENQUEUE()
and ALTQ_DEQUEUE().
Furthermore, ALTQ processing is always required KERNEL_LOCK currently.


# 1.128 20-Jun-2016 knakahara

fix: should not assert IFEF_OUTPUT_MPSAFE in bridge_output()


# 1.127 20-Jun-2016 knakahara

tentative fix for ATF(net/if_bridge/t_bridge)


# 1.126 20-Jun-2016 knakahara

make bridge_output MP-safe, so that bridge(4) can enable IFEF_OUTPUT_MPSAFE.

making MP-scalable is future work.


# 1.125 10-Jun-2016 ozaki-r

Avoid storing a pointer of an interface in a mbuf

Having a pointer of an interface in a mbuf isn't safe if we remove big
kernel locks; an interface object (ifnet) can be destroyed anytime in any
packet processing and accessing such object via a pointer is racy. Instead
we have to get an object from the interface collection (ifindex2ifnet) via
an interface index (if_index) that is stored to a mbuf instead of an
pointer.

The change provides two APIs: m_{get,put}_rcvif_psref that use psref(9)
for sleep-able critical sections and m_{get,put}_rcvif that use
pserialize(9) for other critical sections. The change also adds another
API called m_get_rcvif_NOMPSAFE, that is NOT MP-safe and for transition
moratorium, i.e., it is intended to be used for places where are not
planned to be MP-ified soon.

The change adds some overhead due to psref to performance sensitive paths,
however the overhead is not serious, 2% down at worst.

Proposed on tech-kern and tech-net.


# 1.124 10-Jun-2016 ozaki-r

Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.123 16-May-2016 ozaki-r

Apply if_get and if_put to bridge(4)


# 1.122 04-May-2016 roy

Allow multicast/broadcast packets from a bridge member to other members.
Note this should just call bridge_broadcast when more locking issues are
resolved.


# 1.121 28-Apr-2016 knakahara

introduce new ifnet MP-scalable sending interface "if_transmit".


# 1.120 28-Apr-2016 ozaki-r

Constify rtentry of if_output

We no longer need to change rtentry below if_output.

The change makes it clear where rtentries are changed (or not)
and helps forthcoming locking (os psrefing) rtentries.


# 1.119 24-Apr-2016 christos

CID 1358673: dead code


Revision tags: nick-nhusb-base-20160422
# 1.118 22-Apr-2016 roy

Change used from int to bool.
If used, abort the loop because we think we're already at the end.


# 1.117 20-Apr-2016 knakahara

IFQ_ENQUEUE refactor (3/3) : eliminate pktattr argument from IFQ_ENQUEUE caller


# 1.116 20-Apr-2016 knakahara

IFQ_ENQUEUE refactor (2/3) : eliminate pktattr argument from altq implemantation


# 1.115 19-Apr-2016 ozaki-r

Apply psref(9) to bridge(4)

Note that there is an issue that ioctls for an interface and a destruction
of the interface can run in parallel and it causes race conditions on
bridge as well (it rarely happens). The issue will be addressed in the
interface common code (if.c).


# 1.114 19-Apr-2016 ozaki-r

Remove BRIDGE_MPSAFE switch and enable MP-safe code by default

We need to enable it by default because bridge_input now runs
in softint, but bridge_input w/o BRIDGE_MPSAFE was designed as
it runs in hardware interrupt.

Note that there remains a racy code in bridge_output; it will be
solved in the upcoming change (applying psref(9)).


# 1.113 11-Apr-2016 ozaki-r

Fix usage of pslist(9)

Pointed out by riastradh@.


# 1.112 11-Apr-2016 ozaki-r

Use pslist(9) in bridge(4)

This adds missing memory barriers to list operations for pserialize.


# 1.111 28-Mar-2016 ozaki-r

Remove unused global bridge list

Pointed out by riastradh@


# 1.110 23-Mar-2016 ozaki-r

Fix LIST_FOREACH argument


# 1.109 23-Mar-2016 ozaki-r

Use LIST_FOREACH instead of LIST_FOREACH_SAFE

No need to use *_SAFE because we don't remove any items in the loop.


Revision tags: nick-nhusb-base-20160319
# 1.108 15-Feb-2016 ozaki-r

Simplify bridge(4)

Thanks to introducing softint-based if_input, the entire bridge code now
never run in hardware interrupt context. So we can simplify the code.

- Remove spin mutexes
- They were needed because some code of bridge could run in
hardware interrupt context
- We now need only an adaptive mutex for each shared object
(a member list and a forwarding table)
- Remove pktqueue
- bridge_input is already in softint, using another softint
(for bridge_forward) is useless
- Packet distribution should be down at device drivers


# 1.107 10-Feb-2016 ozaki-r

Don't share struct work, instead have one per softc

Pointed out by riastradh@


# 1.106 09-Feb-2016 ozaki-r

Introduce softint-based if_input

This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.

This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.

To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.

Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html

Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!


Revision tags: nick-nhusb-base-20151226
# 1.105 19-Nov-2015 christos

Add handling of VLAN packets in if_bridge where the parent interface supports
them (Jean-Jacques.Puig@espci.fr). Factor out the vlan_mtu enabling and
disabling code.


# 1.104 20-Oct-2015 maxv

Harmless alloc inconsistency; make sure the exact same argument is given to
kmem_alloc/kmem_free. Found by Brainy.


# 1.103 07-Oct-2015 ozaki-r

Enqueue frames to a curcpu's pktqueue

Currently RX can run on a CPU other than CPU#0, so always enqueuing
to a pktqueue of CPU#0 makes no sense. Let's use a curcpu's pktqueue,
although bridge_foward softint doesn't run in parallel without
NET_MPSAFE.

This is a temporal solution. We need a fundamental solution.


Revision tags: nick-nhusb-base-20150921
# 1.102 28-Aug-2015 rjs

Don't set M_PROTO1 in mbuf flags.

This was left over from the old usage of gif(4) with bridges.


# 1.101 20-Aug-2015 christos

include "ioconf.h" to get the 'void <driver>attach(int count);' prototype.


# 1.100 23-Jul-2015 ozaki-r

Fix PR 48104

So far bridge cannot receive frames via a member interface when the frames
come from another member interface. So when we assign an IP address to
a member interface, hosts connected to another member interface cannot
ping to the IP address. That behavior isn't expected. See PR 48104 for
more realistic examples of this issue.

The change does:
- drop M_PROMISC before ether_input, which allows a bridge member interface
to receive a frame coming from another bridge member interface
- receive broadcast/multicast frames via all bridge member interfaces,
which is required to receive IPv6 multicast packets destined to a
multicast group belonging to a bridge member interface that is different
from a packet arrival interface

roy@ helped testing of the fix, thanks!


Revision tags: nick-nhusb-base-20150606
# 1.99 01-Jun-2015 matt

Modify the BRDGGIFS and BRDGRTS cmds to be more COMPAT_NETBSD32 friendly.
(XXX whitespace)


# 1.98 16-Apr-2015 ozaki-r

Fix racy bridge_delete_member

It can be called from bridge_ioctl_del and bridge_clone_destroy with
a same bridge member (bif) at the same time. We have to prevent
that happens.

Pointed out by riastradh@


Revision tags: nick-nhusb-base-20150406
# 1.97 08-Jan-2015 ozaki-r

Use pserialize for rtlist in bridge

This change enables lockless accesses to bridge rtable lists.
See locking notes in a comment to know how pserialize and
mutexes are used. Some functions are rearranged to use
pserialize. A workqueue is introduced to use pserialize in
bridge_rtage via bridge_timer callout.

As usual, pserialize and mutexes are used only when NET_MPSAFE
on. On the other hand, the newly added workqueue is used
regardless of NET_MPSAFE on or off.


# 1.96 01-Jan-2015 ozaki-r

Reset the expire time of a cache on receiving a frame for the cache

The expire time of a cache in a bridge MAC address table was never reset
once it is initialized regardless of traffic for the cache. The behavior
isn't supposed and active caches are unnecessarily expired and removed.

PR kern/49507


# 1.95 31-Dec-2014 ozaki-r

Use pserialize in bridge

This change enables lockless accesses to bridge member lists.
See locking notes in a comment to know how pserialize and
mutexes are used.

This change also provides support for softint-based interrupt
handling; pserialize readers can run in both HW interrupt and
softint contexts.

As usual, pserialize is used only when NET_MPSAFE on.


# 1.94 25-Dec-2014 ozaki-r

Use LIST_FOREACH_SAFE in bridge_rt* functions


# 1.93 24-Dec-2014 ozaki-r

Replace malloc/free with kmem_* in if_bridge

Additionally M_NOWAIT is replaced with KM_SLEEP.


# 1.92 22-Dec-2014 ozaki-r

Call ether_input/m_freem without holding a lock or referencing unnecessary objects

When NET_MPSAFE on, a bridge tries to pass up a packet to Layer 3
(or call m_freem) with holding a lock or referencing unnecessary
objects. That causes random lock ups. The change fixes the issue.


Revision tags: nick-nhusb-base
# 1.91 15-Aug-2014 ozaki-r

branches: 1.91.2;
bridge: reject non-IFF_SIMPLEX interfaces

bridge does not work with !IFF_SIMPLEX interfaces (PR/18035);
the bug is not yet fixed. Until it gets fixed, we should
reject non-IFF_SIMPLEX interfaces.

Discussed with pooka@


Revision tags: netbsd-7-1-RELEASE netbsd-7-1-RC2 netbsd-7-nhusb-base-20170116 netbsd-7-1-RC1 netbsd-7-0-2-RELEASE netbsd-7-nhusb-base netbsd-7-0-1-RELEASE netbsd-7-0-RELEASE netbsd-7-0-RC3 netbsd-7-0-RC2 netbsd-7-0-RC1 netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.90 23-Jul-2014 ozaki-r

branches: 1.90.2;
Avoid calling copyout with holding mutex(IPL_NET)

Because copyout may lead a page fault that may sleep, we have to pull it
out from the critical section of mutex(IPL_NET) in bridge_ioctl_gifs.


# 1.89 23-Jul-2014 ozaki-r

Add missing unlock


# 1.88 20-Jul-2014 ozaki-r

Don't return ENETRESET when ioctl SIOCSIFMTU

Otherwise, just changing MTU with ifconfig shows
a confusable error message.

RP kern/48996


# 1.87 14-Jul-2014 ozaki-r

Make bridge MPSAFE

- Introduce BRIDGE_MPSAFE
- It's enabled only when NET_MPSAFE is defined
in if.h or the kernel config
- Add iflist and rtlist mutex locks
- Locking iflist is performance sensitive,
so it's not used when !BRIDGE_MPSAFE
- Add bif object reference counting
- It enables fine-grain locking for bridge member lists
by allowing to not hold a lock during touching a bif
- bridge_release_member is added to decrement the
reference count
- A condition variable is added to do bridge_delete_member
gracefully
- Add if_bridgeif to ifnet
- It's a shortcut to a bif object of a bridge member
- It reduces a bif lookup cost and so lock contention on iflist
- Make bridgestp MPSAFE too


# 1.86 02-Jul-2014 ozaki-r

Protect bridge_list with a mutex


# 1.85 02-Jul-2014 ozaki-r

Remove obsolete codes for if_snd


# 1.84 23-Jun-2014 ozaki-r

Get rid of unnecessary xc_broadcast after pktq_barrier

Pointed out by rmind@


# 1.83 18-Jun-2014 ozaki-r

Restructure bridge_input and bridge_broadcast

There are two changes:
- Assemble the places calling pktq_enqueue (bridge_forward)
for unicast and {b,m}cast frames into one
- Receive {b,m}cast frames in bridge_broadcast, not in
bridge_input

The changes make the code clear and readable. bridge_input
now doesn't need to take care of {b,m}cast frames;
bridge_forward and bridge_broadcast have the responsibility.

The changes are based on a patch of Lloyd Parkes submitted
in PR 48104, but don't fix its issue yet.


# 1.82 18-Jun-2014 ozaki-r

Tidy up bridge_input

No functional change.


# 1.81 17-Jun-2014 ozaki-r

Restructure ether_input and bridge_input

The network stack of NetBSD is well organized and
layered. A packet reception is processed from a
lower layer to an upper layer one by one. However,
ether_input and bridge_input are not structured so.
bridge_input is called inside ether_input.

The new structure replaces ifnet#if_input of a bridge
member with bridge_input when the member is attached.
So a packet goes straight on a packet reception via
a bridge, bridge_input => ether_input => ip_input.

The change is part of a patch of Lloyd Parkes submitted
in PR 48104. Unlike the patch, the change doesn't
intend to change the behavior of the packet processing.
Another patch will fix PR 48104.


# 1.80 16-Jun-2014 ozaki-r

Add net.interfaces.bridgeN.fwdq.{maxlen,len,drops} sysctl


# 1.79 16-Jun-2014 ozaki-r

Use pktqueue for bridge forwarding queue and softint


# 1.78 15-Jun-2014 ozaki-r

Get rid of unnecessary splnet for pool_{get,put}

A mutex prevents interrupts in the functions now.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base
# 1.77 29-Jun-2013 rmind

branches: 1.77.4;
- Rewrite parts of pfil(9): use array to store hooks and thus be more cache
friendly (there are only few hooks in the system). Make the structures
opaque and the interface more strict.
- Remove PFIL_HOOKS option by making pfil(9) mandatory.


Revision tags: agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8
# 1.76 22-Mar-2012 wiz

branches: 1.76.2; 1.76.4;
Fix typo in kauth name. From PR 46234 by Matthew Mondor.
Tested by Geoff Adams and Ryo ONODERA.


# 1.75 13-Mar-2012 elad

Replace the remaining KAUTH_GENERIC_ISSUSER authorization calls with
something meaningful. All relevant documentation has been updated or
written.

Most of these changes were brought up in the following messages:

http://mail-index.netbsd.org/tech-kern/2012/01/18/msg012490.html
http://mail-index.netbsd.org/tech-kern/2012/01/19/msg012502.html
http://mail-index.netbsd.org/tech-kern/2012/02/17/msg012728.html

Thanks to christos, manu, njoly, and jmmv for input.

Huge thanks to pgoyette for spinning these changes through some build
cycles and ATF.


Revision tags: netbsd-6-0-6-RELEASE netbsd-6-1-5-RELEASE netbsd-6-1-4-RELEASE netbsd-6-0-5-RELEASE netbsd-6-1-3-RELEASE netbsd-6-0-4-RELEASE netbsd-6-1-2-RELEASE netbsd-6-0-3-RELEASE netbsd-6-1-1-RELEASE netbsd-6-0-2-RELEASE netbsd-6-1-RELEASE netbsd-6-1-RC4 netbsd-6-1-RC3 netbsd-6-1-RC2 netbsd-6-1-RC1 netbsd-6-0-1-RELEASE matt-nb6-plus-nbase netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base
# 1.74 19-Nov-2011 tls

branches: 1.74.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.73 23-May-2011 joerg

branches: 1.73.4;
simplify


Revision tags: bouyer-quota2-nbase bouyer-quota2-base jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.72 07-Dec-2010 pooka

branches: 1.72.2;
_KERNEL_TOP


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1 yamt-nfs-mp-base9 uebayasi-xip-base
# 1.71 19-Jan-2010 pooka

branches: 1.71.4;
Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


Revision tags: matt-premerge-20091211 yamt-nfs-mp-base8 yamt-nfs-mp-base7 jymxensuspend-base yamt-nfs-mp-base6 yamt-nfs-mp-base5 jym-xensuspend-nbase
# 1.70 17-May-2009 cegger

fix crash in bridge_ioctl():


BRDGGFLT and BRDGSFILT bridge controls are only available with BRIDGE_IPF and PFIL_HOOKS defined.
In amd64 GENERIC and XEN kernel configs PFIL_HOOKS is defined but BRIDGE_IPF is not.

When a BRDGGFLT or BRDGSFILT command comes in, then ifd->ifd_cmd is not in range
of bridge_control_table_size. Then bc is not set and is dereferenced
later => BOOM.


Revision tags: yamt-nfs-mp-base4 jym-xensuspend-base
# 1.69 12-May-2009 elad

Move kauth(9) call before going into splnet().

Mailing list reference:

http://mail-index.netbsd.org/tech-net/2009/05/08/msg001286.html


Revision tags: yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 nick-hppapmap-base
# 1.68 04-Apr-2009 bouyer

Fix another typo


# 1.67 04-Apr-2009 bouyer

Fix a comment, and make it build.


# 1.66 04-Apr-2009 bouyer

Fixes from Masao Uebayashi


# 1.65 04-Apr-2009 bouyer

Fix for if_start() and pfil_hook() being called from hardware interrupt
context (reported on various mailing-lists, and part of PR kern/41114,
causing panic in pf(4) and possibly ipf(4) when BRIDGE_IPF is used).
Defer bridge_forward() to a software interrupt; bridge_input() enqueues
mbufs to ifp->if_snd which is handled in bridge_forward().


Revision tags: nick-hppapmap-base2
# 1.64 18-Jan-2009 mrg

branches: 1.64.2;
Fix multiple problems:

* A sign extension error creating the bridge ID corrupted the
priority (always making it the maximum).
* Do not catch STP packets on an interface for which STP is not
enabled -- it's a violation of the spec, and causes STP to fail on
neighboring bridges.
* An optimization to bstp_input() -- some information is already
known when we call it.

contributed anonymously.


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.63 07-Nov-2008 dyoung

*** Summary ***

When a link-layer address changes (e.g., ifconfig ex0 link
02:de:ad:be:ef:02 active), send a gratuitous ARP and/or a Neighbor
Advertisement to update the network-/link-layer address bindings
on our LAN peers.

Refuse a change of ethernet address to the address 00:00:00:00:00:00
or to any multicast/broadcast address. (Thanks matt@.)

Reorder ifnet ioctl operations so that driver ioctls may inherit
the functions of their "class"---ether_ioctl(), fddi_ioctl(), et
cetera---and the class ioctls may inherit from the generic ioctl,
ifioctl_common(), but both driver- and class-ioctls may override
the generic behavior. Make network drivers share more code.

Distinguish a "factory" link-layer address from others for the
purposes of both protecting that address from deletion and computing
EUI64.

Return consistent, appropriate error codes from network drivers.

Improve readability. KNF.

*** Details ***

In if_attach(), always initialize the interface ioctl routine,
ifnet->if_ioctl, if the driver has not already initialized it.
Delete if_ioctl == NULL tests everywhere else, because it cannot
happen.

In the ioctl routines of network interfaces, inherit common ioctl
behaviors by calling either ifioctl_common() or whichever ioctl
routine is appropriate for the class of interface---e.g., ether_ioctl()
for ethernets.

Stop (ab)using SIOCSIFADDR and start to use SIOCINITIFADDR. In
the user->kernel interface, SIOCSIFADDR's argument was an ifreq,
but on the protocol->ifnet interface, SIOCSIFADDR's argument was
an ifaddr. That was confusing, and it would work against me as I
make it possible for a network interface to overload most ioctls.
On the protocol->ifnet interface, replace SIOCSIFADDR with
SIOCINITIFADDR. In ifioctl(), return EPERM if userland tries to
invoke SIOCINITIFADDR.

In ifioctl(), give the interface the first shot at handling most
interface ioctls, and give the protocol the second shot, instead
of the other way around. Finally, let compatibility code (COMPAT_OSOCK)
take a shot.

Pull device initialization out of switch statements under
SIOCINITIFADDR. For example, pull ..._init() out of any switch
statement that looks like this:

switch (...->sa_family) {
case ...:
..._init();
...
break;
...
default:
..._init();
...
break;
}

Rewrite many if-else clauses that handle all permutations of IFF_UP
and IFF_RUNNING to use a switch statement,

switch (x & (IFF_UP|IFF_RUNNING)) {
case 0:
...
break;
case IFF_RUNNING:
...
break;
case IFF_UP:
...
break;
case IFF_UP|IFF_RUNNING:
...
break;
}

unifdef lots of code containing #ifdef FreeBSD, #ifdef NetBSD, and
#ifdef SIOCSIFMTU, especially in fwip(4) and in ndis(4).

In ipw(4), remove an if_set_sadl() call that is out of place.

In nfe(4), reuse the jumbo MTU logic in ether_ioctl().

Let ethernets register a callback for setting h/w state such as
promiscuous mode and the multicast filter in accord with a change
in the if_flags: ether_set_ifflags_cb() registers a callback that
returns ENETRESET if the caller should reset the ethernet by calling
if_init(), 0 on success, != 0 on failure. Pull common code from
ex(4), gem(4), nfe(4), sip(4), tlp(4), vge(4) into ether_ioctl(),
and register if_flags callbacks for those drivers.

Return ENOTTY instead of EINVAL for inappropriate ioctls. In
zyd(4), use ENXIO instead of ENOTTY to indicate that the device is
not any longer attached.

Add to if_set_sadl() a boolean 'factory' argument that indicates
whether a link-layer address was assigned by the factory or some
other source. In a comment, recommend using the factory address
for generating an EUI64, and update in6_get_hw_ifid() to prefer a
factory address to any other link-layer address.

Add a routing message, RTM_LLINFO_UPD, that tells protocols to
update the binding of network-layer addresses to link-layer addresses.
Implement this message in IPv4 and IPv6 by sending a gratuitous
ARP or a neighbor advertisement, respectively. Generate RTM_LLINFO_UPD
messages on a change of an interface's link-layer address.

In ether_ioctl(), do not let SIOCALIFADDR set a link-layer address
that is broadcast/multicast or equal to 00:00:00:00:00:00.

Make ether_ioctl() call ifioctl_common() to handle ioctls that it
does not understand.

In gif(4), initialize if_softc and use it, instead of assuming that
the gif_softc and ifp overlap.

Let ifioctl_common() handle SIOCGIFADDR.

Sprinkle rtcache_invariants(), which checks on DIAGNOSTIC kernels
that certain invariants on a struct route are satisfied.

In agr(4), rewrite agr_ioctl_filter() to be a bit more explicit
about the ioctls that we do not allow on an agr(4) member interface.

bzero -> memset. Delete unnecessary casts to void *. Use
sockaddr_in_init() and sockaddr_in6_init(). Compare pointers with
NULL instead of "testing truth". Replace some instances of (type
*)0 with NULL. Change some K&R prototypes to ANSI C, and join
lines.


Revision tags: netbsd-5-0-RC3 netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 wrstuden-revivesa-base-1 simonb-wapbl-nbase yamt-pf42-base4 simonb-wapbl-base wrstuden-revivesa-base
# 1.62 15-Jun-2008 christos

branches: 1.62.2; 1.62.4; 1.62.6;
- add if_alloc (ours just mallocs), and if_initname and use them (from FreeBSD)
- kill memsets where M_ZERO can be used.


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base
# 1.61 15-Apr-2008 thorpej

branches: 1.61.2; 1.61.4; 1.61.6; 1.61.8;
Make ip6 and icmp6 stats per-cpu.


# 1.60 12-Apr-2008 cegger

make this build with BRIDGE_IPF and PFIL_HOOKS options


# 1.59 12-Apr-2008 thorpej

Make IP, TCP, UDP, and ICMP statistics per-CPU. The stats are collated
when the user requests them via sysctl.


# 1.58 08-Apr-2008 thorpej

Change IPv6 stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ip6stat structure; old netstat
binaries will continue to work properly.


# 1.57 07-Apr-2008 thorpej

Change IP stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ipstat structure; old netstat
binaries will continue to work properly.


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase nick-net80211-sync-base keiichi-mipv6-base matt-armv6-nbase hpcarm-cleanup-base
# 1.56 20-Feb-2008 matt

branches: 1.56.6;
s/u_\(int[0-9]*_t\)/u\1/g
(change u_int*_t to uint*_t)


Revision tags: bouyer-xeni386-nbase bouyer-xeni386-base mjf-devfs-base
# 1.55 19-Jan-2008 dyoung

Use C99 array initializers for bridge_control_table[].


Revision tags: nick-csl-alignment-base5 bouyer-xeni386-merge1 matt-armv6-prevmlocking vmlocking2-base3 yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 jmcneill-base bouyer-xenamd64-base2 vmlocking-nbase yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base matt-armv6-base jmcneill-pm-base reinoud-bufcleanup-base vmlocking-base
# 1.54 27-Aug-2007 dyoung

branches: 1.54.2; 1.54.8; 1.54.14;
LLADDR -> CLLADDR.


# 1.53 26-Aug-2007 dyoung

Constify: LLADDR -> CLLADDR. I'm aiming here to make it easier to
identify sockaddr_dl abuse that remains in the kernel, especially
the potential for overwriting memory past the end of a sockaddr_dl
with, e.g., memcpy(LLADDR(), ...).

Use sockaddr_dl_setaddr() in a few places.


Revision tags: matt-mips64-base nick-csl-alignment-base mjf-ufs-trans-base
# 1.52 09-Jul-2007 ad

branches: 1.52.2; 1.52.6;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.51 12-Mar-2007 ad

branches: 1.51.2;
Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.


# 1.50 04-Mar-2007 christos

branches: 1.50.2;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.49 21-Feb-2007 dyoung

Use __arraycount().


# 1.48 17-Feb-2007 dyoung

KNF: de-__P, bzero -> memset, bcmp -> memcmp. Remove extraneous
parentheses in return statements.

Cosmetic: don't open-code TAILQ_FOREACH().

Cosmetic: change types of variables to avoid oodles of casts: in
in6_src.c, avoid casts by changing several route_in6 pointers
to struct route pointers. Remove unnecessary casts to caddr_t
elsewhere.

Pave the way for eliminating address family-specific route caches:
soon, struct route will not embed a sockaddr, but it will hold
a reference to an external sockaddr, instead. We will set the
destination sockaddr using rtcache_setdst(). (I created a stub
for it, but it isn't used anywhere, yet.) rtcache_free() will
free the sockaddr. I have extracted from rtcache_free() a helper
subroutine, rtcache_clear(). rtcache_clear() will "forget" a
cached route, but it will not forget the destination by releasing
the sockaddr. I use rtcache_clear() instead of rtcache_free()
in rtcache_update(), because rtcache_update() is not supposed
to forget the destination.

Constify:

1 Introduce const accessor for route->ro_dst, rtcache_getdst().

2 Constify the 'dst' argument to ifnet->if_output(). This
led me to constify a lot of code called by output routines.

3 Constify the sockaddr argument to protosw->pr_ctlinput. This
led me to constify a lot of code called by ctlinput routines.

4 Introduce const macros for converting from a generic sockaddr
to family-specific sockaddrs, e.g., sockaddr_in: satocsin6,
satocsin, et cetera.


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.47 04-Jan-2007 elad

branches: 1.47.2;
Consistent usage of KAUTH_GENERIC_ISSUSER.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3 netbsd-4-base
# 1.46 23-Nov-2006 rpaulo

New EtherIP driver based on tap(4) and gif(4) by Hans Rosenfeld.
Notable changes:
* Fixes PR 34268.
* Separates the code from gif(4) (which is more cleaner).
* Allows the usage of STP (Spanning Tree Protocol).
* Removed EtherIP implementation from gif(4)/tap(4).

Some input from Christos.


# 1.45 16-Nov-2006 christos

__unused removal on arguments; approved by core.


Revision tags: yamt-splraiseipl-base2
# 1.44 17-Oct-2006 dogcow

now that we have -Wno-unused-parameter, back out all the tremendously ugly
code to gratuitously access said parameters.


# 1.43 13-Oct-2006 dogcow

More -Wunused fallout. sprinkle __unused when possible; otherwise, use the
do { if (&x) {} } while (/* CONSTCOND */ 0);
construct as suggested by uwe in <20061012224845.GA9449@snark.ptc.spbu.ru>.


# 1.42 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


# 1.41 05-Oct-2006 tls

Protect calls to pool_put/pool_get that may occur in interrupt context
with spl used to protect other allocations and frees, or datastructure
element insertion and removal, in adjacent code.

It is almost unquestionably the case that some of the spl()/splx() calls
added here are superfluous, but it really seems wrong to see:

s=splfoo();
/* frob data structure */
splx(s);
pool_put(x);

and if we think we need to protect the first operation, then it is hard
to see why we should not think we need to protect the next. "Better
safe than sorry".

It is also almost unquestionably the case that I missed some pool
gets/puts from interrupt context with my strategy for finding these
calls; use of PR_NOWAIT is a strong hint that a pool may be used from
interrupt context but many callers in the kernel pass a "can wait/can't
wait" flag down such that my searches might not have found them. One
notable area that needs to be looked at is pf.

See also:

http://mail-index.netbsd.org/tech-kern/2006/07/19/0003.html
http://mail-index.netbsd.org/tech-kern/2006/07/19/0009.html


Revision tags: abandoned-netbsd-4-base yamt-splraiseipl-base yamt-pdpolicy-base9 yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.40 23-Jul-2006 ad

branches: 1.40.4; 1.40.6;
Use the LWP cached credentials where sane.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.39 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.38 18-May-2006 liamjfoy

branches: 1.38.2;
Integrate Common Address Redundancy Procotol (CARP) from OpenBSD

'pseudo-device carp'

Thanks to: joerg@ christos@ riz@ and others who tested
Ok: core@


# 1.37 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base yamt-pdpolicy-base2 elad-kernelauth-base yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.36 17-Jan-2006 christos

branches: 1.36.2; 1.36.4; 1.36.6; 1.36.8; 1.36.10;
Make sure that breq is also cleared (from Xin LI)


# 1.35 09-Jan-2006 christos

Make sure we initialize all structs to 0; from Xin LI


# 1.34 24-Dec-2005 perry

branches: 1.34.2;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.33 11-Dec-2005 thorpej

ANSI function decls and application of static.


# 1.32 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.31 01-Jun-2005 jdc

branches: 1.31.2;
Fix this properly by renaming the conflicting variables.


# 1.30 01-Jun-2005 jdc

Remove extraneous definition of struct llc (found by shadow warning).


Revision tags: netbsd-3-0-RELEASE netbsd-3-0-RC6 netbsd-3-0-RC5 netbsd-3-0-RC4 netbsd-3-0-RC3 netbsd-3-0-RC2 netbsd-3-0-RC1 yamt-km-base4 yamt-km-base3 netbsd-3-base kent-audio2-base
# 1.29 26-Feb-2005 perry

branches: 1.29.2; 1.29.4;
nuke trailing whitespace


Revision tags: yamt-km-base2
# 1.28 31-Jan-2005 kim

Add RFC 3378 EtherIP support, ported from OpenBSD to NetBSD by
Hans Rosenfeld (rosenfeld at grumpf.hope-2000.org)

This change makes it possible to add gif interfaces to bridges, which
will then send and receive IP protocol 97 packets. Packets are Ethernet
frames with an EtherIP header prepended.


Revision tags: yamt-km-base kent-audio1-beforemerge kent-audio1-base
# 1.27 04-Dec-2004 peter

branches: 1.27.4; 1.27.6;
Change ifc_destroy to return an int instead of void, so that it
can pass back errors to ifconfig.


# 1.26 06-Oct-2004 bad

Interfaces that do checksum offloading indicate the checksum status of
received packets in csum_flags in the packet header. Packets that are
forwarded over the bridge need to have csum_flags cleared before being
put on the output queue. Do so in bridge_enqueue().

Discussed with Jason Thorpe.

Fixes PR kern/27007 and the first part of PR kern/21831.


# 1.25 05-Oct-2004 christos

Only enable BRIDGE_IPF code if PFIL_HOOKS is enabled.


# 1.24 21-Apr-2004 itojun

kill a sprintf


# 1.23 21-Apr-2004 itojun

kill sprintf, use snprintf


Revision tags: netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.22 31-Jan-2004 jdc

branches: 1.22.2;
Use m_copydata(), m_adj() and M_PREPEND() to manipulate mbuf's in
bridge_ipf(). Fixes kernel memory corruption that occured when using
m_split() and m_cat().
Idea from OpenBSD.


# 1.21 09-Dec-2003 augustss

Fix spelling mistake in a comment.


# 1.20 28-Oct-2003 mycroft

Mark this initializer in the canonical way so it can be found later.


# 1.19 25-Oct-2003 christos

Fix uninitialized variable warnings


# 1.18 16-Sep-2003 jdc

Add a flag parameter to bridge_enqueue() to tell it whether to run the
filter or not. We only need to run the filter for bridge_forward() and
bridge_broadcast(). If we also run it for bridge_output(), we will run
the filter twice outbound per packet, so don't.

In bridge_ipf(), make sure we don't run m_cat() on a single mbuf chain
by checking to see (and remembering) if we need to m_split() the mbuf.
This fixes bridge + ipfilter on sparc.

Fixes PR kern/22063.


# 1.17 11-Aug-2003 itojun

rm extra blank line


# 1.16 13-Jul-2003 jdc

Include opt_inet.h to get INET6 definition.
Now, bridged ipv6 packets are passed through ipfilter.
However, some v6 packets still do not get transmitted when ipf is enabled.
Partial fix for PR kern/22063.


# 1.15 23-Jun-2003 martin

branches: 1.15.2;
Make sure to include opt_foo.h if a defflag option FOO is used.


# 1.14 24-May-2003 kristerw

Make sure splx() is called for all bridge_ioctl() error cases.


# 1.13 16-May-2003 itojun

use strlcpy


# 1.12 14-May-2003 itojun

use arc4random


# 1.11 19-Mar-2003 bouyer

Fix 2 bugs:
- initialise stp when the bridge is turned up, without this stp will keep
all interfaces disabled in a sequence like:
brconfig bridge0 add if0 add if1 stp if0 stp if1 up
- s/BRDGSPRI/BRDGSIFPRIO in brconfig.c:cmd_ifpriority()

add a command (ifpathcost) to change the stp path cost of the STP path cost of
an interface. Display the interface path cost with the others STP parameters.


# 1.10 27-Feb-2003 perseant

Make BRIDGE_IPF an option, and document it. Add it (commented) to GENERIC.
Let brconfig tell whether the bridge is using the ipfilter hook, or not.


# 1.9 15-Feb-2003 perseant

Add ipf packet-filtering option to if_bridge. The option is controlled at
compile-time by BRIDGE_IPF, and at runtime by brconfig with the {ipf,-ipf}
option on a per-bridge basis.

As a side-effect, add PFIL_HOOKS processing to if_bridge.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base kqueue-aftermerge kqueue-beforemerge gehenna-devsw-base kqueue-base
# 1.8 24-Aug-2002 martin

Add a function to lookup bridge members by struct ifnet * and use
it at all call sites that have such a pointer readily available.
This avoids unnecessary strcmp()s in critical paths, and removes
some XXX comments.


# 1.7 08-Jun-2002 itojun

reject "add" request if if_mtu is different.


# 1.6 23-May-2002 itojun

use IFT_BRIDGE


Revision tags: netbsd-1-6-base
# 1.5 24-Mar-2002 jdolecek

branches: 1.5.2; 1.5.4;
Fix a memory leak in bridge_ioctl_add() when the called for non-ethernet
interface.
Problem noted and fix provided by in kern/16019 by Love.


Revision tags: eeh-devprop-base newlock-base
# 1.4 08-Mar-2002 thorpej

Pool deals fairly well with physical memory shortage, but it doesn't
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map). Try to deal with this:

* Group all information about the backend allocator for a pool in a
separate structure. The pool references this structure, rather than
the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
to become available, but will still fail if it cannot callocate KVA
space for the pages. If this happens, carefully drain all pools using
the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
some pages, and use that information to make draining easier and more
efficient.
* Get rid of PR_URGENT. There was only one use of it, and it could be
dealt with by the caller.

From art@openbsd.org.


Revision tags: ifpoll-base
# 1.3 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.2 17-Aug-2001 thorpej

branches: 1.2.2; 1.2.4;
Only report expire time for DYNAMIC forwarding table entries.


# 1.1 17-Aug-2001 thorpej

Add support for building Ethernet bridges, based on Jason Wright's
bridge driver from OpenBSD, although the bridge code has been *heavily*
modified by me (the 802.1D code remains mostly unchanged from the
original).


# 1.135 02-Oct-2017 ozaki-r

Add curlwp_bind to bridge_input for psref

It can be called in a thread context via tap (tap_dev_write).

Fix PR kern/52587


Revision tags: nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320
# 1.134 07-Mar-2017 ozaki-r

branches: 1.134.6;
Remove unnecessary splnet for bridge_enqueue

bridge_enqueue now uses if_transmit_lock that does splnet for device
drivers, so splnet for bridge_enqueue isn't needed anymore.


# 1.133 16-Feb-2017 knakahara

add l2tp(4) L2TPv3 interface.

originally implemented by IIJ SEIL team.


Revision tags: nick-nhusb-base-20170204
# 1.132 23-Jan-2017 ozaki-r

Replace some splnet with splsoftnet


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107 nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004
# 1.131 15-Sep-2016 christos

branches: 1.131.2;
Always do the mbuf checks. The packet filters (npf) expect the mbuf to be
pulled-up. (Krists Krilovs)


Revision tags: localcount-20160914
# 1.130 29-Aug-2016 ozaki-r

KNF; replace white spaces with hard tabs

No functional change.


Revision tags: pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.129 22-Jun-2016 knakahara

branches: 1.129.2;
fix: locking about IFQ_ENQUEUE and ALTQ

- If NET_MPSAFE is not defined, IFQ_LOCK is nop. Currently, that means
IFQ_ENQUEUE() of some paths such as bridge_enqueue() is called parallel
wrongly.
- If ALTQ is enabled, Tx processing should call if_transmit() (= IFQ_ENQUEUE
+ ifp->if_start()) instead of ifp->if_transmit() to call ALTQ_ENQUEUE()
and ALTQ_DEQUEUE().
Furthermore, ALTQ processing is always required KERNEL_LOCK currently.


# 1.128 20-Jun-2016 knakahara

fix: should not assert IFEF_OUTPUT_MPSAFE in bridge_output()


# 1.127 20-Jun-2016 knakahara

tentative fix for ATF(net/if_bridge/t_bridge)


# 1.126 20-Jun-2016 knakahara

make bridge_output MP-safe, so that bridge(4) can enable IFEF_OUTPUT_MPSAFE.

making MP-scalable is future work.


# 1.125 10-Jun-2016 ozaki-r

Avoid storing a pointer of an interface in a mbuf

Having a pointer of an interface in a mbuf isn't safe if we remove big
kernel locks; an interface object (ifnet) can be destroyed anytime in any
packet processing and accessing such object via a pointer is racy. Instead
we have to get an object from the interface collection (ifindex2ifnet) via
an interface index (if_index) that is stored to a mbuf instead of an
pointer.

The change provides two APIs: m_{get,put}_rcvif_psref that use psref(9)
for sleep-able critical sections and m_{get,put}_rcvif that use
pserialize(9) for other critical sections. The change also adds another
API called m_get_rcvif_NOMPSAFE, that is NOT MP-safe and for transition
moratorium, i.e., it is intended to be used for places where are not
planned to be MP-ified soon.

The change adds some overhead due to psref to performance sensitive paths,
however the overhead is not serious, 2% down at worst.

Proposed on tech-kern and tech-net.


# 1.124 10-Jun-2016 ozaki-r

Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.123 16-May-2016 ozaki-r

Apply if_get and if_put to bridge(4)


# 1.122 04-May-2016 roy

Allow multicast/broadcast packets from a bridge member to other members.
Note this should just call bridge_broadcast when more locking issues are
resolved.


# 1.121 28-Apr-2016 knakahara

introduce new ifnet MP-scalable sending interface "if_transmit".


# 1.120 28-Apr-2016 ozaki-r

Constify rtentry of if_output

We no longer need to change rtentry below if_output.

The change makes it clear where rtentries are changed (or not)
and helps forthcoming locking (os psrefing) rtentries.


# 1.119 24-Apr-2016 christos

CID 1358673: dead code


Revision tags: nick-nhusb-base-20160422
# 1.118 22-Apr-2016 roy

Change used from int to bool.
If used, abort the loop because we think we're already at the end.


# 1.117 20-Apr-2016 knakahara

IFQ_ENQUEUE refactor (3/3) : eliminate pktattr argument from IFQ_ENQUEUE caller


# 1.116 20-Apr-2016 knakahara

IFQ_ENQUEUE refactor (2/3) : eliminate pktattr argument from altq implemantation


# 1.115 19-Apr-2016 ozaki-r

Apply psref(9) to bridge(4)

Note that there is an issue that ioctls for an interface and a destruction
of the interface can run in parallel and it causes race conditions on
bridge as well (it rarely happens). The issue will be addressed in the
interface common code (if.c).


# 1.114 19-Apr-2016 ozaki-r

Remove BRIDGE_MPSAFE switch and enable MP-safe code by default

We need to enable it by default because bridge_input now runs
in softint, but bridge_input w/o BRIDGE_MPSAFE was designed as
it runs in hardware interrupt.

Note that there remains a racy code in bridge_output; it will be
solved in the upcoming change (applying psref(9)).


# 1.113 11-Apr-2016 ozaki-r

Fix usage of pslist(9)

Pointed out by riastradh@.


# 1.112 11-Apr-2016 ozaki-r

Use pslist(9) in bridge(4)

This adds missing memory barriers to list operations for pserialize.


# 1.111 28-Mar-2016 ozaki-r

Remove unused global bridge list

Pointed out by riastradh@


# 1.110 23-Mar-2016 ozaki-r

Fix LIST_FOREACH argument


# 1.109 23-Mar-2016 ozaki-r

Use LIST_FOREACH instead of LIST_FOREACH_SAFE

No need to use *_SAFE because we don't remove any items in the loop.


Revision tags: nick-nhusb-base-20160319
# 1.108 15-Feb-2016 ozaki-r

Simplify bridge(4)

Thanks to introducing softint-based if_input, the entire bridge code now
never run in hardware interrupt context. So we can simplify the code.

- Remove spin mutexes
- They were needed because some code of bridge could run in
hardware interrupt context
- We now need only an adaptive mutex for each shared object
(a member list and a forwarding table)
- Remove pktqueue
- bridge_input is already in softint, using another softint
(for bridge_forward) is useless
- Packet distribution should be down at device drivers


# 1.107 10-Feb-2016 ozaki-r

Don't share struct work, instead have one per softc

Pointed out by riastradh@


# 1.106 09-Feb-2016 ozaki-r

Introduce softint-based if_input

This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.

This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.

To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.

Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html

Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!


Revision tags: nick-nhusb-base-20151226
# 1.105 19-Nov-2015 christos

Add handling of VLAN packets in if_bridge where the parent interface supports
them (Jean-Jacques.Puig@espci.fr). Factor out the vlan_mtu enabling and
disabling code.


# 1.104 20-Oct-2015 maxv

Harmless alloc inconsistency; make sure the exact same argument is given to
kmem_alloc/kmem_free. Found by Brainy.


# 1.103 07-Oct-2015 ozaki-r

Enqueue frames to a curcpu's pktqueue

Currently RX can run on a CPU other than CPU#0, so always enqueuing
to a pktqueue of CPU#0 makes no sense. Let's use a curcpu's pktqueue,
although bridge_foward softint doesn't run in parallel without
NET_MPSAFE.

This is a temporal solution. We need a fundamental solution.


Revision tags: nick-nhusb-base-20150921
# 1.102 28-Aug-2015 rjs

Don't set M_PROTO1 in mbuf flags.

This was left over from the old usage of gif(4) with bridges.


# 1.101 20-Aug-2015 christos

include "ioconf.h" to get the 'void <driver>attach(int count);' prototype.


# 1.100 23-Jul-2015 ozaki-r

Fix PR 48104

So far bridge cannot receive frames via a member interface when the frames
come from another member interface. So when we assign an IP address to
a member interface, hosts connected to another member interface cannot
ping to the IP address. That behavior isn't expected. See PR 48104 for
more realistic examples of this issue.

The change does:
- drop M_PROMISC before ether_input, which allows a bridge member interface
to receive a frame coming from another bridge member interface
- receive broadcast/multicast frames via all bridge member interfaces,
which is required to receive IPv6 multicast packets destined to a
multicast group belonging to a bridge member interface that is different
from a packet arrival interface

roy@ helped testing of the fix, thanks!


Revision tags: nick-nhusb-base-20150606
# 1.99 01-Jun-2015 matt

Modify the BRDGGIFS and BRDGRTS cmds to be more COMPAT_NETBSD32 friendly.
(XXX whitespace)


# 1.98 16-Apr-2015 ozaki-r

Fix racy bridge_delete_member

It can be called from bridge_ioctl_del and bridge_clone_destroy with
a same bridge member (bif) at the same time. We have to prevent
that happens.

Pointed out by riastradh@


Revision tags: nick-nhusb-base-20150406
# 1.97 08-Jan-2015 ozaki-r

Use pserialize for rtlist in bridge

This change enables lockless accesses to bridge rtable lists.
See locking notes in a comment to know how pserialize and
mutexes are used. Some functions are rearranged to use
pserialize. A workqueue is introduced to use pserialize in
bridge_rtage via bridge_timer callout.

As usual, pserialize and mutexes are used only when NET_MPSAFE
on. On the other hand, the newly added workqueue is used
regardless of NET_MPSAFE on or off.


# 1.96 01-Jan-2015 ozaki-r

Reset the expire time of a cache on receiving a frame for the cache

The expire time of a cache in a bridge MAC address table was never reset
once it is initialized regardless of traffic for the cache. The behavior
isn't supposed and active caches are unnecessarily expired and removed.

PR kern/49507


# 1.95 31-Dec-2014 ozaki-r

Use pserialize in bridge

This change enables lockless accesses to bridge member lists.
See locking notes in a comment to know how pserialize and
mutexes are used.

This change also provides support for softint-based interrupt
handling; pserialize readers can run in both HW interrupt and
softint contexts.

As usual, pserialize is used only when NET_MPSAFE on.


# 1.94 25-Dec-2014 ozaki-r

Use LIST_FOREACH_SAFE in bridge_rt* functions


# 1.93 24-Dec-2014 ozaki-r

Replace malloc/free with kmem_* in if_bridge

Additionally M_NOWAIT is replaced with KM_SLEEP.


# 1.92 22-Dec-2014 ozaki-r

Call ether_input/m_freem without holding a lock or referencing unnecessary objects

When NET_MPSAFE on, a bridge tries to pass up a packet to Layer 3
(or call m_freem) with holding a lock or referencing unnecessary
objects. That causes random lock ups. The change fixes the issue.


Revision tags: nick-nhusb-base
# 1.91 15-Aug-2014 ozaki-r

branches: 1.91.2;
bridge: reject non-IFF_SIMPLEX interfaces

bridge does not work with !IFF_SIMPLEX interfaces (PR/18035);
the bug is not yet fixed. Until it gets fixed, we should
reject non-IFF_SIMPLEX interfaces.

Discussed with pooka@


Revision tags: netbsd-7-1-RELEASE netbsd-7-1-RC2 netbsd-7-nhusb-base-20170116 netbsd-7-1-RC1 netbsd-7-0-2-RELEASE netbsd-7-nhusb-base netbsd-7-0-1-RELEASE netbsd-7-0-RELEASE netbsd-7-0-RC3 netbsd-7-0-RC2 netbsd-7-0-RC1 netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.90 23-Jul-2014 ozaki-r

branches: 1.90.2;
Avoid calling copyout with holding mutex(IPL_NET)

Because copyout may lead a page fault that may sleep, we have to pull it
out from the critical section of mutex(IPL_NET) in bridge_ioctl_gifs.


# 1.89 23-Jul-2014 ozaki-r

Add missing unlock


# 1.88 20-Jul-2014 ozaki-r

Don't return ENETRESET when ioctl SIOCSIFMTU

Otherwise, just changing MTU with ifconfig shows
a confusable error message.

RP kern/48996


# 1.87 14-Jul-2014 ozaki-r

Make bridge MPSAFE

- Introduce BRIDGE_MPSAFE
- It's enabled only when NET_MPSAFE is defined
in if.h or the kernel config
- Add iflist and rtlist mutex locks
- Locking iflist is performance sensitive,
so it's not used when !BRIDGE_MPSAFE
- Add bif object reference counting
- It enables fine-grain locking for bridge member lists
by allowing to not hold a lock during touching a bif
- bridge_release_member is added to decrement the
reference count
- A condition variable is added to do bridge_delete_member
gracefully
- Add if_bridgeif to ifnet
- It's a shortcut to a bif object of a bridge member
- It reduces a bif lookup cost and so lock contention on iflist
- Make bridgestp MPSAFE too


# 1.86 02-Jul-2014 ozaki-r

Protect bridge_list with a mutex


# 1.85 02-Jul-2014 ozaki-r

Remove obsolete codes for if_snd


# 1.84 23-Jun-2014 ozaki-r

Get rid of unnecessary xc_broadcast after pktq_barrier

Pointed out by rmind@


# 1.83 18-Jun-2014 ozaki-r

Restructure bridge_input and bridge_broadcast

There are two changes:
- Assemble the places calling pktq_enqueue (bridge_forward)
for unicast and {b,m}cast frames into one
- Receive {b,m}cast frames in bridge_broadcast, not in
bridge_input

The changes make the code clear and readable. bridge_input
now doesn't need to take care of {b,m}cast frames;
bridge_forward and bridge_broadcast have the responsibility.

The changes are based on a patch of Lloyd Parkes submitted
in PR 48104, but don't fix its issue yet.


# 1.82 18-Jun-2014 ozaki-r

Tidy up bridge_input

No functional change.


# 1.81 17-Jun-2014 ozaki-r

Restructure ether_input and bridge_input

The network stack of NetBSD is well organized and
layered. A packet reception is processed from a
lower layer to an upper layer one by one. However,
ether_input and bridge_input are not structured so.
bridge_input is called inside ether_input.

The new structure replaces ifnet#if_input of a bridge
member with bridge_input when the member is attached.
So a packet goes straight on a packet reception via
a bridge, bridge_input => ether_input => ip_input.

The change is part of a patch of Lloyd Parkes submitted
in PR 48104. Unlike the patch, the change doesn't
intend to change the behavior of the packet processing.
Another patch will fix PR 48104.


# 1.80 16-Jun-2014 ozaki-r

Add net.interfaces.bridgeN.fwdq.{maxlen,len,drops} sysctl


# 1.79 16-Jun-2014 ozaki-r

Use pktqueue for bridge forwarding queue and softint


# 1.78 15-Jun-2014 ozaki-r

Get rid of unnecessary splnet for pool_{get,put}

A mutex prevents interrupts in the functions now.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base
# 1.77 29-Jun-2013 rmind

branches: 1.77.4;
- Rewrite parts of pfil(9): use array to store hooks and thus be more cache
friendly (there are only few hooks in the system). Make the structures
opaque and the interface more strict.
- Remove PFIL_HOOKS option by making pfil(9) mandatory.


Revision tags: agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8
# 1.76 22-Mar-2012 wiz

branches: 1.76.2; 1.76.4;
Fix typo in kauth name. From PR 46234 by Matthew Mondor.
Tested by Geoff Adams and Ryo ONODERA.


# 1.75 13-Mar-2012 elad

Replace the remaining KAUTH_GENERIC_ISSUSER authorization calls with
something meaningful. All relevant documentation has been updated or
written.

Most of these changes were brought up in the following messages:

http://mail-index.netbsd.org/tech-kern/2012/01/18/msg012490.html
http://mail-index.netbsd.org/tech-kern/2012/01/19/msg012502.html
http://mail-index.netbsd.org/tech-kern/2012/02/17/msg012728.html

Thanks to christos, manu, njoly, and jmmv for input.

Huge thanks to pgoyette for spinning these changes through some build
cycles and ATF.


Revision tags: netbsd-6-0-6-RELEASE netbsd-6-1-5-RELEASE netbsd-6-1-4-RELEASE netbsd-6-0-5-RELEASE netbsd-6-1-3-RELEASE netbsd-6-0-4-RELEASE netbsd-6-1-2-RELEASE netbsd-6-0-3-RELEASE netbsd-6-1-1-RELEASE netbsd-6-0-2-RELEASE netbsd-6-1-RELEASE netbsd-6-1-RC4 netbsd-6-1-RC3 netbsd-6-1-RC2 netbsd-6-1-RC1 netbsd-6-0-1-RELEASE matt-nb6-plus-nbase netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base
# 1.74 19-Nov-2011 tls

branches: 1.74.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.73 23-May-2011 joerg

branches: 1.73.4;
simplify


Revision tags: bouyer-quota2-nbase bouyer-quota2-base jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.72 07-Dec-2010 pooka

branches: 1.72.2;
_KERNEL_TOP


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1 yamt-nfs-mp-base9 uebayasi-xip-base
# 1.71 19-Jan-2010 pooka

branches: 1.71.4;
Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


Revision tags: matt-premerge-20091211 yamt-nfs-mp-base8 yamt-nfs-mp-base7 jymxensuspend-base yamt-nfs-mp-base6 yamt-nfs-mp-base5 jym-xensuspend-nbase
# 1.70 17-May-2009 cegger

fix crash in bridge_ioctl():


BRDGGFLT and BRDGSFILT bridge controls are only available with BRIDGE_IPF and PFIL_HOOKS defined.
In amd64 GENERIC and XEN kernel configs PFIL_HOOKS is defined but BRIDGE_IPF is not.

When a BRDGGFLT or BRDGSFILT command comes in, then ifd->ifd_cmd is not in range
of bridge_control_table_size. Then bc is not set and is dereferenced
later => BOOM.


Revision tags: yamt-nfs-mp-base4 jym-xensuspend-base
# 1.69 12-May-2009 elad

Move kauth(9) call before going into splnet().

Mailing list reference:

http://mail-index.netbsd.org/tech-net/2009/05/08/msg001286.html


Revision tags: yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 nick-hppapmap-base
# 1.68 04-Apr-2009 bouyer

Fix another typo


# 1.67 04-Apr-2009 bouyer

Fix a comment, and make it build.


# 1.66 04-Apr-2009 bouyer

Fixes from Masao Uebayashi


# 1.65 04-Apr-2009 bouyer

Fix for if_start() and pfil_hook() being called from hardware interrupt
context (reported on various mailing-lists, and part of PR kern/41114,
causing panic in pf(4) and possibly ipf(4) when BRIDGE_IPF is used).
Defer bridge_forward() to a software interrupt; bridge_input() enqueues
mbufs to ifp->if_snd which is handled in bridge_forward().


Revision tags: nick-hppapmap-base2
# 1.64 18-Jan-2009 mrg

branches: 1.64.2;
Fix multiple problems:

* A sign extension error creating the bridge ID corrupted the
priority (always making it the maximum).
* Do not catch STP packets on an interface for which STP is not
enabled -- it's a violation of the spec, and causes STP to fail on
neighboring bridges.
* An optimization to bstp_input() -- some information is already
known when we call it.

contributed anonymously.


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.63 07-Nov-2008 dyoung

*** Summary ***

When a link-layer address changes (e.g., ifconfig ex0 link
02:de:ad:be:ef:02 active), send a gratuitous ARP and/or a Neighbor
Advertisement to update the network-/link-layer address bindings
on our LAN peers.

Refuse a change of ethernet address to the address 00:00:00:00:00:00
or to any multicast/broadcast address. (Thanks matt@.)

Reorder ifnet ioctl operations so that driver ioctls may inherit
the functions of their "class"---ether_ioctl(), fddi_ioctl(), et
cetera---and the class ioctls may inherit from the generic ioctl,
ifioctl_common(), but both driver- and class-ioctls may override
the generic behavior. Make network drivers share more code.

Distinguish a "factory" link-layer address from others for the
purposes of both protecting that address from deletion and computing
EUI64.

Return consistent, appropriate error codes from network drivers.

Improve readability. KNF.

*** Details ***

In if_attach(), always initialize the interface ioctl routine,
ifnet->if_ioctl, if the driver has not already initialized it.
Delete if_ioctl == NULL tests everywhere else, because it cannot
happen.

In the ioctl routines of network interfaces, inherit common ioctl
behaviors by calling either ifioctl_common() or whichever ioctl
routine is appropriate for the class of interface---e.g., ether_ioctl()
for ethernets.

Stop (ab)using SIOCSIFADDR and start to use SIOCINITIFADDR. In
the user->kernel interface, SIOCSIFADDR's argument was an ifreq,
but on the protocol->ifnet interface, SIOCSIFADDR's argument was
an ifaddr. That was confusing, and it would work against me as I
make it possible for a network interface to overload most ioctls.
On the protocol->ifnet interface, replace SIOCSIFADDR with
SIOCINITIFADDR. In ifioctl(), return EPERM if userland tries to
invoke SIOCINITIFADDR.

In ifioctl(), give the interface the first shot at handling most
interface ioctls, and give the protocol the second shot, instead
of the other way around. Finally, let compatibility code (COMPAT_OSOCK)
take a shot.

Pull device initialization out of switch statements under
SIOCINITIFADDR. For example, pull ..._init() out of any switch
statement that looks like this:

switch (...->sa_family) {
case ...:
..._init();
...
break;
...
default:
..._init();
...
break;
}

Rewrite many if-else clauses that handle all permutations of IFF_UP
and IFF_RUNNING to use a switch statement,

switch (x & (IFF_UP|IFF_RUNNING)) {
case 0:
...
break;
case IFF_RUNNING:
...
break;
case IFF_UP:
...
break;
case IFF_UP|IFF_RUNNING:
...
break;
}

unifdef lots of code containing #ifdef FreeBSD, #ifdef NetBSD, and
#ifdef SIOCSIFMTU, especially in fwip(4) and in ndis(4).

In ipw(4), remove an if_set_sadl() call that is out of place.

In nfe(4), reuse the jumbo MTU logic in ether_ioctl().

Let ethernets register a callback for setting h/w state such as
promiscuous mode and the multicast filter in accord with a change
in the if_flags: ether_set_ifflags_cb() registers a callback that
returns ENETRESET if the caller should reset the ethernet by calling
if_init(), 0 on success, != 0 on failure. Pull common code from
ex(4), gem(4), nfe(4), sip(4), tlp(4), vge(4) into ether_ioctl(),
and register if_flags callbacks for those drivers.

Return ENOTTY instead of EINVAL for inappropriate ioctls. In
zyd(4), use ENXIO instead of ENOTTY to indicate that the device is
not any longer attached.

Add to if_set_sadl() a boolean 'factory' argument that indicates
whether a link-layer address was assigned by the factory or some
other source. In a comment, recommend using the factory address
for generating an EUI64, and update in6_get_hw_ifid() to prefer a
factory address to any other link-layer address.

Add a routing message, RTM_LLINFO_UPD, that tells protocols to
update the binding of network-layer addresses to link-layer addresses.
Implement this message in IPv4 and IPv6 by sending a gratuitous
ARP or a neighbor advertisement, respectively. Generate RTM_LLINFO_UPD
messages on a change of an interface's link-layer address.

In ether_ioctl(), do not let SIOCALIFADDR set a link-layer address
that is broadcast/multicast or equal to 00:00:00:00:00:00.

Make ether_ioctl() call ifioctl_common() to handle ioctls that it
does not understand.

In gif(4), initialize if_softc and use it, instead of assuming that
the gif_softc and ifp overlap.

Let ifioctl_common() handle SIOCGIFADDR.

Sprinkle rtcache_invariants(), which checks on DIAGNOSTIC kernels
that certain invariants on a struct route are satisfied.

In agr(4), rewrite agr_ioctl_filter() to be a bit more explicit
about the ioctls that we do not allow on an agr(4) member interface.

bzero -> memset. Delete unnecessary casts to void *. Use
sockaddr_in_init() and sockaddr_in6_init(). Compare pointers with
NULL instead of "testing truth". Replace some instances of (type
*)0 with NULL. Change some K&R prototypes to ANSI C, and join
lines.


Revision tags: netbsd-5-0-RC3 netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 wrstuden-revivesa-base-1 simonb-wapbl-nbase yamt-pf42-base4 simonb-wapbl-base wrstuden-revivesa-base
# 1.62 15-Jun-2008 christos

branches: 1.62.2; 1.62.4; 1.62.6;
- add if_alloc (ours just mallocs), and if_initname and use them (from FreeBSD)
- kill memsets where M_ZERO can be used.


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base
# 1.61 15-Apr-2008 thorpej

branches: 1.61.2; 1.61.4; 1.61.6; 1.61.8;
Make ip6 and icmp6 stats per-cpu.


# 1.60 12-Apr-2008 cegger

make this build with BRIDGE_IPF and PFIL_HOOKS options


# 1.59 12-Apr-2008 thorpej

Make IP, TCP, UDP, and ICMP statistics per-CPU. The stats are collated
when the user requests them via sysctl.


# 1.58 08-Apr-2008 thorpej

Change IPv6 stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ip6stat structure; old netstat
binaries will continue to work properly.


# 1.57 07-Apr-2008 thorpej

Change IP stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ipstat structure; old netstat
binaries will continue to work properly.


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase nick-net80211-sync-base keiichi-mipv6-base matt-armv6-nbase hpcarm-cleanup-base
# 1.56 20-Feb-2008 matt

branches: 1.56.6;
s/u_\(int[0-9]*_t\)/u\1/g
(change u_int*_t to uint*_t)


Revision tags: bouyer-xeni386-nbase bouyer-xeni386-base mjf-devfs-base
# 1.55 19-Jan-2008 dyoung

Use C99 array initializers for bridge_control_table[].


Revision tags: nick-csl-alignment-base5 bouyer-xeni386-merge1 matt-armv6-prevmlocking vmlocking2-base3 yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 jmcneill-base bouyer-xenamd64-base2 vmlocking-nbase yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base matt-armv6-base jmcneill-pm-base reinoud-bufcleanup-base vmlocking-base
# 1.54 27-Aug-2007 dyoung

branches: 1.54.2; 1.54.8; 1.54.14;
LLADDR -> CLLADDR.


# 1.53 26-Aug-2007 dyoung

Constify: LLADDR -> CLLADDR. I'm aiming here to make it easier to
identify sockaddr_dl abuse that remains in the kernel, especially
the potential for overwriting memory past the end of a sockaddr_dl
with, e.g., memcpy(LLADDR(), ...).

Use sockaddr_dl_setaddr() in a few places.


Revision tags: matt-mips64-base nick-csl-alignment-base mjf-ufs-trans-base
# 1.52 09-Jul-2007 ad

branches: 1.52.2; 1.52.6;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.51 12-Mar-2007 ad

branches: 1.51.2;
Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.


# 1.50 04-Mar-2007 christos

branches: 1.50.2;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.49 21-Feb-2007 dyoung

Use __arraycount().


# 1.48 17-Feb-2007 dyoung

KNF: de-__P, bzero -> memset, bcmp -> memcmp. Remove extraneous
parentheses in return statements.

Cosmetic: don't open-code TAILQ_FOREACH().

Cosmetic: change types of variables to avoid oodles of casts: in
in6_src.c, avoid casts by changing several route_in6 pointers
to struct route pointers. Remove unnecessary casts to caddr_t
elsewhere.

Pave the way for eliminating address family-specific route caches:
soon, struct route will not embed a sockaddr, but it will hold
a reference to an external sockaddr, instead. We will set the
destination sockaddr using rtcache_setdst(). (I created a stub
for it, but it isn't used anywhere, yet.) rtcache_free() will
free the sockaddr. I have extracted from rtcache_free() a helper
subroutine, rtcache_clear(). rtcache_clear() will "forget" a
cached route, but it will not forget the destination by releasing
the sockaddr. I use rtcache_clear() instead of rtcache_free()
in rtcache_update(), because rtcache_update() is not supposed
to forget the destination.

Constify:

1 Introduce const accessor for route->ro_dst, rtcache_getdst().

2 Constify the 'dst' argument to ifnet->if_output(). This
led me to constify a lot of code called by output routines.

3 Constify the sockaddr argument to protosw->pr_ctlinput. This
led me to constify a lot of code called by ctlinput routines.

4 Introduce const macros for converting from a generic sockaddr
to family-specific sockaddrs, e.g., sockaddr_in: satocsin6,
satocsin, et cetera.


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.47 04-Jan-2007 elad

branches: 1.47.2;
Consistent usage of KAUTH_GENERIC_ISSUSER.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3 netbsd-4-base
# 1.46 23-Nov-2006 rpaulo

New EtherIP driver based on tap(4) and gif(4) by Hans Rosenfeld.
Notable changes:
* Fixes PR 34268.
* Separates the code from gif(4) (which is more cleaner).
* Allows the usage of STP (Spanning Tree Protocol).
* Removed EtherIP implementation from gif(4)/tap(4).

Some input from Christos.


# 1.45 16-Nov-2006 christos

__unused removal on arguments; approved by core.


Revision tags: yamt-splraiseipl-base2
# 1.44 17-Oct-2006 dogcow

now that we have -Wno-unused-parameter, back out all the tremendously ugly
code to gratuitously access said parameters.


# 1.43 13-Oct-2006 dogcow

More -Wunused fallout. sprinkle __unused when possible; otherwise, use the
do { if (&x) {} } while (/* CONSTCOND */ 0);
construct as suggested by uwe in <20061012224845.GA9449@snark.ptc.spbu.ru>.


# 1.42 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


# 1.41 05-Oct-2006 tls

Protect calls to pool_put/pool_get that may occur in interrupt context
with spl used to protect other allocations and frees, or datastructure
element insertion and removal, in adjacent code.

It is almost unquestionably the case that some of the spl()/splx() calls
added here are superfluous, but it really seems wrong to see:

s=splfoo();
/* frob data structure */
splx(s);
pool_put(x);

and if we think we need to protect the first operation, then it is hard
to see why we should not think we need to protect the next. "Better
safe than sorry".

It is also almost unquestionably the case that I missed some pool
gets/puts from interrupt context with my strategy for finding these
calls; use of PR_NOWAIT is a strong hint that a pool may be used from
interrupt context but many callers in the kernel pass a "can wait/can't
wait" flag down such that my searches might not have found them. One
notable area that needs to be looked at is pf.

See also:

http://mail-index.netbsd.org/tech-kern/2006/07/19/0003.html
http://mail-index.netbsd.org/tech-kern/2006/07/19/0009.html


Revision tags: abandoned-netbsd-4-base yamt-splraiseipl-base yamt-pdpolicy-base9 yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.40 23-Jul-2006 ad

branches: 1.40.4; 1.40.6;
Use the LWP cached credentials where sane.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.39 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.38 18-May-2006 liamjfoy

branches: 1.38.2;
Integrate Common Address Redundancy Procotol (CARP) from OpenBSD

'pseudo-device carp'

Thanks to: joerg@ christos@ riz@ and others who tested
Ok: core@


# 1.37 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base yamt-pdpolicy-base2 elad-kernelauth-base yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.36 17-Jan-2006 christos

branches: 1.36.2; 1.36.4; 1.36.6; 1.36.8; 1.36.10;
Make sure that breq is also cleared (from Xin LI)


# 1.35 09-Jan-2006 christos

Make sure we initialize all structs to 0; from Xin LI


# 1.34 24-Dec-2005 perry

branches: 1.34.2;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.33 11-Dec-2005 thorpej

ANSI function decls and application of static.


# 1.32 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.31 01-Jun-2005 jdc

branches: 1.31.2;
Fix this properly by renaming the conflicting variables.


# 1.30 01-Jun-2005 jdc

Remove extraneous definition of struct llc (found by shadow warning).


Revision tags: netbsd-3-0-RELEASE netbsd-3-0-RC6 netbsd-3-0-RC5 netbsd-3-0-RC4 netbsd-3-0-RC3 netbsd-3-0-RC2 netbsd-3-0-RC1 yamt-km-base4 yamt-km-base3 netbsd-3-base kent-audio2-base
# 1.29 26-Feb-2005 perry

branches: 1.29.2; 1.29.4;
nuke trailing whitespace


Revision tags: yamt-km-base2
# 1.28 31-Jan-2005 kim

Add RFC 3378 EtherIP support, ported from OpenBSD to NetBSD by
Hans Rosenfeld (rosenfeld at grumpf.hope-2000.org)

This change makes it possible to add gif interfaces to bridges, which
will then send and receive IP protocol 97 packets. Packets are Ethernet
frames with an EtherIP header prepended.


Revision tags: yamt-km-base kent-audio1-beforemerge kent-audio1-base
# 1.27 04-Dec-2004 peter

branches: 1.27.4; 1.27.6;
Change ifc_destroy to return an int instead of void, so that it
can pass back errors to ifconfig.


# 1.26 06-Oct-2004 bad

Interfaces that do checksum offloading indicate the checksum status of
received packets in csum_flags in the packet header. Packets that are
forwarded over the bridge need to have csum_flags cleared before being
put on the output queue. Do so in bridge_enqueue().

Discussed with Jason Thorpe.

Fixes PR kern/27007 and the first part of PR kern/21831.


# 1.25 05-Oct-2004 christos

Only enable BRIDGE_IPF code if PFIL_HOOKS is enabled.


# 1.24 21-Apr-2004 itojun

kill a sprintf


# 1.23 21-Apr-2004 itojun

kill sprintf, use snprintf


Revision tags: netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.22 31-Jan-2004 jdc

branches: 1.22.2;
Use m_copydata(), m_adj() and M_PREPEND() to manipulate mbuf's in
bridge_ipf(). Fixes kernel memory corruption that occured when using
m_split() and m_cat().
Idea from OpenBSD.


# 1.21 09-Dec-2003 augustss

Fix spelling mistake in a comment.


# 1.20 28-Oct-2003 mycroft

Mark this initializer in the canonical way so it can be found later.


# 1.19 25-Oct-2003 christos

Fix uninitialized variable warnings


# 1.18 16-Sep-2003 jdc

Add a flag parameter to bridge_enqueue() to tell it whether to run the
filter or not. We only need to run the filter for bridge_forward() and
bridge_broadcast(). If we also run it for bridge_output(), we will run
the filter twice outbound per packet, so don't.

In bridge_ipf(), make sure we don't run m_cat() on a single mbuf chain
by checking to see (and remembering) if we need to m_split() the mbuf.
This fixes bridge + ipfilter on sparc.

Fixes PR kern/22063.


# 1.17 11-Aug-2003 itojun

rm extra blank line


# 1.16 13-Jul-2003 jdc

Include opt_inet.h to get INET6 definition.
Now, bridged ipv6 packets are passed through ipfilter.
However, some v6 packets still do not get transmitted when ipf is enabled.
Partial fix for PR kern/22063.


# 1.15 23-Jun-2003 martin

branches: 1.15.2;
Make sure to include opt_foo.h if a defflag option FOO is used.


# 1.14 24-May-2003 kristerw

Make sure splx() is called for all bridge_ioctl() error cases.


# 1.13 16-May-2003 itojun

use strlcpy


# 1.12 14-May-2003 itojun

use arc4random


# 1.11 19-Mar-2003 bouyer

Fix 2 bugs:
- initialise stp when the bridge is turned up, without this stp will keep
all interfaces disabled in a sequence like:
brconfig bridge0 add if0 add if1 stp if0 stp if1 up
- s/BRDGSPRI/BRDGSIFPRIO in brconfig.c:cmd_ifpriority()

add a command (ifpathcost) to change the stp path cost of the STP path cost of
an interface. Display the interface path cost with the others STP parameters.


# 1.10 27-Feb-2003 perseant

Make BRIDGE_IPF an option, and document it. Add it (commented) to GENERIC.
Let brconfig tell whether the bridge is using the ipfilter hook, or not.


# 1.9 15-Feb-2003 perseant

Add ipf packet-filtering option to if_bridge. The option is controlled at
compile-time by BRIDGE_IPF, and at runtime by brconfig with the {ipf,-ipf}
option on a per-bridge basis.

As a side-effect, add PFIL_HOOKS processing to if_bridge.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base kqueue-aftermerge kqueue-beforemerge gehenna-devsw-base kqueue-base
# 1.8 24-Aug-2002 martin

Add a function to lookup bridge members by struct ifnet * and use
it at all call sites that have such a pointer readily available.
This avoids unnecessary strcmp()s in critical paths, and removes
some XXX comments.


# 1.7 08-Jun-2002 itojun

reject "add" request if if_mtu is different.


# 1.6 23-May-2002 itojun

use IFT_BRIDGE


Revision tags: netbsd-1-6-base
# 1.5 24-Mar-2002 jdolecek

branches: 1.5.2; 1.5.4;
Fix a memory leak in bridge_ioctl_add() when the called for non-ethernet
interface.
Problem noted and fix provided by in kern/16019 by Love.


Revision tags: eeh-devprop-base newlock-base
# 1.4 08-Mar-2002 thorpej

Pool deals fairly well with physical memory shortage, but it doesn't
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map). Try to deal with this:

* Group all information about the backend allocator for a pool in a
separate structure. The pool references this structure, rather than
the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
to become available, but will still fail if it cannot callocate KVA
space for the pages. If this happens, carefully drain all pools using
the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
some pages, and use that information to make draining easier and more
efficient.
* Get rid of PR_URGENT. There was only one use of it, and it could be
dealt with by the caller.

From art@openbsd.org.


Revision tags: ifpoll-base
# 1.3 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.2 17-Aug-2001 thorpej

branches: 1.2.2; 1.2.4;
Only report expire time for DYNAMIC forwarding table entries.


# 1.1 17-Aug-2001 thorpej

Add support for building Ethernet bridges, based on Jason Wright's
bridge driver from OpenBSD, although the bridge code has been *heavily*
modified by me (the 802.1D code remains mostly unchanged from the
original).


# 1.134 07-Mar-2017 ozaki-r

Remove unnecessary splnet for bridge_enqueue

bridge_enqueue now uses if_transmit_lock that does splnet for device
drivers, so splnet for bridge_enqueue isn't needed anymore.


# 1.133 16-Feb-2017 knakahara

add l2tp(4) L2TPv3 interface.

originally implemented by IIJ SEIL team.


Revision tags: nick-nhusb-base-20170204
# 1.132 23-Jan-2017 ozaki-r

Replace some splnet with splsoftnet


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107 nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004
# 1.131 15-Sep-2016 christos

Always do the mbuf checks. The packet filters (npf) expect the mbuf to be
pulled-up. (Krists Krilovs)


Revision tags: localcount-20160914
# 1.130 29-Aug-2016 ozaki-r

KNF; replace white spaces with hard tabs

No functional change.


Revision tags: pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.129 22-Jun-2016 knakahara

branches: 1.129.2;
fix: locking about IFQ_ENQUEUE and ALTQ

- If NET_MPSAFE is not defined, IFQ_LOCK is nop. Currently, that means
IFQ_ENQUEUE() of some paths such as bridge_enqueue() is called parallel
wrongly.
- If ALTQ is enabled, Tx processing should call if_transmit() (= IFQ_ENQUEUE
+ ifp->if_start()) instead of ifp->if_transmit() to call ALTQ_ENQUEUE()
and ALTQ_DEQUEUE().
Furthermore, ALTQ processing is always required KERNEL_LOCK currently.


# 1.128 20-Jun-2016 knakahara

fix: should not assert IFEF_OUTPUT_MPSAFE in bridge_output()


# 1.127 20-Jun-2016 knakahara

tentative fix for ATF(net/if_bridge/t_bridge)


# 1.126 20-Jun-2016 knakahara

make bridge_output MP-safe, so that bridge(4) can enable IFEF_OUTPUT_MPSAFE.

making MP-scalable is future work.


# 1.125 10-Jun-2016 ozaki-r

Avoid storing a pointer of an interface in a mbuf

Having a pointer of an interface in a mbuf isn't safe if we remove big
kernel locks; an interface object (ifnet) can be destroyed anytime in any
packet processing and accessing such object via a pointer is racy. Instead
we have to get an object from the interface collection (ifindex2ifnet) via
an interface index (if_index) that is stored to a mbuf instead of an
pointer.

The change provides two APIs: m_{get,put}_rcvif_psref that use psref(9)
for sleep-able critical sections and m_{get,put}_rcvif that use
pserialize(9) for other critical sections. The change also adds another
API called m_get_rcvif_NOMPSAFE, that is NOT MP-safe and for transition
moratorium, i.e., it is intended to be used for places where are not
planned to be MP-ified soon.

The change adds some overhead due to psref to performance sensitive paths,
however the overhead is not serious, 2% down at worst.

Proposed on tech-kern and tech-net.


# 1.124 10-Jun-2016 ozaki-r

Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.123 16-May-2016 ozaki-r

Apply if_get and if_put to bridge(4)


# 1.122 04-May-2016 roy

Allow multicast/broadcast packets from a bridge member to other members.
Note this should just call bridge_broadcast when more locking issues are
resolved.


# 1.121 28-Apr-2016 knakahara

introduce new ifnet MP-scalable sending interface "if_transmit".


# 1.120 28-Apr-2016 ozaki-r

Constify rtentry of if_output

We no longer need to change rtentry below if_output.

The change makes it clear where rtentries are changed (or not)
and helps forthcoming locking (os psrefing) rtentries.


# 1.119 24-Apr-2016 christos

CID 1358673: dead code


Revision tags: nick-nhusb-base-20160422
# 1.118 22-Apr-2016 roy

Change used from int to bool.
If used, abort the loop because we think we're already at the end.


# 1.117 20-Apr-2016 knakahara

IFQ_ENQUEUE refactor (3/3) : eliminate pktattr argument from IFQ_ENQUEUE caller


# 1.116 20-Apr-2016 knakahara

IFQ_ENQUEUE refactor (2/3) : eliminate pktattr argument from altq implemantation


# 1.115 19-Apr-2016 ozaki-r

Apply psref(9) to bridge(4)

Note that there is an issue that ioctls for an interface and a destruction
of the interface can run in parallel and it causes race conditions on
bridge as well (it rarely happens). The issue will be addressed in the
interface common code (if.c).


# 1.114 19-Apr-2016 ozaki-r

Remove BRIDGE_MPSAFE switch and enable MP-safe code by default

We need to enable it by default because bridge_input now runs
in softint, but bridge_input w/o BRIDGE_MPSAFE was designed as
it runs in hardware interrupt.

Note that there remains a racy code in bridge_output; it will be
solved in the upcoming change (applying psref(9)).


# 1.113 11-Apr-2016 ozaki-r

Fix usage of pslist(9)

Pointed out by riastradh@.


# 1.112 11-Apr-2016 ozaki-r

Use pslist(9) in bridge(4)

This adds missing memory barriers to list operations for pserialize.


# 1.111 28-Mar-2016 ozaki-r

Remove unused global bridge list

Pointed out by riastradh@


# 1.110 23-Mar-2016 ozaki-r

Fix LIST_FOREACH argument


# 1.109 23-Mar-2016 ozaki-r

Use LIST_FOREACH instead of LIST_FOREACH_SAFE

No need to use *_SAFE because we don't remove any items in the loop.


Revision tags: nick-nhusb-base-20160319
# 1.108 15-Feb-2016 ozaki-r

Simplify bridge(4)

Thanks to introducing softint-based if_input, the entire bridge code now
never run in hardware interrupt context. So we can simplify the code.

- Remove spin mutexes
- They were needed because some code of bridge could run in
hardware interrupt context
- We now need only an adaptive mutex for each shared object
(a member list and a forwarding table)
- Remove pktqueue
- bridge_input is already in softint, using another softint
(for bridge_forward) is useless
- Packet distribution should be down at device drivers


# 1.107 10-Feb-2016 ozaki-r

Don't share struct work, instead have one per softc

Pointed out by riastradh@


# 1.106 09-Feb-2016 ozaki-r

Introduce softint-based if_input

This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.

This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.

To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.

Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html

Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!


Revision tags: nick-nhusb-base-20151226
# 1.105 19-Nov-2015 christos

Add handling of VLAN packets in if_bridge where the parent interface supports
them (Jean-Jacques.Puig@espci.fr). Factor out the vlan_mtu enabling and
disabling code.


# 1.104 20-Oct-2015 maxv

Harmless alloc inconsistency; make sure the exact same argument is given to
kmem_alloc/kmem_free. Found by Brainy.


# 1.103 07-Oct-2015 ozaki-r

Enqueue frames to a curcpu's pktqueue

Currently RX can run on a CPU other than CPU#0, so always enqueuing
to a pktqueue of CPU#0 makes no sense. Let's use a curcpu's pktqueue,
although bridge_foward softint doesn't run in parallel without
NET_MPSAFE.

This is a temporal solution. We need a fundamental solution.


Revision tags: nick-nhusb-base-20150921
# 1.102 28-Aug-2015 rjs

Don't set M_PROTO1 in mbuf flags.

This was left over from the old usage of gif(4) with bridges.


# 1.101 20-Aug-2015 christos

include "ioconf.h" to get the 'void <driver>attach(int count);' prototype.


# 1.100 23-Jul-2015 ozaki-r

Fix PR 48104

So far bridge cannot receive frames via a member interface when the frames
come from another member interface. So when we assign an IP address to
a member interface, hosts connected to another member interface cannot
ping to the IP address. That behavior isn't expected. See PR 48104 for
more realistic examples of this issue.

The change does:
- drop M_PROMISC before ether_input, which allows a bridge member interface
to receive a frame coming from another bridge member interface
- receive broadcast/multicast frames via all bridge member interfaces,
which is required to receive IPv6 multicast packets destined to a
multicast group belonging to a bridge member interface that is different
from a packet arrival interface

roy@ helped testing of the fix, thanks!


Revision tags: nick-nhusb-base-20150606
# 1.99 01-Jun-2015 matt

Modify the BRDGGIFS and BRDGRTS cmds to be more COMPAT_NETBSD32 friendly.
(XXX whitespace)


# 1.98 16-Apr-2015 ozaki-r

Fix racy bridge_delete_member

It can be called from bridge_ioctl_del and bridge_clone_destroy with
a same bridge member (bif) at the same time. We have to prevent
that happens.

Pointed out by riastradh@


Revision tags: nick-nhusb-base-20150406
# 1.97 08-Jan-2015 ozaki-r

Use pserialize for rtlist in bridge

This change enables lockless accesses to bridge rtable lists.
See locking notes in a comment to know how pserialize and
mutexes are used. Some functions are rearranged to use
pserialize. A workqueue is introduced to use pserialize in
bridge_rtage via bridge_timer callout.

As usual, pserialize and mutexes are used only when NET_MPSAFE
on. On the other hand, the newly added workqueue is used
regardless of NET_MPSAFE on or off.


# 1.96 01-Jan-2015 ozaki-r

Reset the expire time of a cache on receiving a frame for the cache

The expire time of a cache in a bridge MAC address table was never reset
once it is initialized regardless of traffic for the cache. The behavior
isn't supposed and active caches are unnecessarily expired and removed.

PR kern/49507


# 1.95 31-Dec-2014 ozaki-r

Use pserialize in bridge

This change enables lockless accesses to bridge member lists.
See locking notes in a comment to know how pserialize and
mutexes are used.

This change also provides support for softint-based interrupt
handling; pserialize readers can run in both HW interrupt and
softint contexts.

As usual, pserialize is used only when NET_MPSAFE on.


# 1.94 25-Dec-2014 ozaki-r

Use LIST_FOREACH_SAFE in bridge_rt* functions


# 1.93 24-Dec-2014 ozaki-r

Replace malloc/free with kmem_* in if_bridge

Additionally M_NOWAIT is replaced with KM_SLEEP.


# 1.92 22-Dec-2014 ozaki-r

Call ether_input/m_freem without holding a lock or referencing unnecessary objects

When NET_MPSAFE on, a bridge tries to pass up a packet to Layer 3
(or call m_freem) with holding a lock or referencing unnecessary
objects. That causes random lock ups. The change fixes the issue.


Revision tags: nick-nhusb-base
# 1.91 15-Aug-2014 ozaki-r

branches: 1.91.2;
bridge: reject non-IFF_SIMPLEX interfaces

bridge does not work with !IFF_SIMPLEX interfaces (PR/18035);
the bug is not yet fixed. Until it gets fixed, we should
reject non-IFF_SIMPLEX interfaces.

Discussed with pooka@


Revision tags: netbsd-7-1-RC2 netbsd-7-nhusb-base-20170116 netbsd-7-1-RC1 netbsd-7-0-2-RELEASE netbsd-7-nhusb-base netbsd-7-0-1-RELEASE netbsd-7-0-RELEASE netbsd-7-0-RC3 netbsd-7-0-RC2 netbsd-7-0-RC1 netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.90 23-Jul-2014 ozaki-r

Avoid calling copyout with holding mutex(IPL_NET)

Because copyout may lead a page fault that may sleep, we have to pull it
out from the critical section of mutex(IPL_NET) in bridge_ioctl_gifs.


# 1.89 23-Jul-2014 ozaki-r

Add missing unlock


# 1.88 20-Jul-2014 ozaki-r

Don't return ENETRESET when ioctl SIOCSIFMTU

Otherwise, just changing MTU with ifconfig shows
a confusable error message.

RP kern/48996


# 1.87 14-Jul-2014 ozaki-r

Make bridge MPSAFE

- Introduce BRIDGE_MPSAFE
- It's enabled only when NET_MPSAFE is defined
in if.h or the kernel config
- Add iflist and rtlist mutex locks
- Locking iflist is performance sensitive,
so it's not used when !BRIDGE_MPSAFE
- Add bif object reference counting
- It enables fine-grain locking for bridge member lists
by allowing to not hold a lock during touching a bif
- bridge_release_member is added to decrement the
reference count
- A condition variable is added to do bridge_delete_member
gracefully
- Add if_bridgeif to ifnet
- It's a shortcut to a bif object of a bridge member
- It reduces a bif lookup cost and so lock contention on iflist
- Make bridgestp MPSAFE too


# 1.86 02-Jul-2014 ozaki-r

Protect bridge_list with a mutex


# 1.85 02-Jul-2014 ozaki-r

Remove obsolete codes for if_snd


# 1.84 23-Jun-2014 ozaki-r

Get rid of unnecessary xc_broadcast after pktq_barrier

Pointed out by rmind@


# 1.83 18-Jun-2014 ozaki-r

Restructure bridge_input and bridge_broadcast

There are two changes:
- Assemble the places calling pktq_enqueue (bridge_forward)
for unicast and {b,m}cast frames into one
- Receive {b,m}cast frames in bridge_broadcast, not in
bridge_input

The changes make the code clear and readable. bridge_input
now doesn't need to take care of {b,m}cast frames;
bridge_forward and bridge_broadcast have the responsibility.

The changes are based on a patch of Lloyd Parkes submitted
in PR 48104, but don't fix its issue yet.


# 1.82 18-Jun-2014 ozaki-r

Tidy up bridge_input

No functional change.


# 1.81 17-Jun-2014 ozaki-r

Restructure ether_input and bridge_input

The network stack of NetBSD is well organized and
layered. A packet reception is processed from a
lower layer to an upper layer one by one. However,
ether_input and bridge_input are not structured so.
bridge_input is called inside ether_input.

The new structure replaces ifnet#if_input of a bridge
member with bridge_input when the member is attached.
So a packet goes straight on a packet reception via
a bridge, bridge_input => ether_input => ip_input.

The change is part of a patch of Lloyd Parkes submitted
in PR 48104. Unlike the patch, the change doesn't
intend to change the behavior of the packet processing.
Another patch will fix PR 48104.


# 1.80 16-Jun-2014 ozaki-r

Add net.interfaces.bridgeN.fwdq.{maxlen,len,drops} sysctl


# 1.79 16-Jun-2014 ozaki-r

Use pktqueue for bridge forwarding queue and softint


# 1.78 15-Jun-2014 ozaki-r

Get rid of unnecessary splnet for pool_{get,put}

A mutex prevents interrupts in the functions now.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base
# 1.77 29-Jun-2013 rmind

branches: 1.77.4;
- Rewrite parts of pfil(9): use array to store hooks and thus be more cache
friendly (there are only few hooks in the system). Make the structures
opaque and the interface more strict.
- Remove PFIL_HOOKS option by making pfil(9) mandatory.


Revision tags: agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8
# 1.76 22-Mar-2012 wiz

branches: 1.76.2; 1.76.4;
Fix typo in kauth name. From PR 46234 by Matthew Mondor.
Tested by Geoff Adams and Ryo ONODERA.


# 1.75 13-Mar-2012 elad

Replace the remaining KAUTH_GENERIC_ISSUSER authorization calls with
something meaningful. All relevant documentation has been updated or
written.

Most of these changes were brought up in the following messages:

http://mail-index.netbsd.org/tech-kern/2012/01/18/msg012490.html
http://mail-index.netbsd.org/tech-kern/2012/01/19/msg012502.html
http://mail-index.netbsd.org/tech-kern/2012/02/17/msg012728.html

Thanks to christos, manu, njoly, and jmmv for input.

Huge thanks to pgoyette for spinning these changes through some build
cycles and ATF.


Revision tags: netbsd-6-0-6-RELEASE netbsd-6-1-5-RELEASE netbsd-6-1-4-RELEASE netbsd-6-0-5-RELEASE netbsd-6-1-3-RELEASE netbsd-6-0-4-RELEASE netbsd-6-1-2-RELEASE netbsd-6-0-3-RELEASE netbsd-6-1-1-RELEASE netbsd-6-0-2-RELEASE netbsd-6-1-RELEASE netbsd-6-1-RC4 netbsd-6-1-RC3 netbsd-6-1-RC2 netbsd-6-1-RC1 netbsd-6-0-1-RELEASE matt-nb6-plus-nbase netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base
# 1.74 19-Nov-2011 tls

branches: 1.74.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.73 23-May-2011 joerg

branches: 1.73.4;
simplify


Revision tags: bouyer-quota2-nbase bouyer-quota2-base jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.72 07-Dec-2010 pooka

branches: 1.72.2;
_KERNEL_TOP


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1 yamt-nfs-mp-base9 uebayasi-xip-base
# 1.71 19-Jan-2010 pooka

branches: 1.71.4;
Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


Revision tags: matt-premerge-20091211 yamt-nfs-mp-base8 yamt-nfs-mp-base7 jymxensuspend-base yamt-nfs-mp-base6 yamt-nfs-mp-base5 jym-xensuspend-nbase
# 1.70 17-May-2009 cegger

fix crash in bridge_ioctl():


BRDGGFLT and BRDGSFILT bridge controls are only available with BRIDGE_IPF and PFIL_HOOKS defined.
In amd64 GENERIC and XEN kernel configs PFIL_HOOKS is defined but BRIDGE_IPF is not.

When a BRDGGFLT or BRDGSFILT command comes in, then ifd->ifd_cmd is not in range
of bridge_control_table_size. Then bc is not set and is dereferenced
later => BOOM.


Revision tags: yamt-nfs-mp-base4 jym-xensuspend-base
# 1.69 12-May-2009 elad

Move kauth(9) call before going into splnet().

Mailing list reference:

http://mail-index.netbsd.org/tech-net/2009/05/08/msg001286.html


Revision tags: yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 nick-hppapmap-base
# 1.68 04-Apr-2009 bouyer

Fix another typo


# 1.67 04-Apr-2009 bouyer

Fix a comment, and make it build.


# 1.66 04-Apr-2009 bouyer

Fixes from Masao Uebayashi


# 1.65 04-Apr-2009 bouyer

Fix for if_start() and pfil_hook() being called from hardware interrupt
context (reported on various mailing-lists, and part of PR kern/41114,
causing panic in pf(4) and possibly ipf(4) when BRIDGE_IPF is used).
Defer bridge_forward() to a software interrupt; bridge_input() enqueues
mbufs to ifp->if_snd which is handled in bridge_forward().


Revision tags: nick-hppapmap-base2
# 1.64 18-Jan-2009 mrg

branches: 1.64.2;
Fix multiple problems:

* A sign extension error creating the bridge ID corrupted the
priority (always making it the maximum).
* Do not catch STP packets on an interface for which STP is not
enabled -- it's a violation of the spec, and causes STP to fail on
neighboring bridges.
* An optimization to bstp_input() -- some information is already
known when we call it.

contributed anonymously.


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.63 07-Nov-2008 dyoung

*** Summary ***

When a link-layer address changes (e.g., ifconfig ex0 link
02:de:ad:be:ef:02 active), send a gratuitous ARP and/or a Neighbor
Advertisement to update the network-/link-layer address bindings
on our LAN peers.

Refuse a change of ethernet address to the address 00:00:00:00:00:00
or to any multicast/broadcast address. (Thanks matt@.)

Reorder ifnet ioctl operations so that driver ioctls may inherit
the functions of their "class"---ether_ioctl(), fddi_ioctl(), et
cetera---and the class ioctls may inherit from the generic ioctl,
ifioctl_common(), but both driver- and class-ioctls may override
the generic behavior. Make network drivers share more code.

Distinguish a "factory" link-layer address from others for the
purposes of both protecting that address from deletion and computing
EUI64.

Return consistent, appropriate error codes from network drivers.

Improve readability. KNF.

*** Details ***

In if_attach(), always initialize the interface ioctl routine,
ifnet->if_ioctl, if the driver has not already initialized it.
Delete if_ioctl == NULL tests everywhere else, because it cannot
happen.

In the ioctl routines of network interfaces, inherit common ioctl
behaviors by calling either ifioctl_common() or whichever ioctl
routine is appropriate for the class of interface---e.g., ether_ioctl()
for ethernets.

Stop (ab)using SIOCSIFADDR and start to use SIOCINITIFADDR. In
the user->kernel interface, SIOCSIFADDR's argument was an ifreq,
but on the protocol->ifnet interface, SIOCSIFADDR's argument was
an ifaddr. That was confusing, and it would work against me as I
make it possible for a network interface to overload most ioctls.
On the protocol->ifnet interface, replace SIOCSIFADDR with
SIOCINITIFADDR. In ifioctl(), return EPERM if userland tries to
invoke SIOCINITIFADDR.

In ifioctl(), give the interface the first shot at handling most
interface ioctls, and give the protocol the second shot, instead
of the other way around. Finally, let compatibility code (COMPAT_OSOCK)
take a shot.

Pull device initialization out of switch statements under
SIOCINITIFADDR. For example, pull ..._init() out of any switch
statement that looks like this:

switch (...->sa_family) {
case ...:
..._init();
...
break;
...
default:
..._init();
...
break;
}

Rewrite many if-else clauses that handle all permutations of IFF_UP
and IFF_RUNNING to use a switch statement,

switch (x & (IFF_UP|IFF_RUNNING)) {
case 0:
...
break;
case IFF_RUNNING:
...
break;
case IFF_UP:
...
break;
case IFF_UP|IFF_RUNNING:
...
break;
}

unifdef lots of code containing #ifdef FreeBSD, #ifdef NetBSD, and
#ifdef SIOCSIFMTU, especially in fwip(4) and in ndis(4).

In ipw(4), remove an if_set_sadl() call that is out of place.

In nfe(4), reuse the jumbo MTU logic in ether_ioctl().

Let ethernets register a callback for setting h/w state such as
promiscuous mode and the multicast filter in accord with a change
in the if_flags: ether_set_ifflags_cb() registers a callback that
returns ENETRESET if the caller should reset the ethernet by calling
if_init(), 0 on success, != 0 on failure. Pull common code from
ex(4), gem(4), nfe(4), sip(4), tlp(4), vge(4) into ether_ioctl(),
and register if_flags callbacks for those drivers.

Return ENOTTY instead of EINVAL for inappropriate ioctls. In
zyd(4), use ENXIO instead of ENOTTY to indicate that the device is
not any longer attached.

Add to if_set_sadl() a boolean 'factory' argument that indicates
whether a link-layer address was assigned by the factory or some
other source. In a comment, recommend using the factory address
for generating an EUI64, and update in6_get_hw_ifid() to prefer a
factory address to any other link-layer address.

Add a routing message, RTM_LLINFO_UPD, that tells protocols to
update the binding of network-layer addresses to link-layer addresses.
Implement this message in IPv4 and IPv6 by sending a gratuitous
ARP or a neighbor advertisement, respectively. Generate RTM_LLINFO_UPD
messages on a change of an interface's link-layer address.

In ether_ioctl(), do not let SIOCALIFADDR set a link-layer address
that is broadcast/multicast or equal to 00:00:00:00:00:00.

Make ether_ioctl() call ifioctl_common() to handle ioctls that it
does not understand.

In gif(4), initialize if_softc and use it, instead of assuming that
the gif_softc and ifp overlap.

Let ifioctl_common() handle SIOCGIFADDR.

Sprinkle rtcache_invariants(), which checks on DIAGNOSTIC kernels
that certain invariants on a struct route are satisfied.

In agr(4), rewrite agr_ioctl_filter() to be a bit more explicit
about the ioctls that we do not allow on an agr(4) member interface.

bzero -> memset. Delete unnecessary casts to void *. Use
sockaddr_in_init() and sockaddr_in6_init(). Compare pointers with
NULL instead of "testing truth". Replace some instances of (type
*)0 with NULL. Change some K&R prototypes to ANSI C, and join
lines.


Revision tags: netbsd-5-0-RC3 netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 wrstuden-revivesa-base-1 simonb-wapbl-nbase yamt-pf42-base4 simonb-wapbl-base wrstuden-revivesa-base
# 1.62 15-Jun-2008 christos

branches: 1.62.2; 1.62.4; 1.62.6;
- add if_alloc (ours just mallocs), and if_initname and use them (from FreeBSD)
- kill memsets where M_ZERO can be used.


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base
# 1.61 15-Apr-2008 thorpej

branches: 1.61.2; 1.61.4; 1.61.6; 1.61.8;
Make ip6 and icmp6 stats per-cpu.


# 1.60 12-Apr-2008 cegger

make this build with BRIDGE_IPF and PFIL_HOOKS options


# 1.59 12-Apr-2008 thorpej

Make IP, TCP, UDP, and ICMP statistics per-CPU. The stats are collated
when the user requests them via sysctl.


# 1.58 08-Apr-2008 thorpej

Change IPv6 stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ip6stat structure; old netstat
binaries will continue to work properly.


# 1.57 07-Apr-2008 thorpej

Change IP stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ipstat structure; old netstat
binaries will continue to work properly.


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase nick-net80211-sync-base keiichi-mipv6-base matt-armv6-nbase hpcarm-cleanup-base
# 1.56 20-Feb-2008 matt

branches: 1.56.6;
s/u_\(int[0-9]*_t\)/u\1/g
(change u_int*_t to uint*_t)


Revision tags: bouyer-xeni386-nbase bouyer-xeni386-base mjf-devfs-base
# 1.55 19-Jan-2008 dyoung

Use C99 array initializers for bridge_control_table[].


Revision tags: nick-csl-alignment-base5 bouyer-xeni386-merge1 matt-armv6-prevmlocking vmlocking2-base3 yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 jmcneill-base bouyer-xenamd64-base2 vmlocking-nbase yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base matt-armv6-base jmcneill-pm-base reinoud-bufcleanup-base vmlocking-base
# 1.54 27-Aug-2007 dyoung

branches: 1.54.2; 1.54.8; 1.54.14;
LLADDR -> CLLADDR.


# 1.53 26-Aug-2007 dyoung

Constify: LLADDR -> CLLADDR. I'm aiming here to make it easier to
identify sockaddr_dl abuse that remains in the kernel, especially
the potential for overwriting memory past the end of a sockaddr_dl
with, e.g., memcpy(LLADDR(), ...).

Use sockaddr_dl_setaddr() in a few places.


Revision tags: matt-mips64-base nick-csl-alignment-base mjf-ufs-trans-base
# 1.52 09-Jul-2007 ad

branches: 1.52.2; 1.52.6;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.51 12-Mar-2007 ad

branches: 1.51.2;
Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.


# 1.50 04-Mar-2007 christos

branches: 1.50.2;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.49 21-Feb-2007 dyoung

Use __arraycount().


# 1.48 17-Feb-2007 dyoung

KNF: de-__P, bzero -> memset, bcmp -> memcmp. Remove extraneous
parentheses in return statements.

Cosmetic: don't open-code TAILQ_FOREACH().

Cosmetic: change types of variables to avoid oodles of casts: in
in6_src.c, avoid casts by changing several route_in6 pointers
to struct route pointers. Remove unnecessary casts to caddr_t
elsewhere.

Pave the way for eliminating address family-specific route caches:
soon, struct route will not embed a sockaddr, but it will hold
a reference to an external sockaddr, instead. We will set the
destination sockaddr using rtcache_setdst(). (I created a stub
for it, but it isn't used anywhere, yet.) rtcache_free() will
free the sockaddr. I have extracted from rtcache_free() a helper
subroutine, rtcache_clear(). rtcache_clear() will "forget" a
cached route, but it will not forget the destination by releasing
the sockaddr. I use rtcache_clear() instead of rtcache_free()
in rtcache_update(), because rtcache_update() is not supposed
to forget the destination.

Constify:

1 Introduce const accessor for route->ro_dst, rtcache_getdst().

2 Constify the 'dst' argument to ifnet->if_output(). This
led me to constify a lot of code called by output routines.

3 Constify the sockaddr argument to protosw->pr_ctlinput. This
led me to constify a lot of code called by ctlinput routines.

4 Introduce const macros for converting from a generic sockaddr
to family-specific sockaddrs, e.g., sockaddr_in: satocsin6,
satocsin, et cetera.


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.47 04-Jan-2007 elad

branches: 1.47.2;
Consistent usage of KAUTH_GENERIC_ISSUSER.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3 netbsd-4-base
# 1.46 23-Nov-2006 rpaulo

New EtherIP driver based on tap(4) and gif(4) by Hans Rosenfeld.
Notable changes:
* Fixes PR 34268.
* Separates the code from gif(4) (which is more cleaner).
* Allows the usage of STP (Spanning Tree Protocol).
* Removed EtherIP implementation from gif(4)/tap(4).

Some input from Christos.


# 1.45 16-Nov-2006 christos

__unused removal on arguments; approved by core.


Revision tags: yamt-splraiseipl-base2
# 1.44 17-Oct-2006 dogcow

now that we have -Wno-unused-parameter, back out all the tremendously ugly
code to gratuitously access said parameters.


# 1.43 13-Oct-2006 dogcow

More -Wunused fallout. sprinkle __unused when possible; otherwise, use the
do { if (&x) {} } while (/* CONSTCOND */ 0);
construct as suggested by uwe in <20061012224845.GA9449@snark.ptc.spbu.ru>.


# 1.42 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


# 1.41 05-Oct-2006 tls

Protect calls to pool_put/pool_get that may occur in interrupt context
with spl used to protect other allocations and frees, or datastructure
element insertion and removal, in adjacent code.

It is almost unquestionably the case that some of the spl()/splx() calls
added here are superfluous, but it really seems wrong to see:

s=splfoo();
/* frob data structure */
splx(s);
pool_put(x);

and if we think we need to protect the first operation, then it is hard
to see why we should not think we need to protect the next. "Better
safe than sorry".

It is also almost unquestionably the case that I missed some pool
gets/puts from interrupt context with my strategy for finding these
calls; use of PR_NOWAIT is a strong hint that a pool may be used from
interrupt context but many callers in the kernel pass a "can wait/can't
wait" flag down such that my searches might not have found them. One
notable area that needs to be looked at is pf.

See also:

http://mail-index.netbsd.org/tech-kern/2006/07/19/0003.html
http://mail-index.netbsd.org/tech-kern/2006/07/19/0009.html


Revision tags: abandoned-netbsd-4-base yamt-splraiseipl-base yamt-pdpolicy-base9 yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.40 23-Jul-2006 ad

branches: 1.40.4; 1.40.6;
Use the LWP cached credentials where sane.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.39 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.38 18-May-2006 liamjfoy

branches: 1.38.2;
Integrate Common Address Redundancy Procotol (CARP) from OpenBSD

'pseudo-device carp'

Thanks to: joerg@ christos@ riz@ and others who tested
Ok: core@


# 1.37 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base yamt-pdpolicy-base2 elad-kernelauth-base yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.36 17-Jan-2006 christos

branches: 1.36.2; 1.36.4; 1.36.6; 1.36.8; 1.36.10;
Make sure that breq is also cleared (from Xin LI)


# 1.35 09-Jan-2006 christos

Make sure we initialize all structs to 0; from Xin LI


# 1.34 24-Dec-2005 perry

branches: 1.34.2;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.33 11-Dec-2005 thorpej

ANSI function decls and application of static.


# 1.32 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.31 01-Jun-2005 jdc

branches: 1.31.2;
Fix this properly by renaming the conflicting variables.


# 1.30 01-Jun-2005 jdc

Remove extraneous definition of struct llc (found by shadow warning).


Revision tags: netbsd-3-0-RELEASE netbsd-3-0-RC6 netbsd-3-0-RC5 netbsd-3-0-RC4 netbsd-3-0-RC3 netbsd-3-0-RC2 netbsd-3-0-RC1 yamt-km-base4 yamt-km-base3 netbsd-3-base kent-audio2-base
# 1.29 26-Feb-2005 perry

branches: 1.29.2; 1.29.4;
nuke trailing whitespace


Revision tags: yamt-km-base2
# 1.28 31-Jan-2005 kim

Add RFC 3378 EtherIP support, ported from OpenBSD to NetBSD by
Hans Rosenfeld (rosenfeld at grumpf.hope-2000.org)

This change makes it possible to add gif interfaces to bridges, which
will then send and receive IP protocol 97 packets. Packets are Ethernet
frames with an EtherIP header prepended.


Revision tags: yamt-km-base kent-audio1-beforemerge kent-audio1-base
# 1.27 04-Dec-2004 peter

branches: 1.27.4; 1.27.6;
Change ifc_destroy to return an int instead of void, so that it
can pass back errors to ifconfig.


# 1.26 06-Oct-2004 bad

Interfaces that do checksum offloading indicate the checksum status of
received packets in csum_flags in the packet header. Packets that are
forwarded over the bridge need to have csum_flags cleared before being
put on the output queue. Do so in bridge_enqueue().

Discussed with Jason Thorpe.

Fixes PR kern/27007 and the first part of PR kern/21831.


# 1.25 05-Oct-2004 christos

Only enable BRIDGE_IPF code if PFIL_HOOKS is enabled.


# 1.24 21-Apr-2004 itojun

kill a sprintf


# 1.23 21-Apr-2004 itojun

kill sprintf, use snprintf


Revision tags: netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.22 31-Jan-2004 jdc

branches: 1.22.2;
Use m_copydata(), m_adj() and M_PREPEND() to manipulate mbuf's in
bridge_ipf(). Fixes kernel memory corruption that occured when using
m_split() and m_cat().
Idea from OpenBSD.


# 1.21 09-Dec-2003 augustss

Fix spelling mistake in a comment.


# 1.20 28-Oct-2003 mycroft

Mark this initializer in the canonical way so it can be found later.


# 1.19 25-Oct-2003 christos

Fix uninitialized variable warnings


# 1.18 16-Sep-2003 jdc

Add a flag parameter to bridge_enqueue() to tell it whether to run the
filter or not. We only need to run the filter for bridge_forward() and
bridge_broadcast(). If we also run it for bridge_output(), we will run
the filter twice outbound per packet, so don't.

In bridge_ipf(), make sure we don't run m_cat() on a single mbuf chain
by checking to see (and remembering) if we need to m_split() the mbuf.
This fixes bridge + ipfilter on sparc.

Fixes PR kern/22063.


# 1.17 11-Aug-2003 itojun

rm extra blank line


# 1.16 13-Jul-2003 jdc

Include opt_inet.h to get INET6 definition.
Now, bridged ipv6 packets are passed through ipfilter.
However, some v6 packets still do not get transmitted when ipf is enabled.
Partial fix for PR kern/22063.


# 1.15 23-Jun-2003 martin

branches: 1.15.2;
Make sure to include opt_foo.h if a defflag option FOO is used.


# 1.14 24-May-2003 kristerw

Make sure splx() is called for all bridge_ioctl() error cases.


# 1.13 16-May-2003 itojun

use strlcpy


# 1.12 14-May-2003 itojun

use arc4random


# 1.11 19-Mar-2003 bouyer

Fix 2 bugs:
- initialise stp when the bridge is turned up, without this stp will keep
all interfaces disabled in a sequence like:
brconfig bridge0 add if0 add if1 stp if0 stp if1 up
- s/BRDGSPRI/BRDGSIFPRIO in brconfig.c:cmd_ifpriority()

add a command (ifpathcost) to change the stp path cost of the STP path cost of
an interface. Display the interface path cost with the others STP parameters.


# 1.10 27-Feb-2003 perseant

Make BRIDGE_IPF an option, and document it. Add it (commented) to GENERIC.
Let brconfig tell whether the bridge is using the ipfilter hook, or not.


# 1.9 15-Feb-2003 perseant

Add ipf packet-filtering option to if_bridge. The option is controlled at
compile-time by BRIDGE_IPF, and at runtime by brconfig with the {ipf,-ipf}
option on a per-bridge basis.

As a side-effect, add PFIL_HOOKS processing to if_bridge.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base kqueue-aftermerge kqueue-beforemerge gehenna-devsw-base kqueue-base
# 1.8 24-Aug-2002 martin

Add a function to lookup bridge members by struct ifnet * and use
it at all call sites that have such a pointer readily available.
This avoids unnecessary strcmp()s in critical paths, and removes
some XXX comments.


# 1.7 08-Jun-2002 itojun

reject "add" request if if_mtu is different.


# 1.6 23-May-2002 itojun

use IFT_BRIDGE


Revision tags: netbsd-1-6-base
# 1.5 24-Mar-2002 jdolecek

branches: 1.5.2; 1.5.4;
Fix a memory leak in bridge_ioctl_add() when the called for non-ethernet
interface.
Problem noted and fix provided by in kern/16019 by Love.


Revision tags: eeh-devprop-base newlock-base
# 1.4 08-Mar-2002 thorpej

Pool deals fairly well with physical memory shortage, but it doesn't
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map). Try to deal with this:

* Group all information about the backend allocator for a pool in a
separate structure. The pool references this structure, rather than
the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
to become available, but will still fail if it cannot callocate KVA
space for the pages. If this happens, carefully drain all pools using
the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
some pages, and use that information to make draining easier and more
efficient.
* Get rid of PR_URGENT. There was only one use of it, and it could be
dealt with by the caller.

From art@openbsd.org.


Revision tags: ifpoll-base
# 1.3 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.2 17-Aug-2001 thorpej

branches: 1.2.2; 1.2.4;
Only report expire time for DYNAMIC forwarding table entries.


# 1.1 17-Aug-2001 thorpej

Add support for building Ethernet bridges, based on Jason Wright's
bridge driver from OpenBSD, although the bridge code has been *heavily*
modified by me (the 802.1D code remains mostly unchanged from the
original).


# 1.133 16-Feb-2017 knakahara

add l2tp(4) L2TPv3 interface.

originally implemented by IIJ SEIL team.


Revision tags: nick-nhusb-base-20170204
# 1.132 23-Jan-2017 ozaki-r

Replace some splnet with splsoftnet


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107 nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004
# 1.131 15-Sep-2016 christos

Always do the mbuf checks. The packet filters (npf) expect the mbuf to be
pulled-up. (Krists Krilovs)


Revision tags: localcount-20160914
# 1.130 29-Aug-2016 ozaki-r

KNF; replace white spaces with hard tabs

No functional change.


Revision tags: pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.129 22-Jun-2016 knakahara

branches: 1.129.2;
fix: locking about IFQ_ENQUEUE and ALTQ

- If NET_MPSAFE is not defined, IFQ_LOCK is nop. Currently, that means
IFQ_ENQUEUE() of some paths such as bridge_enqueue() is called parallel
wrongly.
- If ALTQ is enabled, Tx processing should call if_transmit() (= IFQ_ENQUEUE
+ ifp->if_start()) instead of ifp->if_transmit() to call ALTQ_ENQUEUE()
and ALTQ_DEQUEUE().
Furthermore, ALTQ processing is always required KERNEL_LOCK currently.


# 1.128 20-Jun-2016 knakahara

fix: should not assert IFEF_OUTPUT_MPSAFE in bridge_output()


# 1.127 20-Jun-2016 knakahara

tentative fix for ATF(net/if_bridge/t_bridge)


# 1.126 20-Jun-2016 knakahara

make bridge_output MP-safe, so that bridge(4) can enable IFEF_OUTPUT_MPSAFE.

making MP-scalable is future work.


# 1.125 10-Jun-2016 ozaki-r

Avoid storing a pointer of an interface in a mbuf

Having a pointer of an interface in a mbuf isn't safe if we remove big
kernel locks; an interface object (ifnet) can be destroyed anytime in any
packet processing and accessing such object via a pointer is racy. Instead
we have to get an object from the interface collection (ifindex2ifnet) via
an interface index (if_index) that is stored to a mbuf instead of an
pointer.

The change provides two APIs: m_{get,put}_rcvif_psref that use psref(9)
for sleep-able critical sections and m_{get,put}_rcvif that use
pserialize(9) for other critical sections. The change also adds another
API called m_get_rcvif_NOMPSAFE, that is NOT MP-safe and for transition
moratorium, i.e., it is intended to be used for places where are not
planned to be MP-ified soon.

The change adds some overhead due to psref to performance sensitive paths,
however the overhead is not serious, 2% down at worst.

Proposed on tech-kern and tech-net.


# 1.124 10-Jun-2016 ozaki-r

Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.123 16-May-2016 ozaki-r

Apply if_get and if_put to bridge(4)


# 1.122 04-May-2016 roy

Allow multicast/broadcast packets from a bridge member to other members.
Note this should just call bridge_broadcast when more locking issues are
resolved.


# 1.121 28-Apr-2016 knakahara

introduce new ifnet MP-scalable sending interface "if_transmit".


# 1.120 28-Apr-2016 ozaki-r

Constify rtentry of if_output

We no longer need to change rtentry below if_output.

The change makes it clear where rtentries are changed (or not)
and helps forthcoming locking (os psrefing) rtentries.


# 1.119 24-Apr-2016 christos

CID 1358673: dead code


Revision tags: nick-nhusb-base-20160422
# 1.118 22-Apr-2016 roy

Change used from int to bool.
If used, abort the loop because we think we're already at the end.


# 1.117 20-Apr-2016 knakahara

IFQ_ENQUEUE refactor (3/3) : eliminate pktattr argument from IFQ_ENQUEUE caller


# 1.116 20-Apr-2016 knakahara

IFQ_ENQUEUE refactor (2/3) : eliminate pktattr argument from altq implemantation


# 1.115 19-Apr-2016 ozaki-r

Apply psref(9) to bridge(4)

Note that there is an issue that ioctls for an interface and a destruction
of the interface can run in parallel and it causes race conditions on
bridge as well (it rarely happens). The issue will be addressed in the
interface common code (if.c).


# 1.114 19-Apr-2016 ozaki-r

Remove BRIDGE_MPSAFE switch and enable MP-safe code by default

We need to enable it by default because bridge_input now runs
in softint, but bridge_input w/o BRIDGE_MPSAFE was designed as
it runs in hardware interrupt.

Note that there remains a racy code in bridge_output; it will be
solved in the upcoming change (applying psref(9)).


# 1.113 11-Apr-2016 ozaki-r

Fix usage of pslist(9)

Pointed out by riastradh@.


# 1.112 11-Apr-2016 ozaki-r

Use pslist(9) in bridge(4)

This adds missing memory barriers to list operations for pserialize.


# 1.111 28-Mar-2016 ozaki-r

Remove unused global bridge list

Pointed out by riastradh@


# 1.110 23-Mar-2016 ozaki-r

Fix LIST_FOREACH argument


# 1.109 23-Mar-2016 ozaki-r

Use LIST_FOREACH instead of LIST_FOREACH_SAFE

No need to use *_SAFE because we don't remove any items in the loop.


Revision tags: nick-nhusb-base-20160319
# 1.108 15-Feb-2016 ozaki-r

Simplify bridge(4)

Thanks to introducing softint-based if_input, the entire bridge code now
never run in hardware interrupt context. So we can simplify the code.

- Remove spin mutexes
- They were needed because some code of bridge could run in
hardware interrupt context
- We now need only an adaptive mutex for each shared object
(a member list and a forwarding table)
- Remove pktqueue
- bridge_input is already in softint, using another softint
(for bridge_forward) is useless
- Packet distribution should be down at device drivers


# 1.107 10-Feb-2016 ozaki-r

Don't share struct work, instead have one per softc

Pointed out by riastradh@


# 1.106 09-Feb-2016 ozaki-r

Introduce softint-based if_input

This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.

This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.

To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.

Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html

Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!


Revision tags: nick-nhusb-base-20151226
# 1.105 19-Nov-2015 christos

Add handling of VLAN packets in if_bridge where the parent interface supports
them (Jean-Jacques.Puig@espci.fr). Factor out the vlan_mtu enabling and
disabling code.


# 1.104 20-Oct-2015 maxv

Harmless alloc inconsistency; make sure the exact same argument is given to
kmem_alloc/kmem_free. Found by Brainy.


# 1.103 07-Oct-2015 ozaki-r

Enqueue frames to a curcpu's pktqueue

Currently RX can run on a CPU other than CPU#0, so always enqueuing
to a pktqueue of CPU#0 makes no sense. Let's use a curcpu's pktqueue,
although bridge_foward softint doesn't run in parallel without
NET_MPSAFE.

This is a temporal solution. We need a fundamental solution.


Revision tags: nick-nhusb-base-20150921
# 1.102 28-Aug-2015 rjs

Don't set M_PROTO1 in mbuf flags.

This was left over from the old usage of gif(4) with bridges.


# 1.101 20-Aug-2015 christos

include "ioconf.h" to get the 'void <driver>attach(int count);' prototype.


# 1.100 23-Jul-2015 ozaki-r

Fix PR 48104

So far bridge cannot receive frames via a member interface when the frames
come from another member interface. So when we assign an IP address to
a member interface, hosts connected to another member interface cannot
ping to the IP address. That behavior isn't expected. See PR 48104 for
more realistic examples of this issue.

The change does:
- drop M_PROMISC before ether_input, which allows a bridge member interface
to receive a frame coming from another bridge member interface
- receive broadcast/multicast frames via all bridge member interfaces,
which is required to receive IPv6 multicast packets destined to a
multicast group belonging to a bridge member interface that is different
from a packet arrival interface

roy@ helped testing of the fix, thanks!


Revision tags: nick-nhusb-base-20150606
# 1.99 01-Jun-2015 matt

Modify the BRDGGIFS and BRDGRTS cmds to be more COMPAT_NETBSD32 friendly.
(XXX whitespace)


# 1.98 16-Apr-2015 ozaki-r

Fix racy bridge_delete_member

It can be called from bridge_ioctl_del and bridge_clone_destroy with
a same bridge member (bif) at the same time. We have to prevent
that happens.

Pointed out by riastradh@


Revision tags: nick-nhusb-base-20150406
# 1.97 08-Jan-2015 ozaki-r

Use pserialize for rtlist in bridge

This change enables lockless accesses to bridge rtable lists.
See locking notes in a comment to know how pserialize and
mutexes are used. Some functions are rearranged to use
pserialize. A workqueue is introduced to use pserialize in
bridge_rtage via bridge_timer callout.

As usual, pserialize and mutexes are used only when NET_MPSAFE
on. On the other hand, the newly added workqueue is used
regardless of NET_MPSAFE on or off.


# 1.96 01-Jan-2015 ozaki-r

Reset the expire time of a cache on receiving a frame for the cache

The expire time of a cache in a bridge MAC address table was never reset
once it is initialized regardless of traffic for the cache. The behavior
isn't supposed and active caches are unnecessarily expired and removed.

PR kern/49507


# 1.95 31-Dec-2014 ozaki-r

Use pserialize in bridge

This change enables lockless accesses to bridge member lists.
See locking notes in a comment to know how pserialize and
mutexes are used.

This change also provides support for softint-based interrupt
handling; pserialize readers can run in both HW interrupt and
softint contexts.

As usual, pserialize is used only when NET_MPSAFE on.


# 1.94 25-Dec-2014 ozaki-r

Use LIST_FOREACH_SAFE in bridge_rt* functions


# 1.93 24-Dec-2014 ozaki-r

Replace malloc/free with kmem_* in if_bridge

Additionally M_NOWAIT is replaced with KM_SLEEP.


# 1.92 22-Dec-2014 ozaki-r

Call ether_input/m_freem without holding a lock or referencing unnecessary objects

When NET_MPSAFE on, a bridge tries to pass up a packet to Layer 3
(or call m_freem) with holding a lock or referencing unnecessary
objects. That causes random lock ups. The change fixes the issue.


Revision tags: nick-nhusb-base
# 1.91 15-Aug-2014 ozaki-r

branches: 1.91.2;
bridge: reject non-IFF_SIMPLEX interfaces

bridge does not work with !IFF_SIMPLEX interfaces (PR/18035);
the bug is not yet fixed. Until it gets fixed, we should
reject non-IFF_SIMPLEX interfaces.

Discussed with pooka@


Revision tags: netbsd-7-nhusb-base-20170116 netbsd-7-1-RC1 netbsd-7-0-2-RELEASE netbsd-7-nhusb-base netbsd-7-0-1-RELEASE netbsd-7-0-RELEASE netbsd-7-0-RC3 netbsd-7-0-RC2 netbsd-7-0-RC1 netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.90 23-Jul-2014 ozaki-r

Avoid calling copyout with holding mutex(IPL_NET)

Because copyout may lead a page fault that may sleep, we have to pull it
out from the critical section of mutex(IPL_NET) in bridge_ioctl_gifs.


# 1.89 23-Jul-2014 ozaki-r

Add missing unlock


# 1.88 20-Jul-2014 ozaki-r

Don't return ENETRESET when ioctl SIOCSIFMTU

Otherwise, just changing MTU with ifconfig shows
a confusable error message.

RP kern/48996


# 1.87 14-Jul-2014 ozaki-r

Make bridge MPSAFE

- Introduce BRIDGE_MPSAFE
- It's enabled only when NET_MPSAFE is defined
in if.h or the kernel config
- Add iflist and rtlist mutex locks
- Locking iflist is performance sensitive,
so it's not used when !BRIDGE_MPSAFE
- Add bif object reference counting
- It enables fine-grain locking for bridge member lists
by allowing to not hold a lock during touching a bif
- bridge_release_member is added to decrement the
reference count
- A condition variable is added to do bridge_delete_member
gracefully
- Add if_bridgeif to ifnet
- It's a shortcut to a bif object of a bridge member
- It reduces a bif lookup cost and so lock contention on iflist
- Make bridgestp MPSAFE too


# 1.86 02-Jul-2014 ozaki-r

Protect bridge_list with a mutex


# 1.85 02-Jul-2014 ozaki-r

Remove obsolete codes for if_snd


# 1.84 23-Jun-2014 ozaki-r

Get rid of unnecessary xc_broadcast after pktq_barrier

Pointed out by rmind@


# 1.83 18-Jun-2014 ozaki-r

Restructure bridge_input and bridge_broadcast

There are two changes:
- Assemble the places calling pktq_enqueue (bridge_forward)
for unicast and {b,m}cast frames into one
- Receive {b,m}cast frames in bridge_broadcast, not in
bridge_input

The changes make the code clear and readable. bridge_input
now doesn't need to take care of {b,m}cast frames;
bridge_forward and bridge_broadcast have the responsibility.

The changes are based on a patch of Lloyd Parkes submitted
in PR 48104, but don't fix its issue yet.


# 1.82 18-Jun-2014 ozaki-r

Tidy up bridge_input

No functional change.


# 1.81 17-Jun-2014 ozaki-r

Restructure ether_input and bridge_input

The network stack of NetBSD is well organized and
layered. A packet reception is processed from a
lower layer to an upper layer one by one. However,
ether_input and bridge_input are not structured so.
bridge_input is called inside ether_input.

The new structure replaces ifnet#if_input of a bridge
member with bridge_input when the member is attached.
So a packet goes straight on a packet reception via
a bridge, bridge_input => ether_input => ip_input.

The change is part of a patch of Lloyd Parkes submitted
in PR 48104. Unlike the patch, the change doesn't
intend to change the behavior of the packet processing.
Another patch will fix PR 48104.


# 1.80 16-Jun-2014 ozaki-r

Add net.interfaces.bridgeN.fwdq.{maxlen,len,drops} sysctl


# 1.79 16-Jun-2014 ozaki-r

Use pktqueue for bridge forwarding queue and softint


# 1.78 15-Jun-2014 ozaki-r

Get rid of unnecessary splnet for pool_{get,put}

A mutex prevents interrupts in the functions now.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base
# 1.77 29-Jun-2013 rmind

branches: 1.77.4;
- Rewrite parts of pfil(9): use array to store hooks and thus be more cache
friendly (there are only few hooks in the system). Make the structures
opaque and the interface more strict.
- Remove PFIL_HOOKS option by making pfil(9) mandatory.


Revision tags: agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8
# 1.76 22-Mar-2012 wiz

branches: 1.76.2; 1.76.4;
Fix typo in kauth name. From PR 46234 by Matthew Mondor.
Tested by Geoff Adams and Ryo ONODERA.


# 1.75 13-Mar-2012 elad

Replace the remaining KAUTH_GENERIC_ISSUSER authorization calls with
something meaningful. All relevant documentation has been updated or
written.

Most of these changes were brought up in the following messages:

http://mail-index.netbsd.org/tech-kern/2012/01/18/msg012490.html
http://mail-index.netbsd.org/tech-kern/2012/01/19/msg012502.html
http://mail-index.netbsd.org/tech-kern/2012/02/17/msg012728.html

Thanks to christos, manu, njoly, and jmmv for input.

Huge thanks to pgoyette for spinning these changes through some build
cycles and ATF.


Revision tags: netbsd-6-0-6-RELEASE netbsd-6-1-5-RELEASE netbsd-6-1-4-RELEASE netbsd-6-0-5-RELEASE netbsd-6-1-3-RELEASE netbsd-6-0-4-RELEASE netbsd-6-1-2-RELEASE netbsd-6-0-3-RELEASE netbsd-6-1-1-RELEASE netbsd-6-0-2-RELEASE netbsd-6-1-RELEASE netbsd-6-1-RC4 netbsd-6-1-RC3 netbsd-6-1-RC2 netbsd-6-1-RC1 netbsd-6-0-1-RELEASE matt-nb6-plus-nbase netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base
# 1.74 19-Nov-2011 tls

branches: 1.74.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.73 23-May-2011 joerg

branches: 1.73.4;
simplify


Revision tags: bouyer-quota2-nbase bouyer-quota2-base jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.72 07-Dec-2010 pooka

branches: 1.72.2;
_KERNEL_TOP


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1 yamt-nfs-mp-base9 uebayasi-xip-base
# 1.71 19-Jan-2010 pooka

branches: 1.71.4;
Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


Revision tags: matt-premerge-20091211 yamt-nfs-mp-base8 yamt-nfs-mp-base7 jymxensuspend-base yamt-nfs-mp-base6 yamt-nfs-mp-base5 jym-xensuspend-nbase
# 1.70 17-May-2009 cegger

fix crash in bridge_ioctl():


BRDGGFLT and BRDGSFILT bridge controls are only available with BRIDGE_IPF and PFIL_HOOKS defined.
In amd64 GENERIC and XEN kernel configs PFIL_HOOKS is defined but BRIDGE_IPF is not.

When a BRDGGFLT or BRDGSFILT command comes in, then ifd->ifd_cmd is not in range
of bridge_control_table_size. Then bc is not set and is dereferenced
later => BOOM.


Revision tags: yamt-nfs-mp-base4 jym-xensuspend-base
# 1.69 12-May-2009 elad

Move kauth(9) call before going into splnet().

Mailing list reference:

http://mail-index.netbsd.org/tech-net/2009/05/08/msg001286.html


Revision tags: yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 nick-hppapmap-base
# 1.68 04-Apr-2009 bouyer

Fix another typo


# 1.67 04-Apr-2009 bouyer

Fix a comment, and make it build.


# 1.66 04-Apr-2009 bouyer

Fixes from Masao Uebayashi


# 1.65 04-Apr-2009 bouyer

Fix for if_start() and pfil_hook() being called from hardware interrupt
context (reported on various mailing-lists, and part of PR kern/41114,
causing panic in pf(4) and possibly ipf(4) when BRIDGE_IPF is used).
Defer bridge_forward() to a software interrupt; bridge_input() enqueues
mbufs to ifp->if_snd which is handled in bridge_forward().


Revision tags: nick-hppapmap-base2
# 1.64 18-Jan-2009 mrg

branches: 1.64.2;
Fix multiple problems:

* A sign extension error creating the bridge ID corrupted the
priority (always making it the maximum).
* Do not catch STP packets on an interface for which STP is not
enabled -- it's a violation of the spec, and causes STP to fail on
neighboring bridges.
* An optimization to bstp_input() -- some information is already
known when we call it.

contributed anonymously.


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.63 07-Nov-2008 dyoung

*** Summary ***

When a link-layer address changes (e.g., ifconfig ex0 link
02:de:ad:be:ef:02 active), send a gratuitous ARP and/or a Neighbor
Advertisement to update the network-/link-layer address bindings
on our LAN peers.

Refuse a change of ethernet address to the address 00:00:00:00:00:00
or to any multicast/broadcast address. (Thanks matt@.)

Reorder ifnet ioctl operations so that driver ioctls may inherit
the functions of their "class"---ether_ioctl(), fddi_ioctl(), et
cetera---and the class ioctls may inherit from the generic ioctl,
ifioctl_common(), but both driver- and class-ioctls may override
the generic behavior. Make network drivers share more code.

Distinguish a "factory" link-layer address from others for the
purposes of both protecting that address from deletion and computing
EUI64.

Return consistent, appropriate error codes from network drivers.

Improve readability. KNF.

*** Details ***

In if_attach(), always initialize the interface ioctl routine,
ifnet->if_ioctl, if the driver has not already initialized it.
Delete if_ioctl == NULL tests everywhere else, because it cannot
happen.

In the ioctl routines of network interfaces, inherit common ioctl
behaviors by calling either ifioctl_common() or whichever ioctl
routine is appropriate for the class of interface---e.g., ether_ioctl()
for ethernets.

Stop (ab)using SIOCSIFADDR and start to use SIOCINITIFADDR. In
the user->kernel interface, SIOCSIFADDR's argument was an ifreq,
but on the protocol->ifnet interface, SIOCSIFADDR's argument was
an ifaddr. That was confusing, and it would work against me as I
make it possible for a network interface to overload most ioctls.
On the protocol->ifnet interface, replace SIOCSIFADDR with
SIOCINITIFADDR. In ifioctl(), return EPERM if userland tries to
invoke SIOCINITIFADDR.

In ifioctl(), give the interface the first shot at handling most
interface ioctls, and give the protocol the second shot, instead
of the other way around. Finally, let compatibility code (COMPAT_OSOCK)
take a shot.

Pull device initialization out of switch statements under
SIOCINITIFADDR. For example, pull ..._init() out of any switch
statement that looks like this:

switch (...->sa_family) {
case ...:
..._init();
...
break;
...
default:
..._init();
...
break;
}

Rewrite many if-else clauses that handle all permutations of IFF_UP
and IFF_RUNNING to use a switch statement,

switch (x & (IFF_UP|IFF_RUNNING)) {
case 0:
...
break;
case IFF_RUNNING:
...
break;
case IFF_UP:
...
break;
case IFF_UP|IFF_RUNNING:
...
break;
}

unifdef lots of code containing #ifdef FreeBSD, #ifdef NetBSD, and
#ifdef SIOCSIFMTU, especially in fwip(4) and in ndis(4).

In ipw(4), remove an if_set_sadl() call that is out of place.

In nfe(4), reuse the jumbo MTU logic in ether_ioctl().

Let ethernets register a callback for setting h/w state such as
promiscuous mode and the multicast filter in accord with a change
in the if_flags: ether_set_ifflags_cb() registers a callback that
returns ENETRESET if the caller should reset the ethernet by calling
if_init(), 0 on success, != 0 on failure. Pull common code from
ex(4), gem(4), nfe(4), sip(4), tlp(4), vge(4) into ether_ioctl(),
and register if_flags callbacks for those drivers.

Return ENOTTY instead of EINVAL for inappropriate ioctls. In
zyd(4), use ENXIO instead of ENOTTY to indicate that the device is
not any longer attached.

Add to if_set_sadl() a boolean 'factory' argument that indicates
whether a link-layer address was assigned by the factory or some
other source. In a comment, recommend using the factory address
for generating an EUI64, and update in6_get_hw_ifid() to prefer a
factory address to any other link-layer address.

Add a routing message, RTM_LLINFO_UPD, that tells protocols to
update the binding of network-layer addresses to link-layer addresses.
Implement this message in IPv4 and IPv6 by sending a gratuitous
ARP or a neighbor advertisement, respectively. Generate RTM_LLINFO_UPD
messages on a change of an interface's link-layer address.

In ether_ioctl(), do not let SIOCALIFADDR set a link-layer address
that is broadcast/multicast or equal to 00:00:00:00:00:00.

Make ether_ioctl() call ifioctl_common() to handle ioctls that it
does not understand.

In gif(4), initialize if_softc and use it, instead of assuming that
the gif_softc and ifp overlap.

Let ifioctl_common() handle SIOCGIFADDR.

Sprinkle rtcache_invariants(), which checks on DIAGNOSTIC kernels
that certain invariants on a struct route are satisfied.

In agr(4), rewrite agr_ioctl_filter() to be a bit more explicit
about the ioctls that we do not allow on an agr(4) member interface.

bzero -> memset. Delete unnecessary casts to void *. Use
sockaddr_in_init() and sockaddr_in6_init(). Compare pointers with
NULL instead of "testing truth". Replace some instances of (type
*)0 with NULL. Change some K&R prototypes to ANSI C, and join
lines.


Revision tags: netbsd-5-0-RC3 netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 wrstuden-revivesa-base-1 simonb-wapbl-nbase yamt-pf42-base4 simonb-wapbl-base wrstuden-revivesa-base
# 1.62 15-Jun-2008 christos

branches: 1.62.2; 1.62.4; 1.62.6;
- add if_alloc (ours just mallocs), and if_initname and use them (from FreeBSD)
- kill memsets where M_ZERO can be used.


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base
# 1.61 15-Apr-2008 thorpej

branches: 1.61.2; 1.61.4; 1.61.6; 1.61.8;
Make ip6 and icmp6 stats per-cpu.


# 1.60 12-Apr-2008 cegger

make this build with BRIDGE_IPF and PFIL_HOOKS options


# 1.59 12-Apr-2008 thorpej

Make IP, TCP, UDP, and ICMP statistics per-CPU. The stats are collated
when the user requests them via sysctl.


# 1.58 08-Apr-2008 thorpej

Change IPv6 stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ip6stat structure; old netstat
binaries will continue to work properly.


# 1.57 07-Apr-2008 thorpej

Change IP stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ipstat structure; old netstat
binaries will continue to work properly.


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase nick-net80211-sync-base keiichi-mipv6-base matt-armv6-nbase hpcarm-cleanup-base
# 1.56 20-Feb-2008 matt

branches: 1.56.6;
s/u_\(int[0-9]*_t\)/u\1/g
(change u_int*_t to uint*_t)


Revision tags: bouyer-xeni386-nbase bouyer-xeni386-base mjf-devfs-base
# 1.55 19-Jan-2008 dyoung

Use C99 array initializers for bridge_control_table[].


Revision tags: nick-csl-alignment-base5 bouyer-xeni386-merge1 matt-armv6-prevmlocking vmlocking2-base3 yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 jmcneill-base bouyer-xenamd64-base2 vmlocking-nbase yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base matt-armv6-base jmcneill-pm-base reinoud-bufcleanup-base vmlocking-base
# 1.54 27-Aug-2007 dyoung

branches: 1.54.2; 1.54.8; 1.54.14;
LLADDR -> CLLADDR.


# 1.53 26-Aug-2007 dyoung

Constify: LLADDR -> CLLADDR. I'm aiming here to make it easier to
identify sockaddr_dl abuse that remains in the kernel, especially
the potential for overwriting memory past the end of a sockaddr_dl
with, e.g., memcpy(LLADDR(), ...).

Use sockaddr_dl_setaddr() in a few places.


Revision tags: matt-mips64-base nick-csl-alignment-base mjf-ufs-trans-base
# 1.52 09-Jul-2007 ad

branches: 1.52.2; 1.52.6;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.51 12-Mar-2007 ad

branches: 1.51.2;
Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.


# 1.50 04-Mar-2007 christos

branches: 1.50.2;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.49 21-Feb-2007 dyoung

Use __arraycount().


# 1.48 17-Feb-2007 dyoung

KNF: de-__P, bzero -> memset, bcmp -> memcmp. Remove extraneous
parentheses in return statements.

Cosmetic: don't open-code TAILQ_FOREACH().

Cosmetic: change types of variables to avoid oodles of casts: in
in6_src.c, avoid casts by changing several route_in6 pointers
to struct route pointers. Remove unnecessary casts to caddr_t
elsewhere.

Pave the way for eliminating address family-specific route caches:
soon, struct route will not embed a sockaddr, but it will hold
a reference to an external sockaddr, instead. We will set the
destination sockaddr using rtcache_setdst(). (I created a stub
for it, but it isn't used anywhere, yet.) rtcache_free() will
free the sockaddr. I have extracted from rtcache_free() a helper
subroutine, rtcache_clear(). rtcache_clear() will "forget" a
cached route, but it will not forget the destination by releasing
the sockaddr. I use rtcache_clear() instead of rtcache_free()
in rtcache_update(), because rtcache_update() is not supposed
to forget the destination.

Constify:

1 Introduce const accessor for route->ro_dst, rtcache_getdst().

2 Constify the 'dst' argument to ifnet->if_output(). This
led me to constify a lot of code called by output routines.

3 Constify the sockaddr argument to protosw->pr_ctlinput. This
led me to constify a lot of code called by ctlinput routines.

4 Introduce const macros for converting from a generic sockaddr
to family-specific sockaddrs, e.g., sockaddr_in: satocsin6,
satocsin, et cetera.


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.47 04-Jan-2007 elad

branches: 1.47.2;
Consistent usage of KAUTH_GENERIC_ISSUSER.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3 netbsd-4-base
# 1.46 23-Nov-2006 rpaulo

New EtherIP driver based on tap(4) and gif(4) by Hans Rosenfeld.
Notable changes:
* Fixes PR 34268.
* Separates the code from gif(4) (which is more cleaner).
* Allows the usage of STP (Spanning Tree Protocol).
* Removed EtherIP implementation from gif(4)/tap(4).

Some input from Christos.


# 1.45 16-Nov-2006 christos

__unused removal on arguments; approved by core.


Revision tags: yamt-splraiseipl-base2
# 1.44 17-Oct-2006 dogcow

now that we have -Wno-unused-parameter, back out all the tremendously ugly
code to gratuitously access said parameters.


# 1.43 13-Oct-2006 dogcow

More -Wunused fallout. sprinkle __unused when possible; otherwise, use the
do { if (&x) {} } while (/* CONSTCOND */ 0);
construct as suggested by uwe in <20061012224845.GA9449@snark.ptc.spbu.ru>.


# 1.42 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


# 1.41 05-Oct-2006 tls

Protect calls to pool_put/pool_get that may occur in interrupt context
with spl used to protect other allocations and frees, or datastructure
element insertion and removal, in adjacent code.

It is almost unquestionably the case that some of the spl()/splx() calls
added here are superfluous, but it really seems wrong to see:

s=splfoo();
/* frob data structure */
splx(s);
pool_put(x);

and if we think we need to protect the first operation, then it is hard
to see why we should not think we need to protect the next. "Better
safe than sorry".

It is also almost unquestionably the case that I missed some pool
gets/puts from interrupt context with my strategy for finding these
calls; use of PR_NOWAIT is a strong hint that a pool may be used from
interrupt context but many callers in the kernel pass a "can wait/can't
wait" flag down such that my searches might not have found them. One
notable area that needs to be looked at is pf.

See also:

http://mail-index.netbsd.org/tech-kern/2006/07/19/0003.html
http://mail-index.netbsd.org/tech-kern/2006/07/19/0009.html


Revision tags: abandoned-netbsd-4-base yamt-splraiseipl-base yamt-pdpolicy-base9 yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.40 23-Jul-2006 ad

branches: 1.40.4; 1.40.6;
Use the LWP cached credentials where sane.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.39 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.38 18-May-2006 liamjfoy

branches: 1.38.2;
Integrate Common Address Redundancy Procotol (CARP) from OpenBSD

'pseudo-device carp'

Thanks to: joerg@ christos@ riz@ and others who tested
Ok: core@


# 1.37 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base yamt-pdpolicy-base2 elad-kernelauth-base yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.36 17-Jan-2006 christos

branches: 1.36.2; 1.36.4; 1.36.6; 1.36.8; 1.36.10;
Make sure that breq is also cleared (from Xin LI)


# 1.35 09-Jan-2006 christos

Make sure we initialize all structs to 0; from Xin LI


# 1.34 24-Dec-2005 perry

branches: 1.34.2;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.33 11-Dec-2005 thorpej

ANSI function decls and application of static.


# 1.32 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.31 01-Jun-2005 jdc

branches: 1.31.2;
Fix this properly by renaming the conflicting variables.


# 1.30 01-Jun-2005 jdc

Remove extraneous definition of struct llc (found by shadow warning).


Revision tags: netbsd-3-0-RELEASE netbsd-3-0-RC6 netbsd-3-0-RC5 netbsd-3-0-RC4 netbsd-3-0-RC3 netbsd-3-0-RC2 netbsd-3-0-RC1 yamt-km-base4 yamt-km-base3 netbsd-3-base kent-audio2-base
# 1.29 26-Feb-2005 perry

branches: 1.29.2; 1.29.4;
nuke trailing whitespace


Revision tags: yamt-km-base2
# 1.28 31-Jan-2005 kim

Add RFC 3378 EtherIP support, ported from OpenBSD to NetBSD by
Hans Rosenfeld (rosenfeld at grumpf.hope-2000.org)

This change makes it possible to add gif interfaces to bridges, which
will then send and receive IP protocol 97 packets. Packets are Ethernet
frames with an EtherIP header prepended.


Revision tags: yamt-km-base kent-audio1-beforemerge kent-audio1-base
# 1.27 04-Dec-2004 peter

branches: 1.27.4; 1.27.6;
Change ifc_destroy to return an int instead of void, so that it
can pass back errors to ifconfig.


# 1.26 06-Oct-2004 bad

Interfaces that do checksum offloading indicate the checksum status of
received packets in csum_flags in the packet header. Packets that are
forwarded over the bridge need to have csum_flags cleared before being
put on the output queue. Do so in bridge_enqueue().

Discussed with Jason Thorpe.

Fixes PR kern/27007 and the first part of PR kern/21831.


# 1.25 05-Oct-2004 christos

Only enable BRIDGE_IPF code if PFIL_HOOKS is enabled.


# 1.24 21-Apr-2004 itojun

kill a sprintf


# 1.23 21-Apr-2004 itojun

kill sprintf, use snprintf


Revision tags: netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.22 31-Jan-2004 jdc

branches: 1.22.2;
Use m_copydata(), m_adj() and M_PREPEND() to manipulate mbuf's in
bridge_ipf(). Fixes kernel memory corruption that occured when using
m_split() and m_cat().
Idea from OpenBSD.


# 1.21 09-Dec-2003 augustss

Fix spelling mistake in a comment.


# 1.20 28-Oct-2003 mycroft

Mark this initializer in the canonical way so it can be found later.


# 1.19 25-Oct-2003 christos

Fix uninitialized variable warnings


# 1.18 16-Sep-2003 jdc

Add a flag parameter to bridge_enqueue() to tell it whether to run the
filter or not. We only need to run the filter for bridge_forward() and
bridge_broadcast(). If we also run it for bridge_output(), we will run
the filter twice outbound per packet, so don't.

In bridge_ipf(), make sure we don't run m_cat() on a single mbuf chain
by checking to see (and remembering) if we need to m_split() the mbuf.
This fixes bridge + ipfilter on sparc.

Fixes PR kern/22063.


# 1.17 11-Aug-2003 itojun

rm extra blank line


# 1.16 13-Jul-2003 jdc

Include opt_inet.h to get INET6 definition.
Now, bridged ipv6 packets are passed through ipfilter.
However, some v6 packets still do not get transmitted when ipf is enabled.
Partial fix for PR kern/22063.


# 1.15 23-Jun-2003 martin

branches: 1.15.2;
Make sure to include opt_foo.h if a defflag option FOO is used.


# 1.14 24-May-2003 kristerw

Make sure splx() is called for all bridge_ioctl() error cases.


# 1.13 16-May-2003 itojun

use strlcpy


# 1.12 14-May-2003 itojun

use arc4random


# 1.11 19-Mar-2003 bouyer

Fix 2 bugs:
- initialise stp when the bridge is turned up, without this stp will keep
all interfaces disabled in a sequence like:
brconfig bridge0 add if0 add if1 stp if0 stp if1 up
- s/BRDGSPRI/BRDGSIFPRIO in brconfig.c:cmd_ifpriority()

add a command (ifpathcost) to change the stp path cost of the STP path cost of
an interface. Display the interface path cost with the others STP parameters.


# 1.10 27-Feb-2003 perseant

Make BRIDGE_IPF an option, and document it. Add it (commented) to GENERIC.
Let brconfig tell whether the bridge is using the ipfilter hook, or not.


# 1.9 15-Feb-2003 perseant

Add ipf packet-filtering option to if_bridge. The option is controlled at
compile-time by BRIDGE_IPF, and at runtime by brconfig with the {ipf,-ipf}
option on a per-bridge basis.

As a side-effect, add PFIL_HOOKS processing to if_bridge.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base kqueue-aftermerge kqueue-beforemerge gehenna-devsw-base kqueue-base
# 1.8 24-Aug-2002 martin

Add a function to lookup bridge members by struct ifnet * and use
it at all call sites that have such a pointer readily available.
This avoids unnecessary strcmp()s in critical paths, and removes
some XXX comments.


# 1.7 08-Jun-2002 itojun

reject "add" request if if_mtu is different.


# 1.6 23-May-2002 itojun

use IFT_BRIDGE


Revision tags: netbsd-1-6-base
# 1.5 24-Mar-2002 jdolecek

branches: 1.5.2; 1.5.4;
Fix a memory leak in bridge_ioctl_add() when the called for non-ethernet
interface.
Problem noted and fix provided by in kern/16019 by Love.


Revision tags: eeh-devprop-base newlock-base
# 1.4 08-Mar-2002 thorpej

Pool deals fairly well with physical memory shortage, but it doesn't
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map). Try to deal with this:

* Group all information about the backend allocator for a pool in a
separate structure. The pool references this structure, rather than
the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
to become available, but will still fail if it cannot callocate KVA
space for the pages. If this happens, carefully drain all pools using
the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
some pages, and use that information to make draining easier and more
efficient.
* Get rid of PR_URGENT. There was only one use of it, and it could be
dealt with by the caller.

From art@openbsd.org.


Revision tags: ifpoll-base
# 1.3 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.2 17-Aug-2001 thorpej

branches: 1.2.2; 1.2.4;
Only report expire time for DYNAMIC forwarding table entries.


# 1.1 17-Aug-2001 thorpej

Add support for building Ethernet bridges, based on Jason Wright's
bridge driver from OpenBSD, although the bridge code has been *heavily*
modified by me (the 802.1D code remains mostly unchanged from the
original).


# 1.132 23-Jan-2017 ozaki-r

Replace some splnet with splsoftnet


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107 nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004
# 1.131 15-Sep-2016 christos

Always do the mbuf checks. The packet filters (npf) expect the mbuf to be
pulled-up. (Krists Krilovs)


Revision tags: localcount-20160914
# 1.130 29-Aug-2016 ozaki-r

KNF; replace white spaces with hard tabs

No functional change.


Revision tags: pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.129 22-Jun-2016 knakahara

branches: 1.129.2;
fix: locking about IFQ_ENQUEUE and ALTQ

- If NET_MPSAFE is not defined, IFQ_LOCK is nop. Currently, that means
IFQ_ENQUEUE() of some paths such as bridge_enqueue() is called parallel
wrongly.
- If ALTQ is enabled, Tx processing should call if_transmit() (= IFQ_ENQUEUE
+ ifp->if_start()) instead of ifp->if_transmit() to call ALTQ_ENQUEUE()
and ALTQ_DEQUEUE().
Furthermore, ALTQ processing is always required KERNEL_LOCK currently.


# 1.128 20-Jun-2016 knakahara

fix: should not assert IFEF_OUTPUT_MPSAFE in bridge_output()


# 1.127 20-Jun-2016 knakahara

tentative fix for ATF(net/if_bridge/t_bridge)


# 1.126 20-Jun-2016 knakahara

make bridge_output MP-safe, so that bridge(4) can enable IFEF_OUTPUT_MPSAFE.

making MP-scalable is future work.


# 1.125 10-Jun-2016 ozaki-r

Avoid storing a pointer of an interface in a mbuf

Having a pointer of an interface in a mbuf isn't safe if we remove big
kernel locks; an interface object (ifnet) can be destroyed anytime in any
packet processing and accessing such object via a pointer is racy. Instead
we have to get an object from the interface collection (ifindex2ifnet) via
an interface index (if_index) that is stored to a mbuf instead of an
pointer.

The change provides two APIs: m_{get,put}_rcvif_psref that use psref(9)
for sleep-able critical sections and m_{get,put}_rcvif that use
pserialize(9) for other critical sections. The change also adds another
API called m_get_rcvif_NOMPSAFE, that is NOT MP-safe and for transition
moratorium, i.e., it is intended to be used for places where are not
planned to be MP-ified soon.

The change adds some overhead due to psref to performance sensitive paths,
however the overhead is not serious, 2% down at worst.

Proposed on tech-kern and tech-net.


# 1.124 10-Jun-2016 ozaki-r

Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.123 16-May-2016 ozaki-r

Apply if_get and if_put to bridge(4)


# 1.122 04-May-2016 roy

Allow multicast/broadcast packets from a bridge member to other members.
Note this should just call bridge_broadcast when more locking issues are
resolved.


# 1.121 28-Apr-2016 knakahara

introduce new ifnet MP-scalable sending interface "if_transmit".


# 1.120 28-Apr-2016 ozaki-r

Constify rtentry of if_output

We no longer need to change rtentry below if_output.

The change makes it clear where rtentries are changed (or not)
and helps forthcoming locking (os psrefing) rtentries.


# 1.119 24-Apr-2016 christos

CID 1358673: dead code


Revision tags: nick-nhusb-base-20160422
# 1.118 22-Apr-2016 roy

Change used from int to bool.
If used, abort the loop because we think we're already at the end.


# 1.117 20-Apr-2016 knakahara

IFQ_ENQUEUE refactor (3/3) : eliminate pktattr argument from IFQ_ENQUEUE caller


# 1.116 20-Apr-2016 knakahara

IFQ_ENQUEUE refactor (2/3) : eliminate pktattr argument from altq implemantation


# 1.115 19-Apr-2016 ozaki-r

Apply psref(9) to bridge(4)

Note that there is an issue that ioctls for an interface and a destruction
of the interface can run in parallel and it causes race conditions on
bridge as well (it rarely happens). The issue will be addressed in the
interface common code (if.c).


# 1.114 19-Apr-2016 ozaki-r

Remove BRIDGE_MPSAFE switch and enable MP-safe code by default

We need to enable it by default because bridge_input now runs
in softint, but bridge_input w/o BRIDGE_MPSAFE was designed as
it runs in hardware interrupt.

Note that there remains a racy code in bridge_output; it will be
solved in the upcoming change (applying psref(9)).


# 1.113 11-Apr-2016 ozaki-r

Fix usage of pslist(9)

Pointed out by riastradh@.


# 1.112 11-Apr-2016 ozaki-r

Use pslist(9) in bridge(4)

This adds missing memory barriers to list operations for pserialize.


# 1.111 28-Mar-2016 ozaki-r

Remove unused global bridge list

Pointed out by riastradh@


# 1.110 23-Mar-2016 ozaki-r

Fix LIST_FOREACH argument


# 1.109 23-Mar-2016 ozaki-r

Use LIST_FOREACH instead of LIST_FOREACH_SAFE

No need to use *_SAFE because we don't remove any items in the loop.


Revision tags: nick-nhusb-base-20160319
# 1.108 15-Feb-2016 ozaki-r

Simplify bridge(4)

Thanks to introducing softint-based if_input, the entire bridge code now
never run in hardware interrupt context. So we can simplify the code.

- Remove spin mutexes
- They were needed because some code of bridge could run in
hardware interrupt context
- We now need only an adaptive mutex for each shared object
(a member list and a forwarding table)
- Remove pktqueue
- bridge_input is already in softint, using another softint
(for bridge_forward) is useless
- Packet distribution should be down at device drivers


# 1.107 10-Feb-2016 ozaki-r

Don't share struct work, instead have one per softc

Pointed out by riastradh@


# 1.106 09-Feb-2016 ozaki-r

Introduce softint-based if_input

This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.

This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.

To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.

Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html

Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!


Revision tags: nick-nhusb-base-20151226
# 1.105 19-Nov-2015 christos

Add handling of VLAN packets in if_bridge where the parent interface supports
them (Jean-Jacques.Puig@espci.fr). Factor out the vlan_mtu enabling and
disabling code.


# 1.104 20-Oct-2015 maxv

Harmless alloc inconsistency; make sure the exact same argument is given to
kmem_alloc/kmem_free. Found by Brainy.


# 1.103 07-Oct-2015 ozaki-r

Enqueue frames to a curcpu's pktqueue

Currently RX can run on a CPU other than CPU#0, so always enqueuing
to a pktqueue of CPU#0 makes no sense. Let's use a curcpu's pktqueue,
although bridge_foward softint doesn't run in parallel without
NET_MPSAFE.

This is a temporal solution. We need a fundamental solution.


Revision tags: nick-nhusb-base-20150921
# 1.102 28-Aug-2015 rjs

Don't set M_PROTO1 in mbuf flags.

This was left over from the old usage of gif(4) with bridges.


# 1.101 20-Aug-2015 christos

include "ioconf.h" to get the 'void <driver>attach(int count);' prototype.


# 1.100 23-Jul-2015 ozaki-r

Fix PR 48104

So far bridge cannot receive frames via a member interface when the frames
come from another member interface. So when we assign an IP address to
a member interface, hosts connected to another member interface cannot
ping to the IP address. That behavior isn't expected. See PR 48104 for
more realistic examples of this issue.

The change does:
- drop M_PROMISC before ether_input, which allows a bridge member interface
to receive a frame coming from another bridge member interface
- receive broadcast/multicast frames via all bridge member interfaces,
which is required to receive IPv6 multicast packets destined to a
multicast group belonging to a bridge member interface that is different
from a packet arrival interface

roy@ helped testing of the fix, thanks!


Revision tags: nick-nhusb-base-20150606
# 1.99 01-Jun-2015 matt

Modify the BRDGGIFS and BRDGRTS cmds to be more COMPAT_NETBSD32 friendly.
(XXX whitespace)


# 1.98 16-Apr-2015 ozaki-r

Fix racy bridge_delete_member

It can be called from bridge_ioctl_del and bridge_clone_destroy with
a same bridge member (bif) at the same time. We have to prevent
that happens.

Pointed out by riastradh@


Revision tags: nick-nhusb-base-20150406
# 1.97 08-Jan-2015 ozaki-r

Use pserialize for rtlist in bridge

This change enables lockless accesses to bridge rtable lists.
See locking notes in a comment to know how pserialize and
mutexes are used. Some functions are rearranged to use
pserialize. A workqueue is introduced to use pserialize in
bridge_rtage via bridge_timer callout.

As usual, pserialize and mutexes are used only when NET_MPSAFE
on. On the other hand, the newly added workqueue is used
regardless of NET_MPSAFE on or off.


# 1.96 01-Jan-2015 ozaki-r

Reset the expire time of a cache on receiving a frame for the cache

The expire time of a cache in a bridge MAC address table was never reset
once it is initialized regardless of traffic for the cache. The behavior
isn't supposed and active caches are unnecessarily expired and removed.

PR kern/49507


# 1.95 31-Dec-2014 ozaki-r

Use pserialize in bridge

This change enables lockless accesses to bridge member lists.
See locking notes in a comment to know how pserialize and
mutexes are used.

This change also provides support for softint-based interrupt
handling; pserialize readers can run in both HW interrupt and
softint contexts.

As usual, pserialize is used only when NET_MPSAFE on.


# 1.94 25-Dec-2014 ozaki-r

Use LIST_FOREACH_SAFE in bridge_rt* functions


# 1.93 24-Dec-2014 ozaki-r

Replace malloc/free with kmem_* in if_bridge

Additionally M_NOWAIT is replaced with KM_SLEEP.


# 1.92 22-Dec-2014 ozaki-r

Call ether_input/m_freem without holding a lock or referencing unnecessary objects

When NET_MPSAFE on, a bridge tries to pass up a packet to Layer 3
(or call m_freem) with holding a lock or referencing unnecessary
objects. That causes random lock ups. The change fixes the issue.


Revision tags: nick-nhusb-base
# 1.91 15-Aug-2014 ozaki-r

branches: 1.91.2;
bridge: reject non-IFF_SIMPLEX interfaces

bridge does not work with !IFF_SIMPLEX interfaces (PR/18035);
the bug is not yet fixed. Until it gets fixed, we should
reject non-IFF_SIMPLEX interfaces.

Discussed with pooka@


Revision tags: netbsd-7-nhusb-base-20170116 netbsd-7-1-RC1 netbsd-7-0-2-RELEASE netbsd-7-nhusb-base netbsd-7-0-1-RELEASE netbsd-7-0-RELEASE netbsd-7-0-RC3 netbsd-7-0-RC2 netbsd-7-0-RC1 netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.90 23-Jul-2014 ozaki-r

Avoid calling copyout with holding mutex(IPL_NET)

Because copyout may lead a page fault that may sleep, we have to pull it
out from the critical section of mutex(IPL_NET) in bridge_ioctl_gifs.


# 1.89 23-Jul-2014 ozaki-r

Add missing unlock


# 1.88 20-Jul-2014 ozaki-r

Don't return ENETRESET when ioctl SIOCSIFMTU

Otherwise, just changing MTU with ifconfig shows
a confusable error message.

RP kern/48996


# 1.87 14-Jul-2014 ozaki-r

Make bridge MPSAFE

- Introduce BRIDGE_MPSAFE
- It's enabled only when NET_MPSAFE is defined
in if.h or the kernel config
- Add iflist and rtlist mutex locks
- Locking iflist is performance sensitive,
so it's not used when !BRIDGE_MPSAFE
- Add bif object reference counting
- It enables fine-grain locking for bridge member lists
by allowing to not hold a lock during touching a bif
- bridge_release_member is added to decrement the
reference count
- A condition variable is added to do bridge_delete_member
gracefully
- Add if_bridgeif to ifnet
- It's a shortcut to a bif object of a bridge member
- It reduces a bif lookup cost and so lock contention on iflist
- Make bridgestp MPSAFE too


# 1.86 02-Jul-2014 ozaki-r

Protect bridge_list with a mutex


# 1.85 02-Jul-2014 ozaki-r

Remove obsolete codes for if_snd


# 1.84 23-Jun-2014 ozaki-r

Get rid of unnecessary xc_broadcast after pktq_barrier

Pointed out by rmind@


# 1.83 18-Jun-2014 ozaki-r

Restructure bridge_input and bridge_broadcast

There are two changes:
- Assemble the places calling pktq_enqueue (bridge_forward)
for unicast and {b,m}cast frames into one
- Receive {b,m}cast frames in bridge_broadcast, not in
bridge_input

The changes make the code clear and readable. bridge_input
now doesn't need to take care of {b,m}cast frames;
bridge_forward and bridge_broadcast have the responsibility.

The changes are based on a patch of Lloyd Parkes submitted
in PR 48104, but don't fix its issue yet.


# 1.82 18-Jun-2014 ozaki-r

Tidy up bridge_input

No functional change.


# 1.81 17-Jun-2014 ozaki-r

Restructure ether_input and bridge_input

The network stack of NetBSD is well organized and
layered. A packet reception is processed from a
lower layer to an upper layer one by one. However,
ether_input and bridge_input are not structured so.
bridge_input is called inside ether_input.

The new structure replaces ifnet#if_input of a bridge
member with bridge_input when the member is attached.
So a packet goes straight on a packet reception via
a bridge, bridge_input => ether_input => ip_input.

The change is part of a patch of Lloyd Parkes submitted
in PR 48104. Unlike the patch, the change doesn't
intend to change the behavior of the packet processing.
Another patch will fix PR 48104.


# 1.80 16-Jun-2014 ozaki-r

Add net.interfaces.bridgeN.fwdq.{maxlen,len,drops} sysctl


# 1.79 16-Jun-2014 ozaki-r

Use pktqueue for bridge forwarding queue and softint


# 1.78 15-Jun-2014 ozaki-r

Get rid of unnecessary splnet for pool_{get,put}

A mutex prevents interrupts in the functions now.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base
# 1.77 29-Jun-2013 rmind

branches: 1.77.4;
- Rewrite parts of pfil(9): use array to store hooks and thus be more cache
friendly (there are only few hooks in the system). Make the structures
opaque and the interface more strict.
- Remove PFIL_HOOKS option by making pfil(9) mandatory.


Revision tags: agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8
# 1.76 22-Mar-2012 wiz

branches: 1.76.2; 1.76.4;
Fix typo in kauth name. From PR 46234 by Matthew Mondor.
Tested by Geoff Adams and Ryo ONODERA.


# 1.75 13-Mar-2012 elad

Replace the remaining KAUTH_GENERIC_ISSUSER authorization calls with
something meaningful. All relevant documentation has been updated or
written.

Most of these changes were brought up in the following messages:

http://mail-index.netbsd.org/tech-kern/2012/01/18/msg012490.html
http://mail-index.netbsd.org/tech-kern/2012/01/19/msg012502.html
http://mail-index.netbsd.org/tech-kern/2012/02/17/msg012728.html

Thanks to christos, manu, njoly, and jmmv for input.

Huge thanks to pgoyette for spinning these changes through some build
cycles and ATF.


Revision tags: netbsd-6-0-6-RELEASE netbsd-6-1-5-RELEASE netbsd-6-1-4-RELEASE netbsd-6-0-5-RELEASE netbsd-6-1-3-RELEASE netbsd-6-0-4-RELEASE netbsd-6-1-2-RELEASE netbsd-6-0-3-RELEASE netbsd-6-1-1-RELEASE netbsd-6-0-2-RELEASE netbsd-6-1-RELEASE netbsd-6-1-RC4 netbsd-6-1-RC3 netbsd-6-1-RC2 netbsd-6-1-RC1 netbsd-6-0-1-RELEASE matt-nb6-plus-nbase netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base
# 1.74 19-Nov-2011 tls

branches: 1.74.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.73 23-May-2011 joerg

branches: 1.73.4;
simplify


Revision tags: bouyer-quota2-nbase bouyer-quota2-base jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.72 07-Dec-2010 pooka

branches: 1.72.2;
_KERNEL_TOP


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1 yamt-nfs-mp-base9 uebayasi-xip-base
# 1.71 19-Jan-2010 pooka

branches: 1.71.4;
Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


Revision tags: matt-premerge-20091211 yamt-nfs-mp-base8 yamt-nfs-mp-base7 jymxensuspend-base yamt-nfs-mp-base6 yamt-nfs-mp-base5 jym-xensuspend-nbase
# 1.70 17-May-2009 cegger

fix crash in bridge_ioctl():


BRDGGFLT and BRDGSFILT bridge controls are only available with BRIDGE_IPF and PFIL_HOOKS defined.
In amd64 GENERIC and XEN kernel configs PFIL_HOOKS is defined but BRIDGE_IPF is not.

When a BRDGGFLT or BRDGSFILT command comes in, then ifd->ifd_cmd is not in range
of bridge_control_table_size. Then bc is not set and is dereferenced
later => BOOM.


Revision tags: yamt-nfs-mp-base4 jym-xensuspend-base
# 1.69 12-May-2009 elad

Move kauth(9) call before going into splnet().

Mailing list reference:

http://mail-index.netbsd.org/tech-net/2009/05/08/msg001286.html


Revision tags: yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 nick-hppapmap-base
# 1.68 04-Apr-2009 bouyer

Fix another typo


# 1.67 04-Apr-2009 bouyer

Fix a comment, and make it build.


# 1.66 04-Apr-2009 bouyer

Fixes from Masao Uebayashi


# 1.65 04-Apr-2009 bouyer

Fix for if_start() and pfil_hook() being called from hardware interrupt
context (reported on various mailing-lists, and part of PR kern/41114,
causing panic in pf(4) and possibly ipf(4) when BRIDGE_IPF is used).
Defer bridge_forward() to a software interrupt; bridge_input() enqueues
mbufs to ifp->if_snd which is handled in bridge_forward().


Revision tags: nick-hppapmap-base2
# 1.64 18-Jan-2009 mrg

branches: 1.64.2;
Fix multiple problems:

* A sign extension error creating the bridge ID corrupted the
priority (always making it the maximum).
* Do not catch STP packets on an interface for which STP is not
enabled -- it's a violation of the spec, and causes STP to fail on
neighboring bridges.
* An optimization to bstp_input() -- some information is already
known when we call it.

contributed anonymously.


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.63 07-Nov-2008 dyoung

*** Summary ***

When a link-layer address changes (e.g., ifconfig ex0 link
02:de:ad:be:ef:02 active), send a gratuitous ARP and/or a Neighbor
Advertisement to update the network-/link-layer address bindings
on our LAN peers.

Refuse a change of ethernet address to the address 00:00:00:00:00:00
or to any multicast/broadcast address. (Thanks matt@.)

Reorder ifnet ioctl operations so that driver ioctls may inherit
the functions of their "class"---ether_ioctl(), fddi_ioctl(), et
cetera---and the class ioctls may inherit from the generic ioctl,
ifioctl_common(), but both driver- and class-ioctls may override
the generic behavior. Make network drivers share more code.

Distinguish a "factory" link-layer address from others for the
purposes of both protecting that address from deletion and computing
EUI64.

Return consistent, appropriate error codes from network drivers.

Improve readability. KNF.

*** Details ***

In if_attach(), always initialize the interface ioctl routine,
ifnet->if_ioctl, if the driver has not already initialized it.
Delete if_ioctl == NULL tests everywhere else, because it cannot
happen.

In the ioctl routines of network interfaces, inherit common ioctl
behaviors by calling either ifioctl_common() or whichever ioctl
routine is appropriate for the class of interface---e.g., ether_ioctl()
for ethernets.

Stop (ab)using SIOCSIFADDR and start to use SIOCINITIFADDR. In
the user->kernel interface, SIOCSIFADDR's argument was an ifreq,
but on the protocol->ifnet interface, SIOCSIFADDR's argument was
an ifaddr. That was confusing, and it would work against me as I
make it possible for a network interface to overload most ioctls.
On the protocol->ifnet interface, replace SIOCSIFADDR with
SIOCINITIFADDR. In ifioctl(), return EPERM if userland tries to
invoke SIOCINITIFADDR.

In ifioctl(), give the interface the first shot at handling most
interface ioctls, and give the protocol the second shot, instead
of the other way around. Finally, let compatibility code (COMPAT_OSOCK)
take a shot.

Pull device initialization out of switch statements under
SIOCINITIFADDR. For example, pull ..._init() out of any switch
statement that looks like this:

switch (...->sa_family) {
case ...:
..._init();
...
break;
...
default:
..._init();
...
break;
}

Rewrite many if-else clauses that handle all permutations of IFF_UP
and IFF_RUNNING to use a switch statement,

switch (x & (IFF_UP|IFF_RUNNING)) {
case 0:
...
break;
case IFF_RUNNING:
...
break;
case IFF_UP:
...
break;
case IFF_UP|IFF_RUNNING:
...
break;
}

unifdef lots of code containing #ifdef FreeBSD, #ifdef NetBSD, and
#ifdef SIOCSIFMTU, especially in fwip(4) and in ndis(4).

In ipw(4), remove an if_set_sadl() call that is out of place.

In nfe(4), reuse the jumbo MTU logic in ether_ioctl().

Let ethernets register a callback for setting h/w state such as
promiscuous mode and the multicast filter in accord with a change
in the if_flags: ether_set_ifflags_cb() registers a callback that
returns ENETRESET if the caller should reset the ethernet by calling
if_init(), 0 on success, != 0 on failure. Pull common code from
ex(4), gem(4), nfe(4), sip(4), tlp(4), vge(4) into ether_ioctl(),
and register if_flags callbacks for those drivers.

Return ENOTTY instead of EINVAL for inappropriate ioctls. In
zyd(4), use ENXIO instead of ENOTTY to indicate that the device is
not any longer attached.

Add to if_set_sadl() a boolean 'factory' argument that indicates
whether a link-layer address was assigned by the factory or some
other source. In a comment, recommend using the factory address
for generating an EUI64, and update in6_get_hw_ifid() to prefer a
factory address to any other link-layer address.

Add a routing message, RTM_LLINFO_UPD, that tells protocols to
update the binding of network-layer addresses to link-layer addresses.
Implement this message in IPv4 and IPv6 by sending a gratuitous
ARP or a neighbor advertisement, respectively. Generate RTM_LLINFO_UPD
messages on a change of an interface's link-layer address.

In ether_ioctl(), do not let SIOCALIFADDR set a link-layer address
that is broadcast/multicast or equal to 00:00:00:00:00:00.

Make ether_ioctl() call ifioctl_common() to handle ioctls that it
does not understand.

In gif(4), initialize if_softc and use it, instead of assuming that
the gif_softc and ifp overlap.

Let ifioctl_common() handle SIOCGIFADDR.

Sprinkle rtcache_invariants(), which checks on DIAGNOSTIC kernels
that certain invariants on a struct route are satisfied.

In agr(4), rewrite agr_ioctl_filter() to be a bit more explicit
about the ioctls that we do not allow on an agr(4) member interface.

bzero -> memset. Delete unnecessary casts to void *. Use
sockaddr_in_init() and sockaddr_in6_init(). Compare pointers with
NULL instead of "testing truth". Replace some instances of (type
*)0 with NULL. Change some K&R prototypes to ANSI C, and join
lines.


Revision tags: netbsd-5-0-RC3 netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 wrstuden-revivesa-base-1 simonb-wapbl-nbase yamt-pf42-base4 simonb-wapbl-base wrstuden-revivesa-base
# 1.62 15-Jun-2008 christos

branches: 1.62.2; 1.62.4; 1.62.6;
- add if_alloc (ours just mallocs), and if_initname and use them (from FreeBSD)
- kill memsets where M_ZERO can be used.


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base
# 1.61 15-Apr-2008 thorpej

branches: 1.61.2; 1.61.4; 1.61.6; 1.61.8;
Make ip6 and icmp6 stats per-cpu.


# 1.60 12-Apr-2008 cegger

make this build with BRIDGE_IPF and PFIL_HOOKS options


# 1.59 12-Apr-2008 thorpej

Make IP, TCP, UDP, and ICMP statistics per-CPU. The stats are collated
when the user requests them via sysctl.


# 1.58 08-Apr-2008 thorpej

Change IPv6 stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ip6stat structure; old netstat
binaries will continue to work properly.


# 1.57 07-Apr-2008 thorpej

Change IP stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ipstat structure; old netstat
binaries will continue to work properly.


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase nick-net80211-sync-base keiichi-mipv6-base matt-armv6-nbase hpcarm-cleanup-base
# 1.56 20-Feb-2008 matt

branches: 1.56.6;
s/u_\(int[0-9]*_t\)/u\1/g
(change u_int*_t to uint*_t)


Revision tags: bouyer-xeni386-nbase bouyer-xeni386-base mjf-devfs-base
# 1.55 19-Jan-2008 dyoung

Use C99 array initializers for bridge_control_table[].


Revision tags: nick-csl-alignment-base5 bouyer-xeni386-merge1 matt-armv6-prevmlocking vmlocking2-base3 yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 jmcneill-base bouyer-xenamd64-base2 vmlocking-nbase yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base matt-armv6-base jmcneill-pm-base reinoud-bufcleanup-base vmlocking-base
# 1.54 27-Aug-2007 dyoung

branches: 1.54.2; 1.54.8; 1.54.14;
LLADDR -> CLLADDR.


# 1.53 26-Aug-2007 dyoung

Constify: LLADDR -> CLLADDR. I'm aiming here to make it easier to
identify sockaddr_dl abuse that remains in the kernel, especially
the potential for overwriting memory past the end of a sockaddr_dl
with, e.g., memcpy(LLADDR(), ...).

Use sockaddr_dl_setaddr() in a few places.


Revision tags: matt-mips64-base nick-csl-alignment-base mjf-ufs-trans-base
# 1.52 09-Jul-2007 ad

branches: 1.52.2; 1.52.6;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.51 12-Mar-2007 ad

branches: 1.51.2;
Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.


# 1.50 04-Mar-2007 christos

branches: 1.50.2;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.49 21-Feb-2007 dyoung

Use __arraycount().


# 1.48 17-Feb-2007 dyoung

KNF: de-__P, bzero -> memset, bcmp -> memcmp. Remove extraneous
parentheses in return statements.

Cosmetic: don't open-code TAILQ_FOREACH().

Cosmetic: change types of variables to avoid oodles of casts: in
in6_src.c, avoid casts by changing several route_in6 pointers
to struct route pointers. Remove unnecessary casts to caddr_t
elsewhere.

Pave the way for eliminating address family-specific route caches:
soon, struct route will not embed a sockaddr, but it will hold
a reference to an external sockaddr, instead. We will set the
destination sockaddr using rtcache_setdst(). (I created a stub
for it, but it isn't used anywhere, yet.) rtcache_free() will
free the sockaddr. I have extracted from rtcache_free() a helper
subroutine, rtcache_clear(). rtcache_clear() will "forget" a
cached route, but it will not forget the destination by releasing
the sockaddr. I use rtcache_clear() instead of rtcache_free()
in rtcache_update(), because rtcache_update() is not supposed
to forget the destination.

Constify:

1 Introduce const accessor for route->ro_dst, rtcache_getdst().

2 Constify the 'dst' argument to ifnet->if_output(). This
led me to constify a lot of code called by output routines.

3 Constify the sockaddr argument to protosw->pr_ctlinput. This
led me to constify a lot of code called by ctlinput routines.

4 Introduce const macros for converting from a generic sockaddr
to family-specific sockaddrs, e.g., sockaddr_in: satocsin6,
satocsin, et cetera.


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.47 04-Jan-2007 elad

branches: 1.47.2;
Consistent usage of KAUTH_GENERIC_ISSUSER.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3 netbsd-4-base
# 1.46 23-Nov-2006 rpaulo

New EtherIP driver based on tap(4) and gif(4) by Hans Rosenfeld.
Notable changes:
* Fixes PR 34268.
* Separates the code from gif(4) (which is more cleaner).
* Allows the usage of STP (Spanning Tree Protocol).
* Removed EtherIP implementation from gif(4)/tap(4).

Some input from Christos.


# 1.45 16-Nov-2006 christos

__unused removal on arguments; approved by core.


Revision tags: yamt-splraiseipl-base2
# 1.44 17-Oct-2006 dogcow

now that we have -Wno-unused-parameter, back out all the tremendously ugly
code to gratuitously access said parameters.


# 1.43 13-Oct-2006 dogcow

More -Wunused fallout. sprinkle __unused when possible; otherwise, use the
do { if (&x) {} } while (/* CONSTCOND */ 0);
construct as suggested by uwe in <20061012224845.GA9449@snark.ptc.spbu.ru>.


# 1.42 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


# 1.41 05-Oct-2006 tls

Protect calls to pool_put/pool_get that may occur in interrupt context
with spl used to protect other allocations and frees, or datastructure
element insertion and removal, in adjacent code.

It is almost unquestionably the case that some of the spl()/splx() calls
added here are superfluous, but it really seems wrong to see:

s=splfoo();
/* frob data structure */
splx(s);
pool_put(x);

and if we think we need to protect the first operation, then it is hard
to see why we should not think we need to protect the next. "Better
safe than sorry".

It is also almost unquestionably the case that I missed some pool
gets/puts from interrupt context with my strategy for finding these
calls; use of PR_NOWAIT is a strong hint that a pool may be used from
interrupt context but many callers in the kernel pass a "can wait/can't
wait" flag down such that my searches might not have found them. One
notable area that needs to be looked at is pf.

See also:

http://mail-index.netbsd.org/tech-kern/2006/07/19/0003.html
http://mail-index.netbsd.org/tech-kern/2006/07/19/0009.html


Revision tags: abandoned-netbsd-4-base yamt-splraiseipl-base yamt-pdpolicy-base9 yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.40 23-Jul-2006 ad

branches: 1.40.4; 1.40.6;
Use the LWP cached credentials where sane.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.39 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.38 18-May-2006 liamjfoy

branches: 1.38.2;
Integrate Common Address Redundancy Procotol (CARP) from OpenBSD

'pseudo-device carp'

Thanks to: joerg@ christos@ riz@ and others who tested
Ok: core@


# 1.37 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base yamt-pdpolicy-base2 elad-kernelauth-base yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.36 17-Jan-2006 christos

branches: 1.36.2; 1.36.4; 1.36.6; 1.36.8; 1.36.10;
Make sure that breq is also cleared (from Xin LI)


# 1.35 09-Jan-2006 christos

Make sure we initialize all structs to 0; from Xin LI


# 1.34 24-Dec-2005 perry

branches: 1.34.2;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.33 11-Dec-2005 thorpej

ANSI function decls and application of static.


# 1.32 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.31 01-Jun-2005 jdc

branches: 1.31.2;
Fix this properly by renaming the conflicting variables.


# 1.30 01-Jun-2005 jdc

Remove extraneous definition of struct llc (found by shadow warning).


Revision tags: netbsd-3-0-RELEASE netbsd-3-0-RC6 netbsd-3-0-RC5 netbsd-3-0-RC4 netbsd-3-0-RC3 netbsd-3-0-RC2 netbsd-3-0-RC1 yamt-km-base4 yamt-km-base3 netbsd-3-base kent-audio2-base
# 1.29 26-Feb-2005 perry

branches: 1.29.2; 1.29.4;
nuke trailing whitespace


Revision tags: yamt-km-base2
# 1.28 31-Jan-2005 kim

Add RFC 3378 EtherIP support, ported from OpenBSD to NetBSD by
Hans Rosenfeld (rosenfeld at grumpf.hope-2000.org)

This change makes it possible to add gif interfaces to bridges, which
will then send and receive IP protocol 97 packets. Packets are Ethernet
frames with an EtherIP header prepended.


Revision tags: yamt-km-base kent-audio1-beforemerge kent-audio1-base
# 1.27 04-Dec-2004 peter

branches: 1.27.4; 1.27.6;
Change ifc_destroy to return an int instead of void, so that it
can pass back errors to ifconfig.


# 1.26 06-Oct-2004 bad

Interfaces that do checksum offloading indicate the checksum status of
received packets in csum_flags in the packet header. Packets that are
forwarded over the bridge need to have csum_flags cleared before being
put on the output queue. Do so in bridge_enqueue().

Discussed with Jason Thorpe.

Fixes PR kern/27007 and the first part of PR kern/21831.


# 1.25 05-Oct-2004 christos

Only enable BRIDGE_IPF code if PFIL_HOOKS is enabled.


# 1.24 21-Apr-2004 itojun

kill a sprintf


# 1.23 21-Apr-2004 itojun

kill sprintf, use snprintf


Revision tags: netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.22 31-Jan-2004 jdc

branches: 1.22.2;
Use m_copydata(), m_adj() and M_PREPEND() to manipulate mbuf's in
bridge_ipf(). Fixes kernel memory corruption that occured when using
m_split() and m_cat().
Idea from OpenBSD.


# 1.21 09-Dec-2003 augustss

Fix spelling mistake in a comment.


# 1.20 28-Oct-2003 mycroft

Mark this initializer in the canonical way so it can be found later.


# 1.19 25-Oct-2003 christos

Fix uninitialized variable warnings


# 1.18 16-Sep-2003 jdc

Add a flag parameter to bridge_enqueue() to tell it whether to run the
filter or not. We only need to run the filter for bridge_forward() and
bridge_broadcast(). If we also run it for bridge_output(), we will run
the filter twice outbound per packet, so don't.

In bridge_ipf(), make sure we don't run m_cat() on a single mbuf chain
by checking to see (and remembering) if we need to m_split() the mbuf.
This fixes bridge + ipfilter on sparc.

Fixes PR kern/22063.


# 1.17 11-Aug-2003 itojun

rm extra blank line


# 1.16 13-Jul-2003 jdc

Include opt_inet.h to get INET6 definition.
Now, bridged ipv6 packets are passed through ipfilter.
However, some v6 packets still do not get transmitted when ipf is enabled.
Partial fix for PR kern/22063.


# 1.15 23-Jun-2003 martin

branches: 1.15.2;
Make sure to include opt_foo.h if a defflag option FOO is used.


# 1.14 24-May-2003 kristerw

Make sure splx() is called for all bridge_ioctl() error cases.


# 1.13 16-May-2003 itojun

use strlcpy


# 1.12 14-May-2003 itojun

use arc4random


# 1.11 19-Mar-2003 bouyer

Fix 2 bugs:
- initialise stp when the bridge is turned up, without this stp will keep
all interfaces disabled in a sequence like:
brconfig bridge0 add if0 add if1 stp if0 stp if1 up
- s/BRDGSPRI/BRDGSIFPRIO in brconfig.c:cmd_ifpriority()

add a command (ifpathcost) to change the stp path cost of the STP path cost of
an interface. Display the interface path cost with the others STP parameters.


# 1.10 27-Feb-2003 perseant

Make BRIDGE_IPF an option, and document it. Add it (commented) to GENERIC.
Let brconfig tell whether the bridge is using the ipfilter hook, or not.


# 1.9 15-Feb-2003 perseant

Add ipf packet-filtering option to if_bridge. The option is controlled at
compile-time by BRIDGE_IPF, and at runtime by brconfig with the {ipf,-ipf}
option on a per-bridge basis.

As a side-effect, add PFIL_HOOKS processing to if_bridge.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base kqueue-aftermerge kqueue-beforemerge gehenna-devsw-base kqueue-base
# 1.8 24-Aug-2002 martin

Add a function to lookup bridge members by struct ifnet * and use
it at all call sites that have such a pointer readily available.
This avoids unnecessary strcmp()s in critical paths, and removes
some XXX comments.


# 1.7 08-Jun-2002 itojun

reject "add" request if if_mtu is different.


# 1.6 23-May-2002 itojun

use IFT_BRIDGE


Revision tags: netbsd-1-6-base
# 1.5 24-Mar-2002 jdolecek

branches: 1.5.2; 1.5.4;
Fix a memory leak in bridge_ioctl_add() when the called for non-ethernet
interface.
Problem noted and fix provided by in kern/16019 by Love.


Revision tags: eeh-devprop-base newlock-base
# 1.4 08-Mar-2002 thorpej

Pool deals fairly well with physical memory shortage, but it doesn't
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map). Try to deal with this:

* Group all information about the backend allocator for a pool in a
separate structure. The pool references this structure, rather than
the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
to become available, but will still fail if it cannot callocate KVA
space for the pages. If this happens, carefully drain all pools using
the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
some pages, and use that information to make draining easier and more
efficient.
* Get rid of PR_URGENT. There was only one use of it, and it could be
dealt with by the caller.

From art@openbsd.org.


Revision tags: ifpoll-base
# 1.3 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.2 17-Aug-2001 thorpej

branches: 1.2.2; 1.2.4;
Only report expire time for DYNAMIC forwarding table entries.


# 1.1 17-Aug-2001 thorpej

Add support for building Ethernet bridges, based on Jason Wright's
bridge driver from OpenBSD, although the bridge code has been *heavily*
modified by me (the 802.1D code remains mostly unchanged from the
original).


Revision tags: nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004
# 1.131 15-Sep-2016 christos

Always do the mbuf checks. The packet filters (npf) expect the mbuf to be
pulled-up. (Krists Krilovs)


Revision tags: localcount-20160914
# 1.130 29-Aug-2016 ozaki-r

KNF; replace white spaces with hard tabs

No functional change.


Revision tags: pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.129 22-Jun-2016 knakahara

branches: 1.129.2;
fix: locking about IFQ_ENQUEUE and ALTQ

- If NET_MPSAFE is not defined, IFQ_LOCK is nop. Currently, that means
IFQ_ENQUEUE() of some paths such as bridge_enqueue() is called parallel
wrongly.
- If ALTQ is enabled, Tx processing should call if_transmit() (= IFQ_ENQUEUE
+ ifp->if_start()) instead of ifp->if_transmit() to call ALTQ_ENQUEUE()
and ALTQ_DEQUEUE().
Furthermore, ALTQ processing is always required KERNEL_LOCK currently.


# 1.128 20-Jun-2016 knakahara

fix: should not assert IFEF_OUTPUT_MPSAFE in bridge_output()


# 1.127 20-Jun-2016 knakahara

tentative fix for ATF(net/if_bridge/t_bridge)


# 1.126 20-Jun-2016 knakahara

make bridge_output MP-safe, so that bridge(4) can enable IFEF_OUTPUT_MPSAFE.

making MP-scalable is future work.


# 1.125 10-Jun-2016 ozaki-r

Avoid storing a pointer of an interface in a mbuf

Having a pointer of an interface in a mbuf isn't safe if we remove big
kernel locks; an interface object (ifnet) can be destroyed anytime in any
packet processing and accessing such object via a pointer is racy. Instead
we have to get an object from the interface collection (ifindex2ifnet) via
an interface index (if_index) that is stored to a mbuf instead of an
pointer.

The change provides two APIs: m_{get,put}_rcvif_psref that use psref(9)
for sleep-able critical sections and m_{get,put}_rcvif that use
pserialize(9) for other critical sections. The change also adds another
API called m_get_rcvif_NOMPSAFE, that is NOT MP-safe and for transition
moratorium, i.e., it is intended to be used for places where are not
planned to be MP-ified soon.

The change adds some overhead due to psref to performance sensitive paths,
however the overhead is not serious, 2% down at worst.

Proposed on tech-kern and tech-net.


# 1.124 10-Jun-2016 ozaki-r

Introduce m_set_rcvif and m_reset_rcvif

The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.


Revision tags: nick-nhusb-base-20160529
# 1.123 16-May-2016 ozaki-r

Apply if_get and if_put to bridge(4)


# 1.122 04-May-2016 roy

Allow multicast/broadcast packets from a bridge member to other members.
Note this should just call bridge_broadcast when more locking issues are
resolved.


# 1.121 28-Apr-2016 knakahara

introduce new ifnet MP-scalable sending interface "if_transmit".


# 1.120 28-Apr-2016 ozaki-r

Constify rtentry of if_output

We no longer need to change rtentry below if_output.

The change makes it clear where rtentries are changed (or not)
and helps forthcoming locking (os psrefing) rtentries.


# 1.119 24-Apr-2016 christos

CID 1358673: dead code


Revision tags: nick-nhusb-base-20160422
# 1.118 22-Apr-2016 roy

Change used from int to bool.
If used, abort the loop because we think we're already at the end.


# 1.117 20-Apr-2016 knakahara

IFQ_ENQUEUE refactor (3/3) : eliminate pktattr argument from IFQ_ENQUEUE caller


# 1.116 20-Apr-2016 knakahara

IFQ_ENQUEUE refactor (2/3) : eliminate pktattr argument from altq implemantation


# 1.115 19-Apr-2016 ozaki-r

Apply psref(9) to bridge(4)

Note that there is an issue that ioctls for an interface and a destruction
of the interface can run in parallel and it causes race conditions on
bridge as well (it rarely happens). The issue will be addressed in the
interface common code (if.c).


# 1.114 19-Apr-2016 ozaki-r

Remove BRIDGE_MPSAFE switch and enable MP-safe code by default

We need to enable it by default because bridge_input now runs
in softint, but bridge_input w/o BRIDGE_MPSAFE was designed as
it runs in hardware interrupt.

Note that there remains a racy code in bridge_output; it will be
solved in the upcoming change (applying psref(9)).


# 1.113 11-Apr-2016 ozaki-r

Fix usage of pslist(9)

Pointed out by riastradh@.


# 1.112 11-Apr-2016 ozaki-r

Use pslist(9) in bridge(4)

This adds missing memory barriers to list operations for pserialize.


# 1.111 28-Mar-2016 ozaki-r

Remove unused global bridge list

Pointed out by riastradh@


# 1.110 23-Mar-2016 ozaki-r

Fix LIST_FOREACH argument


# 1.109 23-Mar-2016 ozaki-r

Use LIST_FOREACH instead of LIST_FOREACH_SAFE

No need to use *_SAFE because we don't remove any items in the loop.


Revision tags: nick-nhusb-base-20160319
# 1.108 15-Feb-2016 ozaki-r

Simplify bridge(4)

Thanks to introducing softint-based if_input, the entire bridge code now
never run in hardware interrupt context. So we can simplify the code.

- Remove spin mutexes
- They were needed because some code of bridge could run in
hardware interrupt context
- We now need only an adaptive mutex for each shared object
(a member list and a forwarding table)
- Remove pktqueue
- bridge_input is already in softint, using another softint
(for bridge_forward) is useless
- Packet distribution should be down at device drivers


# 1.107 10-Feb-2016 ozaki-r

Don't share struct work, instead have one per softc

Pointed out by riastradh@


# 1.106 09-Feb-2016 ozaki-r

Introduce softint-based if_input

This change intends to run the whole network stack in softint context
(or normal LWP), not hardware interrupt context. Note that the work is
still incomplete by this change; to that end, we also have to softint-ify
if_link_state_change (and bpf) which can still run in hardware interrupt.

This change softint-ifies at ifp->if_input that is called from
each device driver (and ieee80211_input) to ensure Layer 2 runs
in softint (e.g., ether_input and bridge_input). To this end,
we provide a framework (called percpuq) that utlizes softint(9)
and percpu ifqueues. With this patch, rxintr of most drivers just
queues received packets and schedules a softint, and the softint
dequeues packets and does rest packet processing.

To minimize changes to each driver, percpuq is allocated in struct
ifnet for now and that is initialized by default (in if_attach).
We probably have to move percpuq to softc of each driver, but it's
future work. At this point, only wm(4) has percpuq in its softc
as a reference implementation.

Additional information including performance numbers can be found
in the thread at tech-kern@ and tech-net@:
http://mail-index.netbsd.org/tech-kern/2016/01/14/msg019997.html

Acknowledgment: riastradh@ greatly helped this work.
Thank you very much!


Revision tags: nick-nhusb-base-20151226
# 1.105 19-Nov-2015 christos

Add handling of VLAN packets in if_bridge where the parent interface supports
them (Jean-Jacques.Puig@espci.fr). Factor out the vlan_mtu enabling and
disabling code.


# 1.104 20-Oct-2015 maxv

Harmless alloc inconsistency; make sure the exact same argument is given to
kmem_alloc/kmem_free. Found by Brainy.


# 1.103 07-Oct-2015 ozaki-r

Enqueue frames to a curcpu's pktqueue

Currently RX can run on a CPU other than CPU#0, so always enqueuing
to a pktqueue of CPU#0 makes no sense. Let's use a curcpu's pktqueue,
although bridge_foward softint doesn't run in parallel without
NET_MPSAFE.

This is a temporal solution. We need a fundamental solution.


Revision tags: nick-nhusb-base-20150921
# 1.102 28-Aug-2015 rjs

Don't set M_PROTO1 in mbuf flags.

This was left over from the old usage of gif(4) with bridges.


# 1.101 20-Aug-2015 christos

include "ioconf.h" to get the 'void <driver>attach(int count);' prototype.


# 1.100 23-Jul-2015 ozaki-r

Fix PR 48104

So far bridge cannot receive frames via a member interface when the frames
come from another member interface. So when we assign an IP address to
a member interface, hosts connected to another member interface cannot
ping to the IP address. That behavior isn't expected. See PR 48104 for
more realistic examples of this issue.

The change does:
- drop M_PROMISC before ether_input, which allows a bridge member interface
to receive a frame coming from another bridge member interface
- receive broadcast/multicast frames via all bridge member interfaces,
which is required to receive IPv6 multicast packets destined to a
multicast group belonging to a bridge member interface that is different
from a packet arrival interface

roy@ helped testing of the fix, thanks!


Revision tags: nick-nhusb-base-20150606
# 1.99 01-Jun-2015 matt

Modify the BRDGGIFS and BRDGRTS cmds to be more COMPAT_NETBSD32 friendly.
(XXX whitespace)


# 1.98 16-Apr-2015 ozaki-r

Fix racy bridge_delete_member

It can be called from bridge_ioctl_del and bridge_clone_destroy with
a same bridge member (bif) at the same time. We have to prevent
that happens.

Pointed out by riastradh@


Revision tags: nick-nhusb-base-20150406
# 1.97 08-Jan-2015 ozaki-r

Use pserialize for rtlist in bridge

This change enables lockless accesses to bridge rtable lists.
See locking notes in a comment to know how pserialize and
mutexes are used. Some functions are rearranged to use
pserialize. A workqueue is introduced to use pserialize in
bridge_rtage via bridge_timer callout.

As usual, pserialize and mutexes are used only when NET_MPSAFE
on. On the other hand, the newly added workqueue is used
regardless of NET_MPSAFE on or off.


# 1.96 01-Jan-2015 ozaki-r

Reset the expire time of a cache on receiving a frame for the cache

The expire time of a cache in a bridge MAC address table was never reset
once it is initialized regardless of traffic for the cache. The behavior
isn't supposed and active caches are unnecessarily expired and removed.

PR kern/49507


# 1.95 31-Dec-2014 ozaki-r

Use pserialize in bridge

This change enables lockless accesses to bridge member lists.
See locking notes in a comment to know how pserialize and
mutexes are used.

This change also provides support for softint-based interrupt
handling; pserialize readers can run in both HW interrupt and
softint contexts.

As usual, pserialize is used only when NET_MPSAFE on.


# 1.94 25-Dec-2014 ozaki-r

Use LIST_FOREACH_SAFE in bridge_rt* functions


# 1.93 24-Dec-2014 ozaki-r

Replace malloc/free with kmem_* in if_bridge

Additionally M_NOWAIT is replaced with KM_SLEEP.


# 1.92 22-Dec-2014 ozaki-r

Call ether_input/m_freem without holding a lock or referencing unnecessary objects

When NET_MPSAFE on, a bridge tries to pass up a packet to Layer 3
(or call m_freem) with holding a lock or referencing unnecessary
objects. That causes random lock ups. The change fixes the issue.


Revision tags: nick-nhusb-base
# 1.91 15-Aug-2014 ozaki-r

branches: 1.91.2;
bridge: reject non-IFF_SIMPLEX interfaces

bridge does not work with !IFF_SIMPLEX interfaces (PR/18035);
the bug is not yet fixed. Until it gets fixed, we should
reject non-IFF_SIMPLEX interfaces.

Discussed with pooka@


Revision tags: netbsd-7-0-2-RELEASE netbsd-7-nhusb-base netbsd-7-0-1-RELEASE netbsd-7-0-RELEASE netbsd-7-0-RC3 netbsd-7-0-RC2 netbsd-7-0-RC1 netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.90 23-Jul-2014 ozaki-r

Avoid calling copyout with holding mutex(IPL_NET)

Because copyout may lead a page fault that may sleep, we have to pull it
out from the critical section of mutex(IPL_NET) in bridge_ioctl_gifs.


# 1.89 23-Jul-2014 ozaki-r

Add missing unlock


# 1.88 20-Jul-2014 ozaki-r

Don't return ENETRESET when ioctl SIOCSIFMTU

Otherwise, just changing MTU with ifconfig shows
a confusable error message.

RP kern/48996


# 1.87 14-Jul-2014 ozaki-r

Make bridge MPSAFE

- Introduce BRIDGE_MPSAFE
- It's enabled only when NET_MPSAFE is defined
in if.h or the kernel config
- Add iflist and rtlist mutex locks
- Locking iflist is performance sensitive,
so it's not used when !BRIDGE_MPSAFE
- Add bif object reference counting
- It enables fine-grain locking for bridge member lists
by allowing to not hold a lock during touching a bif
- bridge_release_member is added to decrement the
reference count
- A condition variable is added to do bridge_delete_member
gracefully
- Add if_bridgeif to ifnet
- It's a shortcut to a bif object of a bridge member
- It reduces a bif lookup cost and so lock contention on iflist
- Make bridgestp MPSAFE too


# 1.86 02-Jul-2014 ozaki-r

Protect bridge_list with a mutex


# 1.85 02-Jul-2014 ozaki-r

Remove obsolete codes for if_snd


# 1.84 23-Jun-2014 ozaki-r

Get rid of unnecessary xc_broadcast after pktq_barrier

Pointed out by rmind@


# 1.83 18-Jun-2014 ozaki-r

Restructure bridge_input and bridge_broadcast

There are two changes:
- Assemble the places calling pktq_enqueue (bridge_forward)
for unicast and {b,m}cast frames into one
- Receive {b,m}cast frames in bridge_broadcast, not in
bridge_input

The changes make the code clear and readable. bridge_input
now doesn't need to take care of {b,m}cast frames;
bridge_forward and bridge_broadcast have the responsibility.

The changes are based on a patch of Lloyd Parkes submitted
in PR 48104, but don't fix its issue yet.


# 1.82 18-Jun-2014 ozaki-r

Tidy up bridge_input

No functional change.


# 1.81 17-Jun-2014 ozaki-r

Restructure ether_input and bridge_input

The network stack of NetBSD is well organized and
layered. A packet reception is processed from a
lower layer to an upper layer one by one. However,
ether_input and bridge_input are not structured so.
bridge_input is called inside ether_input.

The new structure replaces ifnet#if_input of a bridge
member with bridge_input when the member is attached.
So a packet goes straight on a packet reception via
a bridge, bridge_input => ether_input => ip_input.

The change is part of a patch of Lloyd Parkes submitted
in PR 48104. Unlike the patch, the change doesn't
intend to change the behavior of the packet processing.
Another patch will fix PR 48104.


# 1.80 16-Jun-2014 ozaki-r

Add net.interfaces.bridgeN.fwdq.{maxlen,len,drops} sysctl


# 1.79 16-Jun-2014 ozaki-r

Use pktqueue for bridge forwarding queue and softint


# 1.78 15-Jun-2014 ozaki-r

Get rid of unnecessary splnet for pool_{get,put}

A mutex prevents interrupts in the functions now.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base
# 1.77 29-Jun-2013 rmind

branches: 1.77.4;
- Rewrite parts of pfil(9): use array to store hooks and thus be more cache
friendly (there are only few hooks in the system). Make the structures
opaque and the interface more strict.
- Remove PFIL_HOOKS option by making pfil(9) mandatory.


Revision tags: agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8
# 1.76 22-Mar-2012 wiz

branches: 1.76.2; 1.76.4;
Fix typo in kauth name. From PR 46234 by Matthew Mondor.
Tested by Geoff Adams and Ryo ONODERA.


# 1.75 13-Mar-2012 elad

Replace the remaining KAUTH_GENERIC_ISSUSER authorization calls with
something meaningful. All relevant documentation has been updated or
written.

Most of these changes were brought up in the following messages:

http://mail-index.netbsd.org/tech-kern/2012/01/18/msg012490.html
http://mail-index.netbsd.org/tech-kern/2012/01/19/msg012502.html
http://mail-index.netbsd.org/tech-kern/2012/02/17/msg012728.html

Thanks to christos, manu, njoly, and jmmv for input.

Huge thanks to pgoyette for spinning these changes through some build
cycles and ATF.


Revision tags: netbsd-6-0-6-RELEASE netbsd-6-1-5-RELEASE netbsd-6-1-4-RELEASE netbsd-6-0-5-RELEASE netbsd-6-1-3-RELEASE netbsd-6-0-4-RELEASE netbsd-6-1-2-RELEASE netbsd-6-0-3-RELEASE netbsd-6-1-1-RELEASE netbsd-6-0-2-RELEASE netbsd-6-1-RELEASE netbsd-6-1-RC4 netbsd-6-1-RC3 netbsd-6-1-RC2 netbsd-6-1-RC1 netbsd-6-0-1-RELEASE matt-nb6-plus-nbase netbsd-6-0-RELEASE netbsd-6-0-RC2 matt-nb6-plus-base netbsd-6-0-RC1 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base
# 1.74 19-Nov-2011 tls

branches: 1.74.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.73 23-May-2011 joerg

branches: 1.73.4;
simplify


Revision tags: bouyer-quota2-nbase bouyer-quota2-base jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.72 07-Dec-2010 pooka

branches: 1.72.2;
_KERNEL_TOP


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1 yamt-nfs-mp-base9 uebayasi-xip-base
# 1.71 19-Jan-2010 pooka

branches: 1.71.4;
Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


Revision tags: matt-premerge-20091211 yamt-nfs-mp-base8 yamt-nfs-mp-base7 jymxensuspend-base yamt-nfs-mp-base6 yamt-nfs-mp-base5 jym-xensuspend-nbase
# 1.70 17-May-2009 cegger

fix crash in bridge_ioctl():


BRDGGFLT and BRDGSFILT bridge controls are only available with BRIDGE_IPF and PFIL_HOOKS defined.
In amd64 GENERIC and XEN kernel configs PFIL_HOOKS is defined but BRIDGE_IPF is not.

When a BRDGGFLT or BRDGSFILT command comes in, then ifd->ifd_cmd is not in range
of bridge_control_table_size. Then bc is not set and is dereferenced
later => BOOM.


Revision tags: yamt-nfs-mp-base4 jym-xensuspend-base
# 1.69 12-May-2009 elad

Move kauth(9) call before going into splnet().

Mailing list reference:

http://mail-index.netbsd.org/tech-net/2009/05/08/msg001286.html


Revision tags: yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 nick-hppapmap-base
# 1.68 04-Apr-2009 bouyer

Fix another typo


# 1.67 04-Apr-2009 bouyer

Fix a comment, and make it build.


# 1.66 04-Apr-2009 bouyer

Fixes from Masao Uebayashi


# 1.65 04-Apr-2009 bouyer

Fix for if_start() and pfil_hook() being called from hardware interrupt
context (reported on various mailing-lists, and part of PR kern/41114,
causing panic in pf(4) and possibly ipf(4) when BRIDGE_IPF is used).
Defer bridge_forward() to a software interrupt; bridge_input() enqueues
mbufs to ifp->if_snd which is handled in bridge_forward().


Revision tags: nick-hppapmap-base2
# 1.64 18-Jan-2009 mrg

branches: 1.64.2;
Fix multiple problems:

* A sign extension error creating the bridge ID corrupted the
priority (always making it the maximum).
* Do not catch STP packets on an interface for which STP is not
enabled -- it's a violation of the spec, and causes STP to fail on
neighboring bridges.
* An optimization to bstp_input() -- some information is already
known when we call it.

contributed anonymously.


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base haad-dm-base mjf-devfs2-base
# 1.63 07-Nov-2008 dyoung

*** Summary ***

When a link-layer address changes (e.g., ifconfig ex0 link
02:de:ad:be:ef:02 active), send a gratuitous ARP and/or a Neighbor
Advertisement to update the network-/link-layer address bindings
on our LAN peers.

Refuse a change of ethernet address to the address 00:00:00:00:00:00
or to any multicast/broadcast address. (Thanks matt@.)

Reorder ifnet ioctl operations so that driver ioctls may inherit
the functions of their "class"---ether_ioctl(), fddi_ioctl(), et
cetera---and the class ioctls may inherit from the generic ioctl,
ifioctl_common(), but both driver- and class-ioctls may override
the generic behavior. Make network drivers share more code.

Distinguish a "factory" link-layer address from others for the
purposes of both protecting that address from deletion and computing
EUI64.

Return consistent, appropriate error codes from network drivers.

Improve readability. KNF.

*** Details ***

In if_attach(), always initialize the interface ioctl routine,
ifnet->if_ioctl, if the driver has not already initialized it.
Delete if_ioctl == NULL tests everywhere else, because it cannot
happen.

In the ioctl routines of network interfaces, inherit common ioctl
behaviors by calling either ifioctl_common() or whichever ioctl
routine is appropriate for the class of interface---e.g., ether_ioctl()
for ethernets.

Stop (ab)using SIOCSIFADDR and start to use SIOCINITIFADDR. In
the user->kernel interface, SIOCSIFADDR's argument was an ifreq,
but on the protocol->ifnet interface, SIOCSIFADDR's argument was
an ifaddr. That was confusing, and it would work against me as I
make it possible for a network interface to overload most ioctls.
On the protocol->ifnet interface, replace SIOCSIFADDR with
SIOCINITIFADDR. In ifioctl(), return EPERM if userland tries to
invoke SIOCINITIFADDR.

In ifioctl(), give the interface the first shot at handling most
interface ioctls, and give the protocol the second shot, instead
of the other way around. Finally, let compatibility code (COMPAT_OSOCK)
take a shot.

Pull device initialization out of switch statements under
SIOCINITIFADDR. For example, pull ..._init() out of any switch
statement that looks like this:

switch (...->sa_family) {
case ...:
..._init();
...
break;
...
default:
..._init();
...
break;
}

Rewrite many if-else clauses that handle all permutations of IFF_UP
and IFF_RUNNING to use a switch statement,

switch (x & (IFF_UP|IFF_RUNNING)) {
case 0:
...
break;
case IFF_RUNNING:
...
break;
case IFF_UP:
...
break;
case IFF_UP|IFF_RUNNING:
...
break;
}

unifdef lots of code containing #ifdef FreeBSD, #ifdef NetBSD, and
#ifdef SIOCSIFMTU, especially in fwip(4) and in ndis(4).

In ipw(4), remove an if_set_sadl() call that is out of place.

In nfe(4), reuse the jumbo MTU logic in ether_ioctl().

Let ethernets register a callback for setting h/w state such as
promiscuous mode and the multicast filter in accord with a change
in the if_flags: ether_set_ifflags_cb() registers a callback that
returns ENETRESET if the caller should reset the ethernet by calling
if_init(), 0 on success, != 0 on failure. Pull common code from
ex(4), gem(4), nfe(4), sip(4), tlp(4), vge(4) into ether_ioctl(),
and register if_flags callbacks for those drivers.

Return ENOTTY instead of EINVAL for inappropriate ioctls. In
zyd(4), use ENXIO instead of ENOTTY to indicate that the device is
not any longer attached.

Add to if_set_sadl() a boolean 'factory' argument that indicates
whether a link-layer address was assigned by the factory or some
other source. In a comment, recommend using the factory address
for generating an EUI64, and update in6_get_hw_ifid() to prefer a
factory address to any other link-layer address.

Add a routing message, RTM_LLINFO_UPD, that tells protocols to
update the binding of network-layer addresses to link-layer addresses.
Implement this message in IPv4 and IPv6 by sending a gratuitous
ARP or a neighbor advertisement, respectively. Generate RTM_LLINFO_UPD
messages on a change of an interface's link-layer address.

In ether_ioctl(), do not let SIOCALIFADDR set a link-layer address
that is broadcast/multicast or equal to 00:00:00:00:00:00.

Make ether_ioctl() call ifioctl_common() to handle ioctls that it
does not understand.

In gif(4), initialize if_softc and use it, instead of assuming that
the gif_softc and ifp overlap.

Let ifioctl_common() handle SIOCGIFADDR.

Sprinkle rtcache_invariants(), which checks on DIAGNOSTIC kernels
that certain invariants on a struct route are satisfied.

In agr(4), rewrite agr_ioctl_filter() to be a bit more explicit
about the ioctls that we do not allow on an agr(4) member interface.

bzero -> memset. Delete unnecessary casts to void *. Use
sockaddr_in_init() and sockaddr_in6_init(). Compare pointers with
NULL instead of "testing truth". Replace some instances of (type
*)0 with NULL. Change some K&R prototypes to ANSI C, and join
lines.


Revision tags: netbsd-5-0-RC3 netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 wrstuden-revivesa-base-1 simonb-wapbl-nbase yamt-pf42-base4 simonb-wapbl-base wrstuden-revivesa-base
# 1.62 15-Jun-2008 christos

branches: 1.62.2; 1.62.4; 1.62.6;
- add if_alloc (ours just mallocs), and if_initname and use them (from FreeBSD)
- kill memsets where M_ZERO can be used.


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base
# 1.61 15-Apr-2008 thorpej

branches: 1.61.2; 1.61.4; 1.61.6; 1.61.8;
Make ip6 and icmp6 stats per-cpu.


# 1.60 12-Apr-2008 cegger

make this build with BRIDGE_IPF and PFIL_HOOKS options


# 1.59 12-Apr-2008 thorpej

Make IP, TCP, UDP, and ICMP statistics per-CPU. The stats are collated
when the user requests them via sysctl.


# 1.58 08-Apr-2008 thorpej

Change IPv6 stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ip6stat structure; old netstat
binaries will continue to work properly.


# 1.57 07-Apr-2008 thorpej

Change IP stats from a structure to an array of uint64_t's.

Note: This is ABI-compatible with the old ipstat structure; old netstat
binaries will continue to work properly.


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase nick-net80211-sync-base keiichi-mipv6-base matt-armv6-nbase hpcarm-cleanup-base
# 1.56 20-Feb-2008 matt

branches: 1.56.6;
s/u_\(int[0-9]*_t\)/u\1/g
(change u_int*_t to uint*_t)


Revision tags: bouyer-xeni386-nbase bouyer-xeni386-base mjf-devfs-base
# 1.55 19-Jan-2008 dyoung

Use C99 array initializers for bridge_control_table[].


Revision tags: nick-csl-alignment-base5 bouyer-xeni386-merge1 matt-armv6-prevmlocking vmlocking2-base3 yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 jmcneill-base bouyer-xenamd64-base2 vmlocking-nbase yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base matt-armv6-base jmcneill-pm-base reinoud-bufcleanup-base vmlocking-base
# 1.54 27-Aug-2007 dyoung

branches: 1.54.2; 1.54.8; 1.54.14;
LLADDR -> CLLADDR.


# 1.53 26-Aug-2007 dyoung

Constify: LLADDR -> CLLADDR. I'm aiming here to make it easier to
identify sockaddr_dl abuse that remains in the kernel, especially
the potential for overwriting memory past the end of a sockaddr_dl
with, e.g., memcpy(LLADDR(), ...).

Use sockaddr_dl_setaddr() in a few places.


Revision tags: matt-mips64-base nick-csl-alignment-base mjf-ufs-trans-base
# 1.52 09-Jul-2007 ad

branches: 1.52.2; 1.52.6;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.51 12-Mar-2007 ad

branches: 1.51.2;
Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.


# 1.50 04-Mar-2007 christos

branches: 1.50.2;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.49 21-Feb-2007 dyoung

Use __arraycount().


# 1.48 17-Feb-2007 dyoung

KNF: de-__P, bzero -> memset, bcmp -> memcmp. Remove extraneous
parentheses in return statements.

Cosmetic: don't open-code TAILQ_FOREACH().

Cosmetic: change types of variables to avoid oodles of casts: in
in6_src.c, avoid casts by changing several route_in6 pointers
to struct route pointers. Remove unnecessary casts to caddr_t
elsewhere.

Pave the way for eliminating address family-specific route caches:
soon, struct route will not embed a sockaddr, but it will hold
a reference to an external sockaddr, instead. We will set the
destination sockaddr using rtcache_setdst(). (I created a stub
for it, but it isn't used anywhere, yet.) rtcache_free() will
free the sockaddr. I have extracted from rtcache_free() a helper
subroutine, rtcache_clear(). rtcache_clear() will "forget" a
cached route, but it will not forget the destination by releasing
the sockaddr. I use rtcache_clear() instead of rtcache_free()
in rtcache_update(), because rtcache_update() is not supposed
to forget the destination.

Constify:

1 Introduce const accessor for route->ro_dst, rtcache_getdst().

2 Constify the 'dst' argument to ifnet->if_output(). This
led me to constify a lot of code called by output routines.

3 Constify the sockaddr argument to protosw->pr_ctlinput. This
led me to constify a lot of code called by ctlinput routines.

4 Introduce const macros for converting from a generic sockaddr
to family-specific sockaddrs, e.g., sockaddr_in: satocsin6,
satocsin, et cetera.


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.47 04-Jan-2007 elad

branches: 1.47.2;
Consistent usage of KAUTH_GENERIC_ISSUSER.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3 netbsd-4-base
# 1.46 23-Nov-2006 rpaulo

New EtherIP driver based on tap(4) and gif(4) by Hans Rosenfeld.
Notable changes:
* Fixes PR 34268.
* Separates the code from gif(4) (which is more cleaner).
* Allows the usage of STP (Spanning Tree Protocol).
* Removed EtherIP implementation from gif(4)/tap(4).

Some input from Christos.


# 1.45 16-Nov-2006 christos

__unused removal on arguments; approved by core.


Revision tags: yamt-splraiseipl-base2
# 1.44 17-Oct-2006 dogcow

now that we have -Wno-unused-parameter, back out all the tremendously ugly
code to gratuitously access said parameters.


# 1.43 13-Oct-2006 dogcow

More -Wunused fallout. sprinkle __unused when possible; otherwise, use the
do { if (&x) {} } while (/* CONSTCOND */ 0);
construct as suggested by uwe in <20061012224845.GA9449@snark.ptc.spbu.ru>.


# 1.42 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


# 1.41 05-Oct-2006 tls

Protect calls to pool_put/pool_get that may occur in interrupt context
with spl used to protect other allocations and frees, or datastructure
element insertion and removal, in adjacent code.

It is almost unquestionably the case that some of the spl()/splx() calls
added here are superfluous, but it really seems wrong to see:

s=splfoo();
/* frob data structure */
splx(s);
pool_put(x);

and if we think we need to protect the first operation, then it is hard
to see why we should not think we need to protect the next. "Better
safe than sorry".

It is also almost unquestionably the case that I missed some pool
gets/puts from interrupt context with my strategy for finding these
calls; use of PR_NOWAIT is a strong hint that a pool may be used from
interrupt context but many callers in the kernel pass a "can wait/can't
wait" flag down such that my searches might not have found them. One
notable area that needs to be looked at is pf.

See also:

http://mail-index.netbsd.org/tech-kern/2006/07/19/0003.html
http://mail-index.netbsd.org/tech-kern/2006/07/19/0009.html


Revision tags: abandoned-netbsd-4-base yamt-splraiseipl-base yamt-pdpolicy-base9 yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.40 23-Jul-2006 ad

branches: 1.40.4; 1.40.6;
Use the LWP cached credentials where sane.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.39 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.38 18-May-2006 liamjfoy

branches: 1.38.2;
Integrate Common Address Redundancy Procotol (CARP) from OpenBSD

'pseudo-device carp'

Thanks to: joerg@ christos@ riz@ and others who tested
Ok: core@


# 1.37 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base yamt-pdpolicy-base2 elad-kernelauth-base yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.36 17-Jan-2006 christos

branches: 1.36.2; 1.36.4; 1.36.6; 1.36.8; 1.36.10;
Make sure that breq is also cleared (from Xin LI)


# 1.35 09-Jan-2006 christos

Make sure we initialize all structs to 0; from Xin LI


# 1.34 24-Dec-2005 perry

branches: 1.34.2;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.33 11-Dec-2005 thorpej

ANSI function decls and application of static.


# 1.32 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.31 01-Jun-2005 jdc

branches: 1.31.2;
Fix this properly by renaming the conflicting variables.


# 1.30 01-Jun-2005 jdc

Remove extraneous definition of struct llc (found by shadow warning).


Revision tags: netbsd-3-0-RELEASE netbsd-3-0-RC6 netbsd-3-0-RC5 netbsd-3-0-RC4 netbsd-3-0-RC3 netbsd-3-0-RC2 netbsd-3-0-RC1 yamt-km-base4 yamt-km-base3 netbsd-3-base kent-audio2-base
# 1.29 26-Feb-2005 perry

branches: 1.29.2; 1.29.4;
nuke trailing whitespace


Revision tags: yamt-km-base2
# 1.28 31-Jan-2005 kim

Add RFC 3378 EtherIP support, ported from OpenBSD to NetBSD by
Hans Rosenfeld (rosenfeld at grumpf.hope-2000.org)

This change makes it possible to add gif interfaces to bridges, which
will then send and receive IP protocol 97 packets. Packets are Ethernet
frames with an EtherIP header prepended.


Revision tags: yamt-km-base kent-audio1-beforemerge kent-audio1-base
# 1.27 04-Dec-2004 peter

branches: 1.27.4; 1.27.6;
Change ifc_destroy to return an int instead of void, so that it
can pass back errors to ifconfig.


# 1.26 06-Oct-2004 bad

Interfaces that do checksum offloading indicate the checksum status of
received packets in csum_flags in the packet header. Packets that are
forwarded over the bridge need to have csum_flags cleared before being
put on the output queue. Do so in bridge_enqueue().

Discussed with Jason Thorpe.

Fixes PR kern/27007 and the first part of PR kern/21831.


# 1.25 05-Oct-2004 christos

Only enable BRIDGE_IPF code if PFIL_HOOKS is enabled.


# 1.24 21-Apr-2004 itojun

kill a sprintf


# 1.23 21-Apr-2004 itojun

kill sprintf, use snprintf


Revision tags: netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.22 31-Jan-2004 jdc

branches: 1.22.2;
Use m_copydata(), m_adj() and M_PREPEND() to manipulate mbuf's in
bridge_ipf(). Fixes kernel memory corruption that occured when using
m_split() and m_cat().
Idea from OpenBSD.


# 1.21 09-Dec-2003 augustss

Fix spelling mistake in a comment.


# 1.20 28-Oct-2003 mycroft

Mark this initializer in the canonical way so it can be found later.


# 1.19 25-Oct-2003 christos

Fix uninitialized variable warnings


# 1.18 16-Sep-2003 jdc

Add a flag parameter to bridge_enqueue() to tell it whether to run the
filter or not. We only need to run the filter for bridge_forward() and
bridge_broadcast(). If we also run it for bridge_output(), we will run
the filter twice outbound per packet, so don't.

In bridge_ipf(), make sure we don't run m_cat() on a single mbuf chain
by checking to see (and remembering) if we need to m_split() the mbuf.
This fixes bridge + ipfilter on sparc.

Fixes PR kern/22063.


# 1.17 11-Aug-2003 itojun

rm extra blank line


# 1.16 13-Jul-2003 jdc

Include opt_inet.h to get INET6 definition.
Now, bridged ipv6 packets are passed through ipfilter.
However, some v6 packets still do not get transmitted when ipf is enabled.
Partial fix for PR kern/22063.


# 1.15 23-Jun-2003 martin

branches: 1.15.2;
Make sure to include opt_foo.h if a defflag option FOO is used.


# 1.14 24-May-2003 kristerw

Make sure splx() is called for all bridge_ioctl() error cases.


# 1.13 16-May-2003 itojun

use strlcpy


# 1.12 14-May-2003 itojun

use arc4random


# 1.11 19-Mar-2003 bouyer

Fix 2 bugs:
- initialise stp when the bridge is turned up, without this stp will keep
all interfaces disabled in a sequence like:
brconfig bridge0 add if0 add if1 stp if0 stp if1 up
- s/BRDGSPRI/BRDGSIFPRIO in brconfig.c:cmd_ifpriority()

add a command (ifpathcost) to change the stp path cost of the STP path cost of
an interface. Display the interface path cost with the others STP parameters.


# 1.10 27-Feb-2003 perseant

Make BRIDGE_IPF an option, and document it. Add it (commented) to GENERIC.
Let brconfig tell whether the bridge is using the ipfilter hook, or not.


# 1.9 15-Feb-2003 perseant

Add ipf packet-filtering option to if_bridge. The option is controlled at
compile-time by BRIDGE_IPF, and at runtime by brconfig with the {ipf,-ipf}
option on a per-bridge basis.

As a side-effect, add PFIL_HOOKS processing to if_bridge.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base kqueue-aftermerge kqueue-beforemerge gehenna-devsw-base kqueue-base
# 1.8 24-Aug-2002 martin

Add a function to lookup bridge members by struct ifnet * and use
it at all call sites that have such a pointer readily available.
This avoids unnecessary strcmp()s in critical paths, and removes
some XXX comments.


# 1.7 08-Jun-2002 itojun

reject "add" request if if_mtu is different.


# 1.6 23-May-2002 itojun

use IFT_BRIDGE


Revision tags: netbsd-1-6-base
# 1.5 24-Mar-2002 jdolecek

branches: 1.5.2; 1.5.4;
Fix a memory leak in bridge_ioctl_add() when the called for non-ethernet
interface.
Problem noted and fix provided by in kern/16019 by Love.


Revision tags: eeh-devprop-base newlock-base
# 1.4 08-Mar-2002 thorpej

Pool deals fairly well with physical memory shortage, but it doesn't
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map). Try to deal with this:

* Group all information about the backend allocator for a pool in a
separate structure. The pool references this structure, rather than
the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
to become available, but will still fail if it cannot callocate KVA
space for the pages. If this happens, carefully drain all pools using
the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
some pages, and use that information to make draining easier and more
efficient.
* Get rid of PR_URGENT. There was only one use of it, and it could be
dealt with by the caller.

From art@openbsd.org.


Revision tags: ifpoll-base
# 1.3 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.2 17-Aug-2001 thorpej

branches: 1.2.2; 1.2.4;
Only report expire time for DYNAMIC forwarding table entries.


# 1.1 17-Aug-2001 thorpej

Add support for building Ethernet bridges, based on Jason Wright's
bridge driver from OpenBSD, although the bridge code has been *heavily*
modified by me (the 802.1D code remains mostly unchanged from the
original).