History log of /freebsd-current/sys/net/vnet.c
Revision Date Author Comments
# bd7b2f95 05-Dec-2023 Kristof Provost <kp@FreeBSD.org>

vnet: (read) lock the vnet list while iterating it

Ensure that the vnet list cannot be modified while we're running through
it.

Reviewed by: mjg (previous version), zlei (previous version)
MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D42927


# cf7974fd 20-Sep-2023 Zhenlei Huang <zlei@FreeBSD.org>

sysctl: Update 'master' copy of vnet SYSCTLs on kernel environment variables change

Complete phase three of 3da1cf1e88f8.

With commit 110113bc086f, vnet sysctl variables can be loader tunable
but the feature is limited. When the kernel modules have been initialized,
any changes (e.g. via kenv) to kernel environment variable will not affect
subsequently created VNETs.

This change relexes the limitation by listening on kernel environment
variable's set / unset events, and then update the 'master' copy of vnet
SYSCTL or restore it to its initial value.

With this change, TUNABLE_XXX_FETCH can be greately eliminated for vnet
loader tunables.

Reviewed by: glebius
Fixes: 110113bc086f sysctl(9): Enable vnet sysctl variables to be loader tunable
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D41825


# 685dc743 16-Aug-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove $FreeBSD$: one-line .c pattern

Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/


# 4d846d26 10-May-2023 Warner Losh <imp@FreeBSD.org>

spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSD

The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch
up to that fact and revert to their recommended match of BSD-2-Clause.

Discussed with: pfg
MFC After: 3 days
Sponsored by: Netflix


# a7acce34 17-Apr-2023 Zhenlei Huang <zlei@FreeBSD.org>

vnet: Fix a typo in a source code comment

- s/form/from/

MFC after: 3 days


# fb9b76e0 21-Feb-2023 Zhenlei Huang <zlei@FreeBSD.org>

vnet: Make vnet_sys[un]init() static

These two functions are intended to be used only when allocating or
destroying vnet instances.

No functional change intended.

Reviewed by: kp
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D37955


# c84c5e00 18-Jul-2022 Mitchell Horne <mhorne@FreeBSD.org>

ddb: annotate some commands with DB_CMD_MEMSAFE

This is not completely exhaustive, but covers a large majority of
commands in the tree.

Reviewed by: markj
Sponsored by: Juniper Networks, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D35583


# 6d2a10d9 08-Feb-2021 Kristof Provost <kp@FreeBSD.org>

Widen ifnet_detach_sxlock coverage

Widen the ifnet_detach_sxlock to cover the entire vnet sysuninit code.
This ensures that we can't end up having the vnet_sysuninit free the UDP
pcb while the detach code is running and trying to purge the UDP pcb.

MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D28530


# 10108cb6 17-Feb-2020 Bjoern A. Zeeb <bz@FreeBSD.org>

Partially revert VNET change and expand VNET structure.

Revert parts of r353274 replacing vnet_state with a shutdown flag.

Not having the state flag for the current SI_SUB_* makes it harder to debug
kernel or module panics related to VNET bringup or teardown.
Not having the state also does not allow us to check for other dependency
levels between components, e.g. for moving interfaces.

Expand the VNET structure with the new boolean flag indicating that we are
doing a shutdown of a given vnet and update the vnet magic cookie for the
change.

Update libkvm to compile with a bool in the kernel struct.

Bump __FreeBSD_version for (external) module builds to more easily detect
the change.

Reviewed by: hselasky
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D23097


# a362cf52 08-Oct-2019 Hans Petter Selasky <hselasky@FreeBSD.org>

Fix regression issue after r353274:

Make sure the vnet_shutdown field is not set until after all
VNET_SYSUNINIT()'s in the SI_SUB_VNET_DONE subsystem have been
executed. Especially the vnet_if_return() functions requires that
if_move() is still operational.

Reported by: lwhsu@
MFC after: 1 week
Sponsored by: Mellanox Technologies


# 204e2f30 07-Oct-2019 Hans Petter Selasky <hselasky@FreeBSD.org>

Factor out VNET shutdown check into an own vnet structure field.
Remove the now obsolete vnet_state field. This greatly simplifies the
detection of VNET shutdown and avoids code duplication.

Discussed with: bz@
MFC after: 1 week
Sponsored by: Mellanox Technologies


# c955c6cd 30-Oct-2018 Bjoern A. Zeeb <bz@FreeBSD.org>

With more excessive use of modules, more kernel parts working with
VIMAGE, and feature richness and global state increasing the 8k of
vnet module space are no longer sufficient for people and loading
multiple modules, e.g., pf(4) and ipl(4) or ipsec(4) will fail on
the second module.

Increase the module space to 8 * PAGE_SIZE which should be enough
to hold multiple firewalls, ipsec, multicast (as in the old days was
a problem), epair, carp, and any kind of other vnet enabled modules.

Sadly this is a global byte array part of the vnet_set, so we cannot
dynamically change its size; otherwise a TUNABLE would have been
a better solution.

PR: 228854
Reported by: Ernie Luzar, Marek Zarychta
Discussed with: rgrimes on current
MFC after: 3 days


# cd2106ea 30-Jul-2018 Andrew Turner <andrew@FreeBSD.org>

Ensure the DPCPU and VNET module spaces are aligned to hold a pointer.
Previously they may have been aligned to a char, leading to misaligned
DPCPU and VNET variables.

Sponsored by: DARPA, AFRL


# 5f901c92 24-Jul-2018 Andrew Turner <andrew@FreeBSD.org>

Use the new VNET_DEFINE_STATIC macro when we are defining static VNET
variables.

Reviewed by: bz
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D16147


# 891cf3ed 18-May-2018 Ed Maste <emaste@FreeBSD.org>

Use NULL for SYSINIT's last arg, which is a pointer type

Sponsored by: The FreeBSD Foundation


# fe267a55 27-Nov-2017 Pedro F. Giffuni <pfg@FreeBSD.org>

sys: general adoption of SPDX licensing ID tags.

Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

No functional change intended.


# 8e94025b 20-Oct-2017 Bjoern A. Zeeb <bz@FreeBSD.org>

With r181803 on 2008-08-17 23:27:27Z the first VIMAGE commit went into
HEAD. Enable VIMAGE in GENERIC kernels and some others (where GENERIC does
not exist) on HEAD.

Disable building LINT-VIMAGE with VIMAGE being default.

This should give it a lot more exposure in the run-up to 12 to help
us evaluate whether to keep it on by default or not.
We are also hoping to get better performance testing.
The feature can be disabled using nooptions.

Requested by: many
Reviewed by: kristof, emaste, hiren
X-MFC after: never
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D12639


# 89856f7e 21-Jun-2016 Bjoern A. Zeeb <bz@FreeBSD.org>

Get closer to a VIMAGE network stack teardown from top to bottom rather
than removing the network interfaces first. This change is rather larger
and convoluted as the ordering requirements cannot be separated.

Move the pfil(9) framework to SI_SUB_PROTO_PFIL, move Firewalls and
related modules to their own SI_SUB_PROTO_FIREWALL.
Move initialization of "physical" interfaces to SI_SUB_DRIVERS,
move virtual (cloned) interfaces to SI_SUB_PSEUDO.
Move Multicast to SI_SUB_PROTO_MC.

Re-work parts of multicast initialisation and teardown, not taking the
huge amount of memory into account if used as a module yet.

For interface teardown we try to do as many of them as we can on
SI_SUB_INIT_IF, but for some this makes no sense, e.g., when tunnelling
over a higher layer protocol such as IP. In that case the interface
has to go along (or before) the higher layer protocol is shutdown.

Kernel hhooks need to go last on teardown as they may be used at various
higher layers and we cannot remove them before we cleaned up the higher
layers.

For interface teardown there are multiple paths:
(a) a cloned interface is destroyed (inside a VIMAGE or in the base system),
(b) any interface is moved from a virtual network stack to a different
network stack ("vmove"), or (c) a virtual network stack is being shut down.
All code paths go through if_detach_internal() where we, depending on the
vmove flag or the vnet state, make a decision on how much to shut down;
in case we are destroying a VNET the individual protocol layers will
cleanup their own parts thus we cannot do so again for each interface as
we end up with, e.g., double-frees, destroying locks twice or acquiring
already destroyed locks.
When calling into protocol cleanups we equally have to tell them
whether they need to detach upper layer protocols ("ulp") or not
(e.g., in6_ifdetach()).

Provide or enahnce helper functions to do proper cleanup at a protocol
rather than at an interface level.

Approved by: re (hrs)
Obtained from: projects/vnet
Reviewed by: gnn, jhb
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D6747


# ad4e9116 18-May-2016 Bjoern A. Zeeb <bz@FreeBSD.org>

Rather than having the if_vmove() code intermixed in the vnet_destroy()
function in vnet.c move it to if.c where it logically belongs and put
it under a VNET_SYSUNINIT() call.
To not change the current behaviour make sure it runs first thing
during teardown. In the future this will allow us more flexibility
on changing the order on when we want to get rid of interfaces.

Stop exporting if_vmove() and make it file static.

Reviewed by: gnn
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D6438


# 94081f88 18-May-2016 Bjoern A. Zeeb <bz@FreeBSD.org>

Add a "vnet_state" field to struct vnet.
This is set to the SI_SUB_* value before executing any VNET_SYSINIT
or VNET_SYSUNINT. While good for debugging especially VNET teardown
problems having a chance to know at which level during teardown we are,
it will also be used to identify to detcted a "stable state"
(as in fully up and running) later on.

Obtained from: projects/vnet
Sponsored by: The FreeBSD Foundation


# 00e36a5c 18-May-2016 Bjoern A. Zeeb <bz@FreeBSD.org>

Add a dummy VNET_SYSINIT that will make sure all VNETs started will
always end on SI_SUB_VNET_DONE.

Obtained from: projects/vnet
Sponsored by: The FreeBSD Foundation


# 5fa0728b 18-May-2016 Bjoern A. Zeeb <bz@FreeBSD.org>

Split 'show vnets' into 'show vnet' and 'show all vnets'.
While here adjust some db_printf format string.

Document the two show commands in ddb.4.

Sponsored by: The FreeBSD Foundation


# 54d9f34e 16-May-2016 Bjoern A. Zeeb <bz@FreeBSD.org>

Mark the unused arguments of various SYSINIT functions __unused.

MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation


# ca7ba6a8 25-Jan-2016 Marko Zec <zec@FreeBSD.org>

Prune a definition which is / was never used.


# 1f12da0e 22-Jan-2016 Bjoern A. Zeeb <bz@FreeBSD.org>

Just checkpoint the WIP in order to be able to make the tree update
easier. Note: this is currently not in a usable state as certain
teardown parts are not called and the DOMAIN rework is missing.
More to come soon and find its way to head.

Obtained from: P4 //depot/user/bz/vimage/...
Sponsored by: The FreeBSD Foundation


# b5c32cf4 07-Feb-2014 Gleb Smirnoff <glebius@FreeBSD.org>

Remove identical vnet sysctl handlers, and handle CTLFLAG_VNET
in the sysctl_root().

Note: SYSCTL_VNET_* macros can be removed as well. All is
needed to virtualize a sysctl oid is set CTLFLAG_VNET on it.
But for now keep macros in place to avoid large code churn.

Sponsored by: Nginx, Inc.


# 4678c740 27-Nov-2013 Gleb Smirnoff <glebius@FreeBSD.org>

Fix build.


# d9fae5ab 26-Nov-2013 Andriy Gapon <avg@FreeBSD.org>

dtrace sdt: remove the ugly sname parameter of SDT_PROBE_DEFINE

In its stead use the Solaris / illumos approach of emulating '-' (dash)
in probe names with '__' (two consecutive underscores).

Reviewed by: markj
MFC after: 3 weeks


# 54366c0b 25-Nov-2013 Attilio Rao <attilio@FreeBSD.org>

- For kernel compiled only with KDTRACE_HOOKS and not any lock debugging
option, unbreak the lock tracing release semantic by embedding
calls to LOCKSTAT_PROFILE_RELEASE_LOCK() direclty in the inlined
version of the releasing functions for mutex, rwlock and sxlock.
Failing to do so skips the lockstat_probe_func invokation for
unlocking.
- As part of the LOCKSTAT support is inlined in mutex operation, for
kernel compiled without lock debugging options, potentially every
consumer must be compiled including opt_kdtrace.h.
Fix this by moving KDTRACE_HOOKS into opt_global.h and remove the
dependency by opt_kdtrace.h for all files, as now only KDTRACE_FRAMES
is linked there and it is only used as a compile-time stub [0].

[0] immediately shows some new bug as DTRACE-derived support for debug
in sfxge is broken and it was never really tested. As it was not
including correctly opt_kdtrace.h before it was never enabled so it
was kept broken for a while. Fix this by using a protection stub,
leaving sfxge driver authors the responsibility for fixing it
appropriately [1].

Sponsored by: EMC / Isilon storage division
Discussed with: rstone
[0] Reported by: rstone
[1] Discussed with: philip


# d745c852 06-Nov-2011 Ed Schouten <ed@FreeBSD.org>

Mark MALLOC_DEFINEs static that have no corresponding MALLOC_DECLAREs.

This means that their use is restricted to a single C file.


# 58ccf5b4 11-Jan-2011 John Baldwin <jhb@FreeBSD.org>

Remove unneeded includes of <sys/linker_set.h>. Other headers that use
it internally contain nested includes.

Reviewed by: bde


# 9269189f 09-Jan-2011 Bjoern A. Zeeb <bz@FreeBSD.org>

MfP4 CH=185246 [1]:

Add FEATURE() to announce optional VIMAGE.

MFC after: 3 days
[1] for the moment put it in vnet.c.


# 3e288e62 22-Nov-2010 Dimitry Andric <dim@FreeBSD.org>

After some off-list discussion, revert a number of changes to the
DPCPU_DEFINE and VNET_DEFINE macros, as these cause problems for various
people working on the affected files. A better long-term solution is
still being considered. This reversal may give some modules empty
set_pcpu or set_vnet sections, but these are harmless.

Changes reverted:

------------------------------------------------------------------------
r215318 | dim | 2010-11-14 21:40:55 +0100 (Sun, 14 Nov 2010) | 4 lines

Instead of unconditionally emitting .globl's for the __start_set_xxx and
__stop_set_xxx symbols, only emit them when the set_vnet or set_pcpu
sections are actually defined.

------------------------------------------------------------------------
r215317 | dim | 2010-11-14 21:38:11 +0100 (Sun, 14 Nov 2010) | 3 lines

Apply the STATIC_VNET_DEFINE and STATIC_DPCPU_DEFINE macros throughout
the tree.

------------------------------------------------------------------------
r215316 | dim | 2010-11-14 21:23:02 +0100 (Sun, 14 Nov 2010) | 2 lines

Add macros to define static instances of VNET_DEFINE and DPCPU_DEFINE.


# 31c6a003 14-Nov-2010 Dimitry Andric <dim@FreeBSD.org>

Apply the STATIC_VNET_DEFINE and STATIC_DPCPU_DEFINE macros throughout
the tree.


# a7d5f7eb 19-Oct-2010 Jamie Gritton <jamie@FreeBSD.org>

A new jail(8) with a configuration file, to replace the work currently done
by /etc/rc.d/jail.


# 79856499 22-Aug-2010 Rui Paulo <rpaulo@FreeBSD.org>

Add an extra comment to the SDT probes definition. This allows us to get
use '-' in probe names, matching the probe names in Solaris.[1]

Add userland SDT probes definitions to sys/sdt.h.

Sponsored by: The FreeBSD Foundation
Discussed with: rwaston [1]


# f7eebc1c 17-May-2010 Bjoern A. Zeeb <bz@FreeBSD.org>

MFC r208100:
Fix an issue with the dynamic pcpu/vnet data allocators.

We cannot expect that modspace is the last entry in the linker
set and thus that modspace + possible extra space up to PAGE_SIZE
would be contiguous. For the moment do not support more than
*_MODMIN space and ignore the extra space.

Discussed with: jeff, rwatson (briefly)
Reviewed by: jeff
Sponsored by: The FreeBSD Foundation
Sponsored by: CK Software GmbH


# 793f71bf 14-May-2010 Bjoern A. Zeeb <bz@FreeBSD.org>

Fix an issue with the dynamic pcpu/vnet data allocators.

We cannot expect that modspace is the last entry in the linker
set and thus that modspace + possible extra space up to PAGE_SIZE
would be contiguous. For the moment do not support more than
*_MODMIN space and ignore the extra space (*).

(*) We know how to get it back but it'll need testing.

Discussed with: jeff, rwatson (briefly)
Reviewed by: jeff
Sponsored by: The FreeBSD Foundation
Sponsored by: CK Software GmbH
MFC after: 4 days


# 407b1937 21-Apr-2010 Bjoern A. Zeeb <bz@FreeBSD.org>

MFC r205345:

Split eventhandler_register() into an internal part and a wrapper function
that provides the allocated and setup eventhandler entry.

Add a new wrapper for VIMAGE that allocates extra space to hold the
callback function and argument in addition to an extra wrapper function.
While the wrapper function goes as normal callback function the
argument points to the extra space allocated holding the original func
and arg that the wrapper function can then call.

Provide an iterator function for the virtual network stack (vnet) that
will call the callback function for each network stack.

Provide a new set of macros for VNET that in the non-VIMAGE case will
just call eventhandler_register() while in the VIMAGE case it will use
vimage_eventhandler_register() passing in the extra iterator function
but will only register once rather than per-vnet.
We need a special macro in case we are interested in the tag returned
as we must check for curvnet and can neither simply assign the
return value, nor not change it in the non-vnet0 case without that.

Discussed with: jhb
Reviewed by: zec (earlier version), jhb


# 7a90b212 14-Apr-2010 Julian Elischer <julian@FreeBSD.org>

Move two copies of the same definition to a common include file.

MFC after: 3 weeks


# 78ba8b29 27-Mar-2010 Bjoern A. Zeeb <bz@FreeBSD.org>

MFC r203729:

Add DDB support for printing vnet_sysinit and vnet_sysuninit
ordered call lists. Try to lookup function/symbol names and print
those in addition to the pointers, along with the constants for
subsystem and order.
This is useful for debugging vnet teardown ordering issues.

Make it possible to call the actual printing frunction from normal
code at runtime, ie. from vnet_sysuninit(), if DDB support is there.


# 72ec67fc 27-Mar-2010 Bjoern A. Zeeb <bz@FreeBSD.org>

MFC r203727:

Add an SDT provider for "vnet"s along with probes for vnet_alloc
and vnet_destroy.
Use the line number rather than NULL as dummy argument.

Note: the fbt provider does not reliably provide :return probes
(depending on optimization levels used at compile time) making
it unusable for scripts to generate complete call-traces with
well defined boundaries over allocations or destructions of
virtual network stacks.


# 42eedeac 19-Mar-2010 Bjoern A. Zeeb <bz@FreeBSD.org>

Split eventhandler_register() into an internal part and a wrapper function
that provides the allocated and setup eventhandler entry.

Add a new wrapper for VIMAGE that allocates extra space to hold the
callback function and argument in addition to an extra wrapper function.
While the wrapper function goes as normal callback function the
argument points to the extra space allocated holding the original func
and arg that the wrapper function can then call.

Provide an iterator function for the virtual network stack (vnet) that
will call the callback function for each network stack.

Provide a new set of macros for VNET that in the non-VIMAGE case will
just call eventhandler_register() while in the VIMAGE case it will use
vimage_eventhandler_register() passing in the extra iterator function
but will only register once rather than per-vnet.
We need a special macro in case we are interested in the tag returned
as we must check for curvnet and can neither simply assign the
return value, nor not change it in the non-vnet0 case without that.

Sponsored by: ISPsystem
Discussed with: jhb
Reviewed by: zec (earlier version), jhb
MFC after: 1 month


# 7526c9df 10-Feb-2010 Marko Zec <zec@FreeBSD.org>

MFC r203483:
Instead of spamming the console on each curvnet recursion event, print
out each such call graph only once, along with a stack backtrace. This
should make kernels built with VNET_DEBUG reasonably usable again in
busy / production environments.

Introduce a new DDB command "show vnetrcrs" which dumps the whole log
of distinctive curvnet recursion events. This might be useful when
recursion reports get burried / lost too deep in the message buffer.
In the later case stack backtraces are not available.

Reviewed by: bz


# 3e0490b3 09-Feb-2010 Bjoern A. Zeeb <bz@FreeBSD.org>

Add DDB support for printing vnet_sysinit and vnet_sysuninit
ordered call lists. Try to lookup function/symbol names and print
those in addition to the pointers, along with the constants for
subsystem and order.
This is useful for debugging vnet teardown ordering issues.

Make it possible to call the actual printing frunction from normal
code at runtime, ie. from vnet_sysuninit(), if DDB support is there.

Sponsored by: ISPsystem
MFC After: 8 days


# 61d033d4 09-Feb-2010 Bjoern A. Zeeb <bz@FreeBSD.org>

Add an SDT provider for "vnet"s along with probes for vnet_alloc
and vnet_destroy.
Use the line number rather than NULL as dummy argument.

Note: the fbt provider does not reliably provide :return probes
(depending on optimization levels used at compile time) making
it unusable for scripts to generate complete call-traces with
well defined boundaries over allocations or destructions of
virtual network stacks.

Sponsored by: ISPsystem
MFC After: 8 days


# 0a705ab6 04-Feb-2010 Marko Zec <zec@FreeBSD.org>

Instead of spamming the console on each curvnet recursion event, print
out each such call graph only once, along with a stack backtrace. This
should make kernels built with VNET_DEBUG reasonably usable again in
busy / production environments.

Introduce a new DDB command "show vnetrcrs" which dumps the whole log
of distinctive curvnet recursion events. This might be useful when
recursion reports get burried / lost too deep in the message buffer.
In the later case stack backtraces are not available.

Reviewed by: bz
MFC after: 3 days


# e9cedda8 31-Aug-2009 Marko Zec <zec@FreeBSD.org>

MFC r196633:

Introduce a separate sx lock for protecting lists of vnet sysinit
and sysuninit handlers.

Previously, sx_vnet, which is a lock designated for protecting
the vnet list, was (ab)used for protecting vnet sysinit / sysuninit
handler lists as well. Holding exclusively the sx_vnet lock while
invoking sysinit and / or sysuninit handlers turned out to be
problematic, since some of the handlers may attempt to wake up
another thread and wait for it to walk over the vnet list, hence
acquire a shared lock on sx_vnet, which in turn leads to a deadlock.
Protecting vnet sysinit / sysuninit lists with a separate lock
mitigates this issue, which was first observed with
flowtable_flush() / flowtable_cleaner() in sys/net/flowtable.c.

Reviewed by: rwatson, jhb
MFC after: 3 days

Approved by: re (rwatson)


# a99fcfd4 28-Aug-2009 Marko Zec <zec@FreeBSD.org>

Introduce a separate sx lock for protecting lists of vnet sysinit
and sysuninit handlers.

Previously, sx_vnet, which is a lock designated for protecting
the vnet list, was (ab)used for protecting vnet sysinit / sysuninit
handler lists as well. Holding exclusively the sx_vnet lock while
invoking sysinit and / or sysuninit handlers turned out to be
problematic, since some of the handlers may attempt to wake up
another thread and wait for it to walk over the vnet list, hence
acquire a shared lock on sx_vnet, which in turn leads to a deadlock.
Protecting vnet sysinit / sysuninit lists with a separate lock
mitigates this issue, which was first observed with
flowtable_flush() / flowtable_cleaner() in sys/net/flowtable.c.

Reviewed by: rwatson, jhb
MFC after: 3 days


# 3baf84f5 11-Aug-2009 Bjoern A. Zeeb <bz@FreeBSD.org>

MFC r196129:

Update DDB show vnet command to print all used and available information.

Reviewed by: rwatson, zec

Approved by: re


# 281c86a4 11-Aug-2009 Bjoern A. Zeeb <bz@FreeBSD.org>

Update DDB show vnet command to print all used and available information.

Reviewed by: rwatson, zec
Approved by: re


# 6aad5c1c 01-Aug-2009 Robert Watson <rwatson@FreeBSD.org>

The colour was red as shall be the letters of this warning to people upon
boot if the experimental VIMAGE feature was compiled into the kernel.

Submitted by: bz
Reviewed by: zec
Approved by: re (vimage blanket)


# c8f6a138 01-Aug-2009 Robert Watson <rwatson@FreeBSD.org>

Minor style tweaks.

Approved by: re (vimage blanket)


# 6bc2c7b7 01-Aug-2009 Robert Watson <rwatson@FreeBSD.org>

Make the vnet alloc/destroy paths a bit easier to followg by merging
vnet_data_init/vnet_data_destroy into vnet_alloc/vnet_destroy.

Reviewed by: bz, zec
Approved by: re (vimage blanket)


# 7429a3f3 01-Aug-2009 Robert Watson <rwatson@FreeBSD.org>

Remove vnet_foreach() utility function, which previously allowed
vnet.c to iterate virtual network stacks without being aware of
the implementation details previously hidden in kern_vimage.c.
Now they are in the same file, so remove this added complexity.

Reviewed by: bz
Approved by: re (vimage blanket)


# 530c0060 01-Aug-2009 Robert Watson <rwatson@FreeBSD.org>

Merge the remainder of kern_vimage.c and vimage.h into vnet.c and
vnet.h, we now use jails (rather than vimages) as the abstraction
for virtualization management, and what remained was specific to
virtual network stacks. Minor cleanups are done in the process,
and comments updated to reflect these changes.

Reviewed by: bz
Approved by: re (vimage blanket)


# ed3db012 29-Jul-2009 Robert Watson <rwatson@FreeBSD.org>

Reorder and recomment vnet.c and vnet.h on the basis that they are no longer
solely about the virtual network stack memory allocator.

Approved by: re (vimage blanket)


# d0728d71 23-Jul-2009 Robert Watson <rwatson@FreeBSD.org>

Introduce and use a sysinit-based initialization scheme for virtual
network stacks, VNET_SYSINIT:

- Add VNET_SYSINIT and VNET_SYSUNINIT macros to declare events that will
occur each time a network stack is instantiated and destroyed. In the
!VIMAGE case, these are simply mapped into regular SYSINIT/SYSUNINIT.
For the VIMAGE case, we instead use SYSINIT's to track their order and
properties on registration, using them for each vnet when created/
destroyed, or immediately on module load for already-started vnets.
- Remove vnet_modinfo mechanism that existed to serve this purpose
previously, as well as its dependency scheme: we now just use the
SYSINIT ordering scheme.
- Implement VNET_DOMAIN_SET() to allow protocol domains to declare that
they want init functions to be called for each virtual network stack
rather than just once at boot, compiling down to DOMAIN_SET() in the
non-VIMAGE case.
- Walk all virtualized kernel subsystems and make use of these instead
of modinfo or DOMAIN_SET() for init/uninit events. In some cases,
convert modular components from using modevent to using sysinit (where
appropriate). In some cases, do minor rejuggling of SYSINIT ordering
to make room for or better manage events.

Portions submitted by: jhb (VNET_SYSINIT), bz (cleanup)
Discussed with: jhb, bz, julian, zec
Reviewed by: bz
Approved by: re (VIMAGE blanket)


# eddfbb76 14-Jul-2009 Robert Watson <rwatson@FreeBSD.org>

Build on Jeff Roberson's linker-set based dynamic per-CPU allocator
(DPCPU), as suggested by Peter Wemm, and implement a new per-virtual
network stack memory allocator. Modify vnet to use the allocator
instead of monolithic global container structures (vinet, ...). This
change solves many binary compatibility problems associated with
VIMAGE, and restores ELF symbols for virtualized global variables.

Each virtualized global variable exists as a "reference copy", and also
once per virtual network stack. Virtualized global variables are
tagged at compile-time, placing the in a special linker set, which is
loaded into a contiguous region of kernel memory. Virtualized global
variables in the base kernel are linked as normal, but those in modules
are copied and relocated to a reserved portion of the kernel's vnet
region with the help of a the kernel linker.

Virtualized global variables exist in per-vnet memory set up when the
network stack instance is created, and are initialized statically from
the reference copy. Run-time access occurs via an accessor macro, which
converts from the current vnet and requested symbol to a per-vnet
address. When "options VIMAGE" is not compiled into the kernel, normal
global ELF symbols will be used instead and indirection is avoided.

This change restores static initialization for network stack global
variables, restores support for non-global symbols and types, eliminates
the need for many subsystem constructors, eliminates large per-subsystem
structures that caused many binary compatibility issues both for
monitoring applications (netstat) and kernel modules, removes the
per-function INIT_VNET_*() macros throughout the stack, eliminates the
need for vnet_symmap ksym(2) munging, and eliminates duplicate
definitions of virtualized globals under VIMAGE_GLOBALS.

Bump __FreeBSD_version and update UPDATING.

Portions submitted by: bz
Reviewed by: bz, zec
Discussed with: gnn, jamie, jeff, jhb, julian, sam
Suggested by: peter
Approved by: re (kensmith)