History log of /freebsd-current/sys/net/vnet.h
Revision Date Author Comments
# 2497c70f 15-Mar-2024 Gleb Smirnoff <glebius@FreeBSD.org>

vnet: remove unneeded backslash

Fixes: 430e0e409ce94246bb252cbdddef866fc69dea95


# cf7974fd 20-Sep-2023 Zhenlei Huang <zlei@FreeBSD.org>

sysctl: Update 'master' copy of vnet SYSCTLs on kernel environment variables change

Complete phase three of 3da1cf1e88f8.

With commit 110113bc086f, vnet sysctl variables can be loader tunable
but the feature is limited. When the kernel modules have been initialized,
any changes (e.g. via kenv) to kernel environment variable will not affect
subsequently created VNETs.

This change relexes the limitation by listening on kernel environment
variable's set / unset events, and then update the 'master' copy of vnet
SYSCTL or restore it to its initial value.

With this change, TUNABLE_XXX_FETCH can be greately eliminated for vnet
loader tunables.

Reviewed by: glebius
Fixes: 110113bc086f sysctl(9): Enable vnet sysctl variables to be loader tunable
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D41825


# 95ee2897 16-Aug-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove $FreeBSD$: two-line .h pattern

Remove /^\s*\*\n \*\s+\$FreeBSD\$$\n/


# 5440e701 28-Jul-2023 Dmitry Chagin <dchagin@FreeBSD.org>

i386: Don't use static DPCPU and VNET defines in i386 modules

As of c84617e8 a similar to 4802a2cb and b6ea4c5a fix should be
applied to i386 too.

Reviewed by:
Differential Revision: https://reviews.freebsd.org/D41195


# 4d846d26 10-May-2023 Warner Losh <imp@FreeBSD.org>

spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSD

The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch
up to that fact and revert to their recommended match of BSD-2-Clause.

Discussed with: pfg
MFC After: 3 days
Sponsored by: Netflix


# fb9b76e0 21-Feb-2023 Zhenlei Huang <zlei@FreeBSD.org>

vnet: Make vnet_sys[un]init() static

These two functions are intended to be used only when allocating or
destroying vnet instances.

No functional change intended.

Reviewed by: kp
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D37955


# efe58855 24-May-2022 Mike Karels <karels@FreeBSD.org>

IPv4: experimental changes to allow net 0/8, 240/4, part of 127/8

Combined changes to allow experimentation with net 0/8 (network 0),
240/4 (Experimental/"Class E"), and part of the loopback net 127/8
(all but 127.0/16). All changes are disabled by default, and can be
enabled by the following sysctls:

net.inet.ip.allow_net0=1
net.inet.ip.allow_net240=1
net.inet.ip.loopback_prefixlen=16

When enabled, the corresponding addresses can be used as normal
unicast IP addresses, both as endpoints and when forwarding.

Add descriptions of the new sysctls to inet.4.

Add <machine/param.h> to vnet.h, as CACHE_LINE_SIZE is undefined in
various C files when in.h includes vnet.h.

The proposals motivating this experimentation can be found in

https://datatracker.ietf.org/doc/draft-schoen-intarea-unicast-0
https://datatracker.ietf.org/doc/draft-schoen-intarea-unicast-240
https://datatracker.ietf.org/doc/draft-schoen-intarea-unicast-127

Reviewed by: rgrimes, pauamma_gundo.com; previous versions melifaro, glebius
Differential Revision: https://reviews.freebsd.org/D35741


# 37f604b4 06-Jul-2022 Kristof Provost <kp@FreeBSD.org>

vnet: make VNET_FOREACH() always be a loop

VNET_FOREACH() is a LIST_FOREACH if VIMAGE is set, but empty if it's
not. This means that users of the macro couldn't use 'continue' or
'break' as one would expect of a loop.

Change VNET_FOREACH() to be a loop in all cases (although one that is
fixed to one iteration if VIMAGE is not set).

Reviewed by: karels, melifaro, glebius
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D35739


# 430e0e40 19-Feb-2022 Mateusz Guzik <mjg@FreeBSD.org>

vnet: add CURVNET_ASSERT_SET for !VIMAGE

Reported by: ler
Sponsored by: Rubicon Communications, LLC ("Netgate")


# 75cde1f8 17-Feb-2022 Mateusz Guzik <mjg@FreeBSD.org>

vnet: add CURVNET_ASSERT_SET

Reviewed by: kp
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D34312


# 662c1305 01-Sep-2020 Mateusz Guzik <mjg@FreeBSD.org>

net: clean up empty lines in .c and .h files


# 7029da5c 26-Feb-2020 Pawel Biernacki <kaktus@FreeBSD.org>

Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many)

r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.

This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.

Mark all obvious cases as MPSAFE. All entries that haven't been marked
as MPSAFE before are by default marked as NEEDGIANT

Approved by: kib (mentor, blanket)
Commented by: kib, gallatin, melifaro
Differential Revision: https://reviews.freebsd.org/D23718


# 10108cb6 17-Feb-2020 Bjoern A. Zeeb <bz@FreeBSD.org>

Partially revert VNET change and expand VNET structure.

Revert parts of r353274 replacing vnet_state with a shutdown flag.

Not having the state flag for the current SI_SUB_* makes it harder to debug
kernel or module panics related to VNET bringup or teardown.
Not having the state also does not allow us to check for other dependency
levels between components, e.g. for moving interfaces.

Expand the VNET structure with the new boolean flag indicating that we are
doing a shutdown of a given vnet and update the vnet magic cookie for the
change.

Update libkvm to compile with a bool in the kernel struct.

Bump __FreeBSD_version for (external) module builds to more easily detect
the change.

Reviewed by: hselasky
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D23097


# 4715738b 07-Oct-2019 Hans Petter Selasky <hselasky@FreeBSD.org>

Compile time assert a valid subsystem for all VNET init and uninit functions.
Using VNET init and uninit functions outside the given range has undefined
behaviour.

MFC after: 1 week
Sponsored by: Mellanox Technologies


# 204e2f30 07-Oct-2019 Hans Petter Selasky <hselasky@FreeBSD.org>

Factor out VNET shutdown check into an own vnet structure field.
Remove the now obsolete vnet_state field. This greatly simplifies the
detection of VNET shutdown and avoids code duplication.

Discussed with: bz@
MFC after: 1 week
Sponsored by: Mellanox Technologies


# e2edff41 25-Jun-2019 Leandro Lupori <luporl@FreeBSD.org>

[PowerPC64] Don't mark module data as static

Fixes panic when loading ipfw.ko and if_epair.ko built with modern compiler.

Similar to arm64 and riscv, when using a modern compiler (!gcc4.2), code
generated tries to access data in the wrong location, causing kernel panic
(data storage interrupt trap) when loading if_epair and ipfw.

Issue was reproduced with kernel/module compiled using gcc8 and clang8. It
affects both ELFv1 and ELFv2 ABI environments.

PR: 232387
Submitted by: alfredo.junior_eldorado.org.br
Reported by: Mark Millard
Reviewed by: jhibbits
Differential Revision: https://reviews.freebsd.org/D20461


# 86c59375 12-Sep-2018 Ruslan Bukin <br@FreeBSD.org>

Don't mark module data as static on RISC-V.

Similar to arm64, riscv compiler uses PC-relative loads/stores,
and with static data compiler does not emit relocations.
In result, kernel module linker has nothing to fix and data accessed
from the wrong location.

Approved by: re (gjb)
Sponsored by: DARPA, AFRL


# b6ea4c5a 30-Jul-2018 Andrew Turner <andrew@FreeBSD.org>

As with DPCPU_DEFINE_STATIC make VNET_DEFINE_STATIC non-static on arm64 in
modules. It also fails in the same way, we are unable to relocate static
variables as the compiler uses PC-relative loads with nothing for the
kernel linker to relocate.

Sponsored by: DARPA, AFRL


# bc61d949 29-Jul-2018 Andrew Turner <andrew@FreeBSD.org>

As with DPCPU_DEFINE make it a compile error to use static with VNET_DEFINE.
There is the VNET_DEFINE_STATIC macro for that.


# fceba23f 24-Jul-2018 Andrew Turner <andrew@FreeBSD.org>

As with DPCPU create VNET_DEFINE_STATIC for when a variable needs to be
declaired static. This will allow us to change the definition on arm64
as it has the same issues described in r336349.

Reviewed by: bz
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D16147


# fe267a55 27-Nov-2017 Pedro F. Giffuni <pfg@FreeBSD.org>

sys: general adoption of SPDX licensing ID tags.

Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

No functional change intended.


# 89856f7e 21-Jun-2016 Bjoern A. Zeeb <bz@FreeBSD.org>

Get closer to a VIMAGE network stack teardown from top to bottom rather
than removing the network interfaces first. This change is rather larger
and convoluted as the ordering requirements cannot be separated.

Move the pfil(9) framework to SI_SUB_PROTO_PFIL, move Firewalls and
related modules to their own SI_SUB_PROTO_FIREWALL.
Move initialization of "physical" interfaces to SI_SUB_DRIVERS,
move virtual (cloned) interfaces to SI_SUB_PSEUDO.
Move Multicast to SI_SUB_PROTO_MC.

Re-work parts of multicast initialisation and teardown, not taking the
huge amount of memory into account if used as a module yet.

For interface teardown we try to do as many of them as we can on
SI_SUB_INIT_IF, but for some this makes no sense, e.g., when tunnelling
over a higher layer protocol such as IP. In that case the interface
has to go along (or before) the higher layer protocol is shutdown.

Kernel hhooks need to go last on teardown as they may be used at various
higher layers and we cannot remove them before we cleaned up the higher
layers.

For interface teardown there are multiple paths:
(a) a cloned interface is destroyed (inside a VIMAGE or in the base system),
(b) any interface is moved from a virtual network stack to a different
network stack ("vmove"), or (c) a virtual network stack is being shut down.
All code paths go through if_detach_internal() where we, depending on the
vmove flag or the vnet state, make a decision on how much to shut down;
in case we are destroying a VNET the individual protocol layers will
cleanup their own parts thus we cannot do so again for each interface as
we end up with, e.g., double-frees, destroying locks twice or acquiring
already destroyed locks.
When calling into protocol cleanups we equally have to tell them
whether they need to detach upper layer protocols ("ulp") or not
(e.g., in6_ifdetach()).

Provide or enahnce helper functions to do proper cleanup at a protocol
rather than at an interface level.

Approved by: re (hrs)
Obtained from: projects/vnet
Reviewed by: gnn, jhb
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D6747


# 94081f88 18-May-2016 Bjoern A. Zeeb <bz@FreeBSD.org>

Add a "vnet_state" field to struct vnet.
This is set to the SI_SUB_* value before executing any VNET_SYSINIT
or VNET_SYSUNINT. While good for debugging especially VNET teardown
problems having a chance to know at which level during teardown we are,
it will also be used to identify to detcted a "stable state"
(as in fully up and running) later on.

Obtained from: projects/vnet
Sponsored by: The FreeBSD Foundation


# d17d4c6b 26-Jan-2016 Gleb Smirnoff <glebius@FreeBSD.org>

Provide TCPSTAT_DEC() and TCPSTAT_FETCH() macros.


# 6df8a710 07-Nov-2014 Gleb Smirnoff <glebius@FreeBSD.org>

Remove SYSCTL_VNET_* macros, and simply put CTLFLAG_VNET where needed.

Sponsored by: Nginx, Inc.


# db2f5a24 09-Feb-2014 Mikolaj Golub <trociny@FreeBSD.org>

Fixup for r261590 (vnet sysctl handlers cleanup).

Reviewed by: glebius


# b5c32cf4 07-Feb-2014 Gleb Smirnoff <glebius@FreeBSD.org>

Remove identical vnet sysctl handlers, and handle CTLFLAG_VNET
in the sysctl_root().

Note: SYSCTL_VNET_* macros can be removed as well. All is
needed to virtualize a sysctl oid is set CTLFLAG_VNET on it.
But for now keep macros in place to avoid large code churn.

Sponsored by: Nginx, Inc.


# 9bea6fd6 09-Jul-2013 Andrey V. Elsukov <ae@FreeBSD.org>

Correct CTASSERT condition.


# 7daad711 09-Jul-2013 Andrey V. Elsukov <ae@FreeBSD.org>

Add several macros to help migrate statistics structures to PCPU counters.


# 144e6203 11-Feb-2011 Bjoern A. Zeeb <bz@FreeBSD.org>

Mfp4 CH=177255:

Resort the CURVNET_SET* macros in the non-VNET_DEBUG case to match
the call order of the VNET_DEBUG case.

Add the VNET_ASSERT() to the non-VNET_DEBUG case as well so that
INVARIANTS will still catch problems.

Sponsored by: The FreeBSD Foundation
Sponsored by: CK Software GmbH
Reviewed by: jhb

MFC after: 2 weeks


# 0028e524 11-Feb-2011 Bjoern A. Zeeb <bz@FreeBSD.org>

Mfp4 CH=177255:

Make VNET_ASSERT() available with either VNET_DEBUG or INVARIANTS.

Change the syntax to match KASSERT() to allow more flexible panic
messages rather than having a printf with hardcoded arguments
before panic.

Adjust the few assertions we have to the new format (and enhance
the output).

Sponsored by: The FreeBSD Foundation
Sponsored by: CK Software GmbH
Reviewed by: jhb

MFC after: 2 weeks


# 6cf986ac 10-Feb-2011 Bjoern A. Zeeb <bz@FreeBSD.org>

Mfp4 CH=177255:

Use __func__ rather than __FUNCTION__.

MFC after: 2 weeks


# f8e4b4ef 19-Jan-2011 Matthew D Fleming <mdf@FreeBSD.org>

sysctl(8) should use the CTLTYPE to determine the type of data when
reading. (This was already done for writing to a sysctl). This
requires all SYSCTL setups to specify a type. Most of them are now
checked at compile-time.

Remove SYSCTL_*X* sysctl additions as the print being in hex should be
controlled by the -x flag to sysctl(8).

Succested by: bde


# 3e288e62 22-Nov-2010 Dimitry Andric <dim@FreeBSD.org>

After some off-list discussion, revert a number of changes to the
DPCPU_DEFINE and VNET_DEFINE macros, as these cause problems for various
people working on the affected files. A better long-term solution is
still being considered. This reversal may give some modules empty
set_pcpu or set_vnet sections, but these are harmless.

Changes reverted:

------------------------------------------------------------------------
r215318 | dim | 2010-11-14 21:40:55 +0100 (Sun, 14 Nov 2010) | 4 lines

Instead of unconditionally emitting .globl's for the __start_set_xxx and
__stop_set_xxx symbols, only emit them when the set_vnet or set_pcpu
sections are actually defined.

------------------------------------------------------------------------
r215317 | dim | 2010-11-14 21:38:11 +0100 (Sun, 14 Nov 2010) | 3 lines

Apply the STATIC_VNET_DEFINE and STATIC_DPCPU_DEFINE macros throughout
the tree.

------------------------------------------------------------------------
r215316 | dim | 2010-11-14 21:23:02 +0100 (Sun, 14 Nov 2010) | 2 lines

Add macros to define static instances of VNET_DEFINE and DPCPU_DEFINE.


# c3adda9f 14-Nov-2010 Dimitry Andric <dim@FreeBSD.org>

Instead of unconditionally emitting .globl's for the __start_set_xxx and
__stop_set_xxx symbols, only emit them when the set_vnet or set_pcpu
sections are actually defined.


# 47d46d92 14-Nov-2010 Dimitry Andric <dim@FreeBSD.org>

Add macros to define static instances of VNET_DEFINE and DPCPU_DEFINE.


# 7e54af08 12-Nov-2010 Dimitry Andric <dim@FreeBSD.org>

Similar to r212647, remove the workaround in sys/net/vnet.h for an ld
bug (incorrect placement of __start_SECNAME in some cases) that was
fixed in r210245.

There is already an UPDATING entry about needing a recent ld.

MFC after: 1 month


# 4403994d 11-Nov-2010 Dimitry Andric <dim@FreeBSD.org>

Use the same treatment as in linker_set.h for the __start and __stop
symbols of the set_vnet and set_pcpu sections, so those symbols will
always be emitted in kernel modules, if they use vnet.h or pcpu.h.

Also, for pcpu.h, make the __(start|stop)_set_pcpu declarations, and
associated macros invisible to userland, to prevent it picking up these
symbols.

Reviewed by: kib


# a7d5f7eb 19-Oct-2010 Jamie Gritton <jamie@FreeBSD.org>

A new jail(8) with a configuration file, to replace the work currently done
by /etc/rc.d/jail.


# b1ae592b 02-Jun-2010 Marko Zec <zec@FreeBSD.org>

Provide a macro for registering a virtualized sysctl handler for
VNET opaque data.

MFC after: 30 days


# 407b1937 21-Apr-2010 Bjoern A. Zeeb <bz@FreeBSD.org>

MFC r205345:

Split eventhandler_register() into an internal part and a wrapper function
that provides the allocated and setup eventhandler entry.

Add a new wrapper for VIMAGE that allocates extra space to hold the
callback function and argument in addition to an extra wrapper function.
While the wrapper function goes as normal callback function the
argument points to the extra space allocated holding the original func
and arg that the wrapper function can then call.

Provide an iterator function for the virtual network stack (vnet) that
will call the callback function for each network stack.

Provide a new set of macros for VNET that in the non-VIMAGE case will
just call eventhandler_register() while in the VIMAGE case it will use
vimage_eventhandler_register() passing in the extra iterator function
but will only register once rather than per-vnet.
We need a special macro in case we are interested in the tag returned
as we must check for curvnet and can neither simply assign the
return value, nor not change it in the non-vnet0 case without that.

Discussed with: jhb
Reviewed by: zec (earlier version), jhb


# 7a90b212 14-Apr-2010 Julian Elischer <julian@FreeBSD.org>

Move two copies of the same definition to a common include file.

MFC after: 3 weeks


# 42eedeac 19-Mar-2010 Bjoern A. Zeeb <bz@FreeBSD.org>

Split eventhandler_register() into an internal part and a wrapper function
that provides the allocated and setup eventhandler entry.

Add a new wrapper for VIMAGE that allocates extra space to hold the
callback function and argument in addition to an extra wrapper function.
While the wrapper function goes as normal callback function the
argument points to the extra space allocated holding the original func
and arg that the wrapper function can then call.

Provide an iterator function for the virtual network stack (vnet) that
will call the callback function for each network stack.

Provide a new set of macros for VNET that in the non-VIMAGE case will
just call eventhandler_register() while in the VIMAGE case it will use
vimage_eventhandler_register() passing in the extra iterator function
but will only register once rather than per-vnet.
We need a special macro in case we are interested in the tag returned
as we must check for curvnet and can neither simply assign the
return value, nor not change it in the non-vnet0 case without that.

Sponsored by: ISPsystem
Discussed with: jhb
Reviewed by: zec (earlier version), jhb
MFC after: 1 month


# 7526c9df 10-Feb-2010 Marko Zec <zec@FreeBSD.org>

MFC r203483:
Instead of spamming the console on each curvnet recursion event, print
out each such call graph only once, along with a stack backtrace. This
should make kernels built with VNET_DEBUG reasonably usable again in
busy / production environments.

Introduce a new DDB command "show vnetrcrs" which dumps the whole log
of distinctive curvnet recursion events. This might be useful when
recursion reports get burried / lost too deep in the message buffer.
In the later case stack backtraces are not available.

Reviewed by: bz


# 0a705ab6 04-Feb-2010 Marko Zec <zec@FreeBSD.org>

Instead of spamming the console on each curvnet recursion event, print
out each such call graph only once, along with a stack backtrace. This
should make kernels built with VNET_DEBUG reasonably usable again in
busy / production environments.

Introduce a new DDB command "show vnetrcrs" which dumps the whole log
of distinctive curvnet recursion events. This might be useful when
recursion reports get burried / lost too deep in the message buffer.
In the later case stack backtraces are not available.

Reviewed by: bz
MFC after: 3 days


# 857e5615 14-Aug-2009 Marko Zec <zec@FreeBSD.org>

MFC r196228:

Make VNET_DEBUG a standalone compile-time option, i.e. decouple it from
INVARIANTS.

Reviewed by: bz
Approved by: re (rwatson), julian (mentor)

Approved by: re (rwatson)


# 67addcde 14-Aug-2009 Marko Zec <zec@FreeBSD.org>

Make VNET_DEBUG a standalone compile-time option, i.e. decouple it from
INVARIANTS.

Reviewed by: bz
Approved by: re (rwatson), julian (mentor)


# da2a30fc 13-Aug-2009 Bjoern A. Zeeb <bz@FreeBSD.org>

MFC r196176:

Make it possible to change the vnet sysctl variables on jails
with their own virtual network stack. Jails only inheriting a
network stack cannot change anything that cannot be changed from
within a prison.

Reviewed by: rwatson, zec

Approved by: re (kib)


# eb79e1c7 13-Aug-2009 Bjoern A. Zeeb <bz@FreeBSD.org>

Make it possible to change the vnet sysctl variables on jails
with their own virtual network stack. Jails only inheriting a
network stack cannot change anything that cannot be changed from
within a prison.

Reviewed by: rwatson, zec
Approved by: re (kib)


# 28a2b3dd 12-Aug-2009 Bjoern A. Zeeb <bz@FreeBSD.org>

MFC r196118:
Put minimum alignment on the dpcpu and vnet section so that ld
when adding the __start_ symbol knows the expected section alignment
and can place the __start_ symbol correctly.

These sections will not support symbols with super-cache line alignment
requirements.

For full details, see posting to freebsd-current, 2009-08-10,
Message-ID: <20090810133111.C93661@maildrop.int.zabbadoz.net>.

Debugging and testing patches by:
Kamigishi Rei (spambox haruhiism.net),
np, lstewart, jhb, kib, rwatson
Tested by: Kamigishi Rei, lstewart
Reviewed by: kib

Approved by: re


# 1b501e53 12-Aug-2009 Bjoern A. Zeeb <bz@FreeBSD.org>

Put minimum alignment on the dpcpu and vnet section so that ld
when adding the __start_ symbol knows the expected section alignment
and can place the __start_ symbol correctly.

These sections will not support symbols with super-cache line alignment
requirements.

For full details, see posting to freebsd-current, 2009-08-10,
Message-ID: <20090810133111.C93661@maildrop.int.zabbadoz.net>.

Debugging and testing patches by:
Kamigishi Rei (spambox haruhiism.net),
np, lstewart, jhb, kib, rwatson
Tested by: Kamigishi Rei, lstewart
Reviewed by: kib
Approved by: re


# 6bc2c7b7 01-Aug-2009 Robert Watson <rwatson@FreeBSD.org>

Make the vnet alloc/destroy paths a bit easier to followg by merging
vnet_data_init/vnet_data_destroy into vnet_alloc/vnet_destroy.

Reviewed by: bz, zec
Approved by: re (vimage blanket)


# 530c0060 01-Aug-2009 Robert Watson <rwatson@FreeBSD.org>

Merge the remainder of kern_vimage.c and vimage.h into vnet.c and
vnet.h, we now use jails (rather than vimages) as the abstraction
for virtualization management, and what remained was specific to
virtual network stacks. Minor cleanups are done in the process,
and comments updated to reflect these changes.

Reviewed by: bz
Approved by: re (vimage blanket)


# ed3db012 29-Jul-2009 Robert Watson <rwatson@FreeBSD.org>

Reorder and recomment vnet.c and vnet.h on the basis that they are no longer
solely about the virtual network stack memory allocator.

Approved by: re (vimage blanket)


# a9bcca79 28-Jul-2009 Robert Watson <rwatson@FreeBSD.org>

Revise header comments for vnet.h as we now implement VNET_SYSINIT, not
just VNET_DEFINE in vnet.h.

Approved by: re (vimage blanket)


# d0728d71 23-Jul-2009 Robert Watson <rwatson@FreeBSD.org>

Introduce and use a sysinit-based initialization scheme for virtual
network stacks, VNET_SYSINIT:

- Add VNET_SYSINIT and VNET_SYSUNINIT macros to declare events that will
occur each time a network stack is instantiated and destroyed. In the
!VIMAGE case, these are simply mapped into regular SYSINIT/SYSUNINIT.
For the VIMAGE case, we instead use SYSINIT's to track their order and
properties on registration, using them for each vnet when created/
destroyed, or immediately on module load for already-started vnets.
- Remove vnet_modinfo mechanism that existed to serve this purpose
previously, as well as its dependency scheme: we now just use the
SYSINIT ordering scheme.
- Implement VNET_DOMAIN_SET() to allow protocol domains to declare that
they want init functions to be called for each virtual network stack
rather than just once at boot, compiling down to DOMAIN_SET() in the
non-VIMAGE case.
- Walk all virtualized kernel subsystems and make use of these instead
of modinfo or DOMAIN_SET() for init/uninit events. In some cases,
convert modular components from using modevent to using sysinit (where
appropriate). In some cases, do minor rejuggling of SYSINIT ordering
to make room for or better manage events.

Portions submitted by: jhb (VNET_SYSINIT), bz (cleanup)
Discussed with: jhb, bz, julian, zec
Reviewed by: bz
Approved by: re (VIMAGE blanket)


# a08362ce 21-Jul-2009 Bjoern A. Zeeb <bz@FreeBSD.org>

sysctl_msec_to_ticks is used with both virtualized and
non-vrtiualized sysctls so we cannot used one common function.

Add a macro to convert the arg1 in the virtualized case to
vnet.h to not expose the maths to all over the code.

Add a wrapper for the single virtualized call, properly handling
arg1 and call the default implementation from there.

Convert the two over places to use the new macro.

Reviewed by: rwatson
Approved by: re (kib)


# 17ef1feb 20-Jul-2009 Robert Watson <rwatson@FreeBSD.org>

Add macros VNET_SETNAME and VNET_SYMPREFIX, and expose to userspace if
_WANT_VNET is defined. This way we don't need separate definitions in
libkvm.

Reviewed by: bz
Approved by: re (vimage blanket)


# 1e77c105 16-Jul-2009 Robert Watson <rwatson@FreeBSD.org>

Remove unused VNET_SET() and related macros; only VNET_GET() is
ever actually used. Rename VNET_GET() to VNET() to shorten
variable references.

Discussed with: bz, julian
Reviewed by: bz
Approved by: re (kensmith, kib)


# c1e200ff 14-Jul-2009 Robert Watson <rwatson@FreeBSD.org>

Add missing license line for vnet.h, correct white space nit.

Approved by: re (kensmith) (implicit)


# eddfbb76 14-Jul-2009 Robert Watson <rwatson@FreeBSD.org>

Build on Jeff Roberson's linker-set based dynamic per-CPU allocator
(DPCPU), as suggested by Peter Wemm, and implement a new per-virtual
network stack memory allocator. Modify vnet to use the allocator
instead of monolithic global container structures (vinet, ...). This
change solves many binary compatibility problems associated with
VIMAGE, and restores ELF symbols for virtualized global variables.

Each virtualized global variable exists as a "reference copy", and also
once per virtual network stack. Virtualized global variables are
tagged at compile-time, placing the in a special linker set, which is
loaded into a contiguous region of kernel memory. Virtualized global
variables in the base kernel are linked as normal, but those in modules
are copied and relocated to a reserved portion of the kernel's vnet
region with the help of a the kernel linker.

Virtualized global variables exist in per-vnet memory set up when the
network stack instance is created, and are initialized statically from
the reference copy. Run-time access occurs via an accessor macro, which
converts from the current vnet and requested symbol to a per-vnet
address. When "options VIMAGE" is not compiled into the kernel, normal
global ELF symbols will be used instead and indirection is avoided.

This change restores static initialization for network stack global
variables, restores support for non-global symbols and types, eliminates
the need for many subsystem constructors, eliminates large per-subsystem
structures that caused many binary compatibility issues both for
monitoring applications (netstat) and kernel modules, removes the
per-function INIT_VNET_*() macros throughout the stack, eliminates the
need for vnet_symmap ksym(2) munging, and eliminates duplicate
definitions of virtualized globals under VIMAGE_GLOBALS.

Bump __FreeBSD_version and update UPDATING.

Portions submitted by: bz
Reviewed by: bz, zec
Discussed with: gnn, jamie, jeff, jhb, julian, sam
Suggested by: peter
Approved by: re (kensmith)


# 6cb7f168 29-Jun-2009 Brooks Davis <brooks@FreeBSD.org>

Remove support for the /dev/net/* per-interface devices. They serve
little purpose and are unused in the base system.

The IOCTL functionality is entirely duplicated and routing sockets
provide a richer interface than the kqueue functionality.

Further, it is not practical for these devices to be made sensible in
the face of VIMAGE.

Bump __FreeBSD_version on the off chance that there is any code out
there that actually uses this stuff.

Reviewed by: rwatson
Discussed with: bz, zec
Approved by: re@ (kensmith)


# 3952a5ab 22-Jun-2009 Bjoern A. Zeeb <bz@FreeBSD.org>

Updates after r194640:
- shrink size guards for vnet_net.
vnet_rtable does not need size guards as it is self-contained.
- remove a bunch of defines from vnet.h no longer valid.


# b58ea5f3 22-Jun-2009 Bjoern A. Zeeb <bz@FreeBSD.org>

Move virtualization of routing related variables into their own
Vimage module, which had been there already but now is stateful.

All variables are now file local; so this further limits the global
spreading of routing related things throughout the kernel.

Add a missing function local variable in case of MPATHing.

Reviewed by: zec


# bc29160d 08-Jun-2009 Marko Zec <zec@FreeBSD.org>

Introduce an infrastructure for dismantling vnet instances.

Vnet modules and protocol domains may now register destructor
functions to clean up and release per-module state. The destructor
mechanisms can be triggered by invoking "vimage -d", or a future
equivalent command which will be provided via the new jail framework.

While this patch introduces numerous placeholder destructor functions,
many of those are currently incomplete, thus leaking memory or (even
worse) failing to stop all running timers. Many of such issues are
already known and will be incrementaly fixed over the next weeks in
smaller incremental commits.

Apart from introducing new fields in structs ifnet, domain, protosw
and vnet_net, which requires the kernel and modules to be rebuilt, this
change should have no impact on nooptions VIMAGE builds, since vnet
destructors can only be called in VIMAGE kernels. Moreover,
destructor functions should be in general compiled in only in
options VIMAGE builds, except for kernel modules which can be safely
kldunloaded at run time.

Bump __FreeBSD_version to 800097.
Reviewed by: bz, julian
Approved by: rwatson, kib (re), julian (mentor)


# c2c2a7c1 01-Jun-2009 Bjoern A. Zeeb <bz@FreeBSD.org>

Convert the two dimensional array to be malloced and introduce
an accessor function to get the correct rnh pointer back.

Update netstat to get the correct pointer using kvm_read()
as well.

This not only fixes the ABI problem depending on the kernel
option but also permits the tunable to overwrite the kernel
option at boot time up to MAXFIBS, enlarging the number of
FIBs without having to recompile. So people could just use
GENERIC now.

Reviewed by: julian, rwatson, zec
X-MFC: not possible


# 37f17770 23-May-2009 Marko Zec <zec@FreeBSD.org>

V_irtualize the if_clone framework, thus allowing for clonable ifnets
to optionally have overlapping unit numbers if attached in different
vnets.

At this stage if_loop is the only clonable ifnet class that has been
extended to allow for such overlapping allocation of unit numbers, i.e.
in each vnet it is possible to have a lo0 interface. Other clonable ifnet
classes remain to operate with traditional semantics, i.e. each instance
of a clonable ifnet will be assigned a globally unique unit number,
regardless in which vnet such an ifnet becomes instantiated.

While here, garbage collect unused _lo_list field in struct vnet_net,
as well as improve indentation for #defines in sys/net/vnet.h.

The layout of struct vnet_net has changed, therefore bump
__FreeBSD_version.

This change has no functional impact on nooptions VIMAGE kernel builds.

Reviewed by: bz, brooks
Approved by: julian (mentor)


# 5f416f8e 02-May-2009 Marko Zec <zec@FreeBSD.org>

Make indentation more uniform accross vnet container structs.

This is a purely cosmetic / NOP change.

Reviewed by: bz
Approved by: julian (mentor)
Verified by: svn diff -x -w producing no output


# 1ed81b73 06-Apr-2009 Marko Zec <zec@FreeBSD.org>

First pass at separating per-vnet initializer functions
from existing functions for initializing global state.

At this stage, the new per-vnet initializer functions are
directly called from the existing global initialization code,
which should in most cases result in compiler inlining those
new functions, hence yielding a near-zero functional change.

Modify the existing initializer functions which are invoked via
protosw, like ip_init() et. al., to allow them to be invoked
multiple times, i.e. per each vnet. Global state, if any,
is initialized only if such functions are called within the
context of vnet0, which will be determined via the
IS_DEFAULT_VNET(curvnet) check (currently always true).

While here, V_irtualize a few remaining global UMA zones
used by net/netinet/netipsec networking code. While it is
not yet clear to me or anybody else whether this is the right
thing to do, at this stage this makes the code more readable,
and makes it easier to track uncollected UMA-zone-backed
objects on vnet removal. In the long run, it's quite possible
that some form of shared use of UMA zone pools among multiple
vnets should be considered.

Bump __FreeBSD_version due to changes in layout of structs
vnet_ipfw, vnet_inet and vnet_net.

Approved by: julian (mentor)


# 2bebb491 01-Mar-2009 Bjoern A. Zeeb <bz@FreeBSD.org>

Add size-guards evaluated at compile-time to the main struct vnet_*
which are not in a module of their own like gif.

Single kernel compiles and universe will fail if the size of the struct
changes. Th expected values are given in sys/vimage.h.
See the comments where how to handle this.

Requested by: peter


# 33553d6e 27-Feb-2009 Bjoern A. Zeeb <bz@FreeBSD.org>

For all files including net/vnet.h directly include opt_route.h and
net/route.h.

Remove the hidden include of opt_route.h and net/route.h from net/vnet.h.

We need to make sure that both opt_route.h and net/route.h are included
before net/vnet.h because of the way MRT figures out the number of FIBs
from the kernel option. If we do not, we end up with the default number
of 1 when including net/vnet.h and array sizes are wrong.

This does not change the list of files which depend on opt_route.h
but we can identify them now more easily.


# b27b816f 16-Feb-2009 Luigi Rizzo <luigi@FreeBSD.org>

we need if_var.h not if.h


# ada55ca0 14-Feb-2009 Luigi Rizzo <luigi@FreeBSD.org>

remove unnecessary #include from vnet.h and vinet.h

Approved by: Marko Zec


# 385195c0 10-Dec-2008 Marko Zec <zec@FreeBSD.org>

Conditionally compile out V_ globals while instantiating the appropriate
container structures, depending on VIMAGE_GLOBALS compile time option.

Make VIMAGE_GLOBALS a new compile-time option, which by default will not
be defined, resulting in instatiations of global variables selected for
V_irtualization (enclosed in #ifdef VIMAGE_GLOBALS blocks) to be
effectively compiled out. Instantiate new global container structures
to hold V_irtualized variables: vnet_net_0, vnet_inet_0, vnet_inet6_0,
vnet_ipsec_0, vnet_netgraph_0, and vnet_gif_0.

Update the VSYM() macro so that depending on VIMAGE_GLOBALS the V_
macros resolve either to the original globals, or to fields inside
container structures, i.e. effectively

#ifdef VIMAGE_GLOBALS
#define V_rt_tables rt_tables
#else
#define V_rt_tables vnet_net_0._rt_tables
#endif

Update SYSCTL_V_*() macros to operate either on globals or on fields
inside container structs.

Extend the internal kldsym() lookups with the ability to resolve
selected fields inside the virtualization container structs. This
applies only to the fields which are explicitly registered for kldsym()
visibility via VNET_MOD_DECLARE() and vnet_mod_register(), currently
this is done only in sys/net/if.c.

Fix a few broken instances of MODULE_GLOBAL() macro use in SCTP code,
and modify the MODULE_GLOBAL() macro to resolve to V_ macros, which in
turn result in proper code being generated depending on VIMAGE_GLOBALS.

De-virtualize local static variables in sys/contrib/pf/net/pf_subr.c
which were prematurely V_irtualized by automated V_ prepending scripts
during earlier merging steps. PF virtualization will be done
separately, most probably after next PF import.

Convert a few variable initializations at instantiation to
initialization in init functions, most notably in ipfw. Also convert
TUNABLE_INT() initializers for V_ variables to TUNABLE_FETCH_INT() in
initializer functions.

Discussed at: devsummit Strassburg
Reviewed by: bz, julian
Approved by: julian (mentor)
Obtained from: //depot/projects/vimage-commit2/...
X-MFC after: never
Sponsored by: NLnet Foundation, The FreeBSD Foundation


# f02493cb 28-Nov-2008 Marko Zec <zec@FreeBSD.org>

Unhide declarations of network stack virtualization structs from
underneath #ifdef VIMAGE blocks.

This change introduces some churn in #include ordering and nesting
throughout the network stack and drivers but is not expected to cause
any additional issues.

In the next step this will allow us to instantiate the virtualization
container structures and switch from using global variables to their
"containerized" counterparts.

Reviewed by: bz, julian
Approved by: julian (mentor)
Obtained from: //depot/projects/vimage-commit2/...
X-MFC after: never
Sponsored by: NLnet Foundation, The FreeBSD Foundation


# d7f03759 19-Oct-2008 Ulf Lilleengen <lulf@FreeBSD.org>

- Import the HEAD csup code which is the basis for the cvsmode work.


# 8b615593 02-Oct-2008 Marko Zec <zec@FreeBSD.org>

Step 1.5 of importing the network stack virtualization infrastructure
from the vimage project, as per plan established at devsummit 08/08:
http://wiki.freebsd.org/Image/Notes200808DevSummit

Introduce INIT_VNET_*() initializer macros, VNET_FOREACH() iterator
macros, and CURVNET_SET() context setting macros, all currently
resolving to NOPs.

Prepare for virtualization of selected SYSCTL objects by introducing a
family of SYSCTL_V_*() macros, currently resolving to their global
counterparts, i.e. SYSCTL_V_INT() == SYSCTL_INT().

Move selected #defines from sys/sys/vimage.h to newly introduced header
files specific to virtualized subsystems (sys/net/vnet.h,
sys/netinet/vinet.h etc.).

All the changes are verified to have zero functional impact at this
point in time by doing MD5 comparision between pre- and post-change
object files(*).

(*) netipsec/keysock.c did not validate depending on compile time options.

Implemented by: julian, bz, brooks, zec
Reviewed by: julian, bz, brooks, kris, rwatson, ...
Approved by: julian (mentor)
Obtained from: //depot/projects/vimage-commit2/...
X-MFC after: never
Sponsored by: NLnet Foundation, The FreeBSD Foundation