History log of /openbsd-current/usr.sbin/vmd/vmd.c
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 1.157 18-May-2024 jsg

remove prototypes with no matching function


# 1.156 08-Apr-2024 tobhe

Call daemon() only in parent and before proc_exec() to avoid orphaning child
processes. Synced from relayd.

ok mlarkin@ dv@


Revision tags: OPENBSD_7_5_BASE
# 1.155 05-Feb-2024 dv

Cleanup fcntl(3) usage and fd lifetimes in vmd(8).

Remove extraneous fcntl(3) usage for setting fd features that can
be set at time of open(2), pipe2(2), or socketpair(2). Also cleans
up pty creation switching to using functions from libutil instead
of direct ioctl(2) calls.

ok mlarkin@, original diff ok claudio@ as well.


# 1.154 04-Feb-2024 dv

Prevent null pointer deref is vm isn't found.

This area of code in vmd(8) is suspect, but the null dereference
is easily avoided.

Found by smatch, reported by and ok jsg@


# 1.153 18-Jan-2024 claudio

Use imsg_get_fd() in vmd.

vmd uses a lot of fd passing and does it sometimes via extra abstraction
so this just tries to convert the code without any optimisations.

ok dv@


Revision tags: OPENBSD_7_4_BASE
# 1.152 26-Sep-2023 dv

vmd(8): disambiguate log messages per vm and device.

The logging output from vmd(8) often specifies the function performing
the logging, but leaves which vm or vm device to guesswork and
reading tea leaves.

Change the logging formatting to prefix with information about the
specific vm and potentially the device subprocess. Most of this
logging is behind the "verbose" mode, but for warnings this will
clarify which vm or device logged the warning.

The format of vm/<name>/<device><index> is chosen to be concise and
less ugly than other approaches. This adjusts the process naming
for devices to match, dropping the use of brackets.

In the process of this change, updating log settings dynamically
via vmctl(8) is fixed by properly broadcasting that information to
the device subprocesses. The "vmm" process also now updates its own
state properly, so settings survive vm reboots.

ok mlarkin@


# 1.151 03-Jul-2023 jasper

when shutting down a vm, handle the VM id in the same way as a VM name argument

ok dv@


# 1.150 18-Jun-2023 op

relax absolute path requirement for configtest (-n)

ok dv@


# 1.149 13-May-2023 dv

vmm(4)/vmd(8): switch to anonymous shared mappings.

While splitting out emulated virtio network and block devices into
separate processes, I originally used named mappings via shm_mkstemp(3).
While this functionally achieved the desired result, it had two
unintended consequences:

1) tearing down a vm process and its child processes required
excessive locking as the guest memory was tied into the VFS layer.

2) it was observed by mlarkin@ that actions in other parts of the
VFS layer could cause some of the guest memory to flush to storage,
possibly filling /tmp.

This commit adds a new vmm(4) ioctl dedicated to allowing a process
request the kernel share a mapping of guest memory into its own vm
space. This requires an open fd to /dev/vmm (requiring root) and
both the "vmm" and "proc" pledge(2) promises. In addition, the caller
must know enough about the original memory ranges to reconstruct them
to make the vm's ranges.

Tested with help from Mischa Peters.

ok mlarkin@


# 1.148 12-May-2023 dv

vmd(8): fix segfault on vm creation.

vm_instance was using the wrong vm instance for checking the
vm_kernel_path member. Switch to using the value from the parent
vm instance in the check for if a kernel is known.

Issue reported by kn@. OK mlarkin@, kn@.


# 1.147 12-May-2023 dv

vmd(8): fix console attach from vmctl(8).

Adding in the ability to override the boot kernel created an edge
case in the ipc message handling logic for the parent process (vmd)
when receiving a "start vm" request. Result was incorrectly responding
to the control process, and as a result the vmctl client, with a
bogus "start vm response" reply with an empty tty name.

This commit rewrites the logic of how vmd goes about processing the
"start vm" request with the aim of making it simpler to understand
while addressing the edge case.

Issue reported by kn@. OK mlarkin@.


# 1.146 28-Apr-2023 dv

vmd(8)/vmctl(8): allow vm owners to override boot kernel.

vmd allows non-root users to "own" a vm defined in vm.conf(5). While
the user can start/stop the vm, if they break their filesystem they
have no means of booting recovery media like a ramdisk kernel.

This change opens the provided boot kernel via vmctl and passes the
file descriptor through the control channel to vmd. The next boot
of the vm will use the provided file descriptor as boot kernel/bios.
Subsequent boots (e.g. a reboot) will return to using behavior
defined in vm.conf or the default bios image.

ok mlarkin@


# 1.145 27-Apr-2023 dv

vmd(8): introduce multi-process model for virtio devices.

Isolate virtio network and block device emulation in dedicated
processes, forked and exec'd from the vm process. This allows for
tightening pledge promises to just "stdio".

Communication between the vcpu's and these devices now occurs via
imsg channels, which adds the benefit of not always blocking the
vcpu thread while emulating the device.

With this commit, it's possible that vmd is the first open source
hypervisor that *defaults* to a multi-process device emulation
model without requiring any additional configuration from the
operator.

Testing help from phessler@ and Mischa Peters.

ok mlarkin@


# 1.144 25-Apr-2023 dv

vmm(4)/vmd(8): pull struct members out of vmm ioctl create struct.

The object sent to vmm(4) contained file paths and details the
kernel does not need for cpu virtualization as device emulation is
in userland. Effectively, "pull up" the struct members from the
vm_create_params struct to the parent vmop_create_params struct.

This allows us to clean up some of vmd(8) and simplify things for
switching to having vmctl(8) open the "kernel" file (SeaBIOS, bsd.rd,
etc.) to allow users to boot recovery ramdisk kernels.

ok mlarkin@


# 1.143 24-Apr-2023 kn

Missing the optional default config is not an error

/var/log/{messages,daemon} logs ENOENT as error on default configless vmd.
Only complain on explicitly passed files and print a debug hint under `-vv'
in case someone forgot to populate their /etc/vm.conf.

OK dv mlarkin


# 1.142 23-Apr-2023 dv

vmd(8): teach vmm process how to exec.

Use execvp(2) to launch vm children with new address spaces.
Consequently, introduces use of unveil(2) into the vmm and vm
processes.

This imposes the requirement of launching vmd with absolute paths,
similar to sshd(8).

ok mlarkin@


# 1.141 19-Apr-2023 jsg

remove duplicate includes


# 1.140 16-Apr-2023 dv

vmd(8): clean up fd closing in vmm process.

Some mild tidying of fd closing in the vmm process in prep for
landing parts of my fork+exec diff.

With input from guenther@ on the nuances of if/when EINTR may happen
in a call to close(2).

ok mlarkin@


# 1.139 02-Apr-2023 dv

vmd(8): migrate vmd_vm.vm_ttyname to char array.

Other structs use a fixed length array already. This allows a vmd_vm
object to be transmitted over an ipc channel, too.

Additionally, solves a segfault caused by a strlcpy(3) in an error
path.

ok mlarkin@


Revision tags: OPENBSD_7_3_BASE
# 1.138 28-Jan-2023 dv

Move some header definitions from vmm(4) to vmd(8).

Part of an ongoing effort to move userland-specific information out
of a kernel header and directly into vmd(8). No functional change.

ok mlarkin@


# 1.137 22-Jan-2023 dv

vmd(8): don't remove known vm's from the config on error.

Multiple error paths, specifically the one related to if a guest
cannot allocate memory at start, resulted in a known vm (via
vm.conf(5)) being removed from the vm list. Adjust the error paths
to check if the failing vm is defined in the config before tearing
it down.

Tested with help from beck@ and Mischa Peters.

ok beck@


# 1.136 14-Jan-2023 dv

Only open /dev/vmm once in vmd(8).

Have the parent process open /dev/vmm and send the fd to the vmm
child process. Only the vmm process and its resulting children
(guest vms) need it for ioctl calls.

ok kn@


# 1.135 28-Dec-2022 jmc

spelling fixes; from paul tagliamonte
any parts of his diff not taken are noted on tech


# 1.134 15-Dec-2022 dv

Add explicit casts to ctype functions in vmd(8).

OK millert@


# 1.133 31-Oct-2022 dv

vmd(8): remove unfinished user accounting.

User accounting and enforcement was never finished. tedu the thing
until someone wants to pick it up and finish it.

Originally found by Matthew Martin.

ok mlarkin@, kn@. input from tb@.


Revision tags: OPENBSD_7_2_BASE
# 1.132 13-Sep-2022 martijn

branches: 1.132.2;
Add (partial) support for agentx in vmd.

Metrics can be found under mib-2.236 and VM-MIB (RFC7666).

Stress tested by and happy noises from Mischa Peters
OK dv@


# 1.131 08-May-2022 dv

vmd: fix rebooting a received vm

Rebooting a received vm resulted in vmd(8) exiting as a result of
flawed state tracking in the parent process.

When stopping a vm, clear the VM_RECEIVE_STATE flag. When starting
a vm, make sure the parent process collapses any existing memory
ranges after the vm is sent to the vmm process (responsible for
launching the vm).

ok mlarkin@


Revision tags: OPENBSD_7_1_BASE
# 1.130 01-Mar-2022 dv

branches: 1.130.2;
vmd(8): gracefully handle hitting data limits when starting a vm

With recent changes to login.conf(5) to restrict daemon datasize
to a finite value, users can now hit resource limits when attempting
to start a vm.

This change fixes the error path when hitting the limit. vmd(8)
will no longer abort and memory error messages are relayed to the
user.

While here, address potential under-reads/writes using atomicio
when relaying data between the child vm process and vmd's vmm
process.

Original diff from tedu@. OK mlarkin@.


# 1.129 04-Jan-2022 claudio

Fix some simple -Wunused-but-set-variable warnings.
OK benno@ dv@


# 1.128 13-Dec-2021 deraadt

including sys/cdefs.h manually started as a result of netbsd trying to
macro-build a replacement for sccsid, and was done without any concern
for namespace damage. Unfortunately this practice started infecting
other code as others were unaware they didn't need the file.
ok millert guenther


# 1.127 29-Nov-2021 deraadt

mostly avoid sys/param.h with a local nitems()
ok mlarkin


Revision tags: OPENBSD_7_0_BASE
# 1.126 18-Jul-2021 dv

vmd(8): remove invalid errno values from config_setvm

Refactor config_setvm to directly return error code on failure
instead of returning -1 and setting errno. It was setting unsupported
values not defined in <errno.h>.

OK mlarkin@


# 1.125 05-May-2021 dv

Refactor vm_instance to return error value directly.

vmd(8)'s vm_instance function set unsupported errno values. Change the
api to directly return an error (either errno or custom vmd error).

"go for it" -mlarkin@


# 1.124 04-May-2021 dv

Init debug logging state before attempting to log.

Error messages related to bad configuration were not flushing to
stderr.

OK mlarkin@


# 1.123 26-Apr-2021 dv

vmd(8): fix vmctl client "wait" state corruption

Adds queue-based tracking of waiting client state to fix the cause of
state corruption when a vmctl(8) user cancels a wait and restarts it.
The socket fd value for the control process client was being used to
track the waiting party, but this also prevented multiple waiting
clients.

This moves all the state tracking of who to notify of a vm's stopping
to the control process and no longer requires the parent process to
track it in the global environment state.

Future work will be needed to smooth out the difference between the
IMSG_VMDOP_TERMINATE_VM_{EVENT,RESPONSE} events instead of needing to
translate before relaying to the vmctl(8) client.

Tested by Mischa Peters (thanks!)

ok mlarkin@


Revision tags: OPENBSD_6_9_BASE
# 1.122 05-Apr-2021 dv

Send correct response type on unpause errors.

ok pd@


# 1.121 29-Mar-2021 dv

Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp
and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior
ignored did not intercept these packets and instead transmitted them.

This should make vmd(8)'s dhcp behave more as a true dhcp server should and
allows it to work properly with the new dhcpleased(8) attempting a renewal.

OK mlarkin@


# 1.120 27-Jan-2021 deraadt

these programs (with common ancestry) had a -fno-common problem related
to privsep_procid.
ok mortimer


Revision tags: OPENBSD_6_8_BASE
# 1.119 23-Sep-2020 martijn

Revert agentx support for now, we're too close to release.

requested by deraadt@


# 1.118 23-Sep-2020 martijn

Add support for agentx to vmd.

This is based around VM-MIB from RFC7666,but does not export the full
spec. People more knowledgeable of vmd are encouraged to expand on this.


Revision tags: OPENBSD_6_7_BASE
# 1.117 12-Dec-2019 pd

vmd: start vms defined in vm.conf in a staggered fashion

This addresses 'thundering herd' problem when a lot of
vms are configured in vm.conf. A lot of vms booting in parallel can
overload the host and also mess up tsc calibration in openbsd guests as
it uses PIT which doesn't fire reliably if the host is overloaded.

We default to starting vms with parallelism of ncpuonline and a delay 30 seconds
between batches. This is configurable in vm.conf.

ok mlarkin@ (also addressed comments from cheloha@)


Revision tags: OPENBSD_6_6_BASE
# 1.116 04-Sep-2019 mlarkin

vmd(8): memory leak in an error path

Found by Hiltjo Posthuma, thanks!


# 1.115 14-Aug-2019 anton

Improve the error message when supplying an invalid template to vmctl
start. Favoring 'invalid template' over 'permission denied' should give
the user a better hint on what went wrong.

ok kn@ mlarkin@


# 1.114 28-Jun-2019 deraadt

When system calls indicate an error they return -1, not some arbitrary
value < 0. errno is only updated in this case. Change all (most?)
callers of syscalls to follow this better, and let's see if this strictness
helps us in the future.


# 1.113 20-May-2019 jasper

drop fatalx calls when claiming a new vm id; otherwise it's possible
to crash vmd and take all other vms with it. this required a little
shuffling to get the error value reported back to the caller to
handle the error properly.

ok mlarkin@


# 1.112 11-May-2019 jasper

report vm state through 'vmctl status'; whereas previously this would display the state of
the vcpu (which is why it got removed), it now actually reports the correct state
(running, stopped, disabled, paused, etc)

ok ccardenas@ mlarkin@


# 1.111 11-May-2019 jasper

vm_dump_header allocated space for a signature but it was never set;
set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image

ok mlarkin@ pd@


# 1.110 11-May-2019 jasper

track the state of the vm (running, paused, etc) using a single bitfield instead of
a handful of separate variables. this will makes it easier for vmd to report
and check on the individual vm states

no functional change intended

ok ccardenas@ mlarkin@


# 1.109 11-May-2019 jasper

sync the vm state in vmd too when (un)pausing a vm, otherwise the vm process
knows the vm is paused, but vmd does not.

ok mlarkin@ pd@


Revision tags: OPENBSD_6_5_BASE
# 1.108 09-Dec-2018 claudio

When -B is used to specify a specific boot device also change the reboot
behaviour of vmd to stop / exit at guest reboot.
OK ccardenas@


# 1.107 04-Dec-2018 claudio

Introduce IMSG_VMDOP_WAIT_VM_REQUEST a control message that registers a
vmctl peerid that should be informed when the VM is stopped (like when the
guest does a shutdown). Uses the same logic as using the VMOP_WAIT flag on
IMSG_VMDOP_TERMINATE_VM_REQUEST.
Ok ccardenas@, reyk@


# 1.106 26-Nov-2018 ori

Keep a list of known vms, and reuse the VM IDs.

This means that when using '-L', the IP addresses of the VMs are stable.

ok reyk@


# 1.105 21-Nov-2018 reyk

Add support for "local inet6" interfaces.

ok & test ccardenas@, additional review from kn@


# 1.104 15-Oct-2018 reyk

Prevent VM reboot loops by rate-limiting the interval a VM can reboot.

This looping has been experienced by people who run VMs with a broken
kernel or boot loader that trigger a very fast reboot loop (triple
fault) of a VM that ends up using a lot of CPU and resources on the
host. Some fixes in vmm(4) and vmd(8) helped to avoid such conditions
but it can still occur if something is wrong in the guest VM itself.

If the VM restarts after less than VM_START_RATE_SEC (6) seconds, we
increment the limit counter. After VM_START_RATE_LIMIT (3) of suchs
fast reboots the VM is stopped.

There are only very few people who intentionally want to reboot-loop a
VM very quickly (many times within a second); mostly for fuzzing.
They will have to recompile and adjust the stated #defines in the code
as we don't have a config option to disable it.

OK mlarkin@


Revision tags: OPENBSD_6_4_BASE
# 1.103 08-Oct-2018 reyk

Add support for qcow2 base images (external snapshots).

This works is from Ori Bernstein, committing on his behalf:

Add support to vmd for external snapshots. That is, snapshots that are
derived from a base image. Data lookups start in the derived image,
and if the derived image does not contain some data, the search
proceeds ot the base image. Multiple derived images may exist off of
a single base image.

A limitation of this format is that modifying the base image will
corrupt the derived image.

This change also adds support for creating disk derived disk images to
vmctl. To use it:

vmctl create derived.qcow2 -s 16G -b base.qcow2

From Ori Bernstein
OK mlarkin@ reyk@


# 1.102 29-Sep-2018 pd

vmd: don't remove vm if sending failed

Fix a bug where a vm was removed in vmd.c after vmctl send even if sending
failed.
spotted by solene@
ok mlarkin@


# 1.101 28-Sep-2018 reyk

Fix copy-pasto to use maxmem instead of maxcpu

Reported by Greg Steuck

OK mlarkin@


# 1.100 10-Sep-2018 bluhm

vmd(8) clould close file descriptor 0 as not all fd fields were
properly initialized with -1. Also avoid closing -1.
OK mlarkin@


# 1.99 10-Sep-2018 bluhm

During the fork+exec implementation, daemon(3) was moved after
proc_init(). As a consequence vmd(8) child processes did not detach
from the terminal anymore. Dup /dev/null to the stdio file descriptors
in the children.
OK mlarkin@ reyk@


# 1.98 15-Jul-2018 reyk

Track resources and enforce cpu/memory/interface limits for non-root users.

The limits are currently hard-coded and undocumented (4 CPUs/VMs, 2G
memory, 8 interfaces) but will be configurable in an upcoming diff.
These limits are tracked in total usage; for example, a user will be
able to run up to 4 VMs with 512M of memory or a single VM with 2G.

OK ccardenas@ mlarkin@


# 1.97 13-Jul-2018 reyk

Check the disk/kernel/cdrom file permissions after openening the fd.

This prevents time of TOCTOU attacks for instances.

OK mlarkin@


# 1.96 13-Jul-2018 reyk

Add "allow instance" option.

This allows users to create VM instances and change desired options,
for example a user can be allowed to run a VM with all the
pre-configured options but specify an own disk image.

(mlarkin@ was fine with iterating over it)

OK ccardenas@


# 1.95 12-Jul-2018 reyk

Allow to use configured/running VMs as templates for other VM instances.

This introduces new grammar and the -t optional in vmctl start.

(For now, only root can create VM instances; but it is planned to allow
users to create their own VMs based on permissions and quota.)

OK ccardenas@ mlarkin@ jmc@


# 1.94 11-Jul-2018 reyk

style - indent each case statement in a switch.


# 1.93 11-Jul-2018 reyk

Add -w option to vmctl stop to wait for completion of VM termination.

Use it in /etc/rc.d/vmd accordingly.

OK sthen@


# 1.92 11-Jul-2018 reyk

Rename function to vmd_check_vmh


# 1.91 11-Jul-2018 reyk

Add -f option to vmctl stop to forcefully kill a VM.

This also fixes a bug in vmm_sighdlr where it might have missed
forwarding the TERMINATE_EVENT to the vmd parent after a VM child
died, leading to an abandoned VM in the vmd parent process.

OK ccardenas@ mlarkin@ benno@ kn@


# 1.90 10-Jul-2018 reyk

style (single-line ifs don't need braces)


# 1.89 10-Jul-2018 reyk

vmd already had DEBUG/DPRINTF, there is no need for VMD_DEBUG/dprintf

Replace all occurences of dprintf with DPRINTF (defined in proc.h).


# 1.88 10-Jul-2018 reyk

Tweak debug log messages

- Turn tracing messages into DPRINTF (only compiled with DEBUG).

- Pass __func__ to vm_stop and vm_remove: this way we can track who
called the function in the async context. It replaces the manual
log_debug in front of each vm_stop/vm_remove. This debug logging
trick can be removed in the future once we are more confident about
it.

OK ccardenas@ mlarkin@


# 1.87 26-Jun-2018 reyk

Add "socket owner" to allow changing the owner of the vmd control socket.

This allows to open vmctl control or console access to other users
that are not in group wheel. Access for non-root users still defaults
to read-only actions unless you change the owner (user/group) of each
individual VM.

Requested by Mischa Peters

OK mlarkin@


# 1.86 19-Jun-2018 reyk

knf


# 1.85 13-May-2018 pd

vmd(8): enable pause / unpause for vm owners

Patch from Mohamed Aslan. Thanks!
ok kn@


# 1.84 25-Apr-2018 mlarkin

vmd(8)'s early error messages weren't visible when started via /etc/rc
(such as errors relating to not having VMX/etc). Change the log_init
to log to syslog so at least we have some chance of seeing these errors.

requested and ok beck@


# 1.83 21-Apr-2018 mlarkin

spelling error in log message


# 1.82 29-Mar-2018 martijn

Make sure that the global config is send out immediately when it is
loaded. This makes sure that the local prefix specied in the config is
always used.

OK ccardenas@


Revision tags: OPENBSD_6_3_BASE
# 1.81 14-Mar-2018 mlarkin

block two VMs from using the same disk image file at the same time.
Also changes an error message in vmctl to reflect same.


# 1.80 18-Feb-2018 pd

vmd: fix vmctl pause for non existing vm ids (never returns)

check if vm id is valid before sending to vmm for pausing. The 'lock' is caused
by vmm sending back ENOENT for a non existent vm but vmd drops the message
because it doesn't recogize the vmid vmm is talking about. This is an artifact
of the 'policy' don't trust any imsg from a sibling priv sep process and do
your own checking.

reported by Abel Abraham Camarillo Ojeda
ok mlarkin@ and ccardenas@


# 1.79 10-Jan-2018 sthen

Don't require "disk" or "kernel", also allow just "cdrom" instead, a VM can
still be useful with only cdrom storage. ok ccardenas@


# 1.78 08-Jan-2018 mpi

Enable TIOCUCNTL to be able to set ns8250's break detected condition.

It is now possible to send BREAK commands to vmd(8) independently of
the serial terminal emulator.

Happy virtual ddb(4) hacking!

No objection from mlarkin@, ok nicm@, ccardenas@, deraadt@


# 1.77 03-Jan-2018 ccardenas

Add initial CD-ROM support to VMD via vioscsi.

* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8)
* Support various sized ISOs (Limitation of 4G ISOs on Linux guests)
* Known working guests: OpenBSD (primary), Alpine Linux (primary),
CentOS 6 (secondary), Ubuntu 17.10 (secondary).
NOTE: Secondary indicates some issue(s) preventing full/reliable
functionality outside the scope of the vioscsi work.
* If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's
default BIOS) will boot from CD-ROM.

ok mlarkin@, jca@


# 1.76 06-Dec-2017 abieber

Make vmd respect owner when starting non-disabled vms.

OK pd@, benno@


# 1.75 30-Nov-2017 ccardenas

When performing vmctl reload and a previously configured vm is running,
exit with an EALREADY vs EPERM.

ok mlarkin@


# 1.74 11-Nov-2017 mlarkin

update switch handling in vmd(8). vmd now gets switch information (rdomain,
etc) from underlying switch interface instead of handling this on its
own.

Diff from carlos cardenas, Thanks!

ok reyk@


# 1.73 07-Nov-2017 mlarkin

typo in previous


# 1.72 07-Nov-2017 mlarkin

comment function vm_checkperm


# 1.71 24-Oct-2017 mlarkin

The VMD parent process didn't handle the case of a VM exiting
with a non 0 return properly (i.e. EIO).

From: Carlos Cardenas, thanks!


# 1.70 07-Oct-2017 mlarkin

vmd: retain ownership on vm reboot

from Jesper Wallin, thanks!


Revision tags: OPENBSD_6_2_BASE
# 1.69 08-Sep-2017 mlarkin

vmd: add more explanatory log_debug messages

From Carlos Cardenas, many thanks!


# 1.68 20-Aug-2017 pd

vmd: Allow only upward migration

This restricts receiving vms from hosts with more cpu features.

Tested on
broadwell -> skylake (works)
skylake -> broadwell (don't work)

ok mlarkin@


# 1.67 15-Aug-2017 pd

vmd: fix vm id displayed by vmctl when receiving a vm

Also fix two debug messages and an IMSG type.


# 1.66 14-Aug-2017 jasper

validate vm names before creating them; a valid name contains alphanumeric
characters, including '.', '_' and '-'. but does not start with the latter
three.

ok mlarkin@ pd@


# 1.65 13-Aug-2017 jasper

don't issue a termination command to an already stopped vm

ok mlarkin@


# 1.64 15-Jul-2017 pd

Add vmctl send and vmctl receive

ok reyk@ and mlarkin@


# 1.63 09-Jul-2017 pd

vmd/vmctl: Add ability to pause / unpause vms

With help from Ashwin Agrawal

ok reyk@ mlarkin@


# 1.62 29-May-2017 mlarkin

vmd(8): prevent crashing when presented with a vm name argument to
"vmctl stop" that doesn't exist.

Diff from Pratik Vyas, thanks!


# 1.61 04-May-2017 reyk

Report command failure back to vmctl reload, reset, load, log verbose.

OK mlarkin@


# 1.60 04-May-2017 reyk

Add support for rdomains.

This allows to configure VM interfaces and switches in individual rdomains.

OK mlarkin@


# 1.59 25-Apr-2017 reyk

Generate randomized MAC addresses earlier to keep them across reboots.

OK deraadt@


# 1.58 21-Apr-2017 reyk

Add global configuration option "local prefix" to change prefix for -L.

The default prefix is 100.64.0.0/10 from RFC6598.

Requested by sthen@ chris@
OK mlarkin@


# 1.57 19-Apr-2017 reyk

Add support for dynamic "NAT" interfaces (-L/local interface).

When a local interface is configured, vmd configures a /31 address on
the tap(4) interface of the host and provides another IP in the same
subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server
that replies with IP, gateway, and DNS addresses to the VM. The
built-in server only ever responds to the VM on the inside and cannot
leak its DHCP responses to the outside.

Thanks to Uwe Werler, Josh Grosse, and some others for testing!

OK deraadt@


# 1.56 06-Apr-2017 reyk

Do not expose vmm(4) VM IDs to the user, use vmd(8)'s IDs instead.

Each VM has two IDs: one from the kernel (vmm) and a different one
from userland (vmd). The vmm ID is not consistent and incremented on
every boot during runtimg of the host system. The vmd ID remains the
same during the lifetime of a configured VM, even after reboots.
Configured VMs will even get and keep their IDs when the configuration
is loaded. This is more what users expect.

Pointed out and tested by otto@

OK deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.55 15-Mar-2017 reyk

More fixes for starting and stopping VMs, fixing fallout from vm_running.

- Don't start a VM that is already running
- Keep the VM as running until it is powered off (and not stopping)
- Don't fatal in the parent if the vmm process referenced an unknown VM
- Don't stop a VM that is already stopping
- Indicate that a VM is stopping in "vmctl status"

The previous "vmctl stop; vmctl stop" to force-shutdown is not
supported anymore - the shutdown timeout should make sure that the VM
is really terminated. To force-shutdown, reference the VM by ID.
We might add a flag to vmctl stop to just turn the VM off.


# 1.54 15-Mar-2017 reyk

Close the tty if the VM was powered down.

The parent keeps a copy of each VM's tty fd to reuse it on reboot.
Close this tty if the VM was stopped, and not rebooted, by calling
vm_stop(vm, 0) instead of just setting vm_running to 0. Also make
sure that vm_ttyname is not used after free'ing it.


# 1.53 02-Mar-2017 reyk

Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.

This is especially useful when multiple VMs share a switch, the
implementation is independent from the underlying switch or bridge.

no objections mlarkin@


# 1.52 01-Mar-2017 reyk

Add "owner" option to set a user/group ownership for pre-configured VMs

This allows matching users to start or stop VMs that they "own" and to
access the console accordingly.

OK mlarkin@


# 1.51 27-Feb-2017 reyk

Replace openpty(3) with local function that uses pre-opened /dev/ptm fd

This allows more flexibility for upcoming changes and better pledge.
We also didn't use half of the features of libutil's openpty function.
Additionally, make sure that the ttys are closed correctly on shutdown.

OK gilles@


# 1.50 13-Jan-2017 edd

Make it possible to remove VMs from vmd(8)'s internal queue.

The semantics agreed with reyk@ are:

* ad-hoc created vms, created with `vmctl start`, are removed once stopped.
* Stopped VMs defined in a config file are flushed before a `vmctl reload`.

OK reyk@


# 1.49 11-Jan-2017 reyk

Add imsg communication channel between vmd and invividual VMs.
For now, this is only used to forward "log verbose|brief" requests,
but it will be used for better things later.

OK mlarkin@


# 1.48 09-Jan-2017 reyk

Stop accessing verbose and debug variables from log.c directly.

This replaces log_verbose() and "extern int verbose" with the two functions
log_setverbose() and log_getverbose().

Pointed out by benno@
OK krw@ eric@ gilles@ (OK gilles@ for the snmpd bits as well)


# 1.47 14-Dec-2016 reyk

Allow to start disabled and pre-configured VMs by name, "vmctl start foo".

With testing from Jon Bernard

OK mlarkin@


# 1.46 14-Dec-2016 reyk

If a VM terminates with the result EAGAIN, close all fds except the
pty and re-send it to the vmm monitor process. With additional
changes in vmm.c, this will allow perform a cold reboot of VM.

With testing and feedback from Jon Bernard
OK mlarkin@


# 1.45 26-Nov-2016 reyk

Implement basic support for boot.conf(8) on the disk image.

Like the real boot loader, load and parse hd0a:/etc/boot.conf from the
first disk and fall back to /bsd. Not all boot loader options are
supported, but it at least does set device, set image, and boot -acds
(eg. for booting single-user).

For example, it can now boot install60.fs that includes a boot.conf
with "set image /6.0/amd64/bsd.rd":
vmctl start install -c -d install60.fs -d OpenBSD.img

This pseudo-bootloader is only needed without BIOS and could
potentially be replaced in the future.

OK mlarkin@


# 1.44 26-Nov-2016 reyk

If -m/memory is not specified, use 512M by default.

Default value picked with mlarkin - not too small and not too large.

OK mlarkin@


# 1.43 24-Nov-2016 reyk

Add support for booting the kernel from the disk image.

This make the kernel/-k argument optional and, if not specified, tries
to find the /bsd kernel in the primary hd0a partition of the first
disk image itself. It doesn't support hd0a:/etc/boot.conf yet, and it
is no BIOS or full boot loader, but it makes booting and handling of
VMs a bit easier - booting an external kernel is still supported.

The UFS file system code ufs.c is directly from libsa which is also
used by the real boot loader. The code compiles with a few signedness
warning which will be fixed separately.

OK mlarkin@


# 1.42 22-Nov-2016 reyk

Fix error path of config_setvm() and its callers. This unbreaks
loading of invalid kernel files.

Reported by mlarkin@
OK mlarkin@


# 1.41 22-Nov-2016 reyk

There is no need for res when there is already ret.


# 1.40 22-Nov-2016 edd

Insert disabled VMs into vmd(8)'s queues and allow vmctl(8) to display them.

Tested by Jon Bernard and reyk@.

OK reyk@, no objections mlarkin@.

Thanks


# 1.39 04-Nov-2016 reyk

Pass the internal vmid or 0 to vm_register() instead of changing it
once again after setting the next available id.

Suggested by edd@


# 1.38 04-Nov-2016 reyk

Update the config/register/get VM methods to match the config_set/get
style that is used in other places. Also keep the vmid from the parent.

OK edd@


# 1.37 29-Oct-2016 edd

Separate parsing vms and switches from starting them in vmd(8).

Brings us one step closer to having disabled by default vms is vm.conf(5),
which can be started with vmctl(8).

Input, testing and OK reyk@. Thanks.


# 1.36 17-Oct-2016 reyk

Add the option to specify an interface group per virtual switch as well;
this group will be added to all VM tap(4) interfaces in the switch.

Tested by martijn@


# 1.35 15-Oct-2016 reyk

Allow to add an interface to an interface group; with the group keyword.

Requested and tested by martijn@


# 1.34 12-Oct-2016 reyk

Fix functionality and semantics of vmctl load/reload/reset.

OK rzalamena@


# 1.33 06-Oct-2016 reyk

Terminate VMs on shutdown of vmd instead of leaving them running as
undead VM processes.

OK mlarkin@


# 1.32 05-Oct-2016 reyk

Add support for enhanced networking configuration and virtual switches.
See vm.conf(5) for more details.

OK mlarkin@


# 1.31 04-Oct-2016 reyk

Add a new "priv" process that is responsible for ioctls and restricted
operations that aren't allowed under pledge. This is a companion to
the "vmd" process that runs as root but with pledge.

With the "priv" process, each new tap(4) interface now gets a
description to indicate the vm, eg. "vm1-if0-myvm". For network
configuration will be done by vmd/priv later.

OK mlarkin@


# 1.30 29-Sep-2016 reyk

Implement fork+exec for vmd, using the same framework from httpd etc.

No objections from mlarkin@ sunil@


# 1.29 17-Aug-2016 deraadt

small bits of header cleanup; ok mlarkin


# 1.28 29-Jul-2016 stefan

Allow starting a VM again after it was terminated

If a VM exits, terminate it and remove it from the list of
available VMs. That allows a VM with name `foo' to be restarted
after it has exited.

This changes structures shared between vmd and vmctl. You need to
rebuild vmctl also.

ok mlarkin@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.27 05-Feb-2016 reyk

Fix a possible use-after-free in vmd, forward the result to the
control socket before free'ing the vm.

Found by and OK jsg@


# 1.26 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.25 11-Dec-2015 reyk

The vmctl "id" argument can now be a number of or a vm name, eg.
vmctl stop 3
vmctl stop "openbsd.vm"


# 1.24 08-Dec-2015 jsg

when checking the config file with -n don't open /dev/vmm or require root
ok reyk@


# 1.23 08-Dec-2015 jsg

make the -f option work as intended
ok reyk@


# 1.22 07-Dec-2015 reyk

tweak initial error logging


# 1.21 06-Dec-2015 reyk

Prevent running a VM with the same name multiple times - multiple
instances of the same configuration will be handled in a different way
later. It is also not a good idea to use the same writeable disk
with multiple VMs at the same time.

As discussed with mlarkin@


# 1.20 06-Dec-2015 reyk

Print the TTY in the vmctl status output.


# 1.19 06-Dec-2015 reyk

When a new vm is created with VMM_IOC_CREATE, the kernel assigns a
unique id to it. This happens in the vm child process and has to be
communicated to the parent processes to track the vm. Knowing the vm
id in the parent and vmm processes also allows to remove vm from the
daemons list on terminate requests later.


# 1.18 06-Dec-2015 reyk

Check errno from config_getvm() correctly


# 1.17 05-Dec-2015 reyk

Print shorter error message if opening /dev/vmm failed.

Pointed out by deraadt@


# 1.16 03-Dec-2015 reyk

Re-add the "load" and "reload" commands to vmctl: Instead of parsing
the configuration in vmctl directly, it now sends a (re)load request
to vmd. The reload also resets the existing configuration status -
this doesn't do much difference yet but a future change will compare
if a specified VM is already running. "load" will allow to add
configuration, while "reload" resets the state before loading.


# 1.15 03-Dec-2015 reyk

Add and document -D and -f flags to vmd.


# 1.14 03-Dec-2015 reyk

mlarkin's code has been moved to vmm.c, so it is ok to claim the copyright.


# 1.13 03-Dec-2015 reyk

Add support for an optional vm.conf(5) file in vmd. This will replace
vmm.conf(5) in vmmctl. For a short time, both vmd and vmmctl will
support a configuration file, but vmmctl will be changed to send
"load" requests to vmd instead of loading and parsing the file
directly.


# 1.12 03-Dec-2015 reyk

prepare config_getvm() for parse.y


# 1.11 02-Dec-2015 reyk

send the tty name to vmmctl and print it as a result.


# 1.10 02-Dec-2015 reyk

Split the fully privileged parent into two processes "parent" and
"vmm" with reduced privileges:
- the "parent" opens fds (disks, ifs, etc.) but runs as root but pledged as
"stdio rpath wpath proc tty sendfd".
- the "vmm" process handles the creation and supervision of vm processes,
and the primary communication with the vmm(4) subsystem. It runs as _vmd
in the chroot but does not use pledge, as the vmm ioctls are not allowed
by any pledge model yet.
With this change, vmd starts to track the configuration state of VMs
in vmd and will allow other things later (like terminating a vm by
name, moving the configuration parser to vmd, ...). More incremental
changes will follow.


# 1.9 02-Dec-2015 reyk

Start tweaking vmd's privsep and daemon model by splitting the main
process into multiple parts and adopting the "proc.c"-style from other
daemons. This allows to further reduce the privileges, to give better
pledge(2), and to add some upcoming changes.

"please do" mlarkin@, deraadt@


# 1.8 26-Nov-2015 reyk

Automatically start vmm(4) when the first VM is created and after the
last VM is terminated. This allows to remove the explicit "vmm
enable" / "vmm disable" (VMM_IOC_START / VMM_IOC_STOP) ioctls. You'll
have to update kernel and userland for this change, as the kernel ABI
changes.

OK mpi@ mlarkin@


# 1.7 25-Nov-2015 tedu

typo: should be looking pid == -1


# 1.6 23-Nov-2015 reyk

accept4() is restarted after signals which prevents vmd from exiting
in the current control socket loop. Add a poll before the accept that
is not restarted and allows to escape the loop. This code is kind of
temporary, as we're planning to replace the event handling, but it
allows to kill (or Ctrl+c) vmd for now.

OK tedu@, discussed with many


# 1.5 23-Nov-2015 reyk

I accidentally removed a newline in usage() when converting the log
messages to log_*.

From Cesare Gargano


# 1.4 23-Nov-2015 reyk

Add support for logging to stderr or syslog, and to run vmd in
foreground with -d.

OK mlarkin@ jung@


# 1.3 22-Nov-2015 deraadt

use PATH_MAX where needed


# 1.2 22-Nov-2015 reyk

Add $ Ids


# 1.1 22-Nov-2015 mlarkin

vmd(8) - virtual machine daemon.

There is still a lot to be done, and fixed, in these userland components
but I have received enough "it works, commit it" emails that it's time
to finish those things in tree.

discussed with many, tested by many.


# 1.156 08-Apr-2024 tobhe

Call daemon() only in parent and before proc_exec() to avoid orphaning child
processes. Synced from relayd.

ok mlarkin@ dv@


Revision tags: OPENBSD_7_5_BASE
# 1.155 05-Feb-2024 dv

Cleanup fcntl(3) usage and fd lifetimes in vmd(8).

Remove extraneous fcntl(3) usage for setting fd features that can
be set at time of open(2), pipe2(2), or socketpair(2). Also cleans
up pty creation switching to using functions from libutil instead
of direct ioctl(2) calls.

ok mlarkin@, original diff ok claudio@ as well.


# 1.154 04-Feb-2024 dv

Prevent null pointer deref is vm isn't found.

This area of code in vmd(8) is suspect, but the null dereference
is easily avoided.

Found by smatch, reported by and ok jsg@


# 1.153 18-Jan-2024 claudio

Use imsg_get_fd() in vmd.

vmd uses a lot of fd passing and does it sometimes via extra abstraction
so this just tries to convert the code without any optimisations.

ok dv@


Revision tags: OPENBSD_7_4_BASE
# 1.152 26-Sep-2023 dv

vmd(8): disambiguate log messages per vm and device.

The logging output from vmd(8) often specifies the function performing
the logging, but leaves which vm or vm device to guesswork and
reading tea leaves.

Change the logging formatting to prefix with information about the
specific vm and potentially the device subprocess. Most of this
logging is behind the "verbose" mode, but for warnings this will
clarify which vm or device logged the warning.

The format of vm/<name>/<device><index> is chosen to be concise and
less ugly than other approaches. This adjusts the process naming
for devices to match, dropping the use of brackets.

In the process of this change, updating log settings dynamically
via vmctl(8) is fixed by properly broadcasting that information to
the device subprocesses. The "vmm" process also now updates its own
state properly, so settings survive vm reboots.

ok mlarkin@


# 1.151 03-Jul-2023 jasper

when shutting down a vm, handle the VM id in the same way as a VM name argument

ok dv@


# 1.150 18-Jun-2023 op

relax absolute path requirement for configtest (-n)

ok dv@


# 1.149 13-May-2023 dv

vmm(4)/vmd(8): switch to anonymous shared mappings.

While splitting out emulated virtio network and block devices into
separate processes, I originally used named mappings via shm_mkstemp(3).
While this functionally achieved the desired result, it had two
unintended consequences:

1) tearing down a vm process and its child processes required
excessive locking as the guest memory was tied into the VFS layer.

2) it was observed by mlarkin@ that actions in other parts of the
VFS layer could cause some of the guest memory to flush to storage,
possibly filling /tmp.

This commit adds a new vmm(4) ioctl dedicated to allowing a process
request the kernel share a mapping of guest memory into its own vm
space. This requires an open fd to /dev/vmm (requiring root) and
both the "vmm" and "proc" pledge(2) promises. In addition, the caller
must know enough about the original memory ranges to reconstruct them
to make the vm's ranges.

Tested with help from Mischa Peters.

ok mlarkin@


# 1.148 12-May-2023 dv

vmd(8): fix segfault on vm creation.

vm_instance was using the wrong vm instance for checking the
vm_kernel_path member. Switch to using the value from the parent
vm instance in the check for if a kernel is known.

Issue reported by kn@. OK mlarkin@, kn@.


# 1.147 12-May-2023 dv

vmd(8): fix console attach from vmctl(8).

Adding in the ability to override the boot kernel created an edge
case in the ipc message handling logic for the parent process (vmd)
when receiving a "start vm" request. Result was incorrectly responding
to the control process, and as a result the vmctl client, with a
bogus "start vm response" reply with an empty tty name.

This commit rewrites the logic of how vmd goes about processing the
"start vm" request with the aim of making it simpler to understand
while addressing the edge case.

Issue reported by kn@. OK mlarkin@.


# 1.146 28-Apr-2023 dv

vmd(8)/vmctl(8): allow vm owners to override boot kernel.

vmd allows non-root users to "own" a vm defined in vm.conf(5). While
the user can start/stop the vm, if they break their filesystem they
have no means of booting recovery media like a ramdisk kernel.

This change opens the provided boot kernel via vmctl and passes the
file descriptor through the control channel to vmd. The next boot
of the vm will use the provided file descriptor as boot kernel/bios.
Subsequent boots (e.g. a reboot) will return to using behavior
defined in vm.conf or the default bios image.

ok mlarkin@


# 1.145 27-Apr-2023 dv

vmd(8): introduce multi-process model for virtio devices.

Isolate virtio network and block device emulation in dedicated
processes, forked and exec'd from the vm process. This allows for
tightening pledge promises to just "stdio".

Communication between the vcpu's and these devices now occurs via
imsg channels, which adds the benefit of not always blocking the
vcpu thread while emulating the device.

With this commit, it's possible that vmd is the first open source
hypervisor that *defaults* to a multi-process device emulation
model without requiring any additional configuration from the
operator.

Testing help from phessler@ and Mischa Peters.

ok mlarkin@


# 1.144 25-Apr-2023 dv

vmm(4)/vmd(8): pull struct members out of vmm ioctl create struct.

The object sent to vmm(4) contained file paths and details the
kernel does not need for cpu virtualization as device emulation is
in userland. Effectively, "pull up" the struct members from the
vm_create_params struct to the parent vmop_create_params struct.

This allows us to clean up some of vmd(8) and simplify things for
switching to having vmctl(8) open the "kernel" file (SeaBIOS, bsd.rd,
etc.) to allow users to boot recovery ramdisk kernels.

ok mlarkin@


# 1.143 24-Apr-2023 kn

Missing the optional default config is not an error

/var/log/{messages,daemon} logs ENOENT as error on default configless vmd.
Only complain on explicitly passed files and print a debug hint under `-vv'
in case someone forgot to populate their /etc/vm.conf.

OK dv mlarkin


# 1.142 23-Apr-2023 dv

vmd(8): teach vmm process how to exec.

Use execvp(2) to launch vm children with new address spaces.
Consequently, introduces use of unveil(2) into the vmm and vm
processes.

This imposes the requirement of launching vmd with absolute paths,
similar to sshd(8).

ok mlarkin@


# 1.141 19-Apr-2023 jsg

remove duplicate includes


# 1.140 16-Apr-2023 dv

vmd(8): clean up fd closing in vmm process.

Some mild tidying of fd closing in the vmm process in prep for
landing parts of my fork+exec diff.

With input from guenther@ on the nuances of if/when EINTR may happen
in a call to close(2).

ok mlarkin@


# 1.139 02-Apr-2023 dv

vmd(8): migrate vmd_vm.vm_ttyname to char array.

Other structs use a fixed length array already. This allows a vmd_vm
object to be transmitted over an ipc channel, too.

Additionally, solves a segfault caused by a strlcpy(3) in an error
path.

ok mlarkin@


Revision tags: OPENBSD_7_3_BASE
# 1.138 28-Jan-2023 dv

Move some header definitions from vmm(4) to vmd(8).

Part of an ongoing effort to move userland-specific information out
of a kernel header and directly into vmd(8). No functional change.

ok mlarkin@


# 1.137 22-Jan-2023 dv

vmd(8): don't remove known vm's from the config on error.

Multiple error paths, specifically the one related to if a guest
cannot allocate memory at start, resulted in a known vm (via
vm.conf(5)) being removed from the vm list. Adjust the error paths
to check if the failing vm is defined in the config before tearing
it down.

Tested with help from beck@ and Mischa Peters.

ok beck@


# 1.136 14-Jan-2023 dv

Only open /dev/vmm once in vmd(8).

Have the parent process open /dev/vmm and send the fd to the vmm
child process. Only the vmm process and its resulting children
(guest vms) need it for ioctl calls.

ok kn@


# 1.135 28-Dec-2022 jmc

spelling fixes; from paul tagliamonte
any parts of his diff not taken are noted on tech


# 1.134 15-Dec-2022 dv

Add explicit casts to ctype functions in vmd(8).

OK millert@


# 1.133 31-Oct-2022 dv

vmd(8): remove unfinished user accounting.

User accounting and enforcement was never finished. tedu the thing
until someone wants to pick it up and finish it.

Originally found by Matthew Martin.

ok mlarkin@, kn@. input from tb@.


Revision tags: OPENBSD_7_2_BASE
# 1.132 13-Sep-2022 martijn

branches: 1.132.2;
Add (partial) support for agentx in vmd.

Metrics can be found under mib-2.236 and VM-MIB (RFC7666).

Stress tested by and happy noises from Mischa Peters
OK dv@


# 1.131 08-May-2022 dv

vmd: fix rebooting a received vm

Rebooting a received vm resulted in vmd(8) exiting as a result of
flawed state tracking in the parent process.

When stopping a vm, clear the VM_RECEIVE_STATE flag. When starting
a vm, make sure the parent process collapses any existing memory
ranges after the vm is sent to the vmm process (responsible for
launching the vm).

ok mlarkin@


Revision tags: OPENBSD_7_1_BASE
# 1.130 01-Mar-2022 dv

branches: 1.130.2;
vmd(8): gracefully handle hitting data limits when starting a vm

With recent changes to login.conf(5) to restrict daemon datasize
to a finite value, users can now hit resource limits when attempting
to start a vm.

This change fixes the error path when hitting the limit. vmd(8)
will no longer abort and memory error messages are relayed to the
user.

While here, address potential under-reads/writes using atomicio
when relaying data between the child vm process and vmd's vmm
process.

Original diff from tedu@. OK mlarkin@.


# 1.129 04-Jan-2022 claudio

Fix some simple -Wunused-but-set-variable warnings.
OK benno@ dv@


# 1.128 13-Dec-2021 deraadt

including sys/cdefs.h manually started as a result of netbsd trying to
macro-build a replacement for sccsid, and was done without any concern
for namespace damage. Unfortunately this practice started infecting
other code as others were unaware they didn't need the file.
ok millert guenther


# 1.127 29-Nov-2021 deraadt

mostly avoid sys/param.h with a local nitems()
ok mlarkin


Revision tags: OPENBSD_7_0_BASE
# 1.126 18-Jul-2021 dv

vmd(8): remove invalid errno values from config_setvm

Refactor config_setvm to directly return error code on failure
instead of returning -1 and setting errno. It was setting unsupported
values not defined in <errno.h>.

OK mlarkin@


# 1.125 05-May-2021 dv

Refactor vm_instance to return error value directly.

vmd(8)'s vm_instance function set unsupported errno values. Change the
api to directly return an error (either errno or custom vmd error).

"go for it" -mlarkin@


# 1.124 04-May-2021 dv

Init debug logging state before attempting to log.

Error messages related to bad configuration were not flushing to
stderr.

OK mlarkin@


# 1.123 26-Apr-2021 dv

vmd(8): fix vmctl client "wait" state corruption

Adds queue-based tracking of waiting client state to fix the cause of
state corruption when a vmctl(8) user cancels a wait and restarts it.
The socket fd value for the control process client was being used to
track the waiting party, but this also prevented multiple waiting
clients.

This moves all the state tracking of who to notify of a vm's stopping
to the control process and no longer requires the parent process to
track it in the global environment state.

Future work will be needed to smooth out the difference between the
IMSG_VMDOP_TERMINATE_VM_{EVENT,RESPONSE} events instead of needing to
translate before relaying to the vmctl(8) client.

Tested by Mischa Peters (thanks!)

ok mlarkin@


Revision tags: OPENBSD_6_9_BASE
# 1.122 05-Apr-2021 dv

Send correct response type on unpause errors.

ok pd@


# 1.121 29-Mar-2021 dv

Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp
and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior
ignored did not intercept these packets and instead transmitted them.

This should make vmd(8)'s dhcp behave more as a true dhcp server should and
allows it to work properly with the new dhcpleased(8) attempting a renewal.

OK mlarkin@


# 1.120 27-Jan-2021 deraadt

these programs (with common ancestry) had a -fno-common problem related
to privsep_procid.
ok mortimer


Revision tags: OPENBSD_6_8_BASE
# 1.119 23-Sep-2020 martijn

Revert agentx support for now, we're too close to release.

requested by deraadt@


# 1.118 23-Sep-2020 martijn

Add support for agentx to vmd.

This is based around VM-MIB from RFC7666,but does not export the full
spec. People more knowledgeable of vmd are encouraged to expand on this.


Revision tags: OPENBSD_6_7_BASE
# 1.117 12-Dec-2019 pd

vmd: start vms defined in vm.conf in a staggered fashion

This addresses 'thundering herd' problem when a lot of
vms are configured in vm.conf. A lot of vms booting in parallel can
overload the host and also mess up tsc calibration in openbsd guests as
it uses PIT which doesn't fire reliably if the host is overloaded.

We default to starting vms with parallelism of ncpuonline and a delay 30 seconds
between batches. This is configurable in vm.conf.

ok mlarkin@ (also addressed comments from cheloha@)


Revision tags: OPENBSD_6_6_BASE
# 1.116 04-Sep-2019 mlarkin

vmd(8): memory leak in an error path

Found by Hiltjo Posthuma, thanks!


# 1.115 14-Aug-2019 anton

Improve the error message when supplying an invalid template to vmctl
start. Favoring 'invalid template' over 'permission denied' should give
the user a better hint on what went wrong.

ok kn@ mlarkin@


# 1.114 28-Jun-2019 deraadt

When system calls indicate an error they return -1, not some arbitrary
value < 0. errno is only updated in this case. Change all (most?)
callers of syscalls to follow this better, and let's see if this strictness
helps us in the future.


# 1.113 20-May-2019 jasper

drop fatalx calls when claiming a new vm id; otherwise it's possible
to crash vmd and take all other vms with it. this required a little
shuffling to get the error value reported back to the caller to
handle the error properly.

ok mlarkin@


# 1.112 11-May-2019 jasper

report vm state through 'vmctl status'; whereas previously this would display the state of
the vcpu (which is why it got removed), it now actually reports the correct state
(running, stopped, disabled, paused, etc)

ok ccardenas@ mlarkin@


# 1.111 11-May-2019 jasper

vm_dump_header allocated space for a signature but it was never set;
set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image

ok mlarkin@ pd@


# 1.110 11-May-2019 jasper

track the state of the vm (running, paused, etc) using a single bitfield instead of
a handful of separate variables. this will makes it easier for vmd to report
and check on the individual vm states

no functional change intended

ok ccardenas@ mlarkin@


# 1.109 11-May-2019 jasper

sync the vm state in vmd too when (un)pausing a vm, otherwise the vm process
knows the vm is paused, but vmd does not.

ok mlarkin@ pd@


Revision tags: OPENBSD_6_5_BASE
# 1.108 09-Dec-2018 claudio

When -B is used to specify a specific boot device also change the reboot
behaviour of vmd to stop / exit at guest reboot.
OK ccardenas@


# 1.107 04-Dec-2018 claudio

Introduce IMSG_VMDOP_WAIT_VM_REQUEST a control message that registers a
vmctl peerid that should be informed when the VM is stopped (like when the
guest does a shutdown). Uses the same logic as using the VMOP_WAIT flag on
IMSG_VMDOP_TERMINATE_VM_REQUEST.
Ok ccardenas@, reyk@


# 1.106 26-Nov-2018 ori

Keep a list of known vms, and reuse the VM IDs.

This means that when using '-L', the IP addresses of the VMs are stable.

ok reyk@


# 1.105 21-Nov-2018 reyk

Add support for "local inet6" interfaces.

ok & test ccardenas@, additional review from kn@


# 1.104 15-Oct-2018 reyk

Prevent VM reboot loops by rate-limiting the interval a VM can reboot.

This looping has been experienced by people who run VMs with a broken
kernel or boot loader that trigger a very fast reboot loop (triple
fault) of a VM that ends up using a lot of CPU and resources on the
host. Some fixes in vmm(4) and vmd(8) helped to avoid such conditions
but it can still occur if something is wrong in the guest VM itself.

If the VM restarts after less than VM_START_RATE_SEC (6) seconds, we
increment the limit counter. After VM_START_RATE_LIMIT (3) of suchs
fast reboots the VM is stopped.

There are only very few people who intentionally want to reboot-loop a
VM very quickly (many times within a second); mostly for fuzzing.
They will have to recompile and adjust the stated #defines in the code
as we don't have a config option to disable it.

OK mlarkin@


Revision tags: OPENBSD_6_4_BASE
# 1.103 08-Oct-2018 reyk

Add support for qcow2 base images (external snapshots).

This works is from Ori Bernstein, committing on his behalf:

Add support to vmd for external snapshots. That is, snapshots that are
derived from a base image. Data lookups start in the derived image,
and if the derived image does not contain some data, the search
proceeds ot the base image. Multiple derived images may exist off of
a single base image.

A limitation of this format is that modifying the base image will
corrupt the derived image.

This change also adds support for creating disk derived disk images to
vmctl. To use it:

vmctl create derived.qcow2 -s 16G -b base.qcow2

From Ori Bernstein
OK mlarkin@ reyk@


# 1.102 29-Sep-2018 pd

vmd: don't remove vm if sending failed

Fix a bug where a vm was removed in vmd.c after vmctl send even if sending
failed.
spotted by solene@
ok mlarkin@


# 1.101 28-Sep-2018 reyk

Fix copy-pasto to use maxmem instead of maxcpu

Reported by Greg Steuck

OK mlarkin@


# 1.100 10-Sep-2018 bluhm

vmd(8) clould close file descriptor 0 as not all fd fields were
properly initialized with -1. Also avoid closing -1.
OK mlarkin@


# 1.99 10-Sep-2018 bluhm

During the fork+exec implementation, daemon(3) was moved after
proc_init(). As a consequence vmd(8) child processes did not detach
from the terminal anymore. Dup /dev/null to the stdio file descriptors
in the children.
OK mlarkin@ reyk@


# 1.98 15-Jul-2018 reyk

Track resources and enforce cpu/memory/interface limits for non-root users.

The limits are currently hard-coded and undocumented (4 CPUs/VMs, 2G
memory, 8 interfaces) but will be configurable in an upcoming diff.
These limits are tracked in total usage; for example, a user will be
able to run up to 4 VMs with 512M of memory or a single VM with 2G.

OK ccardenas@ mlarkin@


# 1.97 13-Jul-2018 reyk

Check the disk/kernel/cdrom file permissions after openening the fd.

This prevents time of TOCTOU attacks for instances.

OK mlarkin@


# 1.96 13-Jul-2018 reyk

Add "allow instance" option.

This allows users to create VM instances and change desired options,
for example a user can be allowed to run a VM with all the
pre-configured options but specify an own disk image.

(mlarkin@ was fine with iterating over it)

OK ccardenas@


# 1.95 12-Jul-2018 reyk

Allow to use configured/running VMs as templates for other VM instances.

This introduces new grammar and the -t optional in vmctl start.

(For now, only root can create VM instances; but it is planned to allow
users to create their own VMs based on permissions and quota.)

OK ccardenas@ mlarkin@ jmc@


# 1.94 11-Jul-2018 reyk

style - indent each case statement in a switch.


# 1.93 11-Jul-2018 reyk

Add -w option to vmctl stop to wait for completion of VM termination.

Use it in /etc/rc.d/vmd accordingly.

OK sthen@


# 1.92 11-Jul-2018 reyk

Rename function to vmd_check_vmh


# 1.91 11-Jul-2018 reyk

Add -f option to vmctl stop to forcefully kill a VM.

This also fixes a bug in vmm_sighdlr where it might have missed
forwarding the TERMINATE_EVENT to the vmd parent after a VM child
died, leading to an abandoned VM in the vmd parent process.

OK ccardenas@ mlarkin@ benno@ kn@


# 1.90 10-Jul-2018 reyk

style (single-line ifs don't need braces)


# 1.89 10-Jul-2018 reyk

vmd already had DEBUG/DPRINTF, there is no need for VMD_DEBUG/dprintf

Replace all occurences of dprintf with DPRINTF (defined in proc.h).


# 1.88 10-Jul-2018 reyk

Tweak debug log messages

- Turn tracing messages into DPRINTF (only compiled with DEBUG).

- Pass __func__ to vm_stop and vm_remove: this way we can track who
called the function in the async context. It replaces the manual
log_debug in front of each vm_stop/vm_remove. This debug logging
trick can be removed in the future once we are more confident about
it.

OK ccardenas@ mlarkin@


# 1.87 26-Jun-2018 reyk

Add "socket owner" to allow changing the owner of the vmd control socket.

This allows to open vmctl control or console access to other users
that are not in group wheel. Access for non-root users still defaults
to read-only actions unless you change the owner (user/group) of each
individual VM.

Requested by Mischa Peters

OK mlarkin@


# 1.86 19-Jun-2018 reyk

knf


# 1.85 13-May-2018 pd

vmd(8): enable pause / unpause for vm owners

Patch from Mohamed Aslan. Thanks!
ok kn@


# 1.84 25-Apr-2018 mlarkin

vmd(8)'s early error messages weren't visible when started via /etc/rc
(such as errors relating to not having VMX/etc). Change the log_init
to log to syslog so at least we have some chance of seeing these errors.

requested and ok beck@


# 1.83 21-Apr-2018 mlarkin

spelling error in log message


# 1.82 29-Mar-2018 martijn

Make sure that the global config is send out immediately when it is
loaded. This makes sure that the local prefix specied in the config is
always used.

OK ccardenas@


Revision tags: OPENBSD_6_3_BASE
# 1.81 14-Mar-2018 mlarkin

block two VMs from using the same disk image file at the same time.
Also changes an error message in vmctl to reflect same.


# 1.80 18-Feb-2018 pd

vmd: fix vmctl pause for non existing vm ids (never returns)

check if vm id is valid before sending to vmm for pausing. The 'lock' is caused
by vmm sending back ENOENT for a non existent vm but vmd drops the message
because it doesn't recogize the vmid vmm is talking about. This is an artifact
of the 'policy' don't trust any imsg from a sibling priv sep process and do
your own checking.

reported by Abel Abraham Camarillo Ojeda
ok mlarkin@ and ccardenas@


# 1.79 10-Jan-2018 sthen

Don't require "disk" or "kernel", also allow just "cdrom" instead, a VM can
still be useful with only cdrom storage. ok ccardenas@


# 1.78 08-Jan-2018 mpi

Enable TIOCUCNTL to be able to set ns8250's break detected condition.

It is now possible to send BREAK commands to vmd(8) independently of
the serial terminal emulator.

Happy virtual ddb(4) hacking!

No objection from mlarkin@, ok nicm@, ccardenas@, deraadt@


# 1.77 03-Jan-2018 ccardenas

Add initial CD-ROM support to VMD via vioscsi.

* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8)
* Support various sized ISOs (Limitation of 4G ISOs on Linux guests)
* Known working guests: OpenBSD (primary), Alpine Linux (primary),
CentOS 6 (secondary), Ubuntu 17.10 (secondary).
NOTE: Secondary indicates some issue(s) preventing full/reliable
functionality outside the scope of the vioscsi work.
* If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's
default BIOS) will boot from CD-ROM.

ok mlarkin@, jca@


# 1.76 06-Dec-2017 abieber

Make vmd respect owner when starting non-disabled vms.

OK pd@, benno@


# 1.75 30-Nov-2017 ccardenas

When performing vmctl reload and a previously configured vm is running,
exit with an EALREADY vs EPERM.

ok mlarkin@


# 1.74 11-Nov-2017 mlarkin

update switch handling in vmd(8). vmd now gets switch information (rdomain,
etc) from underlying switch interface instead of handling this on its
own.

Diff from carlos cardenas, Thanks!

ok reyk@


# 1.73 07-Nov-2017 mlarkin

typo in previous


# 1.72 07-Nov-2017 mlarkin

comment function vm_checkperm


# 1.71 24-Oct-2017 mlarkin

The VMD parent process didn't handle the case of a VM exiting
with a non 0 return properly (i.e. EIO).

From: Carlos Cardenas, thanks!


# 1.70 07-Oct-2017 mlarkin

vmd: retain ownership on vm reboot

from Jesper Wallin, thanks!


Revision tags: OPENBSD_6_2_BASE
# 1.69 08-Sep-2017 mlarkin

vmd: add more explanatory log_debug messages

From Carlos Cardenas, many thanks!


# 1.68 20-Aug-2017 pd

vmd: Allow only upward migration

This restricts receiving vms from hosts with more cpu features.

Tested on
broadwell -> skylake (works)
skylake -> broadwell (don't work)

ok mlarkin@


# 1.67 15-Aug-2017 pd

vmd: fix vm id displayed by vmctl when receiving a vm

Also fix two debug messages and an IMSG type.


# 1.66 14-Aug-2017 jasper

validate vm names before creating them; a valid name contains alphanumeric
characters, including '.', '_' and '-'. but does not start with the latter
three.

ok mlarkin@ pd@


# 1.65 13-Aug-2017 jasper

don't issue a termination command to an already stopped vm

ok mlarkin@


# 1.64 15-Jul-2017 pd

Add vmctl send and vmctl receive

ok reyk@ and mlarkin@


# 1.63 09-Jul-2017 pd

vmd/vmctl: Add ability to pause / unpause vms

With help from Ashwin Agrawal

ok reyk@ mlarkin@


# 1.62 29-May-2017 mlarkin

vmd(8): prevent crashing when presented with a vm name argument to
"vmctl stop" that doesn't exist.

Diff from Pratik Vyas, thanks!


# 1.61 04-May-2017 reyk

Report command failure back to vmctl reload, reset, load, log verbose.

OK mlarkin@


# 1.60 04-May-2017 reyk

Add support for rdomains.

This allows to configure VM interfaces and switches in individual rdomains.

OK mlarkin@


# 1.59 25-Apr-2017 reyk

Generate randomized MAC addresses earlier to keep them across reboots.

OK deraadt@


# 1.58 21-Apr-2017 reyk

Add global configuration option "local prefix" to change prefix for -L.

The default prefix is 100.64.0.0/10 from RFC6598.

Requested by sthen@ chris@
OK mlarkin@


# 1.57 19-Apr-2017 reyk

Add support for dynamic "NAT" interfaces (-L/local interface).

When a local interface is configured, vmd configures a /31 address on
the tap(4) interface of the host and provides another IP in the same
subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server
that replies with IP, gateway, and DNS addresses to the VM. The
built-in server only ever responds to the VM on the inside and cannot
leak its DHCP responses to the outside.

Thanks to Uwe Werler, Josh Grosse, and some others for testing!

OK deraadt@


# 1.56 06-Apr-2017 reyk

Do not expose vmm(4) VM IDs to the user, use vmd(8)'s IDs instead.

Each VM has two IDs: one from the kernel (vmm) and a different one
from userland (vmd). The vmm ID is not consistent and incremented on
every boot during runtimg of the host system. The vmd ID remains the
same during the lifetime of a configured VM, even after reboots.
Configured VMs will even get and keep their IDs when the configuration
is loaded. This is more what users expect.

Pointed out and tested by otto@

OK deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.55 15-Mar-2017 reyk

More fixes for starting and stopping VMs, fixing fallout from vm_running.

- Don't start a VM that is already running
- Keep the VM as running until it is powered off (and not stopping)
- Don't fatal in the parent if the vmm process referenced an unknown VM
- Don't stop a VM that is already stopping
- Indicate that a VM is stopping in "vmctl status"

The previous "vmctl stop; vmctl stop" to force-shutdown is not
supported anymore - the shutdown timeout should make sure that the VM
is really terminated. To force-shutdown, reference the VM by ID.
We might add a flag to vmctl stop to just turn the VM off.


# 1.54 15-Mar-2017 reyk

Close the tty if the VM was powered down.

The parent keeps a copy of each VM's tty fd to reuse it on reboot.
Close this tty if the VM was stopped, and not rebooted, by calling
vm_stop(vm, 0) instead of just setting vm_running to 0. Also make
sure that vm_ttyname is not used after free'ing it.


# 1.53 02-Mar-2017 reyk

Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.

This is especially useful when multiple VMs share a switch, the
implementation is independent from the underlying switch or bridge.

no objections mlarkin@


# 1.52 01-Mar-2017 reyk

Add "owner" option to set a user/group ownership for pre-configured VMs

This allows matching users to start or stop VMs that they "own" and to
access the console accordingly.

OK mlarkin@


# 1.51 27-Feb-2017 reyk

Replace openpty(3) with local function that uses pre-opened /dev/ptm fd

This allows more flexibility for upcoming changes and better pledge.
We also didn't use half of the features of libutil's openpty function.
Additionally, make sure that the ttys are closed correctly on shutdown.

OK gilles@


# 1.50 13-Jan-2017 edd

Make it possible to remove VMs from vmd(8)'s internal queue.

The semantics agreed with reyk@ are:

* ad-hoc created vms, created with `vmctl start`, are removed once stopped.
* Stopped VMs defined in a config file are flushed before a `vmctl reload`.

OK reyk@


# 1.49 11-Jan-2017 reyk

Add imsg communication channel between vmd and invividual VMs.
For now, this is only used to forward "log verbose|brief" requests,
but it will be used for better things later.

OK mlarkin@


# 1.48 09-Jan-2017 reyk

Stop accessing verbose and debug variables from log.c directly.

This replaces log_verbose() and "extern int verbose" with the two functions
log_setverbose() and log_getverbose().

Pointed out by benno@
OK krw@ eric@ gilles@ (OK gilles@ for the snmpd bits as well)


# 1.47 14-Dec-2016 reyk

Allow to start disabled and pre-configured VMs by name, "vmctl start foo".

With testing from Jon Bernard

OK mlarkin@


# 1.46 14-Dec-2016 reyk

If a VM terminates with the result EAGAIN, close all fds except the
pty and re-send it to the vmm monitor process. With additional
changes in vmm.c, this will allow perform a cold reboot of VM.

With testing and feedback from Jon Bernard
OK mlarkin@


# 1.45 26-Nov-2016 reyk

Implement basic support for boot.conf(8) on the disk image.

Like the real boot loader, load and parse hd0a:/etc/boot.conf from the
first disk and fall back to /bsd. Not all boot loader options are
supported, but it at least does set device, set image, and boot -acds
(eg. for booting single-user).

For example, it can now boot install60.fs that includes a boot.conf
with "set image /6.0/amd64/bsd.rd":
vmctl start install -c -d install60.fs -d OpenBSD.img

This pseudo-bootloader is only needed without BIOS and could
potentially be replaced in the future.

OK mlarkin@


# 1.44 26-Nov-2016 reyk

If -m/memory is not specified, use 512M by default.

Default value picked with mlarkin - not too small and not too large.

OK mlarkin@


# 1.43 24-Nov-2016 reyk

Add support for booting the kernel from the disk image.

This make the kernel/-k argument optional and, if not specified, tries
to find the /bsd kernel in the primary hd0a partition of the first
disk image itself. It doesn't support hd0a:/etc/boot.conf yet, and it
is no BIOS or full boot loader, but it makes booting and handling of
VMs a bit easier - booting an external kernel is still supported.

The UFS file system code ufs.c is directly from libsa which is also
used by the real boot loader. The code compiles with a few signedness
warning which will be fixed separately.

OK mlarkin@


# 1.42 22-Nov-2016 reyk

Fix error path of config_setvm() and its callers. This unbreaks
loading of invalid kernel files.

Reported by mlarkin@
OK mlarkin@


# 1.41 22-Nov-2016 reyk

There is no need for res when there is already ret.


# 1.40 22-Nov-2016 edd

Insert disabled VMs into vmd(8)'s queues and allow vmctl(8) to display them.

Tested by Jon Bernard and reyk@.

OK reyk@, no objections mlarkin@.

Thanks


# 1.39 04-Nov-2016 reyk

Pass the internal vmid or 0 to vm_register() instead of changing it
once again after setting the next available id.

Suggested by edd@


# 1.38 04-Nov-2016 reyk

Update the config/register/get VM methods to match the config_set/get
style that is used in other places. Also keep the vmid from the parent.

OK edd@


# 1.37 29-Oct-2016 edd

Separate parsing vms and switches from starting them in vmd(8).

Brings us one step closer to having disabled by default vms is vm.conf(5),
which can be started with vmctl(8).

Input, testing and OK reyk@. Thanks.


# 1.36 17-Oct-2016 reyk

Add the option to specify an interface group per virtual switch as well;
this group will be added to all VM tap(4) interfaces in the switch.

Tested by martijn@


# 1.35 15-Oct-2016 reyk

Allow to add an interface to an interface group; with the group keyword.

Requested and tested by martijn@


# 1.34 12-Oct-2016 reyk

Fix functionality and semantics of vmctl load/reload/reset.

OK rzalamena@


# 1.33 06-Oct-2016 reyk

Terminate VMs on shutdown of vmd instead of leaving them running as
undead VM processes.

OK mlarkin@


# 1.32 05-Oct-2016 reyk

Add support for enhanced networking configuration and virtual switches.
See vm.conf(5) for more details.

OK mlarkin@


# 1.31 04-Oct-2016 reyk

Add a new "priv" process that is responsible for ioctls and restricted
operations that aren't allowed under pledge. This is a companion to
the "vmd" process that runs as root but with pledge.

With the "priv" process, each new tap(4) interface now gets a
description to indicate the vm, eg. "vm1-if0-myvm". For network
configuration will be done by vmd/priv later.

OK mlarkin@


# 1.30 29-Sep-2016 reyk

Implement fork+exec for vmd, using the same framework from httpd etc.

No objections from mlarkin@ sunil@


# 1.29 17-Aug-2016 deraadt

small bits of header cleanup; ok mlarkin


# 1.28 29-Jul-2016 stefan

Allow starting a VM again after it was terminated

If a VM exits, terminate it and remove it from the list of
available VMs. That allows a VM with name `foo' to be restarted
after it has exited.

This changes structures shared between vmd and vmctl. You need to
rebuild vmctl also.

ok mlarkin@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.27 05-Feb-2016 reyk

Fix a possible use-after-free in vmd, forward the result to the
control socket before free'ing the vm.

Found by and OK jsg@


# 1.26 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.25 11-Dec-2015 reyk

The vmctl "id" argument can now be a number of or a vm name, eg.
vmctl stop 3
vmctl stop "openbsd.vm"


# 1.24 08-Dec-2015 jsg

when checking the config file with -n don't open /dev/vmm or require root
ok reyk@


# 1.23 08-Dec-2015 jsg

make the -f option work as intended
ok reyk@


# 1.22 07-Dec-2015 reyk

tweak initial error logging


# 1.21 06-Dec-2015 reyk

Prevent running a VM with the same name multiple times - multiple
instances of the same configuration will be handled in a different way
later. It is also not a good idea to use the same writeable disk
with multiple VMs at the same time.

As discussed with mlarkin@


# 1.20 06-Dec-2015 reyk

Print the TTY in the vmctl status output.


# 1.19 06-Dec-2015 reyk

When a new vm is created with VMM_IOC_CREATE, the kernel assigns a
unique id to it. This happens in the vm child process and has to be
communicated to the parent processes to track the vm. Knowing the vm
id in the parent and vmm processes also allows to remove vm from the
daemons list on terminate requests later.


# 1.18 06-Dec-2015 reyk

Check errno from config_getvm() correctly


# 1.17 05-Dec-2015 reyk

Print shorter error message if opening /dev/vmm failed.

Pointed out by deraadt@


# 1.16 03-Dec-2015 reyk

Re-add the "load" and "reload" commands to vmctl: Instead of parsing
the configuration in vmctl directly, it now sends a (re)load request
to vmd. The reload also resets the existing configuration status -
this doesn't do much difference yet but a future change will compare
if a specified VM is already running. "load" will allow to add
configuration, while "reload" resets the state before loading.


# 1.15 03-Dec-2015 reyk

Add and document -D and -f flags to vmd.


# 1.14 03-Dec-2015 reyk

mlarkin's code has been moved to vmm.c, so it is ok to claim the copyright.


# 1.13 03-Dec-2015 reyk

Add support for an optional vm.conf(5) file in vmd. This will replace
vmm.conf(5) in vmmctl. For a short time, both vmd and vmmctl will
support a configuration file, but vmmctl will be changed to send
"load" requests to vmd instead of loading and parsing the file
directly.


# 1.12 03-Dec-2015 reyk

prepare config_getvm() for parse.y


# 1.11 02-Dec-2015 reyk

send the tty name to vmmctl and print it as a result.


# 1.10 02-Dec-2015 reyk

Split the fully privileged parent into two processes "parent" and
"vmm" with reduced privileges:
- the "parent" opens fds (disks, ifs, etc.) but runs as root but pledged as
"stdio rpath wpath proc tty sendfd".
- the "vmm" process handles the creation and supervision of vm processes,
and the primary communication with the vmm(4) subsystem. It runs as _vmd
in the chroot but does not use pledge, as the vmm ioctls are not allowed
by any pledge model yet.
With this change, vmd starts to track the configuration state of VMs
in vmd and will allow other things later (like terminating a vm by
name, moving the configuration parser to vmd, ...). More incremental
changes will follow.


# 1.9 02-Dec-2015 reyk

Start tweaking vmd's privsep and daemon model by splitting the main
process into multiple parts and adopting the "proc.c"-style from other
daemons. This allows to further reduce the privileges, to give better
pledge(2), and to add some upcoming changes.

"please do" mlarkin@, deraadt@


# 1.8 26-Nov-2015 reyk

Automatically start vmm(4) when the first VM is created and after the
last VM is terminated. This allows to remove the explicit "vmm
enable" / "vmm disable" (VMM_IOC_START / VMM_IOC_STOP) ioctls. You'll
have to update kernel and userland for this change, as the kernel ABI
changes.

OK mpi@ mlarkin@


# 1.7 25-Nov-2015 tedu

typo: should be looking pid == -1


# 1.6 23-Nov-2015 reyk

accept4() is restarted after signals which prevents vmd from exiting
in the current control socket loop. Add a poll before the accept that
is not restarted and allows to escape the loop. This code is kind of
temporary, as we're planning to replace the event handling, but it
allows to kill (or Ctrl+c) vmd for now.

OK tedu@, discussed with many


# 1.5 23-Nov-2015 reyk

I accidentally removed a newline in usage() when converting the log
messages to log_*.

From Cesare Gargano


# 1.4 23-Nov-2015 reyk

Add support for logging to stderr or syslog, and to run vmd in
foreground with -d.

OK mlarkin@ jung@


# 1.3 22-Nov-2015 deraadt

use PATH_MAX where needed


# 1.2 22-Nov-2015 reyk

Add $ Ids


# 1.1 22-Nov-2015 mlarkin

vmd(8) - virtual machine daemon.

There is still a lot to be done, and fixed, in these userland components
but I have received enough "it works, commit it" emails that it's time
to finish those things in tree.

discussed with many, tested by many.


# 1.155 05-Feb-2024 dv

Cleanup fcntl(3) usage and fd lifetimes in vmd(8).

Remove extraneous fcntl(3) usage for setting fd features that can
be set at time of open(2), pipe2(2), or socketpair(2). Also cleans
up pty creation switching to using functions from libutil instead
of direct ioctl(2) calls.

ok mlarkin@, original diff ok claudio@ as well.


# 1.154 04-Feb-2024 dv

Prevent null pointer deref is vm isn't found.

This area of code in vmd(8) is suspect, but the null dereference
is easily avoided.

Found by smatch, reported by and ok jsg@


# 1.153 18-Jan-2024 claudio

Use imsg_get_fd() in vmd.

vmd uses a lot of fd passing and does it sometimes via extra abstraction
so this just tries to convert the code without any optimisations.

ok dv@


Revision tags: OPENBSD_7_4_BASE
# 1.152 26-Sep-2023 dv

vmd(8): disambiguate log messages per vm and device.

The logging output from vmd(8) often specifies the function performing
the logging, but leaves which vm or vm device to guesswork and
reading tea leaves.

Change the logging formatting to prefix with information about the
specific vm and potentially the device subprocess. Most of this
logging is behind the "verbose" mode, but for warnings this will
clarify which vm or device logged the warning.

The format of vm/<name>/<device><index> is chosen to be concise and
less ugly than other approaches. This adjusts the process naming
for devices to match, dropping the use of brackets.

In the process of this change, updating log settings dynamically
via vmctl(8) is fixed by properly broadcasting that information to
the device subprocesses. The "vmm" process also now updates its own
state properly, so settings survive vm reboots.

ok mlarkin@


# 1.151 03-Jul-2023 jasper

when shutting down a vm, handle the VM id in the same way as a VM name argument

ok dv@


# 1.150 18-Jun-2023 op

relax absolute path requirement for configtest (-n)

ok dv@


# 1.149 13-May-2023 dv

vmm(4)/vmd(8): switch to anonymous shared mappings.

While splitting out emulated virtio network and block devices into
separate processes, I originally used named mappings via shm_mkstemp(3).
While this functionally achieved the desired result, it had two
unintended consequences:

1) tearing down a vm process and its child processes required
excessive locking as the guest memory was tied into the VFS layer.

2) it was observed by mlarkin@ that actions in other parts of the
VFS layer could cause some of the guest memory to flush to storage,
possibly filling /tmp.

This commit adds a new vmm(4) ioctl dedicated to allowing a process
request the kernel share a mapping of guest memory into its own vm
space. This requires an open fd to /dev/vmm (requiring root) and
both the "vmm" and "proc" pledge(2) promises. In addition, the caller
must know enough about the original memory ranges to reconstruct them
to make the vm's ranges.

Tested with help from Mischa Peters.

ok mlarkin@


# 1.148 12-May-2023 dv

vmd(8): fix segfault on vm creation.

vm_instance was using the wrong vm instance for checking the
vm_kernel_path member. Switch to using the value from the parent
vm instance in the check for if a kernel is known.

Issue reported by kn@. OK mlarkin@, kn@.


# 1.147 12-May-2023 dv

vmd(8): fix console attach from vmctl(8).

Adding in the ability to override the boot kernel created an edge
case in the ipc message handling logic for the parent process (vmd)
when receiving a "start vm" request. Result was incorrectly responding
to the control process, and as a result the vmctl client, with a
bogus "start vm response" reply with an empty tty name.

This commit rewrites the logic of how vmd goes about processing the
"start vm" request with the aim of making it simpler to understand
while addressing the edge case.

Issue reported by kn@. OK mlarkin@.


# 1.146 28-Apr-2023 dv

vmd(8)/vmctl(8): allow vm owners to override boot kernel.

vmd allows non-root users to "own" a vm defined in vm.conf(5). While
the user can start/stop the vm, if they break their filesystem they
have no means of booting recovery media like a ramdisk kernel.

This change opens the provided boot kernel via vmctl and passes the
file descriptor through the control channel to vmd. The next boot
of the vm will use the provided file descriptor as boot kernel/bios.
Subsequent boots (e.g. a reboot) will return to using behavior
defined in vm.conf or the default bios image.

ok mlarkin@


# 1.145 27-Apr-2023 dv

vmd(8): introduce multi-process model for virtio devices.

Isolate virtio network and block device emulation in dedicated
processes, forked and exec'd from the vm process. This allows for
tightening pledge promises to just "stdio".

Communication between the vcpu's and these devices now occurs via
imsg channels, which adds the benefit of not always blocking the
vcpu thread while emulating the device.

With this commit, it's possible that vmd is the first open source
hypervisor that *defaults* to a multi-process device emulation
model without requiring any additional configuration from the
operator.

Testing help from phessler@ and Mischa Peters.

ok mlarkin@


# 1.144 25-Apr-2023 dv

vmm(4)/vmd(8): pull struct members out of vmm ioctl create struct.

The object sent to vmm(4) contained file paths and details the
kernel does not need for cpu virtualization as device emulation is
in userland. Effectively, "pull up" the struct members from the
vm_create_params struct to the parent vmop_create_params struct.

This allows us to clean up some of vmd(8) and simplify things for
switching to having vmctl(8) open the "kernel" file (SeaBIOS, bsd.rd,
etc.) to allow users to boot recovery ramdisk kernels.

ok mlarkin@


# 1.143 24-Apr-2023 kn

Missing the optional default config is not an error

/var/log/{messages,daemon} logs ENOENT as error on default configless vmd.
Only complain on explicitly passed files and print a debug hint under `-vv'
in case someone forgot to populate their /etc/vm.conf.

OK dv mlarkin


# 1.142 23-Apr-2023 dv

vmd(8): teach vmm process how to exec.

Use execvp(2) to launch vm children with new address spaces.
Consequently, introduces use of unveil(2) into the vmm and vm
processes.

This imposes the requirement of launching vmd with absolute paths,
similar to sshd(8).

ok mlarkin@


# 1.141 19-Apr-2023 jsg

remove duplicate includes


# 1.140 16-Apr-2023 dv

vmd(8): clean up fd closing in vmm process.

Some mild tidying of fd closing in the vmm process in prep for
landing parts of my fork+exec diff.

With input from guenther@ on the nuances of if/when EINTR may happen
in a call to close(2).

ok mlarkin@


# 1.139 02-Apr-2023 dv

vmd(8): migrate vmd_vm.vm_ttyname to char array.

Other structs use a fixed length array already. This allows a vmd_vm
object to be transmitted over an ipc channel, too.

Additionally, solves a segfault caused by a strlcpy(3) in an error
path.

ok mlarkin@


Revision tags: OPENBSD_7_3_BASE
# 1.138 28-Jan-2023 dv

Move some header definitions from vmm(4) to vmd(8).

Part of an ongoing effort to move userland-specific information out
of a kernel header and directly into vmd(8). No functional change.

ok mlarkin@


# 1.137 22-Jan-2023 dv

vmd(8): don't remove known vm's from the config on error.

Multiple error paths, specifically the one related to if a guest
cannot allocate memory at start, resulted in a known vm (via
vm.conf(5)) being removed from the vm list. Adjust the error paths
to check if the failing vm is defined in the config before tearing
it down.

Tested with help from beck@ and Mischa Peters.

ok beck@


# 1.136 14-Jan-2023 dv

Only open /dev/vmm once in vmd(8).

Have the parent process open /dev/vmm and send the fd to the vmm
child process. Only the vmm process and its resulting children
(guest vms) need it for ioctl calls.

ok kn@


# 1.135 28-Dec-2022 jmc

spelling fixes; from paul tagliamonte
any parts of his diff not taken are noted on tech


# 1.134 15-Dec-2022 dv

Add explicit casts to ctype functions in vmd(8).

OK millert@


# 1.133 31-Oct-2022 dv

vmd(8): remove unfinished user accounting.

User accounting and enforcement was never finished. tedu the thing
until someone wants to pick it up and finish it.

Originally found by Matthew Martin.

ok mlarkin@, kn@. input from tb@.


Revision tags: OPENBSD_7_2_BASE
# 1.132 13-Sep-2022 martijn

branches: 1.132.2;
Add (partial) support for agentx in vmd.

Metrics can be found under mib-2.236 and VM-MIB (RFC7666).

Stress tested by and happy noises from Mischa Peters
OK dv@


# 1.131 08-May-2022 dv

vmd: fix rebooting a received vm

Rebooting a received vm resulted in vmd(8) exiting as a result of
flawed state tracking in the parent process.

When stopping a vm, clear the VM_RECEIVE_STATE flag. When starting
a vm, make sure the parent process collapses any existing memory
ranges after the vm is sent to the vmm process (responsible for
launching the vm).

ok mlarkin@


Revision tags: OPENBSD_7_1_BASE
# 1.130 01-Mar-2022 dv

branches: 1.130.2;
vmd(8): gracefully handle hitting data limits when starting a vm

With recent changes to login.conf(5) to restrict daemon datasize
to a finite value, users can now hit resource limits when attempting
to start a vm.

This change fixes the error path when hitting the limit. vmd(8)
will no longer abort and memory error messages are relayed to the
user.

While here, address potential under-reads/writes using atomicio
when relaying data between the child vm process and vmd's vmm
process.

Original diff from tedu@. OK mlarkin@.


# 1.129 04-Jan-2022 claudio

Fix some simple -Wunused-but-set-variable warnings.
OK benno@ dv@


# 1.128 13-Dec-2021 deraadt

including sys/cdefs.h manually started as a result of netbsd trying to
macro-build a replacement for sccsid, and was done without any concern
for namespace damage. Unfortunately this practice started infecting
other code as others were unaware they didn't need the file.
ok millert guenther


# 1.127 29-Nov-2021 deraadt

mostly avoid sys/param.h with a local nitems()
ok mlarkin


Revision tags: OPENBSD_7_0_BASE
# 1.126 18-Jul-2021 dv

vmd(8): remove invalid errno values from config_setvm

Refactor config_setvm to directly return error code on failure
instead of returning -1 and setting errno. It was setting unsupported
values not defined in <errno.h>.

OK mlarkin@


# 1.125 05-May-2021 dv

Refactor vm_instance to return error value directly.

vmd(8)'s vm_instance function set unsupported errno values. Change the
api to directly return an error (either errno or custom vmd error).

"go for it" -mlarkin@


# 1.124 04-May-2021 dv

Init debug logging state before attempting to log.

Error messages related to bad configuration were not flushing to
stderr.

OK mlarkin@


# 1.123 26-Apr-2021 dv

vmd(8): fix vmctl client "wait" state corruption

Adds queue-based tracking of waiting client state to fix the cause of
state corruption when a vmctl(8) user cancels a wait and restarts it.
The socket fd value for the control process client was being used to
track the waiting party, but this also prevented multiple waiting
clients.

This moves all the state tracking of who to notify of a vm's stopping
to the control process and no longer requires the parent process to
track it in the global environment state.

Future work will be needed to smooth out the difference between the
IMSG_VMDOP_TERMINATE_VM_{EVENT,RESPONSE} events instead of needing to
translate before relaying to the vmctl(8) client.

Tested by Mischa Peters (thanks!)

ok mlarkin@


Revision tags: OPENBSD_6_9_BASE
# 1.122 05-Apr-2021 dv

Send correct response type on unpause errors.

ok pd@


# 1.121 29-Mar-2021 dv

Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp
and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior
ignored did not intercept these packets and instead transmitted them.

This should make vmd(8)'s dhcp behave more as a true dhcp server should and
allows it to work properly with the new dhcpleased(8) attempting a renewal.

OK mlarkin@


# 1.120 27-Jan-2021 deraadt

these programs (with common ancestry) had a -fno-common problem related
to privsep_procid.
ok mortimer


Revision tags: OPENBSD_6_8_BASE
# 1.119 23-Sep-2020 martijn

Revert agentx support for now, we're too close to release.

requested by deraadt@


# 1.118 23-Sep-2020 martijn

Add support for agentx to vmd.

This is based around VM-MIB from RFC7666,but does not export the full
spec. People more knowledgeable of vmd are encouraged to expand on this.


Revision tags: OPENBSD_6_7_BASE
# 1.117 12-Dec-2019 pd

vmd: start vms defined in vm.conf in a staggered fashion

This addresses 'thundering herd' problem when a lot of
vms are configured in vm.conf. A lot of vms booting in parallel can
overload the host and also mess up tsc calibration in openbsd guests as
it uses PIT which doesn't fire reliably if the host is overloaded.

We default to starting vms with parallelism of ncpuonline and a delay 30 seconds
between batches. This is configurable in vm.conf.

ok mlarkin@ (also addressed comments from cheloha@)


Revision tags: OPENBSD_6_6_BASE
# 1.116 04-Sep-2019 mlarkin

vmd(8): memory leak in an error path

Found by Hiltjo Posthuma, thanks!


# 1.115 14-Aug-2019 anton

Improve the error message when supplying an invalid template to vmctl
start. Favoring 'invalid template' over 'permission denied' should give
the user a better hint on what went wrong.

ok kn@ mlarkin@


# 1.114 28-Jun-2019 deraadt

When system calls indicate an error they return -1, not some arbitrary
value < 0. errno is only updated in this case. Change all (most?)
callers of syscalls to follow this better, and let's see if this strictness
helps us in the future.


# 1.113 20-May-2019 jasper

drop fatalx calls when claiming a new vm id; otherwise it's possible
to crash vmd and take all other vms with it. this required a little
shuffling to get the error value reported back to the caller to
handle the error properly.

ok mlarkin@


# 1.112 11-May-2019 jasper

report vm state through 'vmctl status'; whereas previously this would display the state of
the vcpu (which is why it got removed), it now actually reports the correct state
(running, stopped, disabled, paused, etc)

ok ccardenas@ mlarkin@


# 1.111 11-May-2019 jasper

vm_dump_header allocated space for a signature but it was never set;
set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image

ok mlarkin@ pd@


# 1.110 11-May-2019 jasper

track the state of the vm (running, paused, etc) using a single bitfield instead of
a handful of separate variables. this will makes it easier for vmd to report
and check on the individual vm states

no functional change intended

ok ccardenas@ mlarkin@


# 1.109 11-May-2019 jasper

sync the vm state in vmd too when (un)pausing a vm, otherwise the vm process
knows the vm is paused, but vmd does not.

ok mlarkin@ pd@


Revision tags: OPENBSD_6_5_BASE
# 1.108 09-Dec-2018 claudio

When -B is used to specify a specific boot device also change the reboot
behaviour of vmd to stop / exit at guest reboot.
OK ccardenas@


# 1.107 04-Dec-2018 claudio

Introduce IMSG_VMDOP_WAIT_VM_REQUEST a control message that registers a
vmctl peerid that should be informed when the VM is stopped (like when the
guest does a shutdown). Uses the same logic as using the VMOP_WAIT flag on
IMSG_VMDOP_TERMINATE_VM_REQUEST.
Ok ccardenas@, reyk@


# 1.106 26-Nov-2018 ori

Keep a list of known vms, and reuse the VM IDs.

This means that when using '-L', the IP addresses of the VMs are stable.

ok reyk@


# 1.105 21-Nov-2018 reyk

Add support for "local inet6" interfaces.

ok & test ccardenas@, additional review from kn@


# 1.104 15-Oct-2018 reyk

Prevent VM reboot loops by rate-limiting the interval a VM can reboot.

This looping has been experienced by people who run VMs with a broken
kernel or boot loader that trigger a very fast reboot loop (triple
fault) of a VM that ends up using a lot of CPU and resources on the
host. Some fixes in vmm(4) and vmd(8) helped to avoid such conditions
but it can still occur if something is wrong in the guest VM itself.

If the VM restarts after less than VM_START_RATE_SEC (6) seconds, we
increment the limit counter. After VM_START_RATE_LIMIT (3) of suchs
fast reboots the VM is stopped.

There are only very few people who intentionally want to reboot-loop a
VM very quickly (many times within a second); mostly for fuzzing.
They will have to recompile and adjust the stated #defines in the code
as we don't have a config option to disable it.

OK mlarkin@


Revision tags: OPENBSD_6_4_BASE
# 1.103 08-Oct-2018 reyk

Add support for qcow2 base images (external snapshots).

This works is from Ori Bernstein, committing on his behalf:

Add support to vmd for external snapshots. That is, snapshots that are
derived from a base image. Data lookups start in the derived image,
and if the derived image does not contain some data, the search
proceeds ot the base image. Multiple derived images may exist off of
a single base image.

A limitation of this format is that modifying the base image will
corrupt the derived image.

This change also adds support for creating disk derived disk images to
vmctl. To use it:

vmctl create derived.qcow2 -s 16G -b base.qcow2

From Ori Bernstein
OK mlarkin@ reyk@


# 1.102 29-Sep-2018 pd

vmd: don't remove vm if sending failed

Fix a bug where a vm was removed in vmd.c after vmctl send even if sending
failed.
spotted by solene@
ok mlarkin@


# 1.101 28-Sep-2018 reyk

Fix copy-pasto to use maxmem instead of maxcpu

Reported by Greg Steuck

OK mlarkin@


# 1.100 10-Sep-2018 bluhm

vmd(8) clould close file descriptor 0 as not all fd fields were
properly initialized with -1. Also avoid closing -1.
OK mlarkin@


# 1.99 10-Sep-2018 bluhm

During the fork+exec implementation, daemon(3) was moved after
proc_init(). As a consequence vmd(8) child processes did not detach
from the terminal anymore. Dup /dev/null to the stdio file descriptors
in the children.
OK mlarkin@ reyk@


# 1.98 15-Jul-2018 reyk

Track resources and enforce cpu/memory/interface limits for non-root users.

The limits are currently hard-coded and undocumented (4 CPUs/VMs, 2G
memory, 8 interfaces) but will be configurable in an upcoming diff.
These limits are tracked in total usage; for example, a user will be
able to run up to 4 VMs with 512M of memory or a single VM with 2G.

OK ccardenas@ mlarkin@


# 1.97 13-Jul-2018 reyk

Check the disk/kernel/cdrom file permissions after openening the fd.

This prevents time of TOCTOU attacks for instances.

OK mlarkin@


# 1.96 13-Jul-2018 reyk

Add "allow instance" option.

This allows users to create VM instances and change desired options,
for example a user can be allowed to run a VM with all the
pre-configured options but specify an own disk image.

(mlarkin@ was fine with iterating over it)

OK ccardenas@


# 1.95 12-Jul-2018 reyk

Allow to use configured/running VMs as templates for other VM instances.

This introduces new grammar and the -t optional in vmctl start.

(For now, only root can create VM instances; but it is planned to allow
users to create their own VMs based on permissions and quota.)

OK ccardenas@ mlarkin@ jmc@


# 1.94 11-Jul-2018 reyk

style - indent each case statement in a switch.


# 1.93 11-Jul-2018 reyk

Add -w option to vmctl stop to wait for completion of VM termination.

Use it in /etc/rc.d/vmd accordingly.

OK sthen@


# 1.92 11-Jul-2018 reyk

Rename function to vmd_check_vmh


# 1.91 11-Jul-2018 reyk

Add -f option to vmctl stop to forcefully kill a VM.

This also fixes a bug in vmm_sighdlr where it might have missed
forwarding the TERMINATE_EVENT to the vmd parent after a VM child
died, leading to an abandoned VM in the vmd parent process.

OK ccardenas@ mlarkin@ benno@ kn@


# 1.90 10-Jul-2018 reyk

style (single-line ifs don't need braces)


# 1.89 10-Jul-2018 reyk

vmd already had DEBUG/DPRINTF, there is no need for VMD_DEBUG/dprintf

Replace all occurences of dprintf with DPRINTF (defined in proc.h).


# 1.88 10-Jul-2018 reyk

Tweak debug log messages

- Turn tracing messages into DPRINTF (only compiled with DEBUG).

- Pass __func__ to vm_stop and vm_remove: this way we can track who
called the function in the async context. It replaces the manual
log_debug in front of each vm_stop/vm_remove. This debug logging
trick can be removed in the future once we are more confident about
it.

OK ccardenas@ mlarkin@


# 1.87 26-Jun-2018 reyk

Add "socket owner" to allow changing the owner of the vmd control socket.

This allows to open vmctl control or console access to other users
that are not in group wheel. Access for non-root users still defaults
to read-only actions unless you change the owner (user/group) of each
individual VM.

Requested by Mischa Peters

OK mlarkin@


# 1.86 19-Jun-2018 reyk

knf


# 1.85 13-May-2018 pd

vmd(8): enable pause / unpause for vm owners

Patch from Mohamed Aslan. Thanks!
ok kn@


# 1.84 25-Apr-2018 mlarkin

vmd(8)'s early error messages weren't visible when started via /etc/rc
(such as errors relating to not having VMX/etc). Change the log_init
to log to syslog so at least we have some chance of seeing these errors.

requested and ok beck@


# 1.83 21-Apr-2018 mlarkin

spelling error in log message


# 1.82 29-Mar-2018 martijn

Make sure that the global config is send out immediately when it is
loaded. This makes sure that the local prefix specied in the config is
always used.

OK ccardenas@


Revision tags: OPENBSD_6_3_BASE
# 1.81 14-Mar-2018 mlarkin

block two VMs from using the same disk image file at the same time.
Also changes an error message in vmctl to reflect same.


# 1.80 18-Feb-2018 pd

vmd: fix vmctl pause for non existing vm ids (never returns)

check if vm id is valid before sending to vmm for pausing. The 'lock' is caused
by vmm sending back ENOENT for a non existent vm but vmd drops the message
because it doesn't recogize the vmid vmm is talking about. This is an artifact
of the 'policy' don't trust any imsg from a sibling priv sep process and do
your own checking.

reported by Abel Abraham Camarillo Ojeda
ok mlarkin@ and ccardenas@


# 1.79 10-Jan-2018 sthen

Don't require "disk" or "kernel", also allow just "cdrom" instead, a VM can
still be useful with only cdrom storage. ok ccardenas@


# 1.78 08-Jan-2018 mpi

Enable TIOCUCNTL to be able to set ns8250's break detected condition.

It is now possible to send BREAK commands to vmd(8) independently of
the serial terminal emulator.

Happy virtual ddb(4) hacking!

No objection from mlarkin@, ok nicm@, ccardenas@, deraadt@


# 1.77 03-Jan-2018 ccardenas

Add initial CD-ROM support to VMD via vioscsi.

* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8)
* Support various sized ISOs (Limitation of 4G ISOs on Linux guests)
* Known working guests: OpenBSD (primary), Alpine Linux (primary),
CentOS 6 (secondary), Ubuntu 17.10 (secondary).
NOTE: Secondary indicates some issue(s) preventing full/reliable
functionality outside the scope of the vioscsi work.
* If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's
default BIOS) will boot from CD-ROM.

ok mlarkin@, jca@


# 1.76 06-Dec-2017 abieber

Make vmd respect owner when starting non-disabled vms.

OK pd@, benno@


# 1.75 30-Nov-2017 ccardenas

When performing vmctl reload and a previously configured vm is running,
exit with an EALREADY vs EPERM.

ok mlarkin@


# 1.74 11-Nov-2017 mlarkin

update switch handling in vmd(8). vmd now gets switch information (rdomain,
etc) from underlying switch interface instead of handling this on its
own.

Diff from carlos cardenas, Thanks!

ok reyk@


# 1.73 07-Nov-2017 mlarkin

typo in previous


# 1.72 07-Nov-2017 mlarkin

comment function vm_checkperm


# 1.71 24-Oct-2017 mlarkin

The VMD parent process didn't handle the case of a VM exiting
with a non 0 return properly (i.e. EIO).

From: Carlos Cardenas, thanks!


# 1.70 07-Oct-2017 mlarkin

vmd: retain ownership on vm reboot

from Jesper Wallin, thanks!


Revision tags: OPENBSD_6_2_BASE
# 1.69 08-Sep-2017 mlarkin

vmd: add more explanatory log_debug messages

From Carlos Cardenas, many thanks!


# 1.68 20-Aug-2017 pd

vmd: Allow only upward migration

This restricts receiving vms from hosts with more cpu features.

Tested on
broadwell -> skylake (works)
skylake -> broadwell (don't work)

ok mlarkin@


# 1.67 15-Aug-2017 pd

vmd: fix vm id displayed by vmctl when receiving a vm

Also fix two debug messages and an IMSG type.


# 1.66 14-Aug-2017 jasper

validate vm names before creating them; a valid name contains alphanumeric
characters, including '.', '_' and '-'. but does not start with the latter
three.

ok mlarkin@ pd@


# 1.65 13-Aug-2017 jasper

don't issue a termination command to an already stopped vm

ok mlarkin@


# 1.64 15-Jul-2017 pd

Add vmctl send and vmctl receive

ok reyk@ and mlarkin@


# 1.63 09-Jul-2017 pd

vmd/vmctl: Add ability to pause / unpause vms

With help from Ashwin Agrawal

ok reyk@ mlarkin@


# 1.62 29-May-2017 mlarkin

vmd(8): prevent crashing when presented with a vm name argument to
"vmctl stop" that doesn't exist.

Diff from Pratik Vyas, thanks!


# 1.61 04-May-2017 reyk

Report command failure back to vmctl reload, reset, load, log verbose.

OK mlarkin@


# 1.60 04-May-2017 reyk

Add support for rdomains.

This allows to configure VM interfaces and switches in individual rdomains.

OK mlarkin@


# 1.59 25-Apr-2017 reyk

Generate randomized MAC addresses earlier to keep them across reboots.

OK deraadt@


# 1.58 21-Apr-2017 reyk

Add global configuration option "local prefix" to change prefix for -L.

The default prefix is 100.64.0.0/10 from RFC6598.

Requested by sthen@ chris@
OK mlarkin@


# 1.57 19-Apr-2017 reyk

Add support for dynamic "NAT" interfaces (-L/local interface).

When a local interface is configured, vmd configures a /31 address on
the tap(4) interface of the host and provides another IP in the same
subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server
that replies with IP, gateway, and DNS addresses to the VM. The
built-in server only ever responds to the VM on the inside and cannot
leak its DHCP responses to the outside.

Thanks to Uwe Werler, Josh Grosse, and some others for testing!

OK deraadt@


# 1.56 06-Apr-2017 reyk

Do not expose vmm(4) VM IDs to the user, use vmd(8)'s IDs instead.

Each VM has two IDs: one from the kernel (vmm) and a different one
from userland (vmd). The vmm ID is not consistent and incremented on
every boot during runtimg of the host system. The vmd ID remains the
same during the lifetime of a configured VM, even after reboots.
Configured VMs will even get and keep their IDs when the configuration
is loaded. This is more what users expect.

Pointed out and tested by otto@

OK deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.55 15-Mar-2017 reyk

More fixes for starting and stopping VMs, fixing fallout from vm_running.

- Don't start a VM that is already running
- Keep the VM as running until it is powered off (and not stopping)
- Don't fatal in the parent if the vmm process referenced an unknown VM
- Don't stop a VM that is already stopping
- Indicate that a VM is stopping in "vmctl status"

The previous "vmctl stop; vmctl stop" to force-shutdown is not
supported anymore - the shutdown timeout should make sure that the VM
is really terminated. To force-shutdown, reference the VM by ID.
We might add a flag to vmctl stop to just turn the VM off.


# 1.54 15-Mar-2017 reyk

Close the tty if the VM was powered down.

The parent keeps a copy of each VM's tty fd to reuse it on reboot.
Close this tty if the VM was stopped, and not rebooted, by calling
vm_stop(vm, 0) instead of just setting vm_running to 0. Also make
sure that vm_ttyname is not used after free'ing it.


# 1.53 02-Mar-2017 reyk

Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.

This is especially useful when multiple VMs share a switch, the
implementation is independent from the underlying switch or bridge.

no objections mlarkin@


# 1.52 01-Mar-2017 reyk

Add "owner" option to set a user/group ownership for pre-configured VMs

This allows matching users to start or stop VMs that they "own" and to
access the console accordingly.

OK mlarkin@


# 1.51 27-Feb-2017 reyk

Replace openpty(3) with local function that uses pre-opened /dev/ptm fd

This allows more flexibility for upcoming changes and better pledge.
We also didn't use half of the features of libutil's openpty function.
Additionally, make sure that the ttys are closed correctly on shutdown.

OK gilles@


# 1.50 13-Jan-2017 edd

Make it possible to remove VMs from vmd(8)'s internal queue.

The semantics agreed with reyk@ are:

* ad-hoc created vms, created with `vmctl start`, are removed once stopped.
* Stopped VMs defined in a config file are flushed before a `vmctl reload`.

OK reyk@


# 1.49 11-Jan-2017 reyk

Add imsg communication channel between vmd and invividual VMs.
For now, this is only used to forward "log verbose|brief" requests,
but it will be used for better things later.

OK mlarkin@


# 1.48 09-Jan-2017 reyk

Stop accessing verbose and debug variables from log.c directly.

This replaces log_verbose() and "extern int verbose" with the two functions
log_setverbose() and log_getverbose().

Pointed out by benno@
OK krw@ eric@ gilles@ (OK gilles@ for the snmpd bits as well)


# 1.47 14-Dec-2016 reyk

Allow to start disabled and pre-configured VMs by name, "vmctl start foo".

With testing from Jon Bernard

OK mlarkin@


# 1.46 14-Dec-2016 reyk

If a VM terminates with the result EAGAIN, close all fds except the
pty and re-send it to the vmm monitor process. With additional
changes in vmm.c, this will allow perform a cold reboot of VM.

With testing and feedback from Jon Bernard
OK mlarkin@


# 1.45 26-Nov-2016 reyk

Implement basic support for boot.conf(8) on the disk image.

Like the real boot loader, load and parse hd0a:/etc/boot.conf from the
first disk and fall back to /bsd. Not all boot loader options are
supported, but it at least does set device, set image, and boot -acds
(eg. for booting single-user).

For example, it can now boot install60.fs that includes a boot.conf
with "set image /6.0/amd64/bsd.rd":
vmctl start install -c -d install60.fs -d OpenBSD.img

This pseudo-bootloader is only needed without BIOS and could
potentially be replaced in the future.

OK mlarkin@


# 1.44 26-Nov-2016 reyk

If -m/memory is not specified, use 512M by default.

Default value picked with mlarkin - not too small and not too large.

OK mlarkin@


# 1.43 24-Nov-2016 reyk

Add support for booting the kernel from the disk image.

This make the kernel/-k argument optional and, if not specified, tries
to find the /bsd kernel in the primary hd0a partition of the first
disk image itself. It doesn't support hd0a:/etc/boot.conf yet, and it
is no BIOS or full boot loader, but it makes booting and handling of
VMs a bit easier - booting an external kernel is still supported.

The UFS file system code ufs.c is directly from libsa which is also
used by the real boot loader. The code compiles with a few signedness
warning which will be fixed separately.

OK mlarkin@


# 1.42 22-Nov-2016 reyk

Fix error path of config_setvm() and its callers. This unbreaks
loading of invalid kernel files.

Reported by mlarkin@
OK mlarkin@


# 1.41 22-Nov-2016 reyk

There is no need for res when there is already ret.


# 1.40 22-Nov-2016 edd

Insert disabled VMs into vmd(8)'s queues and allow vmctl(8) to display them.

Tested by Jon Bernard and reyk@.

OK reyk@, no objections mlarkin@.

Thanks


# 1.39 04-Nov-2016 reyk

Pass the internal vmid or 0 to vm_register() instead of changing it
once again after setting the next available id.

Suggested by edd@


# 1.38 04-Nov-2016 reyk

Update the config/register/get VM methods to match the config_set/get
style that is used in other places. Also keep the vmid from the parent.

OK edd@


# 1.37 29-Oct-2016 edd

Separate parsing vms and switches from starting them in vmd(8).

Brings us one step closer to having disabled by default vms is vm.conf(5),
which can be started with vmctl(8).

Input, testing and OK reyk@. Thanks.


# 1.36 17-Oct-2016 reyk

Add the option to specify an interface group per virtual switch as well;
this group will be added to all VM tap(4) interfaces in the switch.

Tested by martijn@


# 1.35 15-Oct-2016 reyk

Allow to add an interface to an interface group; with the group keyword.

Requested and tested by martijn@


# 1.34 12-Oct-2016 reyk

Fix functionality and semantics of vmctl load/reload/reset.

OK rzalamena@


# 1.33 06-Oct-2016 reyk

Terminate VMs on shutdown of vmd instead of leaving them running as
undead VM processes.

OK mlarkin@


# 1.32 05-Oct-2016 reyk

Add support for enhanced networking configuration and virtual switches.
See vm.conf(5) for more details.

OK mlarkin@


# 1.31 04-Oct-2016 reyk

Add a new "priv" process that is responsible for ioctls and restricted
operations that aren't allowed under pledge. This is a companion to
the "vmd" process that runs as root but with pledge.

With the "priv" process, each new tap(4) interface now gets a
description to indicate the vm, eg. "vm1-if0-myvm". For network
configuration will be done by vmd/priv later.

OK mlarkin@


# 1.30 29-Sep-2016 reyk

Implement fork+exec for vmd, using the same framework from httpd etc.

No objections from mlarkin@ sunil@


# 1.29 17-Aug-2016 deraadt

small bits of header cleanup; ok mlarkin


# 1.28 29-Jul-2016 stefan

Allow starting a VM again after it was terminated

If a VM exits, terminate it and remove it from the list of
available VMs. That allows a VM with name `foo' to be restarted
after it has exited.

This changes structures shared between vmd and vmctl. You need to
rebuild vmctl also.

ok mlarkin@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.27 05-Feb-2016 reyk

Fix a possible use-after-free in vmd, forward the result to the
control socket before free'ing the vm.

Found by and OK jsg@


# 1.26 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.25 11-Dec-2015 reyk

The vmctl "id" argument can now be a number of or a vm name, eg.
vmctl stop 3
vmctl stop "openbsd.vm"


# 1.24 08-Dec-2015 jsg

when checking the config file with -n don't open /dev/vmm or require root
ok reyk@


# 1.23 08-Dec-2015 jsg

make the -f option work as intended
ok reyk@


# 1.22 07-Dec-2015 reyk

tweak initial error logging


# 1.21 06-Dec-2015 reyk

Prevent running a VM with the same name multiple times - multiple
instances of the same configuration will be handled in a different way
later. It is also not a good idea to use the same writeable disk
with multiple VMs at the same time.

As discussed with mlarkin@


# 1.20 06-Dec-2015 reyk

Print the TTY in the vmctl status output.


# 1.19 06-Dec-2015 reyk

When a new vm is created with VMM_IOC_CREATE, the kernel assigns a
unique id to it. This happens in the vm child process and has to be
communicated to the parent processes to track the vm. Knowing the vm
id in the parent and vmm processes also allows to remove vm from the
daemons list on terminate requests later.


# 1.18 06-Dec-2015 reyk

Check errno from config_getvm() correctly


# 1.17 05-Dec-2015 reyk

Print shorter error message if opening /dev/vmm failed.

Pointed out by deraadt@


# 1.16 03-Dec-2015 reyk

Re-add the "load" and "reload" commands to vmctl: Instead of parsing
the configuration in vmctl directly, it now sends a (re)load request
to vmd. The reload also resets the existing configuration status -
this doesn't do much difference yet but a future change will compare
if a specified VM is already running. "load" will allow to add
configuration, while "reload" resets the state before loading.


# 1.15 03-Dec-2015 reyk

Add and document -D and -f flags to vmd.


# 1.14 03-Dec-2015 reyk

mlarkin's code has been moved to vmm.c, so it is ok to claim the copyright.


# 1.13 03-Dec-2015 reyk

Add support for an optional vm.conf(5) file in vmd. This will replace
vmm.conf(5) in vmmctl. For a short time, both vmd and vmmctl will
support a configuration file, but vmmctl will be changed to send
"load" requests to vmd instead of loading and parsing the file
directly.


# 1.12 03-Dec-2015 reyk

prepare config_getvm() for parse.y


# 1.11 02-Dec-2015 reyk

send the tty name to vmmctl and print it as a result.


# 1.10 02-Dec-2015 reyk

Split the fully privileged parent into two processes "parent" and
"vmm" with reduced privileges:
- the "parent" opens fds (disks, ifs, etc.) but runs as root but pledged as
"stdio rpath wpath proc tty sendfd".
- the "vmm" process handles the creation and supervision of vm processes,
and the primary communication with the vmm(4) subsystem. It runs as _vmd
in the chroot but does not use pledge, as the vmm ioctls are not allowed
by any pledge model yet.
With this change, vmd starts to track the configuration state of VMs
in vmd and will allow other things later (like terminating a vm by
name, moving the configuration parser to vmd, ...). More incremental
changes will follow.


# 1.9 02-Dec-2015 reyk

Start tweaking vmd's privsep and daemon model by splitting the main
process into multiple parts and adopting the "proc.c"-style from other
daemons. This allows to further reduce the privileges, to give better
pledge(2), and to add some upcoming changes.

"please do" mlarkin@, deraadt@


# 1.8 26-Nov-2015 reyk

Automatically start vmm(4) when the first VM is created and after the
last VM is terminated. This allows to remove the explicit "vmm
enable" / "vmm disable" (VMM_IOC_START / VMM_IOC_STOP) ioctls. You'll
have to update kernel and userland for this change, as the kernel ABI
changes.

OK mpi@ mlarkin@


# 1.7 25-Nov-2015 tedu

typo: should be looking pid == -1


# 1.6 23-Nov-2015 reyk

accept4() is restarted after signals which prevents vmd from exiting
in the current control socket loop. Add a poll before the accept that
is not restarted and allows to escape the loop. This code is kind of
temporary, as we're planning to replace the event handling, but it
allows to kill (or Ctrl+c) vmd for now.

OK tedu@, discussed with many


# 1.5 23-Nov-2015 reyk

I accidentally removed a newline in usage() when converting the log
messages to log_*.

From Cesare Gargano


# 1.4 23-Nov-2015 reyk

Add support for logging to stderr or syslog, and to run vmd in
foreground with -d.

OK mlarkin@ jung@


# 1.3 22-Nov-2015 deraadt

use PATH_MAX where needed


# 1.2 22-Nov-2015 reyk

Add $ Ids


# 1.1 22-Nov-2015 mlarkin

vmd(8) - virtual machine daemon.

There is still a lot to be done, and fixed, in these userland components
but I have received enough "it works, commit it" emails that it's time
to finish those things in tree.

discussed with many, tested by many.


# 1.154 04-Feb-2024 dv

Prevent null pointer deref is vm isn't found.

This area of code in vmd(8) is suspect, but the null dereference
is easily avoided.

Found by smatch, reported by and ok jsg@


# 1.153 18-Jan-2024 claudio

Use imsg_get_fd() in vmd.

vmd uses a lot of fd passing and does it sometimes via extra abstraction
so this just tries to convert the code without any optimisations.

ok dv@


Revision tags: OPENBSD_7_4_BASE
# 1.152 26-Sep-2023 dv

vmd(8): disambiguate log messages per vm and device.

The logging output from vmd(8) often specifies the function performing
the logging, but leaves which vm or vm device to guesswork and
reading tea leaves.

Change the logging formatting to prefix with information about the
specific vm and potentially the device subprocess. Most of this
logging is behind the "verbose" mode, but for warnings this will
clarify which vm or device logged the warning.

The format of vm/<name>/<device><index> is chosen to be concise and
less ugly than other approaches. This adjusts the process naming
for devices to match, dropping the use of brackets.

In the process of this change, updating log settings dynamically
via vmctl(8) is fixed by properly broadcasting that information to
the device subprocesses. The "vmm" process also now updates its own
state properly, so settings survive vm reboots.

ok mlarkin@


# 1.151 03-Jul-2023 jasper

when shutting down a vm, handle the VM id in the same way as a VM name argument

ok dv@


# 1.150 18-Jun-2023 op

relax absolute path requirement for configtest (-n)

ok dv@


# 1.149 13-May-2023 dv

vmm(4)/vmd(8): switch to anonymous shared mappings.

While splitting out emulated virtio network and block devices into
separate processes, I originally used named mappings via shm_mkstemp(3).
While this functionally achieved the desired result, it had two
unintended consequences:

1) tearing down a vm process and its child processes required
excessive locking as the guest memory was tied into the VFS layer.

2) it was observed by mlarkin@ that actions in other parts of the
VFS layer could cause some of the guest memory to flush to storage,
possibly filling /tmp.

This commit adds a new vmm(4) ioctl dedicated to allowing a process
request the kernel share a mapping of guest memory into its own vm
space. This requires an open fd to /dev/vmm (requiring root) and
both the "vmm" and "proc" pledge(2) promises. In addition, the caller
must know enough about the original memory ranges to reconstruct them
to make the vm's ranges.

Tested with help from Mischa Peters.

ok mlarkin@


# 1.148 12-May-2023 dv

vmd(8): fix segfault on vm creation.

vm_instance was using the wrong vm instance for checking the
vm_kernel_path member. Switch to using the value from the parent
vm instance in the check for if a kernel is known.

Issue reported by kn@. OK mlarkin@, kn@.


# 1.147 12-May-2023 dv

vmd(8): fix console attach from vmctl(8).

Adding in the ability to override the boot kernel created an edge
case in the ipc message handling logic for the parent process (vmd)
when receiving a "start vm" request. Result was incorrectly responding
to the control process, and as a result the vmctl client, with a
bogus "start vm response" reply with an empty tty name.

This commit rewrites the logic of how vmd goes about processing the
"start vm" request with the aim of making it simpler to understand
while addressing the edge case.

Issue reported by kn@. OK mlarkin@.


# 1.146 28-Apr-2023 dv

vmd(8)/vmctl(8): allow vm owners to override boot kernel.

vmd allows non-root users to "own" a vm defined in vm.conf(5). While
the user can start/stop the vm, if they break their filesystem they
have no means of booting recovery media like a ramdisk kernel.

This change opens the provided boot kernel via vmctl and passes the
file descriptor through the control channel to vmd. The next boot
of the vm will use the provided file descriptor as boot kernel/bios.
Subsequent boots (e.g. a reboot) will return to using behavior
defined in vm.conf or the default bios image.

ok mlarkin@


# 1.145 27-Apr-2023 dv

vmd(8): introduce multi-process model for virtio devices.

Isolate virtio network and block device emulation in dedicated
processes, forked and exec'd from the vm process. This allows for
tightening pledge promises to just "stdio".

Communication between the vcpu's and these devices now occurs via
imsg channels, which adds the benefit of not always blocking the
vcpu thread while emulating the device.

With this commit, it's possible that vmd is the first open source
hypervisor that *defaults* to a multi-process device emulation
model without requiring any additional configuration from the
operator.

Testing help from phessler@ and Mischa Peters.

ok mlarkin@


# 1.144 25-Apr-2023 dv

vmm(4)/vmd(8): pull struct members out of vmm ioctl create struct.

The object sent to vmm(4) contained file paths and details the
kernel does not need for cpu virtualization as device emulation is
in userland. Effectively, "pull up" the struct members from the
vm_create_params struct to the parent vmop_create_params struct.

This allows us to clean up some of vmd(8) and simplify things for
switching to having vmctl(8) open the "kernel" file (SeaBIOS, bsd.rd,
etc.) to allow users to boot recovery ramdisk kernels.

ok mlarkin@


# 1.143 24-Apr-2023 kn

Missing the optional default config is not an error

/var/log/{messages,daemon} logs ENOENT as error on default configless vmd.
Only complain on explicitly passed files and print a debug hint under `-vv'
in case someone forgot to populate their /etc/vm.conf.

OK dv mlarkin


# 1.142 23-Apr-2023 dv

vmd(8): teach vmm process how to exec.

Use execvp(2) to launch vm children with new address spaces.
Consequently, introduces use of unveil(2) into the vmm and vm
processes.

This imposes the requirement of launching vmd with absolute paths,
similar to sshd(8).

ok mlarkin@


# 1.141 19-Apr-2023 jsg

remove duplicate includes


# 1.140 16-Apr-2023 dv

vmd(8): clean up fd closing in vmm process.

Some mild tidying of fd closing in the vmm process in prep for
landing parts of my fork+exec diff.

With input from guenther@ on the nuances of if/when EINTR may happen
in a call to close(2).

ok mlarkin@


# 1.139 02-Apr-2023 dv

vmd(8): migrate vmd_vm.vm_ttyname to char array.

Other structs use a fixed length array already. This allows a vmd_vm
object to be transmitted over an ipc channel, too.

Additionally, solves a segfault caused by a strlcpy(3) in an error
path.

ok mlarkin@


Revision tags: OPENBSD_7_3_BASE
# 1.138 28-Jan-2023 dv

Move some header definitions from vmm(4) to vmd(8).

Part of an ongoing effort to move userland-specific information out
of a kernel header and directly into vmd(8). No functional change.

ok mlarkin@


# 1.137 22-Jan-2023 dv

vmd(8): don't remove known vm's from the config on error.

Multiple error paths, specifically the one related to if a guest
cannot allocate memory at start, resulted in a known vm (via
vm.conf(5)) being removed from the vm list. Adjust the error paths
to check if the failing vm is defined in the config before tearing
it down.

Tested with help from beck@ and Mischa Peters.

ok beck@


# 1.136 14-Jan-2023 dv

Only open /dev/vmm once in vmd(8).

Have the parent process open /dev/vmm and send the fd to the vmm
child process. Only the vmm process and its resulting children
(guest vms) need it for ioctl calls.

ok kn@


# 1.135 28-Dec-2022 jmc

spelling fixes; from paul tagliamonte
any parts of his diff not taken are noted on tech


# 1.134 15-Dec-2022 dv

Add explicit casts to ctype functions in vmd(8).

OK millert@


# 1.133 31-Oct-2022 dv

vmd(8): remove unfinished user accounting.

User accounting and enforcement was never finished. tedu the thing
until someone wants to pick it up and finish it.

Originally found by Matthew Martin.

ok mlarkin@, kn@. input from tb@.


Revision tags: OPENBSD_7_2_BASE
# 1.132 13-Sep-2022 martijn

branches: 1.132.2;
Add (partial) support for agentx in vmd.

Metrics can be found under mib-2.236 and VM-MIB (RFC7666).

Stress tested by and happy noises from Mischa Peters
OK dv@


# 1.131 08-May-2022 dv

vmd: fix rebooting a received vm

Rebooting a received vm resulted in vmd(8) exiting as a result of
flawed state tracking in the parent process.

When stopping a vm, clear the VM_RECEIVE_STATE flag. When starting
a vm, make sure the parent process collapses any existing memory
ranges after the vm is sent to the vmm process (responsible for
launching the vm).

ok mlarkin@


Revision tags: OPENBSD_7_1_BASE
# 1.130 01-Mar-2022 dv

branches: 1.130.2;
vmd(8): gracefully handle hitting data limits when starting a vm

With recent changes to login.conf(5) to restrict daemon datasize
to a finite value, users can now hit resource limits when attempting
to start a vm.

This change fixes the error path when hitting the limit. vmd(8)
will no longer abort and memory error messages are relayed to the
user.

While here, address potential under-reads/writes using atomicio
when relaying data between the child vm process and vmd's vmm
process.

Original diff from tedu@. OK mlarkin@.


# 1.129 04-Jan-2022 claudio

Fix some simple -Wunused-but-set-variable warnings.
OK benno@ dv@


# 1.128 13-Dec-2021 deraadt

including sys/cdefs.h manually started as a result of netbsd trying to
macro-build a replacement for sccsid, and was done without any concern
for namespace damage. Unfortunately this practice started infecting
other code as others were unaware they didn't need the file.
ok millert guenther


# 1.127 29-Nov-2021 deraadt

mostly avoid sys/param.h with a local nitems()
ok mlarkin


Revision tags: OPENBSD_7_0_BASE
# 1.126 18-Jul-2021 dv

vmd(8): remove invalid errno values from config_setvm

Refactor config_setvm to directly return error code on failure
instead of returning -1 and setting errno. It was setting unsupported
values not defined in <errno.h>.

OK mlarkin@


# 1.125 05-May-2021 dv

Refactor vm_instance to return error value directly.

vmd(8)'s vm_instance function set unsupported errno values. Change the
api to directly return an error (either errno or custom vmd error).

"go for it" -mlarkin@


# 1.124 04-May-2021 dv

Init debug logging state before attempting to log.

Error messages related to bad configuration were not flushing to
stderr.

OK mlarkin@


# 1.123 26-Apr-2021 dv

vmd(8): fix vmctl client "wait" state corruption

Adds queue-based tracking of waiting client state to fix the cause of
state corruption when a vmctl(8) user cancels a wait and restarts it.
The socket fd value for the control process client was being used to
track the waiting party, but this also prevented multiple waiting
clients.

This moves all the state tracking of who to notify of a vm's stopping
to the control process and no longer requires the parent process to
track it in the global environment state.

Future work will be needed to smooth out the difference between the
IMSG_VMDOP_TERMINATE_VM_{EVENT,RESPONSE} events instead of needing to
translate before relaying to the vmctl(8) client.

Tested by Mischa Peters (thanks!)

ok mlarkin@


Revision tags: OPENBSD_6_9_BASE
# 1.122 05-Apr-2021 dv

Send correct response type on unpause errors.

ok pd@


# 1.121 29-Mar-2021 dv

Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp
and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior
ignored did not intercept these packets and instead transmitted them.

This should make vmd(8)'s dhcp behave more as a true dhcp server should and
allows it to work properly with the new dhcpleased(8) attempting a renewal.

OK mlarkin@


# 1.120 27-Jan-2021 deraadt

these programs (with common ancestry) had a -fno-common problem related
to privsep_procid.
ok mortimer


Revision tags: OPENBSD_6_8_BASE
# 1.119 23-Sep-2020 martijn

Revert agentx support for now, we're too close to release.

requested by deraadt@


# 1.118 23-Sep-2020 martijn

Add support for agentx to vmd.

This is based around VM-MIB from RFC7666,but does not export the full
spec. People more knowledgeable of vmd are encouraged to expand on this.


Revision tags: OPENBSD_6_7_BASE
# 1.117 12-Dec-2019 pd

vmd: start vms defined in vm.conf in a staggered fashion

This addresses 'thundering herd' problem when a lot of
vms are configured in vm.conf. A lot of vms booting in parallel can
overload the host and also mess up tsc calibration in openbsd guests as
it uses PIT which doesn't fire reliably if the host is overloaded.

We default to starting vms with parallelism of ncpuonline and a delay 30 seconds
between batches. This is configurable in vm.conf.

ok mlarkin@ (also addressed comments from cheloha@)


Revision tags: OPENBSD_6_6_BASE
# 1.116 04-Sep-2019 mlarkin

vmd(8): memory leak in an error path

Found by Hiltjo Posthuma, thanks!


# 1.115 14-Aug-2019 anton

Improve the error message when supplying an invalid template to vmctl
start. Favoring 'invalid template' over 'permission denied' should give
the user a better hint on what went wrong.

ok kn@ mlarkin@


# 1.114 28-Jun-2019 deraadt

When system calls indicate an error they return -1, not some arbitrary
value < 0. errno is only updated in this case. Change all (most?)
callers of syscalls to follow this better, and let's see if this strictness
helps us in the future.


# 1.113 20-May-2019 jasper

drop fatalx calls when claiming a new vm id; otherwise it's possible
to crash vmd and take all other vms with it. this required a little
shuffling to get the error value reported back to the caller to
handle the error properly.

ok mlarkin@


# 1.112 11-May-2019 jasper

report vm state through 'vmctl status'; whereas previously this would display the state of
the vcpu (which is why it got removed), it now actually reports the correct state
(running, stopped, disabled, paused, etc)

ok ccardenas@ mlarkin@


# 1.111 11-May-2019 jasper

vm_dump_header allocated space for a signature but it was never set;
set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image

ok mlarkin@ pd@


# 1.110 11-May-2019 jasper

track the state of the vm (running, paused, etc) using a single bitfield instead of
a handful of separate variables. this will makes it easier for vmd to report
and check on the individual vm states

no functional change intended

ok ccardenas@ mlarkin@


# 1.109 11-May-2019 jasper

sync the vm state in vmd too when (un)pausing a vm, otherwise the vm process
knows the vm is paused, but vmd does not.

ok mlarkin@ pd@


Revision tags: OPENBSD_6_5_BASE
# 1.108 09-Dec-2018 claudio

When -B is used to specify a specific boot device also change the reboot
behaviour of vmd to stop / exit at guest reboot.
OK ccardenas@


# 1.107 04-Dec-2018 claudio

Introduce IMSG_VMDOP_WAIT_VM_REQUEST a control message that registers a
vmctl peerid that should be informed when the VM is stopped (like when the
guest does a shutdown). Uses the same logic as using the VMOP_WAIT flag on
IMSG_VMDOP_TERMINATE_VM_REQUEST.
Ok ccardenas@, reyk@


# 1.106 26-Nov-2018 ori

Keep a list of known vms, and reuse the VM IDs.

This means that when using '-L', the IP addresses of the VMs are stable.

ok reyk@


# 1.105 21-Nov-2018 reyk

Add support for "local inet6" interfaces.

ok & test ccardenas@, additional review from kn@


# 1.104 15-Oct-2018 reyk

Prevent VM reboot loops by rate-limiting the interval a VM can reboot.

This looping has been experienced by people who run VMs with a broken
kernel or boot loader that trigger a very fast reboot loop (triple
fault) of a VM that ends up using a lot of CPU and resources on the
host. Some fixes in vmm(4) and vmd(8) helped to avoid such conditions
but it can still occur if something is wrong in the guest VM itself.

If the VM restarts after less than VM_START_RATE_SEC (6) seconds, we
increment the limit counter. After VM_START_RATE_LIMIT (3) of suchs
fast reboots the VM is stopped.

There are only very few people who intentionally want to reboot-loop a
VM very quickly (many times within a second); mostly for fuzzing.
They will have to recompile and adjust the stated #defines in the code
as we don't have a config option to disable it.

OK mlarkin@


Revision tags: OPENBSD_6_4_BASE
# 1.103 08-Oct-2018 reyk

Add support for qcow2 base images (external snapshots).

This works is from Ori Bernstein, committing on his behalf:

Add support to vmd for external snapshots. That is, snapshots that are
derived from a base image. Data lookups start in the derived image,
and if the derived image does not contain some data, the search
proceeds ot the base image. Multiple derived images may exist off of
a single base image.

A limitation of this format is that modifying the base image will
corrupt the derived image.

This change also adds support for creating disk derived disk images to
vmctl. To use it:

vmctl create derived.qcow2 -s 16G -b base.qcow2

From Ori Bernstein
OK mlarkin@ reyk@


# 1.102 29-Sep-2018 pd

vmd: don't remove vm if sending failed

Fix a bug where a vm was removed in vmd.c after vmctl send even if sending
failed.
spotted by solene@
ok mlarkin@


# 1.101 28-Sep-2018 reyk

Fix copy-pasto to use maxmem instead of maxcpu

Reported by Greg Steuck

OK mlarkin@


# 1.100 10-Sep-2018 bluhm

vmd(8) clould close file descriptor 0 as not all fd fields were
properly initialized with -1. Also avoid closing -1.
OK mlarkin@


# 1.99 10-Sep-2018 bluhm

During the fork+exec implementation, daemon(3) was moved after
proc_init(). As a consequence vmd(8) child processes did not detach
from the terminal anymore. Dup /dev/null to the stdio file descriptors
in the children.
OK mlarkin@ reyk@


# 1.98 15-Jul-2018 reyk

Track resources and enforce cpu/memory/interface limits for non-root users.

The limits are currently hard-coded and undocumented (4 CPUs/VMs, 2G
memory, 8 interfaces) but will be configurable in an upcoming diff.
These limits are tracked in total usage; for example, a user will be
able to run up to 4 VMs with 512M of memory or a single VM with 2G.

OK ccardenas@ mlarkin@


# 1.97 13-Jul-2018 reyk

Check the disk/kernel/cdrom file permissions after openening the fd.

This prevents time of TOCTOU attacks for instances.

OK mlarkin@


# 1.96 13-Jul-2018 reyk

Add "allow instance" option.

This allows users to create VM instances and change desired options,
for example a user can be allowed to run a VM with all the
pre-configured options but specify an own disk image.

(mlarkin@ was fine with iterating over it)

OK ccardenas@


# 1.95 12-Jul-2018 reyk

Allow to use configured/running VMs as templates for other VM instances.

This introduces new grammar and the -t optional in vmctl start.

(For now, only root can create VM instances; but it is planned to allow
users to create their own VMs based on permissions and quota.)

OK ccardenas@ mlarkin@ jmc@


# 1.94 11-Jul-2018 reyk

style - indent each case statement in a switch.


# 1.93 11-Jul-2018 reyk

Add -w option to vmctl stop to wait for completion of VM termination.

Use it in /etc/rc.d/vmd accordingly.

OK sthen@


# 1.92 11-Jul-2018 reyk

Rename function to vmd_check_vmh


# 1.91 11-Jul-2018 reyk

Add -f option to vmctl stop to forcefully kill a VM.

This also fixes a bug in vmm_sighdlr where it might have missed
forwarding the TERMINATE_EVENT to the vmd parent after a VM child
died, leading to an abandoned VM in the vmd parent process.

OK ccardenas@ mlarkin@ benno@ kn@


# 1.90 10-Jul-2018 reyk

style (single-line ifs don't need braces)


# 1.89 10-Jul-2018 reyk

vmd already had DEBUG/DPRINTF, there is no need for VMD_DEBUG/dprintf

Replace all occurences of dprintf with DPRINTF (defined in proc.h).


# 1.88 10-Jul-2018 reyk

Tweak debug log messages

- Turn tracing messages into DPRINTF (only compiled with DEBUG).

- Pass __func__ to vm_stop and vm_remove: this way we can track who
called the function in the async context. It replaces the manual
log_debug in front of each vm_stop/vm_remove. This debug logging
trick can be removed in the future once we are more confident about
it.

OK ccardenas@ mlarkin@


# 1.87 26-Jun-2018 reyk

Add "socket owner" to allow changing the owner of the vmd control socket.

This allows to open vmctl control or console access to other users
that are not in group wheel. Access for non-root users still defaults
to read-only actions unless you change the owner (user/group) of each
individual VM.

Requested by Mischa Peters

OK mlarkin@


# 1.86 19-Jun-2018 reyk

knf


# 1.85 13-May-2018 pd

vmd(8): enable pause / unpause for vm owners

Patch from Mohamed Aslan. Thanks!
ok kn@


# 1.84 25-Apr-2018 mlarkin

vmd(8)'s early error messages weren't visible when started via /etc/rc
(such as errors relating to not having VMX/etc). Change the log_init
to log to syslog so at least we have some chance of seeing these errors.

requested and ok beck@


# 1.83 21-Apr-2018 mlarkin

spelling error in log message


# 1.82 29-Mar-2018 martijn

Make sure that the global config is send out immediately when it is
loaded. This makes sure that the local prefix specied in the config is
always used.

OK ccardenas@


Revision tags: OPENBSD_6_3_BASE
# 1.81 14-Mar-2018 mlarkin

block two VMs from using the same disk image file at the same time.
Also changes an error message in vmctl to reflect same.


# 1.80 18-Feb-2018 pd

vmd: fix vmctl pause for non existing vm ids (never returns)

check if vm id is valid before sending to vmm for pausing. The 'lock' is caused
by vmm sending back ENOENT for a non existent vm but vmd drops the message
because it doesn't recogize the vmid vmm is talking about. This is an artifact
of the 'policy' don't trust any imsg from a sibling priv sep process and do
your own checking.

reported by Abel Abraham Camarillo Ojeda
ok mlarkin@ and ccardenas@


# 1.79 10-Jan-2018 sthen

Don't require "disk" or "kernel", also allow just "cdrom" instead, a VM can
still be useful with only cdrom storage. ok ccardenas@


# 1.78 08-Jan-2018 mpi

Enable TIOCUCNTL to be able to set ns8250's break detected condition.

It is now possible to send BREAK commands to vmd(8) independently of
the serial terminal emulator.

Happy virtual ddb(4) hacking!

No objection from mlarkin@, ok nicm@, ccardenas@, deraadt@


# 1.77 03-Jan-2018 ccardenas

Add initial CD-ROM support to VMD via vioscsi.

* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8)
* Support various sized ISOs (Limitation of 4G ISOs on Linux guests)
* Known working guests: OpenBSD (primary), Alpine Linux (primary),
CentOS 6 (secondary), Ubuntu 17.10 (secondary).
NOTE: Secondary indicates some issue(s) preventing full/reliable
functionality outside the scope of the vioscsi work.
* If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's
default BIOS) will boot from CD-ROM.

ok mlarkin@, jca@


# 1.76 06-Dec-2017 abieber

Make vmd respect owner when starting non-disabled vms.

OK pd@, benno@


# 1.75 30-Nov-2017 ccardenas

When performing vmctl reload and a previously configured vm is running,
exit with an EALREADY vs EPERM.

ok mlarkin@


# 1.74 11-Nov-2017 mlarkin

update switch handling in vmd(8). vmd now gets switch information (rdomain,
etc) from underlying switch interface instead of handling this on its
own.

Diff from carlos cardenas, Thanks!

ok reyk@


# 1.73 07-Nov-2017 mlarkin

typo in previous


# 1.72 07-Nov-2017 mlarkin

comment function vm_checkperm


# 1.71 24-Oct-2017 mlarkin

The VMD parent process didn't handle the case of a VM exiting
with a non 0 return properly (i.e. EIO).

From: Carlos Cardenas, thanks!


# 1.70 07-Oct-2017 mlarkin

vmd: retain ownership on vm reboot

from Jesper Wallin, thanks!


Revision tags: OPENBSD_6_2_BASE
# 1.69 08-Sep-2017 mlarkin

vmd: add more explanatory log_debug messages

From Carlos Cardenas, many thanks!


# 1.68 20-Aug-2017 pd

vmd: Allow only upward migration

This restricts receiving vms from hosts with more cpu features.

Tested on
broadwell -> skylake (works)
skylake -> broadwell (don't work)

ok mlarkin@


# 1.67 15-Aug-2017 pd

vmd: fix vm id displayed by vmctl when receiving a vm

Also fix two debug messages and an IMSG type.


# 1.66 14-Aug-2017 jasper

validate vm names before creating them; a valid name contains alphanumeric
characters, including '.', '_' and '-'. but does not start with the latter
three.

ok mlarkin@ pd@


# 1.65 13-Aug-2017 jasper

don't issue a termination command to an already stopped vm

ok mlarkin@


# 1.64 15-Jul-2017 pd

Add vmctl send and vmctl receive

ok reyk@ and mlarkin@


# 1.63 09-Jul-2017 pd

vmd/vmctl: Add ability to pause / unpause vms

With help from Ashwin Agrawal

ok reyk@ mlarkin@


# 1.62 29-May-2017 mlarkin

vmd(8): prevent crashing when presented with a vm name argument to
"vmctl stop" that doesn't exist.

Diff from Pratik Vyas, thanks!


# 1.61 04-May-2017 reyk

Report command failure back to vmctl reload, reset, load, log verbose.

OK mlarkin@


# 1.60 04-May-2017 reyk

Add support for rdomains.

This allows to configure VM interfaces and switches in individual rdomains.

OK mlarkin@


# 1.59 25-Apr-2017 reyk

Generate randomized MAC addresses earlier to keep them across reboots.

OK deraadt@


# 1.58 21-Apr-2017 reyk

Add global configuration option "local prefix" to change prefix for -L.

The default prefix is 100.64.0.0/10 from RFC6598.

Requested by sthen@ chris@
OK mlarkin@


# 1.57 19-Apr-2017 reyk

Add support for dynamic "NAT" interfaces (-L/local interface).

When a local interface is configured, vmd configures a /31 address on
the tap(4) interface of the host and provides another IP in the same
subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server
that replies with IP, gateway, and DNS addresses to the VM. The
built-in server only ever responds to the VM on the inside and cannot
leak its DHCP responses to the outside.

Thanks to Uwe Werler, Josh Grosse, and some others for testing!

OK deraadt@


# 1.56 06-Apr-2017 reyk

Do not expose vmm(4) VM IDs to the user, use vmd(8)'s IDs instead.

Each VM has two IDs: one from the kernel (vmm) and a different one
from userland (vmd). The vmm ID is not consistent and incremented on
every boot during runtimg of the host system. The vmd ID remains the
same during the lifetime of a configured VM, even after reboots.
Configured VMs will even get and keep their IDs when the configuration
is loaded. This is more what users expect.

Pointed out and tested by otto@

OK deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.55 15-Mar-2017 reyk

More fixes for starting and stopping VMs, fixing fallout from vm_running.

- Don't start a VM that is already running
- Keep the VM as running until it is powered off (and not stopping)
- Don't fatal in the parent if the vmm process referenced an unknown VM
- Don't stop a VM that is already stopping
- Indicate that a VM is stopping in "vmctl status"

The previous "vmctl stop; vmctl stop" to force-shutdown is not
supported anymore - the shutdown timeout should make sure that the VM
is really terminated. To force-shutdown, reference the VM by ID.
We might add a flag to vmctl stop to just turn the VM off.


# 1.54 15-Mar-2017 reyk

Close the tty if the VM was powered down.

The parent keeps a copy of each VM's tty fd to reuse it on reboot.
Close this tty if the VM was stopped, and not rebooted, by calling
vm_stop(vm, 0) instead of just setting vm_running to 0. Also make
sure that vm_ttyname is not used after free'ing it.


# 1.53 02-Mar-2017 reyk

Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.

This is especially useful when multiple VMs share a switch, the
implementation is independent from the underlying switch or bridge.

no objections mlarkin@


# 1.52 01-Mar-2017 reyk

Add "owner" option to set a user/group ownership for pre-configured VMs

This allows matching users to start or stop VMs that they "own" and to
access the console accordingly.

OK mlarkin@


# 1.51 27-Feb-2017 reyk

Replace openpty(3) with local function that uses pre-opened /dev/ptm fd

This allows more flexibility for upcoming changes and better pledge.
We also didn't use half of the features of libutil's openpty function.
Additionally, make sure that the ttys are closed correctly on shutdown.

OK gilles@


# 1.50 13-Jan-2017 edd

Make it possible to remove VMs from vmd(8)'s internal queue.

The semantics agreed with reyk@ are:

* ad-hoc created vms, created with `vmctl start`, are removed once stopped.
* Stopped VMs defined in a config file are flushed before a `vmctl reload`.

OK reyk@


# 1.49 11-Jan-2017 reyk

Add imsg communication channel between vmd and invividual VMs.
For now, this is only used to forward "log verbose|brief" requests,
but it will be used for better things later.

OK mlarkin@


# 1.48 09-Jan-2017 reyk

Stop accessing verbose and debug variables from log.c directly.

This replaces log_verbose() and "extern int verbose" with the two functions
log_setverbose() and log_getverbose().

Pointed out by benno@
OK krw@ eric@ gilles@ (OK gilles@ for the snmpd bits as well)


# 1.47 14-Dec-2016 reyk

Allow to start disabled and pre-configured VMs by name, "vmctl start foo".

With testing from Jon Bernard

OK mlarkin@


# 1.46 14-Dec-2016 reyk

If a VM terminates with the result EAGAIN, close all fds except the
pty and re-send it to the vmm monitor process. With additional
changes in vmm.c, this will allow perform a cold reboot of VM.

With testing and feedback from Jon Bernard
OK mlarkin@


# 1.45 26-Nov-2016 reyk

Implement basic support for boot.conf(8) on the disk image.

Like the real boot loader, load and parse hd0a:/etc/boot.conf from the
first disk and fall back to /bsd. Not all boot loader options are
supported, but it at least does set device, set image, and boot -acds
(eg. for booting single-user).

For example, it can now boot install60.fs that includes a boot.conf
with "set image /6.0/amd64/bsd.rd":
vmctl start install -c -d install60.fs -d OpenBSD.img

This pseudo-bootloader is only needed without BIOS and could
potentially be replaced in the future.

OK mlarkin@


# 1.44 26-Nov-2016 reyk

If -m/memory is not specified, use 512M by default.

Default value picked with mlarkin - not too small and not too large.

OK mlarkin@


# 1.43 24-Nov-2016 reyk

Add support for booting the kernel from the disk image.

This make the kernel/-k argument optional and, if not specified, tries
to find the /bsd kernel in the primary hd0a partition of the first
disk image itself. It doesn't support hd0a:/etc/boot.conf yet, and it
is no BIOS or full boot loader, but it makes booting and handling of
VMs a bit easier - booting an external kernel is still supported.

The UFS file system code ufs.c is directly from libsa which is also
used by the real boot loader. The code compiles with a few signedness
warning which will be fixed separately.

OK mlarkin@


# 1.42 22-Nov-2016 reyk

Fix error path of config_setvm() and its callers. This unbreaks
loading of invalid kernel files.

Reported by mlarkin@
OK mlarkin@


# 1.41 22-Nov-2016 reyk

There is no need for res when there is already ret.


# 1.40 22-Nov-2016 edd

Insert disabled VMs into vmd(8)'s queues and allow vmctl(8) to display them.

Tested by Jon Bernard and reyk@.

OK reyk@, no objections mlarkin@.

Thanks


# 1.39 04-Nov-2016 reyk

Pass the internal vmid or 0 to vm_register() instead of changing it
once again after setting the next available id.

Suggested by edd@


# 1.38 04-Nov-2016 reyk

Update the config/register/get VM methods to match the config_set/get
style that is used in other places. Also keep the vmid from the parent.

OK edd@


# 1.37 29-Oct-2016 edd

Separate parsing vms and switches from starting them in vmd(8).

Brings us one step closer to having disabled by default vms is vm.conf(5),
which can be started with vmctl(8).

Input, testing and OK reyk@. Thanks.


# 1.36 17-Oct-2016 reyk

Add the option to specify an interface group per virtual switch as well;
this group will be added to all VM tap(4) interfaces in the switch.

Tested by martijn@


# 1.35 15-Oct-2016 reyk

Allow to add an interface to an interface group; with the group keyword.

Requested and tested by martijn@


# 1.34 12-Oct-2016 reyk

Fix functionality and semantics of vmctl load/reload/reset.

OK rzalamena@


# 1.33 06-Oct-2016 reyk

Terminate VMs on shutdown of vmd instead of leaving them running as
undead VM processes.

OK mlarkin@


# 1.32 05-Oct-2016 reyk

Add support for enhanced networking configuration and virtual switches.
See vm.conf(5) for more details.

OK mlarkin@


# 1.31 04-Oct-2016 reyk

Add a new "priv" process that is responsible for ioctls and restricted
operations that aren't allowed under pledge. This is a companion to
the "vmd" process that runs as root but with pledge.

With the "priv" process, each new tap(4) interface now gets a
description to indicate the vm, eg. "vm1-if0-myvm". For network
configuration will be done by vmd/priv later.

OK mlarkin@


# 1.30 29-Sep-2016 reyk

Implement fork+exec for vmd, using the same framework from httpd etc.

No objections from mlarkin@ sunil@


# 1.29 17-Aug-2016 deraadt

small bits of header cleanup; ok mlarkin


# 1.28 29-Jul-2016 stefan

Allow starting a VM again after it was terminated

If a VM exits, terminate it and remove it from the list of
available VMs. That allows a VM with name `foo' to be restarted
after it has exited.

This changes structures shared between vmd and vmctl. You need to
rebuild vmctl also.

ok mlarkin@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.27 05-Feb-2016 reyk

Fix a possible use-after-free in vmd, forward the result to the
control socket before free'ing the vm.

Found by and OK jsg@


# 1.26 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.25 11-Dec-2015 reyk

The vmctl "id" argument can now be a number of or a vm name, eg.
vmctl stop 3
vmctl stop "openbsd.vm"


# 1.24 08-Dec-2015 jsg

when checking the config file with -n don't open /dev/vmm or require root
ok reyk@


# 1.23 08-Dec-2015 jsg

make the -f option work as intended
ok reyk@


# 1.22 07-Dec-2015 reyk

tweak initial error logging


# 1.21 06-Dec-2015 reyk

Prevent running a VM with the same name multiple times - multiple
instances of the same configuration will be handled in a different way
later. It is also not a good idea to use the same writeable disk
with multiple VMs at the same time.

As discussed with mlarkin@


# 1.20 06-Dec-2015 reyk

Print the TTY in the vmctl status output.


# 1.19 06-Dec-2015 reyk

When a new vm is created with VMM_IOC_CREATE, the kernel assigns a
unique id to it. This happens in the vm child process and has to be
communicated to the parent processes to track the vm. Knowing the vm
id in the parent and vmm processes also allows to remove vm from the
daemons list on terminate requests later.


# 1.18 06-Dec-2015 reyk

Check errno from config_getvm() correctly


# 1.17 05-Dec-2015 reyk

Print shorter error message if opening /dev/vmm failed.

Pointed out by deraadt@


# 1.16 03-Dec-2015 reyk

Re-add the "load" and "reload" commands to vmctl: Instead of parsing
the configuration in vmctl directly, it now sends a (re)load request
to vmd. The reload also resets the existing configuration status -
this doesn't do much difference yet but a future change will compare
if a specified VM is already running. "load" will allow to add
configuration, while "reload" resets the state before loading.


# 1.15 03-Dec-2015 reyk

Add and document -D and -f flags to vmd.


# 1.14 03-Dec-2015 reyk

mlarkin's code has been moved to vmm.c, so it is ok to claim the copyright.


# 1.13 03-Dec-2015 reyk

Add support for an optional vm.conf(5) file in vmd. This will replace
vmm.conf(5) in vmmctl. For a short time, both vmd and vmmctl will
support a configuration file, but vmmctl will be changed to send
"load" requests to vmd instead of loading and parsing the file
directly.


# 1.12 03-Dec-2015 reyk

prepare config_getvm() for parse.y


# 1.11 02-Dec-2015 reyk

send the tty name to vmmctl and print it as a result.


# 1.10 02-Dec-2015 reyk

Split the fully privileged parent into two processes "parent" and
"vmm" with reduced privileges:
- the "parent" opens fds (disks, ifs, etc.) but runs as root but pledged as
"stdio rpath wpath proc tty sendfd".
- the "vmm" process handles the creation and supervision of vm processes,
and the primary communication with the vmm(4) subsystem. It runs as _vmd
in the chroot but does not use pledge, as the vmm ioctls are not allowed
by any pledge model yet.
With this change, vmd starts to track the configuration state of VMs
in vmd and will allow other things later (like terminating a vm by
name, moving the configuration parser to vmd, ...). More incremental
changes will follow.


# 1.9 02-Dec-2015 reyk

Start tweaking vmd's privsep and daemon model by splitting the main
process into multiple parts and adopting the "proc.c"-style from other
daemons. This allows to further reduce the privileges, to give better
pledge(2), and to add some upcoming changes.

"please do" mlarkin@, deraadt@


# 1.8 26-Nov-2015 reyk

Automatically start vmm(4) when the first VM is created and after the
last VM is terminated. This allows to remove the explicit "vmm
enable" / "vmm disable" (VMM_IOC_START / VMM_IOC_STOP) ioctls. You'll
have to update kernel and userland for this change, as the kernel ABI
changes.

OK mpi@ mlarkin@


# 1.7 25-Nov-2015 tedu

typo: should be looking pid == -1


# 1.6 23-Nov-2015 reyk

accept4() is restarted after signals which prevents vmd from exiting
in the current control socket loop. Add a poll before the accept that
is not restarted and allows to escape the loop. This code is kind of
temporary, as we're planning to replace the event handling, but it
allows to kill (or Ctrl+c) vmd for now.

OK tedu@, discussed with many


# 1.5 23-Nov-2015 reyk

I accidentally removed a newline in usage() when converting the log
messages to log_*.

From Cesare Gargano


# 1.4 23-Nov-2015 reyk

Add support for logging to stderr or syslog, and to run vmd in
foreground with -d.

OK mlarkin@ jung@


# 1.3 22-Nov-2015 deraadt

use PATH_MAX where needed


# 1.2 22-Nov-2015 reyk

Add $ Ids


# 1.1 22-Nov-2015 mlarkin

vmd(8) - virtual machine daemon.

There is still a lot to be done, and fixed, in these userland components
but I have received enough "it works, commit it" emails that it's time
to finish those things in tree.

discussed with many, tested by many.


# 1.153 18-Jan-2024 claudio

Use imsg_get_fd() in vmd.

vmd uses a lot of fd passing and does it sometimes via extra abstraction
so this just tries to convert the code without any optimisations.

ok dv@


Revision tags: OPENBSD_7_4_BASE
# 1.152 26-Sep-2023 dv

vmd(8): disambiguate log messages per vm and device.

The logging output from vmd(8) often specifies the function performing
the logging, but leaves which vm or vm device to guesswork and
reading tea leaves.

Change the logging formatting to prefix with information about the
specific vm and potentially the device subprocess. Most of this
logging is behind the "verbose" mode, but for warnings this will
clarify which vm or device logged the warning.

The format of vm/<name>/<device><index> is chosen to be concise and
less ugly than other approaches. This adjusts the process naming
for devices to match, dropping the use of brackets.

In the process of this change, updating log settings dynamically
via vmctl(8) is fixed by properly broadcasting that information to
the device subprocesses. The "vmm" process also now updates its own
state properly, so settings survive vm reboots.

ok mlarkin@


# 1.151 03-Jul-2023 jasper

when shutting down a vm, handle the VM id in the same way as a VM name argument

ok dv@


# 1.150 18-Jun-2023 op

relax absolute path requirement for configtest (-n)

ok dv@


# 1.149 13-May-2023 dv

vmm(4)/vmd(8): switch to anonymous shared mappings.

While splitting out emulated virtio network and block devices into
separate processes, I originally used named mappings via shm_mkstemp(3).
While this functionally achieved the desired result, it had two
unintended consequences:

1) tearing down a vm process and its child processes required
excessive locking as the guest memory was tied into the VFS layer.

2) it was observed by mlarkin@ that actions in other parts of the
VFS layer could cause some of the guest memory to flush to storage,
possibly filling /tmp.

This commit adds a new vmm(4) ioctl dedicated to allowing a process
request the kernel share a mapping of guest memory into its own vm
space. This requires an open fd to /dev/vmm (requiring root) and
both the "vmm" and "proc" pledge(2) promises. In addition, the caller
must know enough about the original memory ranges to reconstruct them
to make the vm's ranges.

Tested with help from Mischa Peters.

ok mlarkin@


# 1.148 12-May-2023 dv

vmd(8): fix segfault on vm creation.

vm_instance was using the wrong vm instance for checking the
vm_kernel_path member. Switch to using the value from the parent
vm instance in the check for if a kernel is known.

Issue reported by kn@. OK mlarkin@, kn@.


# 1.147 12-May-2023 dv

vmd(8): fix console attach from vmctl(8).

Adding in the ability to override the boot kernel created an edge
case in the ipc message handling logic for the parent process (vmd)
when receiving a "start vm" request. Result was incorrectly responding
to the control process, and as a result the vmctl client, with a
bogus "start vm response" reply with an empty tty name.

This commit rewrites the logic of how vmd goes about processing the
"start vm" request with the aim of making it simpler to understand
while addressing the edge case.

Issue reported by kn@. OK mlarkin@.


# 1.146 28-Apr-2023 dv

vmd(8)/vmctl(8): allow vm owners to override boot kernel.

vmd allows non-root users to "own" a vm defined in vm.conf(5). While
the user can start/stop the vm, if they break their filesystem they
have no means of booting recovery media like a ramdisk kernel.

This change opens the provided boot kernel via vmctl and passes the
file descriptor through the control channel to vmd. The next boot
of the vm will use the provided file descriptor as boot kernel/bios.
Subsequent boots (e.g. a reboot) will return to using behavior
defined in vm.conf or the default bios image.

ok mlarkin@


# 1.145 27-Apr-2023 dv

vmd(8): introduce multi-process model for virtio devices.

Isolate virtio network and block device emulation in dedicated
processes, forked and exec'd from the vm process. This allows for
tightening pledge promises to just "stdio".

Communication between the vcpu's and these devices now occurs via
imsg channels, which adds the benefit of not always blocking the
vcpu thread while emulating the device.

With this commit, it's possible that vmd is the first open source
hypervisor that *defaults* to a multi-process device emulation
model without requiring any additional configuration from the
operator.

Testing help from phessler@ and Mischa Peters.

ok mlarkin@


# 1.144 25-Apr-2023 dv

vmm(4)/vmd(8): pull struct members out of vmm ioctl create struct.

The object sent to vmm(4) contained file paths and details the
kernel does not need for cpu virtualization as device emulation is
in userland. Effectively, "pull up" the struct members from the
vm_create_params struct to the parent vmop_create_params struct.

This allows us to clean up some of vmd(8) and simplify things for
switching to having vmctl(8) open the "kernel" file (SeaBIOS, bsd.rd,
etc.) to allow users to boot recovery ramdisk kernels.

ok mlarkin@


# 1.143 24-Apr-2023 kn

Missing the optional default config is not an error

/var/log/{messages,daemon} logs ENOENT as error on default configless vmd.
Only complain on explicitly passed files and print a debug hint under `-vv'
in case someone forgot to populate their /etc/vm.conf.

OK dv mlarkin


# 1.142 23-Apr-2023 dv

vmd(8): teach vmm process how to exec.

Use execvp(2) to launch vm children with new address spaces.
Consequently, introduces use of unveil(2) into the vmm and vm
processes.

This imposes the requirement of launching vmd with absolute paths,
similar to sshd(8).

ok mlarkin@


# 1.141 19-Apr-2023 jsg

remove duplicate includes


# 1.140 16-Apr-2023 dv

vmd(8): clean up fd closing in vmm process.

Some mild tidying of fd closing in the vmm process in prep for
landing parts of my fork+exec diff.

With input from guenther@ on the nuances of if/when EINTR may happen
in a call to close(2).

ok mlarkin@


# 1.139 02-Apr-2023 dv

vmd(8): migrate vmd_vm.vm_ttyname to char array.

Other structs use a fixed length array already. This allows a vmd_vm
object to be transmitted over an ipc channel, too.

Additionally, solves a segfault caused by a strlcpy(3) in an error
path.

ok mlarkin@


Revision tags: OPENBSD_7_3_BASE
# 1.138 28-Jan-2023 dv

Move some header definitions from vmm(4) to vmd(8).

Part of an ongoing effort to move userland-specific information out
of a kernel header and directly into vmd(8). No functional change.

ok mlarkin@


# 1.137 22-Jan-2023 dv

vmd(8): don't remove known vm's from the config on error.

Multiple error paths, specifically the one related to if a guest
cannot allocate memory at start, resulted in a known vm (via
vm.conf(5)) being removed from the vm list. Adjust the error paths
to check if the failing vm is defined in the config before tearing
it down.

Tested with help from beck@ and Mischa Peters.

ok beck@


# 1.136 14-Jan-2023 dv

Only open /dev/vmm once in vmd(8).

Have the parent process open /dev/vmm and send the fd to the vmm
child process. Only the vmm process and its resulting children
(guest vms) need it for ioctl calls.

ok kn@


# 1.135 28-Dec-2022 jmc

spelling fixes; from paul tagliamonte
any parts of his diff not taken are noted on tech


# 1.134 15-Dec-2022 dv

Add explicit casts to ctype functions in vmd(8).

OK millert@


# 1.133 31-Oct-2022 dv

vmd(8): remove unfinished user accounting.

User accounting and enforcement was never finished. tedu the thing
until someone wants to pick it up and finish it.

Originally found by Matthew Martin.

ok mlarkin@, kn@. input from tb@.


Revision tags: OPENBSD_7_2_BASE
# 1.132 13-Sep-2022 martijn

branches: 1.132.2;
Add (partial) support for agentx in vmd.

Metrics can be found under mib-2.236 and VM-MIB (RFC7666).

Stress tested by and happy noises from Mischa Peters
OK dv@


# 1.131 08-May-2022 dv

vmd: fix rebooting a received vm

Rebooting a received vm resulted in vmd(8) exiting as a result of
flawed state tracking in the parent process.

When stopping a vm, clear the VM_RECEIVE_STATE flag. When starting
a vm, make sure the parent process collapses any existing memory
ranges after the vm is sent to the vmm process (responsible for
launching the vm).

ok mlarkin@


Revision tags: OPENBSD_7_1_BASE
# 1.130 01-Mar-2022 dv

branches: 1.130.2;
vmd(8): gracefully handle hitting data limits when starting a vm

With recent changes to login.conf(5) to restrict daemon datasize
to a finite value, users can now hit resource limits when attempting
to start a vm.

This change fixes the error path when hitting the limit. vmd(8)
will no longer abort and memory error messages are relayed to the
user.

While here, address potential under-reads/writes using atomicio
when relaying data between the child vm process and vmd's vmm
process.

Original diff from tedu@. OK mlarkin@.


# 1.129 04-Jan-2022 claudio

Fix some simple -Wunused-but-set-variable warnings.
OK benno@ dv@


# 1.128 13-Dec-2021 deraadt

including sys/cdefs.h manually started as a result of netbsd trying to
macro-build a replacement for sccsid, and was done without any concern
for namespace damage. Unfortunately this practice started infecting
other code as others were unaware they didn't need the file.
ok millert guenther


# 1.127 29-Nov-2021 deraadt

mostly avoid sys/param.h with a local nitems()
ok mlarkin


Revision tags: OPENBSD_7_0_BASE
# 1.126 18-Jul-2021 dv

vmd(8): remove invalid errno values from config_setvm

Refactor config_setvm to directly return error code on failure
instead of returning -1 and setting errno. It was setting unsupported
values not defined in <errno.h>.

OK mlarkin@


# 1.125 05-May-2021 dv

Refactor vm_instance to return error value directly.

vmd(8)'s vm_instance function set unsupported errno values. Change the
api to directly return an error (either errno or custom vmd error).

"go for it" -mlarkin@


# 1.124 04-May-2021 dv

Init debug logging state before attempting to log.

Error messages related to bad configuration were not flushing to
stderr.

OK mlarkin@


# 1.123 26-Apr-2021 dv

vmd(8): fix vmctl client "wait" state corruption

Adds queue-based tracking of waiting client state to fix the cause of
state corruption when a vmctl(8) user cancels a wait and restarts it.
The socket fd value for the control process client was being used to
track the waiting party, but this also prevented multiple waiting
clients.

This moves all the state tracking of who to notify of a vm's stopping
to the control process and no longer requires the parent process to
track it in the global environment state.

Future work will be needed to smooth out the difference between the
IMSG_VMDOP_TERMINATE_VM_{EVENT,RESPONSE} events instead of needing to
translate before relaying to the vmctl(8) client.

Tested by Mischa Peters (thanks!)

ok mlarkin@


Revision tags: OPENBSD_6_9_BASE
# 1.122 05-Apr-2021 dv

Send correct response type on unpause errors.

ok pd@


# 1.121 29-Mar-2021 dv

Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp
and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior
ignored did not intercept these packets and instead transmitted them.

This should make vmd(8)'s dhcp behave more as a true dhcp server should and
allows it to work properly with the new dhcpleased(8) attempting a renewal.

OK mlarkin@


# 1.120 27-Jan-2021 deraadt

these programs (with common ancestry) had a -fno-common problem related
to privsep_procid.
ok mortimer


Revision tags: OPENBSD_6_8_BASE
# 1.119 23-Sep-2020 martijn

Revert agentx support for now, we're too close to release.

requested by deraadt@


# 1.118 23-Sep-2020 martijn

Add support for agentx to vmd.

This is based around VM-MIB from RFC7666,but does not export the full
spec. People more knowledgeable of vmd are encouraged to expand on this.


Revision tags: OPENBSD_6_7_BASE
# 1.117 12-Dec-2019 pd

vmd: start vms defined in vm.conf in a staggered fashion

This addresses 'thundering herd' problem when a lot of
vms are configured in vm.conf. A lot of vms booting in parallel can
overload the host and also mess up tsc calibration in openbsd guests as
it uses PIT which doesn't fire reliably if the host is overloaded.

We default to starting vms with parallelism of ncpuonline and a delay 30 seconds
between batches. This is configurable in vm.conf.

ok mlarkin@ (also addressed comments from cheloha@)


Revision tags: OPENBSD_6_6_BASE
# 1.116 04-Sep-2019 mlarkin

vmd(8): memory leak in an error path

Found by Hiltjo Posthuma, thanks!


# 1.115 14-Aug-2019 anton

Improve the error message when supplying an invalid template to vmctl
start. Favoring 'invalid template' over 'permission denied' should give
the user a better hint on what went wrong.

ok kn@ mlarkin@


# 1.114 28-Jun-2019 deraadt

When system calls indicate an error they return -1, not some arbitrary
value < 0. errno is only updated in this case. Change all (most?)
callers of syscalls to follow this better, and let's see if this strictness
helps us in the future.


# 1.113 20-May-2019 jasper

drop fatalx calls when claiming a new vm id; otherwise it's possible
to crash vmd and take all other vms with it. this required a little
shuffling to get the error value reported back to the caller to
handle the error properly.

ok mlarkin@


# 1.112 11-May-2019 jasper

report vm state through 'vmctl status'; whereas previously this would display the state of
the vcpu (which is why it got removed), it now actually reports the correct state
(running, stopped, disabled, paused, etc)

ok ccardenas@ mlarkin@


# 1.111 11-May-2019 jasper

vm_dump_header allocated space for a signature but it was never set;
set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image

ok mlarkin@ pd@


# 1.110 11-May-2019 jasper

track the state of the vm (running, paused, etc) using a single bitfield instead of
a handful of separate variables. this will makes it easier for vmd to report
and check on the individual vm states

no functional change intended

ok ccardenas@ mlarkin@


# 1.109 11-May-2019 jasper

sync the vm state in vmd too when (un)pausing a vm, otherwise the vm process
knows the vm is paused, but vmd does not.

ok mlarkin@ pd@


Revision tags: OPENBSD_6_5_BASE
# 1.108 09-Dec-2018 claudio

When -B is used to specify a specific boot device also change the reboot
behaviour of vmd to stop / exit at guest reboot.
OK ccardenas@


# 1.107 04-Dec-2018 claudio

Introduce IMSG_VMDOP_WAIT_VM_REQUEST a control message that registers a
vmctl peerid that should be informed when the VM is stopped (like when the
guest does a shutdown). Uses the same logic as using the VMOP_WAIT flag on
IMSG_VMDOP_TERMINATE_VM_REQUEST.
Ok ccardenas@, reyk@


# 1.106 26-Nov-2018 ori

Keep a list of known vms, and reuse the VM IDs.

This means that when using '-L', the IP addresses of the VMs are stable.

ok reyk@


# 1.105 21-Nov-2018 reyk

Add support for "local inet6" interfaces.

ok & test ccardenas@, additional review from kn@


# 1.104 15-Oct-2018 reyk

Prevent VM reboot loops by rate-limiting the interval a VM can reboot.

This looping has been experienced by people who run VMs with a broken
kernel or boot loader that trigger a very fast reboot loop (triple
fault) of a VM that ends up using a lot of CPU and resources on the
host. Some fixes in vmm(4) and vmd(8) helped to avoid such conditions
but it can still occur if something is wrong in the guest VM itself.

If the VM restarts after less than VM_START_RATE_SEC (6) seconds, we
increment the limit counter. After VM_START_RATE_LIMIT (3) of suchs
fast reboots the VM is stopped.

There are only very few people who intentionally want to reboot-loop a
VM very quickly (many times within a second); mostly for fuzzing.
They will have to recompile and adjust the stated #defines in the code
as we don't have a config option to disable it.

OK mlarkin@


Revision tags: OPENBSD_6_4_BASE
# 1.103 08-Oct-2018 reyk

Add support for qcow2 base images (external snapshots).

This works is from Ori Bernstein, committing on his behalf:

Add support to vmd for external snapshots. That is, snapshots that are
derived from a base image. Data lookups start in the derived image,
and if the derived image does not contain some data, the search
proceeds ot the base image. Multiple derived images may exist off of
a single base image.

A limitation of this format is that modifying the base image will
corrupt the derived image.

This change also adds support for creating disk derived disk images to
vmctl. To use it:

vmctl create derived.qcow2 -s 16G -b base.qcow2

From Ori Bernstein
OK mlarkin@ reyk@


# 1.102 29-Sep-2018 pd

vmd: don't remove vm if sending failed

Fix a bug where a vm was removed in vmd.c after vmctl send even if sending
failed.
spotted by solene@
ok mlarkin@


# 1.101 28-Sep-2018 reyk

Fix copy-pasto to use maxmem instead of maxcpu

Reported by Greg Steuck

OK mlarkin@


# 1.100 10-Sep-2018 bluhm

vmd(8) clould close file descriptor 0 as not all fd fields were
properly initialized with -1. Also avoid closing -1.
OK mlarkin@


# 1.99 10-Sep-2018 bluhm

During the fork+exec implementation, daemon(3) was moved after
proc_init(). As a consequence vmd(8) child processes did not detach
from the terminal anymore. Dup /dev/null to the stdio file descriptors
in the children.
OK mlarkin@ reyk@


# 1.98 15-Jul-2018 reyk

Track resources and enforce cpu/memory/interface limits for non-root users.

The limits are currently hard-coded and undocumented (4 CPUs/VMs, 2G
memory, 8 interfaces) but will be configurable in an upcoming diff.
These limits are tracked in total usage; for example, a user will be
able to run up to 4 VMs with 512M of memory or a single VM with 2G.

OK ccardenas@ mlarkin@


# 1.97 13-Jul-2018 reyk

Check the disk/kernel/cdrom file permissions after openening the fd.

This prevents time of TOCTOU attacks for instances.

OK mlarkin@


# 1.96 13-Jul-2018 reyk

Add "allow instance" option.

This allows users to create VM instances and change desired options,
for example a user can be allowed to run a VM with all the
pre-configured options but specify an own disk image.

(mlarkin@ was fine with iterating over it)

OK ccardenas@


# 1.95 12-Jul-2018 reyk

Allow to use configured/running VMs as templates for other VM instances.

This introduces new grammar and the -t optional in vmctl start.

(For now, only root can create VM instances; but it is planned to allow
users to create their own VMs based on permissions and quota.)

OK ccardenas@ mlarkin@ jmc@


# 1.94 11-Jul-2018 reyk

style - indent each case statement in a switch.


# 1.93 11-Jul-2018 reyk

Add -w option to vmctl stop to wait for completion of VM termination.

Use it in /etc/rc.d/vmd accordingly.

OK sthen@


# 1.92 11-Jul-2018 reyk

Rename function to vmd_check_vmh


# 1.91 11-Jul-2018 reyk

Add -f option to vmctl stop to forcefully kill a VM.

This also fixes a bug in vmm_sighdlr where it might have missed
forwarding the TERMINATE_EVENT to the vmd parent after a VM child
died, leading to an abandoned VM in the vmd parent process.

OK ccardenas@ mlarkin@ benno@ kn@


# 1.90 10-Jul-2018 reyk

style (single-line ifs don't need braces)


# 1.89 10-Jul-2018 reyk

vmd already had DEBUG/DPRINTF, there is no need for VMD_DEBUG/dprintf

Replace all occurences of dprintf with DPRINTF (defined in proc.h).


# 1.88 10-Jul-2018 reyk

Tweak debug log messages

- Turn tracing messages into DPRINTF (only compiled with DEBUG).

- Pass __func__ to vm_stop and vm_remove: this way we can track who
called the function in the async context. It replaces the manual
log_debug in front of each vm_stop/vm_remove. This debug logging
trick can be removed in the future once we are more confident about
it.

OK ccardenas@ mlarkin@


# 1.87 26-Jun-2018 reyk

Add "socket owner" to allow changing the owner of the vmd control socket.

This allows to open vmctl control or console access to other users
that are not in group wheel. Access for non-root users still defaults
to read-only actions unless you change the owner (user/group) of each
individual VM.

Requested by Mischa Peters

OK mlarkin@


# 1.86 19-Jun-2018 reyk

knf


# 1.85 13-May-2018 pd

vmd(8): enable pause / unpause for vm owners

Patch from Mohamed Aslan. Thanks!
ok kn@


# 1.84 25-Apr-2018 mlarkin

vmd(8)'s early error messages weren't visible when started via /etc/rc
(such as errors relating to not having VMX/etc). Change the log_init
to log to syslog so at least we have some chance of seeing these errors.

requested and ok beck@


# 1.83 21-Apr-2018 mlarkin

spelling error in log message


# 1.82 29-Mar-2018 martijn

Make sure that the global config is send out immediately when it is
loaded. This makes sure that the local prefix specied in the config is
always used.

OK ccardenas@


Revision tags: OPENBSD_6_3_BASE
# 1.81 14-Mar-2018 mlarkin

block two VMs from using the same disk image file at the same time.
Also changes an error message in vmctl to reflect same.


# 1.80 18-Feb-2018 pd

vmd: fix vmctl pause for non existing vm ids (never returns)

check if vm id is valid before sending to vmm for pausing. The 'lock' is caused
by vmm sending back ENOENT for a non existent vm but vmd drops the message
because it doesn't recogize the vmid vmm is talking about. This is an artifact
of the 'policy' don't trust any imsg from a sibling priv sep process and do
your own checking.

reported by Abel Abraham Camarillo Ojeda
ok mlarkin@ and ccardenas@


# 1.79 10-Jan-2018 sthen

Don't require "disk" or "kernel", also allow just "cdrom" instead, a VM can
still be useful with only cdrom storage. ok ccardenas@


# 1.78 08-Jan-2018 mpi

Enable TIOCUCNTL to be able to set ns8250's break detected condition.

It is now possible to send BREAK commands to vmd(8) independently of
the serial terminal emulator.

Happy virtual ddb(4) hacking!

No objection from mlarkin@, ok nicm@, ccardenas@, deraadt@


# 1.77 03-Jan-2018 ccardenas

Add initial CD-ROM support to VMD via vioscsi.

* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8)
* Support various sized ISOs (Limitation of 4G ISOs on Linux guests)
* Known working guests: OpenBSD (primary), Alpine Linux (primary),
CentOS 6 (secondary), Ubuntu 17.10 (secondary).
NOTE: Secondary indicates some issue(s) preventing full/reliable
functionality outside the scope of the vioscsi work.
* If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's
default BIOS) will boot from CD-ROM.

ok mlarkin@, jca@


# 1.76 06-Dec-2017 abieber

Make vmd respect owner when starting non-disabled vms.

OK pd@, benno@


# 1.75 30-Nov-2017 ccardenas

When performing vmctl reload and a previously configured vm is running,
exit with an EALREADY vs EPERM.

ok mlarkin@


# 1.74 11-Nov-2017 mlarkin

update switch handling in vmd(8). vmd now gets switch information (rdomain,
etc) from underlying switch interface instead of handling this on its
own.

Diff from carlos cardenas, Thanks!

ok reyk@


# 1.73 07-Nov-2017 mlarkin

typo in previous


# 1.72 07-Nov-2017 mlarkin

comment function vm_checkperm


# 1.71 24-Oct-2017 mlarkin

The VMD parent process didn't handle the case of a VM exiting
with a non 0 return properly (i.e. EIO).

From: Carlos Cardenas, thanks!


# 1.70 07-Oct-2017 mlarkin

vmd: retain ownership on vm reboot

from Jesper Wallin, thanks!


Revision tags: OPENBSD_6_2_BASE
# 1.69 08-Sep-2017 mlarkin

vmd: add more explanatory log_debug messages

From Carlos Cardenas, many thanks!


# 1.68 20-Aug-2017 pd

vmd: Allow only upward migration

This restricts receiving vms from hosts with more cpu features.

Tested on
broadwell -> skylake (works)
skylake -> broadwell (don't work)

ok mlarkin@


# 1.67 15-Aug-2017 pd

vmd: fix vm id displayed by vmctl when receiving a vm

Also fix two debug messages and an IMSG type.


# 1.66 14-Aug-2017 jasper

validate vm names before creating them; a valid name contains alphanumeric
characters, including '.', '_' and '-'. but does not start with the latter
three.

ok mlarkin@ pd@


# 1.65 13-Aug-2017 jasper

don't issue a termination command to an already stopped vm

ok mlarkin@


# 1.64 15-Jul-2017 pd

Add vmctl send and vmctl receive

ok reyk@ and mlarkin@


# 1.63 09-Jul-2017 pd

vmd/vmctl: Add ability to pause / unpause vms

With help from Ashwin Agrawal

ok reyk@ mlarkin@


# 1.62 29-May-2017 mlarkin

vmd(8): prevent crashing when presented with a vm name argument to
"vmctl stop" that doesn't exist.

Diff from Pratik Vyas, thanks!


# 1.61 04-May-2017 reyk

Report command failure back to vmctl reload, reset, load, log verbose.

OK mlarkin@


# 1.60 04-May-2017 reyk

Add support for rdomains.

This allows to configure VM interfaces and switches in individual rdomains.

OK mlarkin@


# 1.59 25-Apr-2017 reyk

Generate randomized MAC addresses earlier to keep them across reboots.

OK deraadt@


# 1.58 21-Apr-2017 reyk

Add global configuration option "local prefix" to change prefix for -L.

The default prefix is 100.64.0.0/10 from RFC6598.

Requested by sthen@ chris@
OK mlarkin@


# 1.57 19-Apr-2017 reyk

Add support for dynamic "NAT" interfaces (-L/local interface).

When a local interface is configured, vmd configures a /31 address on
the tap(4) interface of the host and provides another IP in the same
subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server
that replies with IP, gateway, and DNS addresses to the VM. The
built-in server only ever responds to the VM on the inside and cannot
leak its DHCP responses to the outside.

Thanks to Uwe Werler, Josh Grosse, and some others for testing!

OK deraadt@


# 1.56 06-Apr-2017 reyk

Do not expose vmm(4) VM IDs to the user, use vmd(8)'s IDs instead.

Each VM has two IDs: one from the kernel (vmm) and a different one
from userland (vmd). The vmm ID is not consistent and incremented on
every boot during runtimg of the host system. The vmd ID remains the
same during the lifetime of a configured VM, even after reboots.
Configured VMs will even get and keep their IDs when the configuration
is loaded. This is more what users expect.

Pointed out and tested by otto@

OK deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.55 15-Mar-2017 reyk

More fixes for starting and stopping VMs, fixing fallout from vm_running.

- Don't start a VM that is already running
- Keep the VM as running until it is powered off (and not stopping)
- Don't fatal in the parent if the vmm process referenced an unknown VM
- Don't stop a VM that is already stopping
- Indicate that a VM is stopping in "vmctl status"

The previous "vmctl stop; vmctl stop" to force-shutdown is not
supported anymore - the shutdown timeout should make sure that the VM
is really terminated. To force-shutdown, reference the VM by ID.
We might add a flag to vmctl stop to just turn the VM off.


# 1.54 15-Mar-2017 reyk

Close the tty if the VM was powered down.

The parent keeps a copy of each VM's tty fd to reuse it on reboot.
Close this tty if the VM was stopped, and not rebooted, by calling
vm_stop(vm, 0) instead of just setting vm_running to 0. Also make
sure that vm_ttyname is not used after free'ing it.


# 1.53 02-Mar-2017 reyk

Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.

This is especially useful when multiple VMs share a switch, the
implementation is independent from the underlying switch or bridge.

no objections mlarkin@


# 1.52 01-Mar-2017 reyk

Add "owner" option to set a user/group ownership for pre-configured VMs

This allows matching users to start or stop VMs that they "own" and to
access the console accordingly.

OK mlarkin@


# 1.51 27-Feb-2017 reyk

Replace openpty(3) with local function that uses pre-opened /dev/ptm fd

This allows more flexibility for upcoming changes and better pledge.
We also didn't use half of the features of libutil's openpty function.
Additionally, make sure that the ttys are closed correctly on shutdown.

OK gilles@


# 1.50 13-Jan-2017 edd

Make it possible to remove VMs from vmd(8)'s internal queue.

The semantics agreed with reyk@ are:

* ad-hoc created vms, created with `vmctl start`, are removed once stopped.
* Stopped VMs defined in a config file are flushed before a `vmctl reload`.

OK reyk@


# 1.49 11-Jan-2017 reyk

Add imsg communication channel between vmd and invividual VMs.
For now, this is only used to forward "log verbose|brief" requests,
but it will be used for better things later.

OK mlarkin@


# 1.48 09-Jan-2017 reyk

Stop accessing verbose and debug variables from log.c directly.

This replaces log_verbose() and "extern int verbose" with the two functions
log_setverbose() and log_getverbose().

Pointed out by benno@
OK krw@ eric@ gilles@ (OK gilles@ for the snmpd bits as well)


# 1.47 14-Dec-2016 reyk

Allow to start disabled and pre-configured VMs by name, "vmctl start foo".

With testing from Jon Bernard

OK mlarkin@


# 1.46 14-Dec-2016 reyk

If a VM terminates with the result EAGAIN, close all fds except the
pty and re-send it to the vmm monitor process. With additional
changes in vmm.c, this will allow perform a cold reboot of VM.

With testing and feedback from Jon Bernard
OK mlarkin@


# 1.45 26-Nov-2016 reyk

Implement basic support for boot.conf(8) on the disk image.

Like the real boot loader, load and parse hd0a:/etc/boot.conf from the
first disk and fall back to /bsd. Not all boot loader options are
supported, but it at least does set device, set image, and boot -acds
(eg. for booting single-user).

For example, it can now boot install60.fs that includes a boot.conf
with "set image /6.0/amd64/bsd.rd":
vmctl start install -c -d install60.fs -d OpenBSD.img

This pseudo-bootloader is only needed without BIOS and could
potentially be replaced in the future.

OK mlarkin@


# 1.44 26-Nov-2016 reyk

If -m/memory is not specified, use 512M by default.

Default value picked with mlarkin - not too small and not too large.

OK mlarkin@


# 1.43 24-Nov-2016 reyk

Add support for booting the kernel from the disk image.

This make the kernel/-k argument optional and, if not specified, tries
to find the /bsd kernel in the primary hd0a partition of the first
disk image itself. It doesn't support hd0a:/etc/boot.conf yet, and it
is no BIOS or full boot loader, but it makes booting and handling of
VMs a bit easier - booting an external kernel is still supported.

The UFS file system code ufs.c is directly from libsa which is also
used by the real boot loader. The code compiles with a few signedness
warning which will be fixed separately.

OK mlarkin@


# 1.42 22-Nov-2016 reyk

Fix error path of config_setvm() and its callers. This unbreaks
loading of invalid kernel files.

Reported by mlarkin@
OK mlarkin@


# 1.41 22-Nov-2016 reyk

There is no need for res when there is already ret.


# 1.40 22-Nov-2016 edd

Insert disabled VMs into vmd(8)'s queues and allow vmctl(8) to display them.

Tested by Jon Bernard and reyk@.

OK reyk@, no objections mlarkin@.

Thanks


# 1.39 04-Nov-2016 reyk

Pass the internal vmid or 0 to vm_register() instead of changing it
once again after setting the next available id.

Suggested by edd@


# 1.38 04-Nov-2016 reyk

Update the config/register/get VM methods to match the config_set/get
style that is used in other places. Also keep the vmid from the parent.

OK edd@


# 1.37 29-Oct-2016 edd

Separate parsing vms and switches from starting them in vmd(8).

Brings us one step closer to having disabled by default vms is vm.conf(5),
which can be started with vmctl(8).

Input, testing and OK reyk@. Thanks.


# 1.36 17-Oct-2016 reyk

Add the option to specify an interface group per virtual switch as well;
this group will be added to all VM tap(4) interfaces in the switch.

Tested by martijn@


# 1.35 15-Oct-2016 reyk

Allow to add an interface to an interface group; with the group keyword.

Requested and tested by martijn@


# 1.34 12-Oct-2016 reyk

Fix functionality and semantics of vmctl load/reload/reset.

OK rzalamena@


# 1.33 06-Oct-2016 reyk

Terminate VMs on shutdown of vmd instead of leaving them running as
undead VM processes.

OK mlarkin@


# 1.32 05-Oct-2016 reyk

Add support for enhanced networking configuration and virtual switches.
See vm.conf(5) for more details.

OK mlarkin@


# 1.31 04-Oct-2016 reyk

Add a new "priv" process that is responsible for ioctls and restricted
operations that aren't allowed under pledge. This is a companion to
the "vmd" process that runs as root but with pledge.

With the "priv" process, each new tap(4) interface now gets a
description to indicate the vm, eg. "vm1-if0-myvm". For network
configuration will be done by vmd/priv later.

OK mlarkin@


# 1.30 29-Sep-2016 reyk

Implement fork+exec for vmd, using the same framework from httpd etc.

No objections from mlarkin@ sunil@


# 1.29 17-Aug-2016 deraadt

small bits of header cleanup; ok mlarkin


# 1.28 29-Jul-2016 stefan

Allow starting a VM again after it was terminated

If a VM exits, terminate it and remove it from the list of
available VMs. That allows a VM with name `foo' to be restarted
after it has exited.

This changes structures shared between vmd and vmctl. You need to
rebuild vmctl also.

ok mlarkin@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.27 05-Feb-2016 reyk

Fix a possible use-after-free in vmd, forward the result to the
control socket before free'ing the vm.

Found by and OK jsg@


# 1.26 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.25 11-Dec-2015 reyk

The vmctl "id" argument can now be a number of or a vm name, eg.
vmctl stop 3
vmctl stop "openbsd.vm"


# 1.24 08-Dec-2015 jsg

when checking the config file with -n don't open /dev/vmm or require root
ok reyk@


# 1.23 08-Dec-2015 jsg

make the -f option work as intended
ok reyk@


# 1.22 07-Dec-2015 reyk

tweak initial error logging


# 1.21 06-Dec-2015 reyk

Prevent running a VM with the same name multiple times - multiple
instances of the same configuration will be handled in a different way
later. It is also not a good idea to use the same writeable disk
with multiple VMs at the same time.

As discussed with mlarkin@


# 1.20 06-Dec-2015 reyk

Print the TTY in the vmctl status output.


# 1.19 06-Dec-2015 reyk

When a new vm is created with VMM_IOC_CREATE, the kernel assigns a
unique id to it. This happens in the vm child process and has to be
communicated to the parent processes to track the vm. Knowing the vm
id in the parent and vmm processes also allows to remove vm from the
daemons list on terminate requests later.


# 1.18 06-Dec-2015 reyk

Check errno from config_getvm() correctly


# 1.17 05-Dec-2015 reyk

Print shorter error message if opening /dev/vmm failed.

Pointed out by deraadt@


# 1.16 03-Dec-2015 reyk

Re-add the "load" and "reload" commands to vmctl: Instead of parsing
the configuration in vmctl directly, it now sends a (re)load request
to vmd. The reload also resets the existing configuration status -
this doesn't do much difference yet but a future change will compare
if a specified VM is already running. "load" will allow to add
configuration, while "reload" resets the state before loading.


# 1.15 03-Dec-2015 reyk

Add and document -D and -f flags to vmd.


# 1.14 03-Dec-2015 reyk

mlarkin's code has been moved to vmm.c, so it is ok to claim the copyright.


# 1.13 03-Dec-2015 reyk

Add support for an optional vm.conf(5) file in vmd. This will replace
vmm.conf(5) in vmmctl. For a short time, both vmd and vmmctl will
support a configuration file, but vmmctl will be changed to send
"load" requests to vmd instead of loading and parsing the file
directly.


# 1.12 03-Dec-2015 reyk

prepare config_getvm() for parse.y


# 1.11 02-Dec-2015 reyk

send the tty name to vmmctl and print it as a result.


# 1.10 02-Dec-2015 reyk

Split the fully privileged parent into two processes "parent" and
"vmm" with reduced privileges:
- the "parent" opens fds (disks, ifs, etc.) but runs as root but pledged as
"stdio rpath wpath proc tty sendfd".
- the "vmm" process handles the creation and supervision of vm processes,
and the primary communication with the vmm(4) subsystem. It runs as _vmd
in the chroot but does not use pledge, as the vmm ioctls are not allowed
by any pledge model yet.
With this change, vmd starts to track the configuration state of VMs
in vmd and will allow other things later (like terminating a vm by
name, moving the configuration parser to vmd, ...). More incremental
changes will follow.


# 1.9 02-Dec-2015 reyk

Start tweaking vmd's privsep and daemon model by splitting the main
process into multiple parts and adopting the "proc.c"-style from other
daemons. This allows to further reduce the privileges, to give better
pledge(2), and to add some upcoming changes.

"please do" mlarkin@, deraadt@


# 1.8 26-Nov-2015 reyk

Automatically start vmm(4) when the first VM is created and after the
last VM is terminated. This allows to remove the explicit "vmm
enable" / "vmm disable" (VMM_IOC_START / VMM_IOC_STOP) ioctls. You'll
have to update kernel and userland for this change, as the kernel ABI
changes.

OK mpi@ mlarkin@


# 1.7 25-Nov-2015 tedu

typo: should be looking pid == -1


# 1.6 23-Nov-2015 reyk

accept4() is restarted after signals which prevents vmd from exiting
in the current control socket loop. Add a poll before the accept that
is not restarted and allows to escape the loop. This code is kind of
temporary, as we're planning to replace the event handling, but it
allows to kill (or Ctrl+c) vmd for now.

OK tedu@, discussed with many


# 1.5 23-Nov-2015 reyk

I accidentally removed a newline in usage() when converting the log
messages to log_*.

From Cesare Gargano


# 1.4 23-Nov-2015 reyk

Add support for logging to stderr or syslog, and to run vmd in
foreground with -d.

OK mlarkin@ jung@


# 1.3 22-Nov-2015 deraadt

use PATH_MAX where needed


# 1.2 22-Nov-2015 reyk

Add $ Ids


# 1.1 22-Nov-2015 mlarkin

vmd(8) - virtual machine daemon.

There is still a lot to be done, and fixed, in these userland components
but I have received enough "it works, commit it" emails that it's time
to finish those things in tree.

discussed with many, tested by many.


# 1.152 26-Sep-2023 dv

vmd(8): disambiguate log messages per vm and device.

The logging output from vmd(8) often specifies the function performing
the logging, but leaves which vm or vm device to guesswork and
reading tea leaves.

Change the logging formatting to prefix with information about the
specific vm and potentially the device subprocess. Most of this
logging is behind the "verbose" mode, but for warnings this will
clarify which vm or device logged the warning.

The format of vm/<name>/<device><index> is chosen to be concise and
less ugly than other approaches. This adjusts the process naming
for devices to match, dropping the use of brackets.

In the process of this change, updating log settings dynamically
via vmctl(8) is fixed by properly broadcasting that information to
the device subprocesses. The "vmm" process also now updates its own
state properly, so settings survive vm reboots.

ok mlarkin@


# 1.151 03-Jul-2023 jasper

when shutting down a vm, handle the VM id in the same way as a VM name argument

ok dv@


# 1.150 18-Jun-2023 op

relax absolute path requirement for configtest (-n)

ok dv@


# 1.149 13-May-2023 dv

vmm(4)/vmd(8): switch to anonymous shared mappings.

While splitting out emulated virtio network and block devices into
separate processes, I originally used named mappings via shm_mkstemp(3).
While this functionally achieved the desired result, it had two
unintended consequences:

1) tearing down a vm process and its child processes required
excessive locking as the guest memory was tied into the VFS layer.

2) it was observed by mlarkin@ that actions in other parts of the
VFS layer could cause some of the guest memory to flush to storage,
possibly filling /tmp.

This commit adds a new vmm(4) ioctl dedicated to allowing a process
request the kernel share a mapping of guest memory into its own vm
space. This requires an open fd to /dev/vmm (requiring root) and
both the "vmm" and "proc" pledge(2) promises. In addition, the caller
must know enough about the original memory ranges to reconstruct them
to make the vm's ranges.

Tested with help from Mischa Peters.

ok mlarkin@


# 1.148 12-May-2023 dv

vmd(8): fix segfault on vm creation.

vm_instance was using the wrong vm instance for checking the
vm_kernel_path member. Switch to using the value from the parent
vm instance in the check for if a kernel is known.

Issue reported by kn@. OK mlarkin@, kn@.


# 1.147 12-May-2023 dv

vmd(8): fix console attach from vmctl(8).

Adding in the ability to override the boot kernel created an edge
case in the ipc message handling logic for the parent process (vmd)
when receiving a "start vm" request. Result was incorrectly responding
to the control process, and as a result the vmctl client, with a
bogus "start vm response" reply with an empty tty name.

This commit rewrites the logic of how vmd goes about processing the
"start vm" request with the aim of making it simpler to understand
while addressing the edge case.

Issue reported by kn@. OK mlarkin@.


# 1.146 28-Apr-2023 dv

vmd(8)/vmctl(8): allow vm owners to override boot kernel.

vmd allows non-root users to "own" a vm defined in vm.conf(5). While
the user can start/stop the vm, if they break their filesystem they
have no means of booting recovery media like a ramdisk kernel.

This change opens the provided boot kernel via vmctl and passes the
file descriptor through the control channel to vmd. The next boot
of the vm will use the provided file descriptor as boot kernel/bios.
Subsequent boots (e.g. a reboot) will return to using behavior
defined in vm.conf or the default bios image.

ok mlarkin@


# 1.145 27-Apr-2023 dv

vmd(8): introduce multi-process model for virtio devices.

Isolate virtio network and block device emulation in dedicated
processes, forked and exec'd from the vm process. This allows for
tightening pledge promises to just "stdio".

Communication between the vcpu's and these devices now occurs via
imsg channels, which adds the benefit of not always blocking the
vcpu thread while emulating the device.

With this commit, it's possible that vmd is the first open source
hypervisor that *defaults* to a multi-process device emulation
model without requiring any additional configuration from the
operator.

Testing help from phessler@ and Mischa Peters.

ok mlarkin@


# 1.144 25-Apr-2023 dv

vmm(4)/vmd(8): pull struct members out of vmm ioctl create struct.

The object sent to vmm(4) contained file paths and details the
kernel does not need for cpu virtualization as device emulation is
in userland. Effectively, "pull up" the struct members from the
vm_create_params struct to the parent vmop_create_params struct.

This allows us to clean up some of vmd(8) and simplify things for
switching to having vmctl(8) open the "kernel" file (SeaBIOS, bsd.rd,
etc.) to allow users to boot recovery ramdisk kernels.

ok mlarkin@


# 1.143 24-Apr-2023 kn

Missing the optional default config is not an error

/var/log/{messages,daemon} logs ENOENT as error on default configless vmd.
Only complain on explicitly passed files and print a debug hint under `-vv'
in case someone forgot to populate their /etc/vm.conf.

OK dv mlarkin


# 1.142 23-Apr-2023 dv

vmd(8): teach vmm process how to exec.

Use execvp(2) to launch vm children with new address spaces.
Consequently, introduces use of unveil(2) into the vmm and vm
processes.

This imposes the requirement of launching vmd with absolute paths,
similar to sshd(8).

ok mlarkin@


# 1.141 19-Apr-2023 jsg

remove duplicate includes


# 1.140 16-Apr-2023 dv

vmd(8): clean up fd closing in vmm process.

Some mild tidying of fd closing in the vmm process in prep for
landing parts of my fork+exec diff.

With input from guenther@ on the nuances of if/when EINTR may happen
in a call to close(2).

ok mlarkin@


# 1.139 02-Apr-2023 dv

vmd(8): migrate vmd_vm.vm_ttyname to char array.

Other structs use a fixed length array already. This allows a vmd_vm
object to be transmitted over an ipc channel, too.

Additionally, solves a segfault caused by a strlcpy(3) in an error
path.

ok mlarkin@


Revision tags: OPENBSD_7_3_BASE
# 1.138 28-Jan-2023 dv

Move some header definitions from vmm(4) to vmd(8).

Part of an ongoing effort to move userland-specific information out
of a kernel header and directly into vmd(8). No functional change.

ok mlarkin@


# 1.137 22-Jan-2023 dv

vmd(8): don't remove known vm's from the config on error.

Multiple error paths, specifically the one related to if a guest
cannot allocate memory at start, resulted in a known vm (via
vm.conf(5)) being removed from the vm list. Adjust the error paths
to check if the failing vm is defined in the config before tearing
it down.

Tested with help from beck@ and Mischa Peters.

ok beck@


# 1.136 14-Jan-2023 dv

Only open /dev/vmm once in vmd(8).

Have the parent process open /dev/vmm and send the fd to the vmm
child process. Only the vmm process and its resulting children
(guest vms) need it for ioctl calls.

ok kn@


# 1.135 28-Dec-2022 jmc

spelling fixes; from paul tagliamonte
any parts of his diff not taken are noted on tech


# 1.134 15-Dec-2022 dv

Add explicit casts to ctype functions in vmd(8).

OK millert@


# 1.133 31-Oct-2022 dv

vmd(8): remove unfinished user accounting.

User accounting and enforcement was never finished. tedu the thing
until someone wants to pick it up and finish it.

Originally found by Matthew Martin.

ok mlarkin@, kn@. input from tb@.


Revision tags: OPENBSD_7_2_BASE
# 1.132 13-Sep-2022 martijn

branches: 1.132.2;
Add (partial) support for agentx in vmd.

Metrics can be found under mib-2.236 and VM-MIB (RFC7666).

Stress tested by and happy noises from Mischa Peters
OK dv@


# 1.131 08-May-2022 dv

vmd: fix rebooting a received vm

Rebooting a received vm resulted in vmd(8) exiting as a result of
flawed state tracking in the parent process.

When stopping a vm, clear the VM_RECEIVE_STATE flag. When starting
a vm, make sure the parent process collapses any existing memory
ranges after the vm is sent to the vmm process (responsible for
launching the vm).

ok mlarkin@


Revision tags: OPENBSD_7_1_BASE
# 1.130 01-Mar-2022 dv

branches: 1.130.2;
vmd(8): gracefully handle hitting data limits when starting a vm

With recent changes to login.conf(5) to restrict daemon datasize
to a finite value, users can now hit resource limits when attempting
to start a vm.

This change fixes the error path when hitting the limit. vmd(8)
will no longer abort and memory error messages are relayed to the
user.

While here, address potential under-reads/writes using atomicio
when relaying data between the child vm process and vmd's vmm
process.

Original diff from tedu@. OK mlarkin@.


# 1.129 04-Jan-2022 claudio

Fix some simple -Wunused-but-set-variable warnings.
OK benno@ dv@


# 1.128 13-Dec-2021 deraadt

including sys/cdefs.h manually started as a result of netbsd trying to
macro-build a replacement for sccsid, and was done without any concern
for namespace damage. Unfortunately this practice started infecting
other code as others were unaware they didn't need the file.
ok millert guenther


# 1.127 29-Nov-2021 deraadt

mostly avoid sys/param.h with a local nitems()
ok mlarkin


Revision tags: OPENBSD_7_0_BASE
# 1.126 18-Jul-2021 dv

vmd(8): remove invalid errno values from config_setvm

Refactor config_setvm to directly return error code on failure
instead of returning -1 and setting errno. It was setting unsupported
values not defined in <errno.h>.

OK mlarkin@


# 1.125 05-May-2021 dv

Refactor vm_instance to return error value directly.

vmd(8)'s vm_instance function set unsupported errno values. Change the
api to directly return an error (either errno or custom vmd error).

"go for it" -mlarkin@


# 1.124 04-May-2021 dv

Init debug logging state before attempting to log.

Error messages related to bad configuration were not flushing to
stderr.

OK mlarkin@


# 1.123 26-Apr-2021 dv

vmd(8): fix vmctl client "wait" state corruption

Adds queue-based tracking of waiting client state to fix the cause of
state corruption when a vmctl(8) user cancels a wait and restarts it.
The socket fd value for the control process client was being used to
track the waiting party, but this also prevented multiple waiting
clients.

This moves all the state tracking of who to notify of a vm's stopping
to the control process and no longer requires the parent process to
track it in the global environment state.

Future work will be needed to smooth out the difference between the
IMSG_VMDOP_TERMINATE_VM_{EVENT,RESPONSE} events instead of needing to
translate before relaying to the vmctl(8) client.

Tested by Mischa Peters (thanks!)

ok mlarkin@


Revision tags: OPENBSD_6_9_BASE
# 1.122 05-Apr-2021 dv

Send correct response type on unpause errors.

ok pd@


# 1.121 29-Mar-2021 dv

Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp
and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior
ignored did not intercept these packets and instead transmitted them.

This should make vmd(8)'s dhcp behave more as a true dhcp server should and
allows it to work properly with the new dhcpleased(8) attempting a renewal.

OK mlarkin@


# 1.120 27-Jan-2021 deraadt

these programs (with common ancestry) had a -fno-common problem related
to privsep_procid.
ok mortimer


Revision tags: OPENBSD_6_8_BASE
# 1.119 23-Sep-2020 martijn

Revert agentx support for now, we're too close to release.

requested by deraadt@


# 1.118 23-Sep-2020 martijn

Add support for agentx to vmd.

This is based around VM-MIB from RFC7666,but does not export the full
spec. People more knowledgeable of vmd are encouraged to expand on this.


Revision tags: OPENBSD_6_7_BASE
# 1.117 12-Dec-2019 pd

vmd: start vms defined in vm.conf in a staggered fashion

This addresses 'thundering herd' problem when a lot of
vms are configured in vm.conf. A lot of vms booting in parallel can
overload the host and also mess up tsc calibration in openbsd guests as
it uses PIT which doesn't fire reliably if the host is overloaded.

We default to starting vms with parallelism of ncpuonline and a delay 30 seconds
between batches. This is configurable in vm.conf.

ok mlarkin@ (also addressed comments from cheloha@)


Revision tags: OPENBSD_6_6_BASE
# 1.116 04-Sep-2019 mlarkin

vmd(8): memory leak in an error path

Found by Hiltjo Posthuma, thanks!


# 1.115 14-Aug-2019 anton

Improve the error message when supplying an invalid template to vmctl
start. Favoring 'invalid template' over 'permission denied' should give
the user a better hint on what went wrong.

ok kn@ mlarkin@


# 1.114 28-Jun-2019 deraadt

When system calls indicate an error they return -1, not some arbitrary
value < 0. errno is only updated in this case. Change all (most?)
callers of syscalls to follow this better, and let's see if this strictness
helps us in the future.


# 1.113 20-May-2019 jasper

drop fatalx calls when claiming a new vm id; otherwise it's possible
to crash vmd and take all other vms with it. this required a little
shuffling to get the error value reported back to the caller to
handle the error properly.

ok mlarkin@


# 1.112 11-May-2019 jasper

report vm state through 'vmctl status'; whereas previously this would display the state of
the vcpu (which is why it got removed), it now actually reports the correct state
(running, stopped, disabled, paused, etc)

ok ccardenas@ mlarkin@


# 1.111 11-May-2019 jasper

vm_dump_header allocated space for a signature but it was never set;
set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image

ok mlarkin@ pd@


# 1.110 11-May-2019 jasper

track the state of the vm (running, paused, etc) using a single bitfield instead of
a handful of separate variables. this will makes it easier for vmd to report
and check on the individual vm states

no functional change intended

ok ccardenas@ mlarkin@


# 1.109 11-May-2019 jasper

sync the vm state in vmd too when (un)pausing a vm, otherwise the vm process
knows the vm is paused, but vmd does not.

ok mlarkin@ pd@


Revision tags: OPENBSD_6_5_BASE
# 1.108 09-Dec-2018 claudio

When -B is used to specify a specific boot device also change the reboot
behaviour of vmd to stop / exit at guest reboot.
OK ccardenas@


# 1.107 04-Dec-2018 claudio

Introduce IMSG_VMDOP_WAIT_VM_REQUEST a control message that registers a
vmctl peerid that should be informed when the VM is stopped (like when the
guest does a shutdown). Uses the same logic as using the VMOP_WAIT flag on
IMSG_VMDOP_TERMINATE_VM_REQUEST.
Ok ccardenas@, reyk@


# 1.106 26-Nov-2018 ori

Keep a list of known vms, and reuse the VM IDs.

This means that when using '-L', the IP addresses of the VMs are stable.

ok reyk@


# 1.105 21-Nov-2018 reyk

Add support for "local inet6" interfaces.

ok & test ccardenas@, additional review from kn@


# 1.104 15-Oct-2018 reyk

Prevent VM reboot loops by rate-limiting the interval a VM can reboot.

This looping has been experienced by people who run VMs with a broken
kernel or boot loader that trigger a very fast reboot loop (triple
fault) of a VM that ends up using a lot of CPU and resources on the
host. Some fixes in vmm(4) and vmd(8) helped to avoid such conditions
but it can still occur if something is wrong in the guest VM itself.

If the VM restarts after less than VM_START_RATE_SEC (6) seconds, we
increment the limit counter. After VM_START_RATE_LIMIT (3) of suchs
fast reboots the VM is stopped.

There are only very few people who intentionally want to reboot-loop a
VM very quickly (many times within a second); mostly for fuzzing.
They will have to recompile and adjust the stated #defines in the code
as we don't have a config option to disable it.

OK mlarkin@


Revision tags: OPENBSD_6_4_BASE
# 1.103 08-Oct-2018 reyk

Add support for qcow2 base images (external snapshots).

This works is from Ori Bernstein, committing on his behalf:

Add support to vmd for external snapshots. That is, snapshots that are
derived from a base image. Data lookups start in the derived image,
and if the derived image does not contain some data, the search
proceeds ot the base image. Multiple derived images may exist off of
a single base image.

A limitation of this format is that modifying the base image will
corrupt the derived image.

This change also adds support for creating disk derived disk images to
vmctl. To use it:

vmctl create derived.qcow2 -s 16G -b base.qcow2

From Ori Bernstein
OK mlarkin@ reyk@


# 1.102 29-Sep-2018 pd

vmd: don't remove vm if sending failed

Fix a bug where a vm was removed in vmd.c after vmctl send even if sending
failed.
spotted by solene@
ok mlarkin@


# 1.101 28-Sep-2018 reyk

Fix copy-pasto to use maxmem instead of maxcpu

Reported by Greg Steuck

OK mlarkin@


# 1.100 10-Sep-2018 bluhm

vmd(8) clould close file descriptor 0 as not all fd fields were
properly initialized with -1. Also avoid closing -1.
OK mlarkin@


# 1.99 10-Sep-2018 bluhm

During the fork+exec implementation, daemon(3) was moved after
proc_init(). As a consequence vmd(8) child processes did not detach
from the terminal anymore. Dup /dev/null to the stdio file descriptors
in the children.
OK mlarkin@ reyk@


# 1.98 15-Jul-2018 reyk

Track resources and enforce cpu/memory/interface limits for non-root users.

The limits are currently hard-coded and undocumented (4 CPUs/VMs, 2G
memory, 8 interfaces) but will be configurable in an upcoming diff.
These limits are tracked in total usage; for example, a user will be
able to run up to 4 VMs with 512M of memory or a single VM with 2G.

OK ccardenas@ mlarkin@


# 1.97 13-Jul-2018 reyk

Check the disk/kernel/cdrom file permissions after openening the fd.

This prevents time of TOCTOU attacks for instances.

OK mlarkin@


# 1.96 13-Jul-2018 reyk

Add "allow instance" option.

This allows users to create VM instances and change desired options,
for example a user can be allowed to run a VM with all the
pre-configured options but specify an own disk image.

(mlarkin@ was fine with iterating over it)

OK ccardenas@


# 1.95 12-Jul-2018 reyk

Allow to use configured/running VMs as templates for other VM instances.

This introduces new grammar and the -t optional in vmctl start.

(For now, only root can create VM instances; but it is planned to allow
users to create their own VMs based on permissions and quota.)

OK ccardenas@ mlarkin@ jmc@


# 1.94 11-Jul-2018 reyk

style - indent each case statement in a switch.


# 1.93 11-Jul-2018 reyk

Add -w option to vmctl stop to wait for completion of VM termination.

Use it in /etc/rc.d/vmd accordingly.

OK sthen@


# 1.92 11-Jul-2018 reyk

Rename function to vmd_check_vmh


# 1.91 11-Jul-2018 reyk

Add -f option to vmctl stop to forcefully kill a VM.

This also fixes a bug in vmm_sighdlr where it might have missed
forwarding the TERMINATE_EVENT to the vmd parent after a VM child
died, leading to an abandoned VM in the vmd parent process.

OK ccardenas@ mlarkin@ benno@ kn@


# 1.90 10-Jul-2018 reyk

style (single-line ifs don't need braces)


# 1.89 10-Jul-2018 reyk

vmd already had DEBUG/DPRINTF, there is no need for VMD_DEBUG/dprintf

Replace all occurences of dprintf with DPRINTF (defined in proc.h).


# 1.88 10-Jul-2018 reyk

Tweak debug log messages

- Turn tracing messages into DPRINTF (only compiled with DEBUG).

- Pass __func__ to vm_stop and vm_remove: this way we can track who
called the function in the async context. It replaces the manual
log_debug in front of each vm_stop/vm_remove. This debug logging
trick can be removed in the future once we are more confident about
it.

OK ccardenas@ mlarkin@


# 1.87 26-Jun-2018 reyk

Add "socket owner" to allow changing the owner of the vmd control socket.

This allows to open vmctl control or console access to other users
that are not in group wheel. Access for non-root users still defaults
to read-only actions unless you change the owner (user/group) of each
individual VM.

Requested by Mischa Peters

OK mlarkin@


# 1.86 19-Jun-2018 reyk

knf


# 1.85 13-May-2018 pd

vmd(8): enable pause / unpause for vm owners

Patch from Mohamed Aslan. Thanks!
ok kn@


# 1.84 25-Apr-2018 mlarkin

vmd(8)'s early error messages weren't visible when started via /etc/rc
(such as errors relating to not having VMX/etc). Change the log_init
to log to syslog so at least we have some chance of seeing these errors.

requested and ok beck@


# 1.83 21-Apr-2018 mlarkin

spelling error in log message


# 1.82 29-Mar-2018 martijn

Make sure that the global config is send out immediately when it is
loaded. This makes sure that the local prefix specied in the config is
always used.

OK ccardenas@


Revision tags: OPENBSD_6_3_BASE
# 1.81 14-Mar-2018 mlarkin

block two VMs from using the same disk image file at the same time.
Also changes an error message in vmctl to reflect same.


# 1.80 18-Feb-2018 pd

vmd: fix vmctl pause for non existing vm ids (never returns)

check if vm id is valid before sending to vmm for pausing. The 'lock' is caused
by vmm sending back ENOENT for a non existent vm but vmd drops the message
because it doesn't recogize the vmid vmm is talking about. This is an artifact
of the 'policy' don't trust any imsg from a sibling priv sep process and do
your own checking.

reported by Abel Abraham Camarillo Ojeda
ok mlarkin@ and ccardenas@


# 1.79 10-Jan-2018 sthen

Don't require "disk" or "kernel", also allow just "cdrom" instead, a VM can
still be useful with only cdrom storage. ok ccardenas@


# 1.78 08-Jan-2018 mpi

Enable TIOCUCNTL to be able to set ns8250's break detected condition.

It is now possible to send BREAK commands to vmd(8) independently of
the serial terminal emulator.

Happy virtual ddb(4) hacking!

No objection from mlarkin@, ok nicm@, ccardenas@, deraadt@


# 1.77 03-Jan-2018 ccardenas

Add initial CD-ROM support to VMD via vioscsi.

* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8)
* Support various sized ISOs (Limitation of 4G ISOs on Linux guests)
* Known working guests: OpenBSD (primary), Alpine Linux (primary),
CentOS 6 (secondary), Ubuntu 17.10 (secondary).
NOTE: Secondary indicates some issue(s) preventing full/reliable
functionality outside the scope of the vioscsi work.
* If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's
default BIOS) will boot from CD-ROM.

ok mlarkin@, jca@


# 1.76 06-Dec-2017 abieber

Make vmd respect owner when starting non-disabled vms.

OK pd@, benno@


# 1.75 30-Nov-2017 ccardenas

When performing vmctl reload and a previously configured vm is running,
exit with an EALREADY vs EPERM.

ok mlarkin@


# 1.74 11-Nov-2017 mlarkin

update switch handling in vmd(8). vmd now gets switch information (rdomain,
etc) from underlying switch interface instead of handling this on its
own.

Diff from carlos cardenas, Thanks!

ok reyk@


# 1.73 07-Nov-2017 mlarkin

typo in previous


# 1.72 07-Nov-2017 mlarkin

comment function vm_checkperm


# 1.71 24-Oct-2017 mlarkin

The VMD parent process didn't handle the case of a VM exiting
with a non 0 return properly (i.e. EIO).

From: Carlos Cardenas, thanks!


# 1.70 07-Oct-2017 mlarkin

vmd: retain ownership on vm reboot

from Jesper Wallin, thanks!


Revision tags: OPENBSD_6_2_BASE
# 1.69 08-Sep-2017 mlarkin

vmd: add more explanatory log_debug messages

From Carlos Cardenas, many thanks!


# 1.68 20-Aug-2017 pd

vmd: Allow only upward migration

This restricts receiving vms from hosts with more cpu features.

Tested on
broadwell -> skylake (works)
skylake -> broadwell (don't work)

ok mlarkin@


# 1.67 15-Aug-2017 pd

vmd: fix vm id displayed by vmctl when receiving a vm

Also fix two debug messages and an IMSG type.


# 1.66 14-Aug-2017 jasper

validate vm names before creating them; a valid name contains alphanumeric
characters, including '.', '_' and '-'. but does not start with the latter
three.

ok mlarkin@ pd@


# 1.65 13-Aug-2017 jasper

don't issue a termination command to an already stopped vm

ok mlarkin@


# 1.64 15-Jul-2017 pd

Add vmctl send and vmctl receive

ok reyk@ and mlarkin@


# 1.63 09-Jul-2017 pd

vmd/vmctl: Add ability to pause / unpause vms

With help from Ashwin Agrawal

ok reyk@ mlarkin@


# 1.62 29-May-2017 mlarkin

vmd(8): prevent crashing when presented with a vm name argument to
"vmctl stop" that doesn't exist.

Diff from Pratik Vyas, thanks!


# 1.61 04-May-2017 reyk

Report command failure back to vmctl reload, reset, load, log verbose.

OK mlarkin@


# 1.60 04-May-2017 reyk

Add support for rdomains.

This allows to configure VM interfaces and switches in individual rdomains.

OK mlarkin@


# 1.59 25-Apr-2017 reyk

Generate randomized MAC addresses earlier to keep them across reboots.

OK deraadt@


# 1.58 21-Apr-2017 reyk

Add global configuration option "local prefix" to change prefix for -L.

The default prefix is 100.64.0.0/10 from RFC6598.

Requested by sthen@ chris@
OK mlarkin@


# 1.57 19-Apr-2017 reyk

Add support for dynamic "NAT" interfaces (-L/local interface).

When a local interface is configured, vmd configures a /31 address on
the tap(4) interface of the host and provides another IP in the same
subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server
that replies with IP, gateway, and DNS addresses to the VM. The
built-in server only ever responds to the VM on the inside and cannot
leak its DHCP responses to the outside.

Thanks to Uwe Werler, Josh Grosse, and some others for testing!

OK deraadt@


# 1.56 06-Apr-2017 reyk

Do not expose vmm(4) VM IDs to the user, use vmd(8)'s IDs instead.

Each VM has two IDs: one from the kernel (vmm) and a different one
from userland (vmd). The vmm ID is not consistent and incremented on
every boot during runtimg of the host system. The vmd ID remains the
same during the lifetime of a configured VM, even after reboots.
Configured VMs will even get and keep their IDs when the configuration
is loaded. This is more what users expect.

Pointed out and tested by otto@

OK deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.55 15-Mar-2017 reyk

More fixes for starting and stopping VMs, fixing fallout from vm_running.

- Don't start a VM that is already running
- Keep the VM as running until it is powered off (and not stopping)
- Don't fatal in the parent if the vmm process referenced an unknown VM
- Don't stop a VM that is already stopping
- Indicate that a VM is stopping in "vmctl status"

The previous "vmctl stop; vmctl stop" to force-shutdown is not
supported anymore - the shutdown timeout should make sure that the VM
is really terminated. To force-shutdown, reference the VM by ID.
We might add a flag to vmctl stop to just turn the VM off.


# 1.54 15-Mar-2017 reyk

Close the tty if the VM was powered down.

The parent keeps a copy of each VM's tty fd to reuse it on reboot.
Close this tty if the VM was stopped, and not rebooted, by calling
vm_stop(vm, 0) instead of just setting vm_running to 0. Also make
sure that vm_ttyname is not used after free'ing it.


# 1.53 02-Mar-2017 reyk

Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.

This is especially useful when multiple VMs share a switch, the
implementation is independent from the underlying switch or bridge.

no objections mlarkin@


# 1.52 01-Mar-2017 reyk

Add "owner" option to set a user/group ownership for pre-configured VMs

This allows matching users to start or stop VMs that they "own" and to
access the console accordingly.

OK mlarkin@


# 1.51 27-Feb-2017 reyk

Replace openpty(3) with local function that uses pre-opened /dev/ptm fd

This allows more flexibility for upcoming changes and better pledge.
We also didn't use half of the features of libutil's openpty function.
Additionally, make sure that the ttys are closed correctly on shutdown.

OK gilles@


# 1.50 13-Jan-2017 edd

Make it possible to remove VMs from vmd(8)'s internal queue.

The semantics agreed with reyk@ are:

* ad-hoc created vms, created with `vmctl start`, are removed once stopped.
* Stopped VMs defined in a config file are flushed before a `vmctl reload`.

OK reyk@


# 1.49 11-Jan-2017 reyk

Add imsg communication channel between vmd and invividual VMs.
For now, this is only used to forward "log verbose|brief" requests,
but it will be used for better things later.

OK mlarkin@


# 1.48 09-Jan-2017 reyk

Stop accessing verbose and debug variables from log.c directly.

This replaces log_verbose() and "extern int verbose" with the two functions
log_setverbose() and log_getverbose().

Pointed out by benno@
OK krw@ eric@ gilles@ (OK gilles@ for the snmpd bits as well)


# 1.47 14-Dec-2016 reyk

Allow to start disabled and pre-configured VMs by name, "vmctl start foo".

With testing from Jon Bernard

OK mlarkin@


# 1.46 14-Dec-2016 reyk

If a VM terminates with the result EAGAIN, close all fds except the
pty and re-send it to the vmm monitor process. With additional
changes in vmm.c, this will allow perform a cold reboot of VM.

With testing and feedback from Jon Bernard
OK mlarkin@


# 1.45 26-Nov-2016 reyk

Implement basic support for boot.conf(8) on the disk image.

Like the real boot loader, load and parse hd0a:/etc/boot.conf from the
first disk and fall back to /bsd. Not all boot loader options are
supported, but it at least does set device, set image, and boot -acds
(eg. for booting single-user).

For example, it can now boot install60.fs that includes a boot.conf
with "set image /6.0/amd64/bsd.rd":
vmctl start install -c -d install60.fs -d OpenBSD.img

This pseudo-bootloader is only needed without BIOS and could
potentially be replaced in the future.

OK mlarkin@


# 1.44 26-Nov-2016 reyk

If -m/memory is not specified, use 512M by default.

Default value picked with mlarkin - not too small and not too large.

OK mlarkin@


# 1.43 24-Nov-2016 reyk

Add support for booting the kernel from the disk image.

This make the kernel/-k argument optional and, if not specified, tries
to find the /bsd kernel in the primary hd0a partition of the first
disk image itself. It doesn't support hd0a:/etc/boot.conf yet, and it
is no BIOS or full boot loader, but it makes booting and handling of
VMs a bit easier - booting an external kernel is still supported.

The UFS file system code ufs.c is directly from libsa which is also
used by the real boot loader. The code compiles with a few signedness
warning which will be fixed separately.

OK mlarkin@


# 1.42 22-Nov-2016 reyk

Fix error path of config_setvm() and its callers. This unbreaks
loading of invalid kernel files.

Reported by mlarkin@
OK mlarkin@


# 1.41 22-Nov-2016 reyk

There is no need for res when there is already ret.


# 1.40 22-Nov-2016 edd

Insert disabled VMs into vmd(8)'s queues and allow vmctl(8) to display them.

Tested by Jon Bernard and reyk@.

OK reyk@, no objections mlarkin@.

Thanks


# 1.39 04-Nov-2016 reyk

Pass the internal vmid or 0 to vm_register() instead of changing it
once again after setting the next available id.

Suggested by edd@


# 1.38 04-Nov-2016 reyk

Update the config/register/get VM methods to match the config_set/get
style that is used in other places. Also keep the vmid from the parent.

OK edd@


# 1.37 29-Oct-2016 edd

Separate parsing vms and switches from starting them in vmd(8).

Brings us one step closer to having disabled by default vms is vm.conf(5),
which can be started with vmctl(8).

Input, testing and OK reyk@. Thanks.


# 1.36 17-Oct-2016 reyk

Add the option to specify an interface group per virtual switch as well;
this group will be added to all VM tap(4) interfaces in the switch.

Tested by martijn@


# 1.35 15-Oct-2016 reyk

Allow to add an interface to an interface group; with the group keyword.

Requested and tested by martijn@


# 1.34 12-Oct-2016 reyk

Fix functionality and semantics of vmctl load/reload/reset.

OK rzalamena@


# 1.33 06-Oct-2016 reyk

Terminate VMs on shutdown of vmd instead of leaving them running as
undead VM processes.

OK mlarkin@


# 1.32 05-Oct-2016 reyk

Add support for enhanced networking configuration and virtual switches.
See vm.conf(5) for more details.

OK mlarkin@


# 1.31 04-Oct-2016 reyk

Add a new "priv" process that is responsible for ioctls and restricted
operations that aren't allowed under pledge. This is a companion to
the "vmd" process that runs as root but with pledge.

With the "priv" process, each new tap(4) interface now gets a
description to indicate the vm, eg. "vm1-if0-myvm". For network
configuration will be done by vmd/priv later.

OK mlarkin@


# 1.30 29-Sep-2016 reyk

Implement fork+exec for vmd, using the same framework from httpd etc.

No objections from mlarkin@ sunil@


# 1.29 17-Aug-2016 deraadt

small bits of header cleanup; ok mlarkin


# 1.28 29-Jul-2016 stefan

Allow starting a VM again after it was terminated

If a VM exits, terminate it and remove it from the list of
available VMs. That allows a VM with name `foo' to be restarted
after it has exited.

This changes structures shared between vmd and vmctl. You need to
rebuild vmctl also.

ok mlarkin@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.27 05-Feb-2016 reyk

Fix a possible use-after-free in vmd, forward the result to the
control socket before free'ing the vm.

Found by and OK jsg@


# 1.26 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.25 11-Dec-2015 reyk

The vmctl "id" argument can now be a number of or a vm name, eg.
vmctl stop 3
vmctl stop "openbsd.vm"


# 1.24 08-Dec-2015 jsg

when checking the config file with -n don't open /dev/vmm or require root
ok reyk@


# 1.23 08-Dec-2015 jsg

make the -f option work as intended
ok reyk@


# 1.22 07-Dec-2015 reyk

tweak initial error logging


# 1.21 06-Dec-2015 reyk

Prevent running a VM with the same name multiple times - multiple
instances of the same configuration will be handled in a different way
later. It is also not a good idea to use the same writeable disk
with multiple VMs at the same time.

As discussed with mlarkin@


# 1.20 06-Dec-2015 reyk

Print the TTY in the vmctl status output.


# 1.19 06-Dec-2015 reyk

When a new vm is created with VMM_IOC_CREATE, the kernel assigns a
unique id to it. This happens in the vm child process and has to be
communicated to the parent processes to track the vm. Knowing the vm
id in the parent and vmm processes also allows to remove vm from the
daemons list on terminate requests later.


# 1.18 06-Dec-2015 reyk

Check errno from config_getvm() correctly


# 1.17 05-Dec-2015 reyk

Print shorter error message if opening /dev/vmm failed.

Pointed out by deraadt@


# 1.16 03-Dec-2015 reyk

Re-add the "load" and "reload" commands to vmctl: Instead of parsing
the configuration in vmctl directly, it now sends a (re)load request
to vmd. The reload also resets the existing configuration status -
this doesn't do much difference yet but a future change will compare
if a specified VM is already running. "load" will allow to add
configuration, while "reload" resets the state before loading.


# 1.15 03-Dec-2015 reyk

Add and document -D and -f flags to vmd.


# 1.14 03-Dec-2015 reyk

mlarkin's code has been moved to vmm.c, so it is ok to claim the copyright.


# 1.13 03-Dec-2015 reyk

Add support for an optional vm.conf(5) file in vmd. This will replace
vmm.conf(5) in vmmctl. For a short time, both vmd and vmmctl will
support a configuration file, but vmmctl will be changed to send
"load" requests to vmd instead of loading and parsing the file
directly.


# 1.12 03-Dec-2015 reyk

prepare config_getvm() for parse.y


# 1.11 02-Dec-2015 reyk

send the tty name to vmmctl and print it as a result.


# 1.10 02-Dec-2015 reyk

Split the fully privileged parent into two processes "parent" and
"vmm" with reduced privileges:
- the "parent" opens fds (disks, ifs, etc.) but runs as root but pledged as
"stdio rpath wpath proc tty sendfd".
- the "vmm" process handles the creation and supervision of vm processes,
and the primary communication with the vmm(4) subsystem. It runs as _vmd
in the chroot but does not use pledge, as the vmm ioctls are not allowed
by any pledge model yet.
With this change, vmd starts to track the configuration state of VMs
in vmd and will allow other things later (like terminating a vm by
name, moving the configuration parser to vmd, ...). More incremental
changes will follow.


# 1.9 02-Dec-2015 reyk

Start tweaking vmd's privsep and daemon model by splitting the main
process into multiple parts and adopting the "proc.c"-style from other
daemons. This allows to further reduce the privileges, to give better
pledge(2), and to add some upcoming changes.

"please do" mlarkin@, deraadt@


# 1.8 26-Nov-2015 reyk

Automatically start vmm(4) when the first VM is created and after the
last VM is terminated. This allows to remove the explicit "vmm
enable" / "vmm disable" (VMM_IOC_START / VMM_IOC_STOP) ioctls. You'll
have to update kernel and userland for this change, as the kernel ABI
changes.

OK mpi@ mlarkin@


# 1.7 25-Nov-2015 tedu

typo: should be looking pid == -1


# 1.6 23-Nov-2015 reyk

accept4() is restarted after signals which prevents vmd from exiting
in the current control socket loop. Add a poll before the accept that
is not restarted and allows to escape the loop. This code is kind of
temporary, as we're planning to replace the event handling, but it
allows to kill (or Ctrl+c) vmd for now.

OK tedu@, discussed with many


# 1.5 23-Nov-2015 reyk

I accidentally removed a newline in usage() when converting the log
messages to log_*.

From Cesare Gargano


# 1.4 23-Nov-2015 reyk

Add support for logging to stderr or syslog, and to run vmd in
foreground with -d.

OK mlarkin@ jung@


# 1.3 22-Nov-2015 deraadt

use PATH_MAX where needed


# 1.2 22-Nov-2015 reyk

Add $ Ids


# 1.1 22-Nov-2015 mlarkin

vmd(8) - virtual machine daemon.

There is still a lot to be done, and fixed, in these userland components
but I have received enough "it works, commit it" emails that it's time
to finish those things in tree.

discussed with many, tested by many.


# 1.151 03-Jul-2023 jasper

when shutting down a vm, handle the VM id in the same way as a VM name argument

ok dv@


# 1.150 18-Jun-2023 op

relax absolute path requirement for configtest (-n)

ok dv@


# 1.149 13-May-2023 dv

vmm(4)/vmd(8): switch to anonymous shared mappings.

While splitting out emulated virtio network and block devices into
separate processes, I originally used named mappings via shm_mkstemp(3).
While this functionally achieved the desired result, it had two
unintended consequences:

1) tearing down a vm process and its child processes required
excessive locking as the guest memory was tied into the VFS layer.

2) it was observed by mlarkin@ that actions in other parts of the
VFS layer could cause some of the guest memory to flush to storage,
possibly filling /tmp.

This commit adds a new vmm(4) ioctl dedicated to allowing a process
request the kernel share a mapping of guest memory into its own vm
space. This requires an open fd to /dev/vmm (requiring root) and
both the "vmm" and "proc" pledge(2) promises. In addition, the caller
must know enough about the original memory ranges to reconstruct them
to make the vm's ranges.

Tested with help from Mischa Peters.

ok mlarkin@


# 1.148 12-May-2023 dv

vmd(8): fix segfault on vm creation.

vm_instance was using the wrong vm instance for checking the
vm_kernel_path member. Switch to using the value from the parent
vm instance in the check for if a kernel is known.

Issue reported by kn@. OK mlarkin@, kn@.


# 1.147 12-May-2023 dv

vmd(8): fix console attach from vmctl(8).

Adding in the ability to override the boot kernel created an edge
case in the ipc message handling logic for the parent process (vmd)
when receiving a "start vm" request. Result was incorrectly responding
to the control process, and as a result the vmctl client, with a
bogus "start vm response" reply with an empty tty name.

This commit rewrites the logic of how vmd goes about processing the
"start vm" request with the aim of making it simpler to understand
while addressing the edge case.

Issue reported by kn@. OK mlarkin@.


# 1.146 28-Apr-2023 dv

vmd(8)/vmctl(8): allow vm owners to override boot kernel.

vmd allows non-root users to "own" a vm defined in vm.conf(5). While
the user can start/stop the vm, if they break their filesystem they
have no means of booting recovery media like a ramdisk kernel.

This change opens the provided boot kernel via vmctl and passes the
file descriptor through the control channel to vmd. The next boot
of the vm will use the provided file descriptor as boot kernel/bios.
Subsequent boots (e.g. a reboot) will return to using behavior
defined in vm.conf or the default bios image.

ok mlarkin@


# 1.145 27-Apr-2023 dv

vmd(8): introduce multi-process model for virtio devices.

Isolate virtio network and block device emulation in dedicated
processes, forked and exec'd from the vm process. This allows for
tightening pledge promises to just "stdio".

Communication between the vcpu's and these devices now occurs via
imsg channels, which adds the benefit of not always blocking the
vcpu thread while emulating the device.

With this commit, it's possible that vmd is the first open source
hypervisor that *defaults* to a multi-process device emulation
model without requiring any additional configuration from the
operator.

Testing help from phessler@ and Mischa Peters.

ok mlarkin@


# 1.144 25-Apr-2023 dv

vmm(4)/vmd(8): pull struct members out of vmm ioctl create struct.

The object sent to vmm(4) contained file paths and details the
kernel does not need for cpu virtualization as device emulation is
in userland. Effectively, "pull up" the struct members from the
vm_create_params struct to the parent vmop_create_params struct.

This allows us to clean up some of vmd(8) and simplify things for
switching to having vmctl(8) open the "kernel" file (SeaBIOS, bsd.rd,
etc.) to allow users to boot recovery ramdisk kernels.

ok mlarkin@


# 1.143 24-Apr-2023 kn

Missing the optional default config is not an error

/var/log/{messages,daemon} logs ENOENT as error on default configless vmd.
Only complain on explicitly passed files and print a debug hint under `-vv'
in case someone forgot to populate their /etc/vm.conf.

OK dv mlarkin


# 1.142 23-Apr-2023 dv

vmd(8): teach vmm process how to exec.

Use execvp(2) to launch vm children with new address spaces.
Consequently, introduces use of unveil(2) into the vmm and vm
processes.

This imposes the requirement of launching vmd with absolute paths,
similar to sshd(8).

ok mlarkin@


# 1.141 19-Apr-2023 jsg

remove duplicate includes


# 1.140 16-Apr-2023 dv

vmd(8): clean up fd closing in vmm process.

Some mild tidying of fd closing in the vmm process in prep for
landing parts of my fork+exec diff.

With input from guenther@ on the nuances of if/when EINTR may happen
in a call to close(2).

ok mlarkin@


# 1.139 02-Apr-2023 dv

vmd(8): migrate vmd_vm.vm_ttyname to char array.

Other structs use a fixed length array already. This allows a vmd_vm
object to be transmitted over an ipc channel, too.

Additionally, solves a segfault caused by a strlcpy(3) in an error
path.

ok mlarkin@


Revision tags: OPENBSD_7_3_BASE
# 1.138 28-Jan-2023 dv

Move some header definitions from vmm(4) to vmd(8).

Part of an ongoing effort to move userland-specific information out
of a kernel header and directly into vmd(8). No functional change.

ok mlarkin@


# 1.137 22-Jan-2023 dv

vmd(8): don't remove known vm's from the config on error.

Multiple error paths, specifically the one related to if a guest
cannot allocate memory at start, resulted in a known vm (via
vm.conf(5)) being removed from the vm list. Adjust the error paths
to check if the failing vm is defined in the config before tearing
it down.

Tested with help from beck@ and Mischa Peters.

ok beck@


# 1.136 14-Jan-2023 dv

Only open /dev/vmm once in vmd(8).

Have the parent process open /dev/vmm and send the fd to the vmm
child process. Only the vmm process and its resulting children
(guest vms) need it for ioctl calls.

ok kn@


# 1.135 28-Dec-2022 jmc

spelling fixes; from paul tagliamonte
any parts of his diff not taken are noted on tech


# 1.134 15-Dec-2022 dv

Add explicit casts to ctype functions in vmd(8).

OK millert@


# 1.133 31-Oct-2022 dv

vmd(8): remove unfinished user accounting.

User accounting and enforcement was never finished. tedu the thing
until someone wants to pick it up and finish it.

Originally found by Matthew Martin.

ok mlarkin@, kn@. input from tb@.


Revision tags: OPENBSD_7_2_BASE
# 1.132 13-Sep-2022 martijn

branches: 1.132.2;
Add (partial) support for agentx in vmd.

Metrics can be found under mib-2.236 and VM-MIB (RFC7666).

Stress tested by and happy noises from Mischa Peters
OK dv@


# 1.131 08-May-2022 dv

vmd: fix rebooting a received vm

Rebooting a received vm resulted in vmd(8) exiting as a result of
flawed state tracking in the parent process.

When stopping a vm, clear the VM_RECEIVE_STATE flag. When starting
a vm, make sure the parent process collapses any existing memory
ranges after the vm is sent to the vmm process (responsible for
launching the vm).

ok mlarkin@


Revision tags: OPENBSD_7_1_BASE
# 1.130 01-Mar-2022 dv

branches: 1.130.2;
vmd(8): gracefully handle hitting data limits when starting a vm

With recent changes to login.conf(5) to restrict daemon datasize
to a finite value, users can now hit resource limits when attempting
to start a vm.

This change fixes the error path when hitting the limit. vmd(8)
will no longer abort and memory error messages are relayed to the
user.

While here, address potential under-reads/writes using atomicio
when relaying data between the child vm process and vmd's vmm
process.

Original diff from tedu@. OK mlarkin@.


# 1.129 04-Jan-2022 claudio

Fix some simple -Wunused-but-set-variable warnings.
OK benno@ dv@


# 1.128 13-Dec-2021 deraadt

including sys/cdefs.h manually started as a result of netbsd trying to
macro-build a replacement for sccsid, and was done without any concern
for namespace damage. Unfortunately this practice started infecting
other code as others were unaware they didn't need the file.
ok millert guenther


# 1.127 29-Nov-2021 deraadt

mostly avoid sys/param.h with a local nitems()
ok mlarkin


Revision tags: OPENBSD_7_0_BASE
# 1.126 18-Jul-2021 dv

vmd(8): remove invalid errno values from config_setvm

Refactor config_setvm to directly return error code on failure
instead of returning -1 and setting errno. It was setting unsupported
values not defined in <errno.h>.

OK mlarkin@


# 1.125 05-May-2021 dv

Refactor vm_instance to return error value directly.

vmd(8)'s vm_instance function set unsupported errno values. Change the
api to directly return an error (either errno or custom vmd error).

"go for it" -mlarkin@


# 1.124 04-May-2021 dv

Init debug logging state before attempting to log.

Error messages related to bad configuration were not flushing to
stderr.

OK mlarkin@


# 1.123 26-Apr-2021 dv

vmd(8): fix vmctl client "wait" state corruption

Adds queue-based tracking of waiting client state to fix the cause of
state corruption when a vmctl(8) user cancels a wait and restarts it.
The socket fd value for the control process client was being used to
track the waiting party, but this also prevented multiple waiting
clients.

This moves all the state tracking of who to notify of a vm's stopping
to the control process and no longer requires the parent process to
track it in the global environment state.

Future work will be needed to smooth out the difference between the
IMSG_VMDOP_TERMINATE_VM_{EVENT,RESPONSE} events instead of needing to
translate before relaying to the vmctl(8) client.

Tested by Mischa Peters (thanks!)

ok mlarkin@


Revision tags: OPENBSD_6_9_BASE
# 1.122 05-Apr-2021 dv

Send correct response type on unpause errors.

ok pd@


# 1.121 29-Mar-2021 dv

Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp
and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior
ignored did not intercept these packets and instead transmitted them.

This should make vmd(8)'s dhcp behave more as a true dhcp server should and
allows it to work properly with the new dhcpleased(8) attempting a renewal.

OK mlarkin@


# 1.120 27-Jan-2021 deraadt

these programs (with common ancestry) had a -fno-common problem related
to privsep_procid.
ok mortimer


Revision tags: OPENBSD_6_8_BASE
# 1.119 23-Sep-2020 martijn

Revert agentx support for now, we're too close to release.

requested by deraadt@


# 1.118 23-Sep-2020 martijn

Add support for agentx to vmd.

This is based around VM-MIB from RFC7666,but does not export the full
spec. People more knowledgeable of vmd are encouraged to expand on this.


Revision tags: OPENBSD_6_7_BASE
# 1.117 12-Dec-2019 pd

vmd: start vms defined in vm.conf in a staggered fashion

This addresses 'thundering herd' problem when a lot of
vms are configured in vm.conf. A lot of vms booting in parallel can
overload the host and also mess up tsc calibration in openbsd guests as
it uses PIT which doesn't fire reliably if the host is overloaded.

We default to starting vms with parallelism of ncpuonline and a delay 30 seconds
between batches. This is configurable in vm.conf.

ok mlarkin@ (also addressed comments from cheloha@)


Revision tags: OPENBSD_6_6_BASE
# 1.116 04-Sep-2019 mlarkin

vmd(8): memory leak in an error path

Found by Hiltjo Posthuma, thanks!


# 1.115 14-Aug-2019 anton

Improve the error message when supplying an invalid template to vmctl
start. Favoring 'invalid template' over 'permission denied' should give
the user a better hint on what went wrong.

ok kn@ mlarkin@


# 1.114 28-Jun-2019 deraadt

When system calls indicate an error they return -1, not some arbitrary
value < 0. errno is only updated in this case. Change all (most?)
callers of syscalls to follow this better, and let's see if this strictness
helps us in the future.


# 1.113 20-May-2019 jasper

drop fatalx calls when claiming a new vm id; otherwise it's possible
to crash vmd and take all other vms with it. this required a little
shuffling to get the error value reported back to the caller to
handle the error properly.

ok mlarkin@


# 1.112 11-May-2019 jasper

report vm state through 'vmctl status'; whereas previously this would display the state of
the vcpu (which is why it got removed), it now actually reports the correct state
(running, stopped, disabled, paused, etc)

ok ccardenas@ mlarkin@


# 1.111 11-May-2019 jasper

vm_dump_header allocated space for a signature but it was never set;
set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image

ok mlarkin@ pd@


# 1.110 11-May-2019 jasper

track the state of the vm (running, paused, etc) using a single bitfield instead of
a handful of separate variables. this will makes it easier for vmd to report
and check on the individual vm states

no functional change intended

ok ccardenas@ mlarkin@


# 1.109 11-May-2019 jasper

sync the vm state in vmd too when (un)pausing a vm, otherwise the vm process
knows the vm is paused, but vmd does not.

ok mlarkin@ pd@


Revision tags: OPENBSD_6_5_BASE
# 1.108 09-Dec-2018 claudio

When -B is used to specify a specific boot device also change the reboot
behaviour of vmd to stop / exit at guest reboot.
OK ccardenas@


# 1.107 04-Dec-2018 claudio

Introduce IMSG_VMDOP_WAIT_VM_REQUEST a control message that registers a
vmctl peerid that should be informed when the VM is stopped (like when the
guest does a shutdown). Uses the same logic as using the VMOP_WAIT flag on
IMSG_VMDOP_TERMINATE_VM_REQUEST.
Ok ccardenas@, reyk@


# 1.106 26-Nov-2018 ori

Keep a list of known vms, and reuse the VM IDs.

This means that when using '-L', the IP addresses of the VMs are stable.

ok reyk@


# 1.105 21-Nov-2018 reyk

Add support for "local inet6" interfaces.

ok & test ccardenas@, additional review from kn@


# 1.104 15-Oct-2018 reyk

Prevent VM reboot loops by rate-limiting the interval a VM can reboot.

This looping has been experienced by people who run VMs with a broken
kernel or boot loader that trigger a very fast reboot loop (triple
fault) of a VM that ends up using a lot of CPU and resources on the
host. Some fixes in vmm(4) and vmd(8) helped to avoid such conditions
but it can still occur if something is wrong in the guest VM itself.

If the VM restarts after less than VM_START_RATE_SEC (6) seconds, we
increment the limit counter. After VM_START_RATE_LIMIT (3) of suchs
fast reboots the VM is stopped.

There are only very few people who intentionally want to reboot-loop a
VM very quickly (many times within a second); mostly for fuzzing.
They will have to recompile and adjust the stated #defines in the code
as we don't have a config option to disable it.

OK mlarkin@


Revision tags: OPENBSD_6_4_BASE
# 1.103 08-Oct-2018 reyk

Add support for qcow2 base images (external snapshots).

This works is from Ori Bernstein, committing on his behalf:

Add support to vmd for external snapshots. That is, snapshots that are
derived from a base image. Data lookups start in the derived image,
and if the derived image does not contain some data, the search
proceeds ot the base image. Multiple derived images may exist off of
a single base image.

A limitation of this format is that modifying the base image will
corrupt the derived image.

This change also adds support for creating disk derived disk images to
vmctl. To use it:

vmctl create derived.qcow2 -s 16G -b base.qcow2

From Ori Bernstein
OK mlarkin@ reyk@


# 1.102 29-Sep-2018 pd

vmd: don't remove vm if sending failed

Fix a bug where a vm was removed in vmd.c after vmctl send even if sending
failed.
spotted by solene@
ok mlarkin@


# 1.101 28-Sep-2018 reyk

Fix copy-pasto to use maxmem instead of maxcpu

Reported by Greg Steuck

OK mlarkin@


# 1.100 10-Sep-2018 bluhm

vmd(8) clould close file descriptor 0 as not all fd fields were
properly initialized with -1. Also avoid closing -1.
OK mlarkin@


# 1.99 10-Sep-2018 bluhm

During the fork+exec implementation, daemon(3) was moved after
proc_init(). As a consequence vmd(8) child processes did not detach
from the terminal anymore. Dup /dev/null to the stdio file descriptors
in the children.
OK mlarkin@ reyk@


# 1.98 15-Jul-2018 reyk

Track resources and enforce cpu/memory/interface limits for non-root users.

The limits are currently hard-coded and undocumented (4 CPUs/VMs, 2G
memory, 8 interfaces) but will be configurable in an upcoming diff.
These limits are tracked in total usage; for example, a user will be
able to run up to 4 VMs with 512M of memory or a single VM with 2G.

OK ccardenas@ mlarkin@


# 1.97 13-Jul-2018 reyk

Check the disk/kernel/cdrom file permissions after openening the fd.

This prevents time of TOCTOU attacks for instances.

OK mlarkin@


# 1.96 13-Jul-2018 reyk

Add "allow instance" option.

This allows users to create VM instances and change desired options,
for example a user can be allowed to run a VM with all the
pre-configured options but specify an own disk image.

(mlarkin@ was fine with iterating over it)

OK ccardenas@


# 1.95 12-Jul-2018 reyk

Allow to use configured/running VMs as templates for other VM instances.

This introduces new grammar and the -t optional in vmctl start.

(For now, only root can create VM instances; but it is planned to allow
users to create their own VMs based on permissions and quota.)

OK ccardenas@ mlarkin@ jmc@


# 1.94 11-Jul-2018 reyk

style - indent each case statement in a switch.


# 1.93 11-Jul-2018 reyk

Add -w option to vmctl stop to wait for completion of VM termination.

Use it in /etc/rc.d/vmd accordingly.

OK sthen@


# 1.92 11-Jul-2018 reyk

Rename function to vmd_check_vmh


# 1.91 11-Jul-2018 reyk

Add -f option to vmctl stop to forcefully kill a VM.

This also fixes a bug in vmm_sighdlr where it might have missed
forwarding the TERMINATE_EVENT to the vmd parent after a VM child
died, leading to an abandoned VM in the vmd parent process.

OK ccardenas@ mlarkin@ benno@ kn@


# 1.90 10-Jul-2018 reyk

style (single-line ifs don't need braces)


# 1.89 10-Jul-2018 reyk

vmd already had DEBUG/DPRINTF, there is no need for VMD_DEBUG/dprintf

Replace all occurences of dprintf with DPRINTF (defined in proc.h).


# 1.88 10-Jul-2018 reyk

Tweak debug log messages

- Turn tracing messages into DPRINTF (only compiled with DEBUG).

- Pass __func__ to vm_stop and vm_remove: this way we can track who
called the function in the async context. It replaces the manual
log_debug in front of each vm_stop/vm_remove. This debug logging
trick can be removed in the future once we are more confident about
it.

OK ccardenas@ mlarkin@


# 1.87 26-Jun-2018 reyk

Add "socket owner" to allow changing the owner of the vmd control socket.

This allows to open vmctl control or console access to other users
that are not in group wheel. Access for non-root users still defaults
to read-only actions unless you change the owner (user/group) of each
individual VM.

Requested by Mischa Peters

OK mlarkin@


# 1.86 19-Jun-2018 reyk

knf


# 1.85 13-May-2018 pd

vmd(8): enable pause / unpause for vm owners

Patch from Mohamed Aslan. Thanks!
ok kn@


# 1.84 25-Apr-2018 mlarkin

vmd(8)'s early error messages weren't visible when started via /etc/rc
(such as errors relating to not having VMX/etc). Change the log_init
to log to syslog so at least we have some chance of seeing these errors.

requested and ok beck@


# 1.83 21-Apr-2018 mlarkin

spelling error in log message


# 1.82 29-Mar-2018 martijn

Make sure that the global config is send out immediately when it is
loaded. This makes sure that the local prefix specied in the config is
always used.

OK ccardenas@


Revision tags: OPENBSD_6_3_BASE
# 1.81 14-Mar-2018 mlarkin

block two VMs from using the same disk image file at the same time.
Also changes an error message in vmctl to reflect same.


# 1.80 18-Feb-2018 pd

vmd: fix vmctl pause for non existing vm ids (never returns)

check if vm id is valid before sending to vmm for pausing. The 'lock' is caused
by vmm sending back ENOENT for a non existent vm but vmd drops the message
because it doesn't recogize the vmid vmm is talking about. This is an artifact
of the 'policy' don't trust any imsg from a sibling priv sep process and do
your own checking.

reported by Abel Abraham Camarillo Ojeda
ok mlarkin@ and ccardenas@


# 1.79 10-Jan-2018 sthen

Don't require "disk" or "kernel", also allow just "cdrom" instead, a VM can
still be useful with only cdrom storage. ok ccardenas@


# 1.78 08-Jan-2018 mpi

Enable TIOCUCNTL to be able to set ns8250's break detected condition.

It is now possible to send BREAK commands to vmd(8) independently of
the serial terminal emulator.

Happy virtual ddb(4) hacking!

No objection from mlarkin@, ok nicm@, ccardenas@, deraadt@


# 1.77 03-Jan-2018 ccardenas

Add initial CD-ROM support to VMD via vioscsi.

* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8)
* Support various sized ISOs (Limitation of 4G ISOs on Linux guests)
* Known working guests: OpenBSD (primary), Alpine Linux (primary),
CentOS 6 (secondary), Ubuntu 17.10 (secondary).
NOTE: Secondary indicates some issue(s) preventing full/reliable
functionality outside the scope of the vioscsi work.
* If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's
default BIOS) will boot from CD-ROM.

ok mlarkin@, jca@


# 1.76 06-Dec-2017 abieber

Make vmd respect owner when starting non-disabled vms.

OK pd@, benno@


# 1.75 30-Nov-2017 ccardenas

When performing vmctl reload and a previously configured vm is running,
exit with an EALREADY vs EPERM.

ok mlarkin@


# 1.74 11-Nov-2017 mlarkin

update switch handling in vmd(8). vmd now gets switch information (rdomain,
etc) from underlying switch interface instead of handling this on its
own.

Diff from carlos cardenas, Thanks!

ok reyk@


# 1.73 07-Nov-2017 mlarkin

typo in previous


# 1.72 07-Nov-2017 mlarkin

comment function vm_checkperm


# 1.71 24-Oct-2017 mlarkin

The VMD parent process didn't handle the case of a VM exiting
with a non 0 return properly (i.e. EIO).

From: Carlos Cardenas, thanks!


# 1.70 07-Oct-2017 mlarkin

vmd: retain ownership on vm reboot

from Jesper Wallin, thanks!


Revision tags: OPENBSD_6_2_BASE
# 1.69 08-Sep-2017 mlarkin

vmd: add more explanatory log_debug messages

From Carlos Cardenas, many thanks!


# 1.68 20-Aug-2017 pd

vmd: Allow only upward migration

This restricts receiving vms from hosts with more cpu features.

Tested on
broadwell -> skylake (works)
skylake -> broadwell (don't work)

ok mlarkin@


# 1.67 15-Aug-2017 pd

vmd: fix vm id displayed by vmctl when receiving a vm

Also fix two debug messages and an IMSG type.


# 1.66 14-Aug-2017 jasper

validate vm names before creating them; a valid name contains alphanumeric
characters, including '.', '_' and '-'. but does not start with the latter
three.

ok mlarkin@ pd@


# 1.65 13-Aug-2017 jasper

don't issue a termination command to an already stopped vm

ok mlarkin@


# 1.64 15-Jul-2017 pd

Add vmctl send and vmctl receive

ok reyk@ and mlarkin@


# 1.63 09-Jul-2017 pd

vmd/vmctl: Add ability to pause / unpause vms

With help from Ashwin Agrawal

ok reyk@ mlarkin@


# 1.62 29-May-2017 mlarkin

vmd(8): prevent crashing when presented with a vm name argument to
"vmctl stop" that doesn't exist.

Diff from Pratik Vyas, thanks!


# 1.61 04-May-2017 reyk

Report command failure back to vmctl reload, reset, load, log verbose.

OK mlarkin@


# 1.60 04-May-2017 reyk

Add support for rdomains.

This allows to configure VM interfaces and switches in individual rdomains.

OK mlarkin@


# 1.59 25-Apr-2017 reyk

Generate randomized MAC addresses earlier to keep them across reboots.

OK deraadt@


# 1.58 21-Apr-2017 reyk

Add global configuration option "local prefix" to change prefix for -L.

The default prefix is 100.64.0.0/10 from RFC6598.

Requested by sthen@ chris@
OK mlarkin@


# 1.57 19-Apr-2017 reyk

Add support for dynamic "NAT" interfaces (-L/local interface).

When a local interface is configured, vmd configures a /31 address on
the tap(4) interface of the host and provides another IP in the same
subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server
that replies with IP, gateway, and DNS addresses to the VM. The
built-in server only ever responds to the VM on the inside and cannot
leak its DHCP responses to the outside.

Thanks to Uwe Werler, Josh Grosse, and some others for testing!

OK deraadt@


# 1.56 06-Apr-2017 reyk

Do not expose vmm(4) VM IDs to the user, use vmd(8)'s IDs instead.

Each VM has two IDs: one from the kernel (vmm) and a different one
from userland (vmd). The vmm ID is not consistent and incremented on
every boot during runtimg of the host system. The vmd ID remains the
same during the lifetime of a configured VM, even after reboots.
Configured VMs will even get and keep their IDs when the configuration
is loaded. This is more what users expect.

Pointed out and tested by otto@

OK deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.55 15-Mar-2017 reyk

More fixes for starting and stopping VMs, fixing fallout from vm_running.

- Don't start a VM that is already running
- Keep the VM as running until it is powered off (and not stopping)
- Don't fatal in the parent if the vmm process referenced an unknown VM
- Don't stop a VM that is already stopping
- Indicate that a VM is stopping in "vmctl status"

The previous "vmctl stop; vmctl stop" to force-shutdown is not
supported anymore - the shutdown timeout should make sure that the VM
is really terminated. To force-shutdown, reference the VM by ID.
We might add a flag to vmctl stop to just turn the VM off.


# 1.54 15-Mar-2017 reyk

Close the tty if the VM was powered down.

The parent keeps a copy of each VM's tty fd to reuse it on reboot.
Close this tty if the VM was stopped, and not rebooted, by calling
vm_stop(vm, 0) instead of just setting vm_running to 0. Also make
sure that vm_ttyname is not used after free'ing it.


# 1.53 02-Mar-2017 reyk

Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.

This is especially useful when multiple VMs share a switch, the
implementation is independent from the underlying switch or bridge.

no objections mlarkin@


# 1.52 01-Mar-2017 reyk

Add "owner" option to set a user/group ownership for pre-configured VMs

This allows matching users to start or stop VMs that they "own" and to
access the console accordingly.

OK mlarkin@


# 1.51 27-Feb-2017 reyk

Replace openpty(3) with local function that uses pre-opened /dev/ptm fd

This allows more flexibility for upcoming changes and better pledge.
We also didn't use half of the features of libutil's openpty function.
Additionally, make sure that the ttys are closed correctly on shutdown.

OK gilles@


# 1.50 13-Jan-2017 edd

Make it possible to remove VMs from vmd(8)'s internal queue.

The semantics agreed with reyk@ are:

* ad-hoc created vms, created with `vmctl start`, are removed once stopped.
* Stopped VMs defined in a config file are flushed before a `vmctl reload`.

OK reyk@


# 1.49 11-Jan-2017 reyk

Add imsg communication channel between vmd and invividual VMs.
For now, this is only used to forward "log verbose|brief" requests,
but it will be used for better things later.

OK mlarkin@


# 1.48 09-Jan-2017 reyk

Stop accessing verbose and debug variables from log.c directly.

This replaces log_verbose() and "extern int verbose" with the two functions
log_setverbose() and log_getverbose().

Pointed out by benno@
OK krw@ eric@ gilles@ (OK gilles@ for the snmpd bits as well)


# 1.47 14-Dec-2016 reyk

Allow to start disabled and pre-configured VMs by name, "vmctl start foo".

With testing from Jon Bernard

OK mlarkin@


# 1.46 14-Dec-2016 reyk

If a VM terminates with the result EAGAIN, close all fds except the
pty and re-send it to the vmm monitor process. With additional
changes in vmm.c, this will allow perform a cold reboot of VM.

With testing and feedback from Jon Bernard
OK mlarkin@


# 1.45 26-Nov-2016 reyk

Implement basic support for boot.conf(8) on the disk image.

Like the real boot loader, load and parse hd0a:/etc/boot.conf from the
first disk and fall back to /bsd. Not all boot loader options are
supported, but it at least does set device, set image, and boot -acds
(eg. for booting single-user).

For example, it can now boot install60.fs that includes a boot.conf
with "set image /6.0/amd64/bsd.rd":
vmctl start install -c -d install60.fs -d OpenBSD.img

This pseudo-bootloader is only needed without BIOS and could
potentially be replaced in the future.

OK mlarkin@


# 1.44 26-Nov-2016 reyk

If -m/memory is not specified, use 512M by default.

Default value picked with mlarkin - not too small and not too large.

OK mlarkin@


# 1.43 24-Nov-2016 reyk

Add support for booting the kernel from the disk image.

This make the kernel/-k argument optional and, if not specified, tries
to find the /bsd kernel in the primary hd0a partition of the first
disk image itself. It doesn't support hd0a:/etc/boot.conf yet, and it
is no BIOS or full boot loader, but it makes booting and handling of
VMs a bit easier - booting an external kernel is still supported.

The UFS file system code ufs.c is directly from libsa which is also
used by the real boot loader. The code compiles with a few signedness
warning which will be fixed separately.

OK mlarkin@


# 1.42 22-Nov-2016 reyk

Fix error path of config_setvm() and its callers. This unbreaks
loading of invalid kernel files.

Reported by mlarkin@
OK mlarkin@


# 1.41 22-Nov-2016 reyk

There is no need for res when there is already ret.


# 1.40 22-Nov-2016 edd

Insert disabled VMs into vmd(8)'s queues and allow vmctl(8) to display them.

Tested by Jon Bernard and reyk@.

OK reyk@, no objections mlarkin@.

Thanks


# 1.39 04-Nov-2016 reyk

Pass the internal vmid or 0 to vm_register() instead of changing it
once again after setting the next available id.

Suggested by edd@


# 1.38 04-Nov-2016 reyk

Update the config/register/get VM methods to match the config_set/get
style that is used in other places. Also keep the vmid from the parent.

OK edd@


# 1.37 29-Oct-2016 edd

Separate parsing vms and switches from starting them in vmd(8).

Brings us one step closer to having disabled by default vms is vm.conf(5),
which can be started with vmctl(8).

Input, testing and OK reyk@. Thanks.


# 1.36 17-Oct-2016 reyk

Add the option to specify an interface group per virtual switch as well;
this group will be added to all VM tap(4) interfaces in the switch.

Tested by martijn@


# 1.35 15-Oct-2016 reyk

Allow to add an interface to an interface group; with the group keyword.

Requested and tested by martijn@


# 1.34 12-Oct-2016 reyk

Fix functionality and semantics of vmctl load/reload/reset.

OK rzalamena@


# 1.33 06-Oct-2016 reyk

Terminate VMs on shutdown of vmd instead of leaving them running as
undead VM processes.

OK mlarkin@


# 1.32 05-Oct-2016 reyk

Add support for enhanced networking configuration and virtual switches.
See vm.conf(5) for more details.

OK mlarkin@


# 1.31 04-Oct-2016 reyk

Add a new "priv" process that is responsible for ioctls and restricted
operations that aren't allowed under pledge. This is a companion to
the "vmd" process that runs as root but with pledge.

With the "priv" process, each new tap(4) interface now gets a
description to indicate the vm, eg. "vm1-if0-myvm". For network
configuration will be done by vmd/priv later.

OK mlarkin@


# 1.30 29-Sep-2016 reyk

Implement fork+exec for vmd, using the same framework from httpd etc.

No objections from mlarkin@ sunil@


# 1.29 17-Aug-2016 deraadt

small bits of header cleanup; ok mlarkin


# 1.28 29-Jul-2016 stefan

Allow starting a VM again after it was terminated

If a VM exits, terminate it and remove it from the list of
available VMs. That allows a VM with name `foo' to be restarted
after it has exited.

This changes structures shared between vmd and vmctl. You need to
rebuild vmctl also.

ok mlarkin@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.27 05-Feb-2016 reyk

Fix a possible use-after-free in vmd, forward the result to the
control socket before free'ing the vm.

Found by and OK jsg@


# 1.26 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.25 11-Dec-2015 reyk

The vmctl "id" argument can now be a number of or a vm name, eg.
vmctl stop 3
vmctl stop "openbsd.vm"


# 1.24 08-Dec-2015 jsg

when checking the config file with -n don't open /dev/vmm or require root
ok reyk@


# 1.23 08-Dec-2015 jsg

make the -f option work as intended
ok reyk@


# 1.22 07-Dec-2015 reyk

tweak initial error logging


# 1.21 06-Dec-2015 reyk

Prevent running a VM with the same name multiple times - multiple
instances of the same configuration will be handled in a different way
later. It is also not a good idea to use the same writeable disk
with multiple VMs at the same time.

As discussed with mlarkin@


# 1.20 06-Dec-2015 reyk

Print the TTY in the vmctl status output.


# 1.19 06-Dec-2015 reyk

When a new vm is created with VMM_IOC_CREATE, the kernel assigns a
unique id to it. This happens in the vm child process and has to be
communicated to the parent processes to track the vm. Knowing the vm
id in the parent and vmm processes also allows to remove vm from the
daemons list on terminate requests later.


# 1.18 06-Dec-2015 reyk

Check errno from config_getvm() correctly


# 1.17 05-Dec-2015 reyk

Print shorter error message if opening /dev/vmm failed.

Pointed out by deraadt@


# 1.16 03-Dec-2015 reyk

Re-add the "load" and "reload" commands to vmctl: Instead of parsing
the configuration in vmctl directly, it now sends a (re)load request
to vmd. The reload also resets the existing configuration status -
this doesn't do much difference yet but a future change will compare
if a specified VM is already running. "load" will allow to add
configuration, while "reload" resets the state before loading.


# 1.15 03-Dec-2015 reyk

Add and document -D and -f flags to vmd.


# 1.14 03-Dec-2015 reyk

mlarkin's code has been moved to vmm.c, so it is ok to claim the copyright.


# 1.13 03-Dec-2015 reyk

Add support for an optional vm.conf(5) file in vmd. This will replace
vmm.conf(5) in vmmctl. For a short time, both vmd and vmmctl will
support a configuration file, but vmmctl will be changed to send
"load" requests to vmd instead of loading and parsing the file
directly.


# 1.12 03-Dec-2015 reyk

prepare config_getvm() for parse.y


# 1.11 02-Dec-2015 reyk

send the tty name to vmmctl and print it as a result.


# 1.10 02-Dec-2015 reyk

Split the fully privileged parent into two processes "parent" and
"vmm" with reduced privileges:
- the "parent" opens fds (disks, ifs, etc.) but runs as root but pledged as
"stdio rpath wpath proc tty sendfd".
- the "vmm" process handles the creation and supervision of vm processes,
and the primary communication with the vmm(4) subsystem. It runs as _vmd
in the chroot but does not use pledge, as the vmm ioctls are not allowed
by any pledge model yet.
With this change, vmd starts to track the configuration state of VMs
in vmd and will allow other things later (like terminating a vm by
name, moving the configuration parser to vmd, ...). More incremental
changes will follow.


# 1.9 02-Dec-2015 reyk

Start tweaking vmd's privsep and daemon model by splitting the main
process into multiple parts and adopting the "proc.c"-style from other
daemons. This allows to further reduce the privileges, to give better
pledge(2), and to add some upcoming changes.

"please do" mlarkin@, deraadt@


# 1.8 26-Nov-2015 reyk

Automatically start vmm(4) when the first VM is created and after the
last VM is terminated. This allows to remove the explicit "vmm
enable" / "vmm disable" (VMM_IOC_START / VMM_IOC_STOP) ioctls. You'll
have to update kernel and userland for this change, as the kernel ABI
changes.

OK mpi@ mlarkin@


# 1.7 25-Nov-2015 tedu

typo: should be looking pid == -1


# 1.6 23-Nov-2015 reyk

accept4() is restarted after signals which prevents vmd from exiting
in the current control socket loop. Add a poll before the accept that
is not restarted and allows to escape the loop. This code is kind of
temporary, as we're planning to replace the event handling, but it
allows to kill (or Ctrl+c) vmd for now.

OK tedu@, discussed with many


# 1.5 23-Nov-2015 reyk

I accidentally removed a newline in usage() when converting the log
messages to log_*.

From Cesare Gargano


# 1.4 23-Nov-2015 reyk

Add support for logging to stderr or syslog, and to run vmd in
foreground with -d.

OK mlarkin@ jung@


# 1.3 22-Nov-2015 deraadt

use PATH_MAX where needed


# 1.2 22-Nov-2015 reyk

Add $ Ids


# 1.1 22-Nov-2015 mlarkin

vmd(8) - virtual machine daemon.

There is still a lot to be done, and fixed, in these userland components
but I have received enough "it works, commit it" emails that it's time
to finish those things in tree.

discussed with many, tested by many.


# 1.150 18-Jun-2023 op

relax absolute path requirement for configtest (-n)

ok dv@


# 1.149 13-May-2023 dv

vmm(4)/vmd(8): switch to anonymous shared mappings.

While splitting out emulated virtio network and block devices into
separate processes, I originally used named mappings via shm_mkstemp(3).
While this functionally achieved the desired result, it had two
unintended consequences:

1) tearing down a vm process and its child processes required
excessive locking as the guest memory was tied into the VFS layer.

2) it was observed by mlarkin@ that actions in other parts of the
VFS layer could cause some of the guest memory to flush to storage,
possibly filling /tmp.

This commit adds a new vmm(4) ioctl dedicated to allowing a process
request the kernel share a mapping of guest memory into its own vm
space. This requires an open fd to /dev/vmm (requiring root) and
both the "vmm" and "proc" pledge(2) promises. In addition, the caller
must know enough about the original memory ranges to reconstruct them
to make the vm's ranges.

Tested with help from Mischa Peters.

ok mlarkin@


# 1.148 12-May-2023 dv

vmd(8): fix segfault on vm creation.

vm_instance was using the wrong vm instance for checking the
vm_kernel_path member. Switch to using the value from the parent
vm instance in the check for if a kernel is known.

Issue reported by kn@. OK mlarkin@, kn@.


# 1.147 12-May-2023 dv

vmd(8): fix console attach from vmctl(8).

Adding in the ability to override the boot kernel created an edge
case in the ipc message handling logic for the parent process (vmd)
when receiving a "start vm" request. Result was incorrectly responding
to the control process, and as a result the vmctl client, with a
bogus "start vm response" reply with an empty tty name.

This commit rewrites the logic of how vmd goes about processing the
"start vm" request with the aim of making it simpler to understand
while addressing the edge case.

Issue reported by kn@. OK mlarkin@.


# 1.146 28-Apr-2023 dv

vmd(8)/vmctl(8): allow vm owners to override boot kernel.

vmd allows non-root users to "own" a vm defined in vm.conf(5). While
the user can start/stop the vm, if they break their filesystem they
have no means of booting recovery media like a ramdisk kernel.

This change opens the provided boot kernel via vmctl and passes the
file descriptor through the control channel to vmd. The next boot
of the vm will use the provided file descriptor as boot kernel/bios.
Subsequent boots (e.g. a reboot) will return to using behavior
defined in vm.conf or the default bios image.

ok mlarkin@


# 1.145 27-Apr-2023 dv

vmd(8): introduce multi-process model for virtio devices.

Isolate virtio network and block device emulation in dedicated
processes, forked and exec'd from the vm process. This allows for
tightening pledge promises to just "stdio".

Communication between the vcpu's and these devices now occurs via
imsg channels, which adds the benefit of not always blocking the
vcpu thread while emulating the device.

With this commit, it's possible that vmd is the first open source
hypervisor that *defaults* to a multi-process device emulation
model without requiring any additional configuration from the
operator.

Testing help from phessler@ and Mischa Peters.

ok mlarkin@


# 1.144 25-Apr-2023 dv

vmm(4)/vmd(8): pull struct members out of vmm ioctl create struct.

The object sent to vmm(4) contained file paths and details the
kernel does not need for cpu virtualization as device emulation is
in userland. Effectively, "pull up" the struct members from the
vm_create_params struct to the parent vmop_create_params struct.

This allows us to clean up some of vmd(8) and simplify things for
switching to having vmctl(8) open the "kernel" file (SeaBIOS, bsd.rd,
etc.) to allow users to boot recovery ramdisk kernels.

ok mlarkin@


# 1.143 24-Apr-2023 kn

Missing the optional default config is not an error

/var/log/{messages,daemon} logs ENOENT as error on default configless vmd.
Only complain on explicitly passed files and print a debug hint under `-vv'
in case someone forgot to populate their /etc/vm.conf.

OK dv mlarkin


# 1.142 23-Apr-2023 dv

vmd(8): teach vmm process how to exec.

Use execvp(2) to launch vm children with new address spaces.
Consequently, introduces use of unveil(2) into the vmm and vm
processes.

This imposes the requirement of launching vmd with absolute paths,
similar to sshd(8).

ok mlarkin@


# 1.141 19-Apr-2023 jsg

remove duplicate includes


# 1.140 16-Apr-2023 dv

vmd(8): clean up fd closing in vmm process.

Some mild tidying of fd closing in the vmm process in prep for
landing parts of my fork+exec diff.

With input from guenther@ on the nuances of if/when EINTR may happen
in a call to close(2).

ok mlarkin@


# 1.139 02-Apr-2023 dv

vmd(8): migrate vmd_vm.vm_ttyname to char array.

Other structs use a fixed length array already. This allows a vmd_vm
object to be transmitted over an ipc channel, too.

Additionally, solves a segfault caused by a strlcpy(3) in an error
path.

ok mlarkin@


Revision tags: OPENBSD_7_3_BASE
# 1.138 28-Jan-2023 dv

Move some header definitions from vmm(4) to vmd(8).

Part of an ongoing effort to move userland-specific information out
of a kernel header and directly into vmd(8). No functional change.

ok mlarkin@


# 1.137 22-Jan-2023 dv

vmd(8): don't remove known vm's from the config on error.

Multiple error paths, specifically the one related to if a guest
cannot allocate memory at start, resulted in a known vm (via
vm.conf(5)) being removed from the vm list. Adjust the error paths
to check if the failing vm is defined in the config before tearing
it down.

Tested with help from beck@ and Mischa Peters.

ok beck@


# 1.136 14-Jan-2023 dv

Only open /dev/vmm once in vmd(8).

Have the parent process open /dev/vmm and send the fd to the vmm
child process. Only the vmm process and its resulting children
(guest vms) need it for ioctl calls.

ok kn@


# 1.135 28-Dec-2022 jmc

spelling fixes; from paul tagliamonte
any parts of his diff not taken are noted on tech


# 1.134 15-Dec-2022 dv

Add explicit casts to ctype functions in vmd(8).

OK millert@


# 1.133 31-Oct-2022 dv

vmd(8): remove unfinished user accounting.

User accounting and enforcement was never finished. tedu the thing
until someone wants to pick it up and finish it.

Originally found by Matthew Martin.

ok mlarkin@, kn@. input from tb@.


Revision tags: OPENBSD_7_2_BASE
# 1.132 13-Sep-2022 martijn

branches: 1.132.2;
Add (partial) support for agentx in vmd.

Metrics can be found under mib-2.236 and VM-MIB (RFC7666).

Stress tested by and happy noises from Mischa Peters
OK dv@


# 1.131 08-May-2022 dv

vmd: fix rebooting a received vm

Rebooting a received vm resulted in vmd(8) exiting as a result of
flawed state tracking in the parent process.

When stopping a vm, clear the VM_RECEIVE_STATE flag. When starting
a vm, make sure the parent process collapses any existing memory
ranges after the vm is sent to the vmm process (responsible for
launching the vm).

ok mlarkin@


Revision tags: OPENBSD_7_1_BASE
# 1.130 01-Mar-2022 dv

branches: 1.130.2;
vmd(8): gracefully handle hitting data limits when starting a vm

With recent changes to login.conf(5) to restrict daemon datasize
to a finite value, users can now hit resource limits when attempting
to start a vm.

This change fixes the error path when hitting the limit. vmd(8)
will no longer abort and memory error messages are relayed to the
user.

While here, address potential under-reads/writes using atomicio
when relaying data between the child vm process and vmd's vmm
process.

Original diff from tedu@. OK mlarkin@.


# 1.129 04-Jan-2022 claudio

Fix some simple -Wunused-but-set-variable warnings.
OK benno@ dv@


# 1.128 13-Dec-2021 deraadt

including sys/cdefs.h manually started as a result of netbsd trying to
macro-build a replacement for sccsid, and was done without any concern
for namespace damage. Unfortunately this practice started infecting
other code as others were unaware they didn't need the file.
ok millert guenther


# 1.127 29-Nov-2021 deraadt

mostly avoid sys/param.h with a local nitems()
ok mlarkin


Revision tags: OPENBSD_7_0_BASE
# 1.126 18-Jul-2021 dv

vmd(8): remove invalid errno values from config_setvm

Refactor config_setvm to directly return error code on failure
instead of returning -1 and setting errno. It was setting unsupported
values not defined in <errno.h>.

OK mlarkin@


# 1.125 05-May-2021 dv

Refactor vm_instance to return error value directly.

vmd(8)'s vm_instance function set unsupported errno values. Change the
api to directly return an error (either errno or custom vmd error).

"go for it" -mlarkin@


# 1.124 04-May-2021 dv

Init debug logging state before attempting to log.

Error messages related to bad configuration were not flushing to
stderr.

OK mlarkin@


# 1.123 26-Apr-2021 dv

vmd(8): fix vmctl client "wait" state corruption

Adds queue-based tracking of waiting client state to fix the cause of
state corruption when a vmctl(8) user cancels a wait and restarts it.
The socket fd value for the control process client was being used to
track the waiting party, but this also prevented multiple waiting
clients.

This moves all the state tracking of who to notify of a vm's stopping
to the control process and no longer requires the parent process to
track it in the global environment state.

Future work will be needed to smooth out the difference between the
IMSG_VMDOP_TERMINATE_VM_{EVENT,RESPONSE} events instead of needing to
translate before relaying to the vmctl(8) client.

Tested by Mischa Peters (thanks!)

ok mlarkin@


Revision tags: OPENBSD_6_9_BASE
# 1.122 05-Apr-2021 dv

Send correct response type on unpause errors.

ok pd@


# 1.121 29-Mar-2021 dv

Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp
and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior
ignored did not intercept these packets and instead transmitted them.

This should make vmd(8)'s dhcp behave more as a true dhcp server should and
allows it to work properly with the new dhcpleased(8) attempting a renewal.

OK mlarkin@


# 1.120 27-Jan-2021 deraadt

these programs (with common ancestry) had a -fno-common problem related
to privsep_procid.
ok mortimer


Revision tags: OPENBSD_6_8_BASE
# 1.119 23-Sep-2020 martijn

Revert agentx support for now, we're too close to release.

requested by deraadt@


# 1.118 23-Sep-2020 martijn

Add support for agentx to vmd.

This is based around VM-MIB from RFC7666,but does not export the full
spec. People more knowledgeable of vmd are encouraged to expand on this.


Revision tags: OPENBSD_6_7_BASE
# 1.117 12-Dec-2019 pd

vmd: start vms defined in vm.conf in a staggered fashion

This addresses 'thundering herd' problem when a lot of
vms are configured in vm.conf. A lot of vms booting in parallel can
overload the host and also mess up tsc calibration in openbsd guests as
it uses PIT which doesn't fire reliably if the host is overloaded.

We default to starting vms with parallelism of ncpuonline and a delay 30 seconds
between batches. This is configurable in vm.conf.

ok mlarkin@ (also addressed comments from cheloha@)


Revision tags: OPENBSD_6_6_BASE
# 1.116 04-Sep-2019 mlarkin

vmd(8): memory leak in an error path

Found by Hiltjo Posthuma, thanks!


# 1.115 14-Aug-2019 anton

Improve the error message when supplying an invalid template to vmctl
start. Favoring 'invalid template' over 'permission denied' should give
the user a better hint on what went wrong.

ok kn@ mlarkin@


# 1.114 28-Jun-2019 deraadt

When system calls indicate an error they return -1, not some arbitrary
value < 0. errno is only updated in this case. Change all (most?)
callers of syscalls to follow this better, and let's see if this strictness
helps us in the future.


# 1.113 20-May-2019 jasper

drop fatalx calls when claiming a new vm id; otherwise it's possible
to crash vmd and take all other vms with it. this required a little
shuffling to get the error value reported back to the caller to
handle the error properly.

ok mlarkin@


# 1.112 11-May-2019 jasper

report vm state through 'vmctl status'; whereas previously this would display the state of
the vcpu (which is why it got removed), it now actually reports the correct state
(running, stopped, disabled, paused, etc)

ok ccardenas@ mlarkin@


# 1.111 11-May-2019 jasper

vm_dump_header allocated space for a signature but it was never set;
set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image

ok mlarkin@ pd@


# 1.110 11-May-2019 jasper

track the state of the vm (running, paused, etc) using a single bitfield instead of
a handful of separate variables. this will makes it easier for vmd to report
and check on the individual vm states

no functional change intended

ok ccardenas@ mlarkin@


# 1.109 11-May-2019 jasper

sync the vm state in vmd too when (un)pausing a vm, otherwise the vm process
knows the vm is paused, but vmd does not.

ok mlarkin@ pd@


Revision tags: OPENBSD_6_5_BASE
# 1.108 09-Dec-2018 claudio

When -B is used to specify a specific boot device also change the reboot
behaviour of vmd to stop / exit at guest reboot.
OK ccardenas@


# 1.107 04-Dec-2018 claudio

Introduce IMSG_VMDOP_WAIT_VM_REQUEST a control message that registers a
vmctl peerid that should be informed when the VM is stopped (like when the
guest does a shutdown). Uses the same logic as using the VMOP_WAIT flag on
IMSG_VMDOP_TERMINATE_VM_REQUEST.
Ok ccardenas@, reyk@


# 1.106 26-Nov-2018 ori

Keep a list of known vms, and reuse the VM IDs.

This means that when using '-L', the IP addresses of the VMs are stable.

ok reyk@


# 1.105 21-Nov-2018 reyk

Add support for "local inet6" interfaces.

ok & test ccardenas@, additional review from kn@


# 1.104 15-Oct-2018 reyk

Prevent VM reboot loops by rate-limiting the interval a VM can reboot.

This looping has been experienced by people who run VMs with a broken
kernel or boot loader that trigger a very fast reboot loop (triple
fault) of a VM that ends up using a lot of CPU and resources on the
host. Some fixes in vmm(4) and vmd(8) helped to avoid such conditions
but it can still occur if something is wrong in the guest VM itself.

If the VM restarts after less than VM_START_RATE_SEC (6) seconds, we
increment the limit counter. After VM_START_RATE_LIMIT (3) of suchs
fast reboots the VM is stopped.

There are only very few people who intentionally want to reboot-loop a
VM very quickly (many times within a second); mostly for fuzzing.
They will have to recompile and adjust the stated #defines in the code
as we don't have a config option to disable it.

OK mlarkin@


Revision tags: OPENBSD_6_4_BASE
# 1.103 08-Oct-2018 reyk

Add support for qcow2 base images (external snapshots).

This works is from Ori Bernstein, committing on his behalf:

Add support to vmd for external snapshots. That is, snapshots that are
derived from a base image. Data lookups start in the derived image,
and if the derived image does not contain some data, the search
proceeds ot the base image. Multiple derived images may exist off of
a single base image.

A limitation of this format is that modifying the base image will
corrupt the derived image.

This change also adds support for creating disk derived disk images to
vmctl. To use it:

vmctl create derived.qcow2 -s 16G -b base.qcow2

From Ori Bernstein
OK mlarkin@ reyk@


# 1.102 29-Sep-2018 pd

vmd: don't remove vm if sending failed

Fix a bug where a vm was removed in vmd.c after vmctl send even if sending
failed.
spotted by solene@
ok mlarkin@


# 1.101 28-Sep-2018 reyk

Fix copy-pasto to use maxmem instead of maxcpu

Reported by Greg Steuck

OK mlarkin@


# 1.100 10-Sep-2018 bluhm

vmd(8) clould close file descriptor 0 as not all fd fields were
properly initialized with -1. Also avoid closing -1.
OK mlarkin@


# 1.99 10-Sep-2018 bluhm

During the fork+exec implementation, daemon(3) was moved after
proc_init(). As a consequence vmd(8) child processes did not detach
from the terminal anymore. Dup /dev/null to the stdio file descriptors
in the children.
OK mlarkin@ reyk@


# 1.98 15-Jul-2018 reyk

Track resources and enforce cpu/memory/interface limits for non-root users.

The limits are currently hard-coded and undocumented (4 CPUs/VMs, 2G
memory, 8 interfaces) but will be configurable in an upcoming diff.
These limits are tracked in total usage; for example, a user will be
able to run up to 4 VMs with 512M of memory or a single VM with 2G.

OK ccardenas@ mlarkin@


# 1.97 13-Jul-2018 reyk

Check the disk/kernel/cdrom file permissions after openening the fd.

This prevents time of TOCTOU attacks for instances.

OK mlarkin@


# 1.96 13-Jul-2018 reyk

Add "allow instance" option.

This allows users to create VM instances and change desired options,
for example a user can be allowed to run a VM with all the
pre-configured options but specify an own disk image.

(mlarkin@ was fine with iterating over it)

OK ccardenas@


# 1.95 12-Jul-2018 reyk

Allow to use configured/running VMs as templates for other VM instances.

This introduces new grammar and the -t optional in vmctl start.

(For now, only root can create VM instances; but it is planned to allow
users to create their own VMs based on permissions and quota.)

OK ccardenas@ mlarkin@ jmc@


# 1.94 11-Jul-2018 reyk

style - indent each case statement in a switch.


# 1.93 11-Jul-2018 reyk

Add -w option to vmctl stop to wait for completion of VM termination.

Use it in /etc/rc.d/vmd accordingly.

OK sthen@


# 1.92 11-Jul-2018 reyk

Rename function to vmd_check_vmh


# 1.91 11-Jul-2018 reyk

Add -f option to vmctl stop to forcefully kill a VM.

This also fixes a bug in vmm_sighdlr where it might have missed
forwarding the TERMINATE_EVENT to the vmd parent after a VM child
died, leading to an abandoned VM in the vmd parent process.

OK ccardenas@ mlarkin@ benno@ kn@


# 1.90 10-Jul-2018 reyk

style (single-line ifs don't need braces)


# 1.89 10-Jul-2018 reyk

vmd already had DEBUG/DPRINTF, there is no need for VMD_DEBUG/dprintf

Replace all occurences of dprintf with DPRINTF (defined in proc.h).


# 1.88 10-Jul-2018 reyk

Tweak debug log messages

- Turn tracing messages into DPRINTF (only compiled with DEBUG).

- Pass __func__ to vm_stop and vm_remove: this way we can track who
called the function in the async context. It replaces the manual
log_debug in front of each vm_stop/vm_remove. This debug logging
trick can be removed in the future once we are more confident about
it.

OK ccardenas@ mlarkin@


# 1.87 26-Jun-2018 reyk

Add "socket owner" to allow changing the owner of the vmd control socket.

This allows to open vmctl control or console access to other users
that are not in group wheel. Access for non-root users still defaults
to read-only actions unless you change the owner (user/group) of each
individual VM.

Requested by Mischa Peters

OK mlarkin@


# 1.86 19-Jun-2018 reyk

knf


# 1.85 13-May-2018 pd

vmd(8): enable pause / unpause for vm owners

Patch from Mohamed Aslan. Thanks!
ok kn@


# 1.84 25-Apr-2018 mlarkin

vmd(8)'s early error messages weren't visible when started via /etc/rc
(such as errors relating to not having VMX/etc). Change the log_init
to log to syslog so at least we have some chance of seeing these errors.

requested and ok beck@


# 1.83 21-Apr-2018 mlarkin

spelling error in log message


# 1.82 29-Mar-2018 martijn

Make sure that the global config is send out immediately when it is
loaded. This makes sure that the local prefix specied in the config is
always used.

OK ccardenas@


Revision tags: OPENBSD_6_3_BASE
# 1.81 14-Mar-2018 mlarkin

block two VMs from using the same disk image file at the same time.
Also changes an error message in vmctl to reflect same.


# 1.80 18-Feb-2018 pd

vmd: fix vmctl pause for non existing vm ids (never returns)

check if vm id is valid before sending to vmm for pausing. The 'lock' is caused
by vmm sending back ENOENT for a non existent vm but vmd drops the message
because it doesn't recogize the vmid vmm is talking about. This is an artifact
of the 'policy' don't trust any imsg from a sibling priv sep process and do
your own checking.

reported by Abel Abraham Camarillo Ojeda
ok mlarkin@ and ccardenas@


# 1.79 10-Jan-2018 sthen

Don't require "disk" or "kernel", also allow just "cdrom" instead, a VM can
still be useful with only cdrom storage. ok ccardenas@


# 1.78 08-Jan-2018 mpi

Enable TIOCUCNTL to be able to set ns8250's break detected condition.

It is now possible to send BREAK commands to vmd(8) independently of
the serial terminal emulator.

Happy virtual ddb(4) hacking!

No objection from mlarkin@, ok nicm@, ccardenas@, deraadt@


# 1.77 03-Jan-2018 ccardenas

Add initial CD-ROM support to VMD via vioscsi.

* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8)
* Support various sized ISOs (Limitation of 4G ISOs on Linux guests)
* Known working guests: OpenBSD (primary), Alpine Linux (primary),
CentOS 6 (secondary), Ubuntu 17.10 (secondary).
NOTE: Secondary indicates some issue(s) preventing full/reliable
functionality outside the scope of the vioscsi work.
* If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's
default BIOS) will boot from CD-ROM.

ok mlarkin@, jca@


# 1.76 06-Dec-2017 abieber

Make vmd respect owner when starting non-disabled vms.

OK pd@, benno@


# 1.75 30-Nov-2017 ccardenas

When performing vmctl reload and a previously configured vm is running,
exit with an EALREADY vs EPERM.

ok mlarkin@


# 1.74 11-Nov-2017 mlarkin

update switch handling in vmd(8). vmd now gets switch information (rdomain,
etc) from underlying switch interface instead of handling this on its
own.

Diff from carlos cardenas, Thanks!

ok reyk@


# 1.73 07-Nov-2017 mlarkin

typo in previous


# 1.72 07-Nov-2017 mlarkin

comment function vm_checkperm


# 1.71 24-Oct-2017 mlarkin

The VMD parent process didn't handle the case of a VM exiting
with a non 0 return properly (i.e. EIO).

From: Carlos Cardenas, thanks!


# 1.70 07-Oct-2017 mlarkin

vmd: retain ownership on vm reboot

from Jesper Wallin, thanks!


Revision tags: OPENBSD_6_2_BASE
# 1.69 08-Sep-2017 mlarkin

vmd: add more explanatory log_debug messages

From Carlos Cardenas, many thanks!


# 1.68 20-Aug-2017 pd

vmd: Allow only upward migration

This restricts receiving vms from hosts with more cpu features.

Tested on
broadwell -> skylake (works)
skylake -> broadwell (don't work)

ok mlarkin@


# 1.67 15-Aug-2017 pd

vmd: fix vm id displayed by vmctl when receiving a vm

Also fix two debug messages and an IMSG type.


# 1.66 14-Aug-2017 jasper

validate vm names before creating them; a valid name contains alphanumeric
characters, including '.', '_' and '-'. but does not start with the latter
three.

ok mlarkin@ pd@


# 1.65 13-Aug-2017 jasper

don't issue a termination command to an already stopped vm

ok mlarkin@


# 1.64 15-Jul-2017 pd

Add vmctl send and vmctl receive

ok reyk@ and mlarkin@


# 1.63 09-Jul-2017 pd

vmd/vmctl: Add ability to pause / unpause vms

With help from Ashwin Agrawal

ok reyk@ mlarkin@


# 1.62 29-May-2017 mlarkin

vmd(8): prevent crashing when presented with a vm name argument to
"vmctl stop" that doesn't exist.

Diff from Pratik Vyas, thanks!


# 1.61 04-May-2017 reyk

Report command failure back to vmctl reload, reset, load, log verbose.

OK mlarkin@


# 1.60 04-May-2017 reyk

Add support for rdomains.

This allows to configure VM interfaces and switches in individual rdomains.

OK mlarkin@


# 1.59 25-Apr-2017 reyk

Generate randomized MAC addresses earlier to keep them across reboots.

OK deraadt@


# 1.58 21-Apr-2017 reyk

Add global configuration option "local prefix" to change prefix for -L.

The default prefix is 100.64.0.0/10 from RFC6598.

Requested by sthen@ chris@
OK mlarkin@


# 1.57 19-Apr-2017 reyk

Add support for dynamic "NAT" interfaces (-L/local interface).

When a local interface is configured, vmd configures a /31 address on
the tap(4) interface of the host and provides another IP in the same
subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server
that replies with IP, gateway, and DNS addresses to the VM. The
built-in server only ever responds to the VM on the inside and cannot
leak its DHCP responses to the outside.

Thanks to Uwe Werler, Josh Grosse, and some others for testing!

OK deraadt@


# 1.56 06-Apr-2017 reyk

Do not expose vmm(4) VM IDs to the user, use vmd(8)'s IDs instead.

Each VM has two IDs: one from the kernel (vmm) and a different one
from userland (vmd). The vmm ID is not consistent and incremented on
every boot during runtimg of the host system. The vmd ID remains the
same during the lifetime of a configured VM, even after reboots.
Configured VMs will even get and keep their IDs when the configuration
is loaded. This is more what users expect.

Pointed out and tested by otto@

OK deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.55 15-Mar-2017 reyk

More fixes for starting and stopping VMs, fixing fallout from vm_running.

- Don't start a VM that is already running
- Keep the VM as running until it is powered off (and not stopping)
- Don't fatal in the parent if the vmm process referenced an unknown VM
- Don't stop a VM that is already stopping
- Indicate that a VM is stopping in "vmctl status"

The previous "vmctl stop; vmctl stop" to force-shutdown is not
supported anymore - the shutdown timeout should make sure that the VM
is really terminated. To force-shutdown, reference the VM by ID.
We might add a flag to vmctl stop to just turn the VM off.


# 1.54 15-Mar-2017 reyk

Close the tty if the VM was powered down.

The parent keeps a copy of each VM's tty fd to reuse it on reboot.
Close this tty if the VM was stopped, and not rebooted, by calling
vm_stop(vm, 0) instead of just setting vm_running to 0. Also make
sure that vm_ttyname is not used after free'ing it.


# 1.53 02-Mar-2017 reyk

Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.

This is especially useful when multiple VMs share a switch, the
implementation is independent from the underlying switch or bridge.

no objections mlarkin@


# 1.52 01-Mar-2017 reyk

Add "owner" option to set a user/group ownership for pre-configured VMs

This allows matching users to start or stop VMs that they "own" and to
access the console accordingly.

OK mlarkin@


# 1.51 27-Feb-2017 reyk

Replace openpty(3) with local function that uses pre-opened /dev/ptm fd

This allows more flexibility for upcoming changes and better pledge.
We also didn't use half of the features of libutil's openpty function.
Additionally, make sure that the ttys are closed correctly on shutdown.

OK gilles@


# 1.50 13-Jan-2017 edd

Make it possible to remove VMs from vmd(8)'s internal queue.

The semantics agreed with reyk@ are:

* ad-hoc created vms, created with `vmctl start`, are removed once stopped.
* Stopped VMs defined in a config file are flushed before a `vmctl reload`.

OK reyk@


# 1.49 11-Jan-2017 reyk

Add imsg communication channel between vmd and invividual VMs.
For now, this is only used to forward "log verbose|brief" requests,
but it will be used for better things later.

OK mlarkin@


# 1.48 09-Jan-2017 reyk

Stop accessing verbose and debug variables from log.c directly.

This replaces log_verbose() and "extern int verbose" with the two functions
log_setverbose() and log_getverbose().

Pointed out by benno@
OK krw@ eric@ gilles@ (OK gilles@ for the snmpd bits as well)


# 1.47 14-Dec-2016 reyk

Allow to start disabled and pre-configured VMs by name, "vmctl start foo".

With testing from Jon Bernard

OK mlarkin@


# 1.46 14-Dec-2016 reyk

If a VM terminates with the result EAGAIN, close all fds except the
pty and re-send it to the vmm monitor process. With additional
changes in vmm.c, this will allow perform a cold reboot of VM.

With testing and feedback from Jon Bernard
OK mlarkin@


# 1.45 26-Nov-2016 reyk

Implement basic support for boot.conf(8) on the disk image.

Like the real boot loader, load and parse hd0a:/etc/boot.conf from the
first disk and fall back to /bsd. Not all boot loader options are
supported, but it at least does set device, set image, and boot -acds
(eg. for booting single-user).

For example, it can now boot install60.fs that includes a boot.conf
with "set image /6.0/amd64/bsd.rd":
vmctl start install -c -d install60.fs -d OpenBSD.img

This pseudo-bootloader is only needed without BIOS and could
potentially be replaced in the future.

OK mlarkin@


# 1.44 26-Nov-2016 reyk

If -m/memory is not specified, use 512M by default.

Default value picked with mlarkin - not too small and not too large.

OK mlarkin@


# 1.43 24-Nov-2016 reyk

Add support for booting the kernel from the disk image.

This make the kernel/-k argument optional and, if not specified, tries
to find the /bsd kernel in the primary hd0a partition of the first
disk image itself. It doesn't support hd0a:/etc/boot.conf yet, and it
is no BIOS or full boot loader, but it makes booting and handling of
VMs a bit easier - booting an external kernel is still supported.

The UFS file system code ufs.c is directly from libsa which is also
used by the real boot loader. The code compiles with a few signedness
warning which will be fixed separately.

OK mlarkin@


# 1.42 22-Nov-2016 reyk

Fix error path of config_setvm() and its callers. This unbreaks
loading of invalid kernel files.

Reported by mlarkin@
OK mlarkin@


# 1.41 22-Nov-2016 reyk

There is no need for res when there is already ret.


# 1.40 22-Nov-2016 edd

Insert disabled VMs into vmd(8)'s queues and allow vmctl(8) to display them.

Tested by Jon Bernard and reyk@.

OK reyk@, no objections mlarkin@.

Thanks


# 1.39 04-Nov-2016 reyk

Pass the internal vmid or 0 to vm_register() instead of changing it
once again after setting the next available id.

Suggested by edd@


# 1.38 04-Nov-2016 reyk

Update the config/register/get VM methods to match the config_set/get
style that is used in other places. Also keep the vmid from the parent.

OK edd@


# 1.37 29-Oct-2016 edd

Separate parsing vms and switches from starting them in vmd(8).

Brings us one step closer to having disabled by default vms is vm.conf(5),
which can be started with vmctl(8).

Input, testing and OK reyk@. Thanks.


# 1.36 17-Oct-2016 reyk

Add the option to specify an interface group per virtual switch as well;
this group will be added to all VM tap(4) interfaces in the switch.

Tested by martijn@


# 1.35 15-Oct-2016 reyk

Allow to add an interface to an interface group; with the group keyword.

Requested and tested by martijn@


# 1.34 12-Oct-2016 reyk

Fix functionality and semantics of vmctl load/reload/reset.

OK rzalamena@


# 1.33 06-Oct-2016 reyk

Terminate VMs on shutdown of vmd instead of leaving them running as
undead VM processes.

OK mlarkin@


# 1.32 05-Oct-2016 reyk

Add support for enhanced networking configuration and virtual switches.
See vm.conf(5) for more details.

OK mlarkin@


# 1.31 04-Oct-2016 reyk

Add a new "priv" process that is responsible for ioctls and restricted
operations that aren't allowed under pledge. This is a companion to
the "vmd" process that runs as root but with pledge.

With the "priv" process, each new tap(4) interface now gets a
description to indicate the vm, eg. "vm1-if0-myvm". For network
configuration will be done by vmd/priv later.

OK mlarkin@


# 1.30 29-Sep-2016 reyk

Implement fork+exec for vmd, using the same framework from httpd etc.

No objections from mlarkin@ sunil@


# 1.29 17-Aug-2016 deraadt

small bits of header cleanup; ok mlarkin


# 1.28 29-Jul-2016 stefan

Allow starting a VM again after it was terminated

If a VM exits, terminate it and remove it from the list of
available VMs. That allows a VM with name `foo' to be restarted
after it has exited.

This changes structures shared between vmd and vmctl. You need to
rebuild vmctl also.

ok mlarkin@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.27 05-Feb-2016 reyk

Fix a possible use-after-free in vmd, forward the result to the
control socket before free'ing the vm.

Found by and OK jsg@


# 1.26 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.25 11-Dec-2015 reyk

The vmctl "id" argument can now be a number of or a vm name, eg.
vmctl stop 3
vmctl stop "openbsd.vm"


# 1.24 08-Dec-2015 jsg

when checking the config file with -n don't open /dev/vmm or require root
ok reyk@


# 1.23 08-Dec-2015 jsg

make the -f option work as intended
ok reyk@


# 1.22 07-Dec-2015 reyk

tweak initial error logging


# 1.21 06-Dec-2015 reyk

Prevent running a VM with the same name multiple times - multiple
instances of the same configuration will be handled in a different way
later. It is also not a good idea to use the same writeable disk
with multiple VMs at the same time.

As discussed with mlarkin@


# 1.20 06-Dec-2015 reyk

Print the TTY in the vmctl status output.


# 1.19 06-Dec-2015 reyk

When a new vm is created with VMM_IOC_CREATE, the kernel assigns a
unique id to it. This happens in the vm child process and has to be
communicated to the parent processes to track the vm. Knowing the vm
id in the parent and vmm processes also allows to remove vm from the
daemons list on terminate requests later.


# 1.18 06-Dec-2015 reyk

Check errno from config_getvm() correctly


# 1.17 05-Dec-2015 reyk

Print shorter error message if opening /dev/vmm failed.

Pointed out by deraadt@


# 1.16 03-Dec-2015 reyk

Re-add the "load" and "reload" commands to vmctl: Instead of parsing
the configuration in vmctl directly, it now sends a (re)load request
to vmd. The reload also resets the existing configuration status -
this doesn't do much difference yet but a future change will compare
if a specified VM is already running. "load" will allow to add
configuration, while "reload" resets the state before loading.


# 1.15 03-Dec-2015 reyk

Add and document -D and -f flags to vmd.


# 1.14 03-Dec-2015 reyk

mlarkin's code has been moved to vmm.c, so it is ok to claim the copyright.


# 1.13 03-Dec-2015 reyk

Add support for an optional vm.conf(5) file in vmd. This will replace
vmm.conf(5) in vmmctl. For a short time, both vmd and vmmctl will
support a configuration file, but vmmctl will be changed to send
"load" requests to vmd instead of loading and parsing the file
directly.


# 1.12 03-Dec-2015 reyk

prepare config_getvm() for parse.y


# 1.11 02-Dec-2015 reyk

send the tty name to vmmctl and print it as a result.


# 1.10 02-Dec-2015 reyk

Split the fully privileged parent into two processes "parent" and
"vmm" with reduced privileges:
- the "parent" opens fds (disks, ifs, etc.) but runs as root but pledged as
"stdio rpath wpath proc tty sendfd".
- the "vmm" process handles the creation and supervision of vm processes,
and the primary communication with the vmm(4) subsystem. It runs as _vmd
in the chroot but does not use pledge, as the vmm ioctls are not allowed
by any pledge model yet.
With this change, vmd starts to track the configuration state of VMs
in vmd and will allow other things later (like terminating a vm by
name, moving the configuration parser to vmd, ...). More incremental
changes will follow.


# 1.9 02-Dec-2015 reyk

Start tweaking vmd's privsep and daemon model by splitting the main
process into multiple parts and adopting the "proc.c"-style from other
daemons. This allows to further reduce the privileges, to give better
pledge(2), and to add some upcoming changes.

"please do" mlarkin@, deraadt@


# 1.8 26-Nov-2015 reyk

Automatically start vmm(4) when the first VM is created and after the
last VM is terminated. This allows to remove the explicit "vmm
enable" / "vmm disable" (VMM_IOC_START / VMM_IOC_STOP) ioctls. You'll
have to update kernel and userland for this change, as the kernel ABI
changes.

OK mpi@ mlarkin@


# 1.7 25-Nov-2015 tedu

typo: should be looking pid == -1


# 1.6 23-Nov-2015 reyk

accept4() is restarted after signals which prevents vmd from exiting
in the current control socket loop. Add a poll before the accept that
is not restarted and allows to escape the loop. This code is kind of
temporary, as we're planning to replace the event handling, but it
allows to kill (or Ctrl+c) vmd for now.

OK tedu@, discussed with many


# 1.5 23-Nov-2015 reyk

I accidentally removed a newline in usage() when converting the log
messages to log_*.

From Cesare Gargano


# 1.4 23-Nov-2015 reyk

Add support for logging to stderr or syslog, and to run vmd in
foreground with -d.

OK mlarkin@ jung@


# 1.3 22-Nov-2015 deraadt

use PATH_MAX where needed


# 1.2 22-Nov-2015 reyk

Add $ Ids


# 1.1 22-Nov-2015 mlarkin

vmd(8) - virtual machine daemon.

There is still a lot to be done, and fixed, in these userland components
but I have received enough "it works, commit it" emails that it's time
to finish those things in tree.

discussed with many, tested by many.


# 1.146 28-Apr-2023 dv

vmd(8)/vmctl(8): allow vm owners to override boot kernel.

vmd allows non-root users to "own" a vm defined in vm.conf(5). While
the user can start/stop the vm, if they break their filesystem they
have no means of booting recovery media like a ramdisk kernel.

This change opens the provided boot kernel via vmctl and passes the
file descriptor through the control channel to vmd. The next boot
of the vm will use the provided file descriptor as boot kernel/bios.
Subsequent boots (e.g. a reboot) will return to using behavior
defined in vm.conf or the default bios image.

ok mlarkin@


# 1.145 27-Apr-2023 dv

vmd(8): introduce multi-process model for virtio devices.

Isolate virtio network and block device emulation in dedicated
processes, forked and exec'd from the vm process. This allows for
tightening pledge promises to just "stdio".

Communication between the vcpu's and these devices now occurs via
imsg channels, which adds the benefit of not always blocking the
vcpu thread while emulating the device.

With this commit, it's possible that vmd is the first open source
hypervisor that *defaults* to a multi-process device emulation
model without requiring any additional configuration from the
operator.

Testing help from phessler@ and Mischa Peters.

ok mlarkin@


# 1.144 25-Apr-2023 dv

vmm(4)/vmd(8): pull struct members out of vmm ioctl create struct.

The object sent to vmm(4) contained file paths and details the
kernel does not need for cpu virtualization as device emulation is
in userland. Effectively, "pull up" the struct members from the
vm_create_params struct to the parent vmop_create_params struct.

This allows us to clean up some of vmd(8) and simplify things for
switching to having vmctl(8) open the "kernel" file (SeaBIOS, bsd.rd,
etc.) to allow users to boot recovery ramdisk kernels.

ok mlarkin@


# 1.143 24-Apr-2023 kn

Missing the optional default config is not an error

/var/log/{messages,daemon} logs ENOENT as error on default configless vmd.
Only complain on explicitly passed files and print a debug hint under `-vv'
in case someone forgot to populate their /etc/vm.conf.

OK dv mlarkin


# 1.142 23-Apr-2023 dv

vmd(8): teach vmm process how to exec.

Use execvp(2) to launch vm children with new address spaces.
Consequently, introduces use of unveil(2) into the vmm and vm
processes.

This imposes the requirement of launching vmd with absolute paths,
similar to sshd(8).

ok mlarkin@


# 1.141 19-Apr-2023 jsg

remove duplicate includes


# 1.140 16-Apr-2023 dv

vmd(8): clean up fd closing in vmm process.

Some mild tidying of fd closing in the vmm process in prep for
landing parts of my fork+exec diff.

With input from guenther@ on the nuances of if/when EINTR may happen
in a call to close(2).

ok mlarkin@


# 1.139 02-Apr-2023 dv

vmd(8): migrate vmd_vm.vm_ttyname to char array.

Other structs use a fixed length array already. This allows a vmd_vm
object to be transmitted over an ipc channel, too.

Additionally, solves a segfault caused by a strlcpy(3) in an error
path.

ok mlarkin@


Revision tags: OPENBSD_7_3_BASE
# 1.138 28-Jan-2023 dv

Move some header definitions from vmm(4) to vmd(8).

Part of an ongoing effort to move userland-specific information out
of a kernel header and directly into vmd(8). No functional change.

ok mlarkin@


# 1.137 22-Jan-2023 dv

vmd(8): don't remove known vm's from the config on error.

Multiple error paths, specifically the one related to if a guest
cannot allocate memory at start, resulted in a known vm (via
vm.conf(5)) being removed from the vm list. Adjust the error paths
to check if the failing vm is defined in the config before tearing
it down.

Tested with help from beck@ and Mischa Peters.

ok beck@


# 1.136 14-Jan-2023 dv

Only open /dev/vmm once in vmd(8).

Have the parent process open /dev/vmm and send the fd to the vmm
child process. Only the vmm process and its resulting children
(guest vms) need it for ioctl calls.

ok kn@


# 1.135 28-Dec-2022 jmc

spelling fixes; from paul tagliamonte
any parts of his diff not taken are noted on tech


# 1.134 15-Dec-2022 dv

Add explicit casts to ctype functions in vmd(8).

OK millert@


# 1.133 31-Oct-2022 dv

vmd(8): remove unfinished user accounting.

User accounting and enforcement was never finished. tedu the thing
until someone wants to pick it up and finish it.

Originally found by Matthew Martin.

ok mlarkin@, kn@. input from tb@.


Revision tags: OPENBSD_7_2_BASE
# 1.132 13-Sep-2022 martijn

branches: 1.132.2;
Add (partial) support for agentx in vmd.

Metrics can be found under mib-2.236 and VM-MIB (RFC7666).

Stress tested by and happy noises from Mischa Peters
OK dv@


# 1.131 08-May-2022 dv

vmd: fix rebooting a received vm

Rebooting a received vm resulted in vmd(8) exiting as a result of
flawed state tracking in the parent process.

When stopping a vm, clear the VM_RECEIVE_STATE flag. When starting
a vm, make sure the parent process collapses any existing memory
ranges after the vm is sent to the vmm process (responsible for
launching the vm).

ok mlarkin@


Revision tags: OPENBSD_7_1_BASE
# 1.130 01-Mar-2022 dv

branches: 1.130.2;
vmd(8): gracefully handle hitting data limits when starting a vm

With recent changes to login.conf(5) to restrict daemon datasize
to a finite value, users can now hit resource limits when attempting
to start a vm.

This change fixes the error path when hitting the limit. vmd(8)
will no longer abort and memory error messages are relayed to the
user.

While here, address potential under-reads/writes using atomicio
when relaying data between the child vm process and vmd's vmm
process.

Original diff from tedu@. OK mlarkin@.


# 1.129 04-Jan-2022 claudio

Fix some simple -Wunused-but-set-variable warnings.
OK benno@ dv@


# 1.128 13-Dec-2021 deraadt

including sys/cdefs.h manually started as a result of netbsd trying to
macro-build a replacement for sccsid, and was done without any concern
for namespace damage. Unfortunately this practice started infecting
other code as others were unaware they didn't need the file.
ok millert guenther


# 1.127 29-Nov-2021 deraadt

mostly avoid sys/param.h with a local nitems()
ok mlarkin


Revision tags: OPENBSD_7_0_BASE
# 1.126 18-Jul-2021 dv

vmd(8): remove invalid errno values from config_setvm

Refactor config_setvm to directly return error code on failure
instead of returning -1 and setting errno. It was setting unsupported
values not defined in <errno.h>.

OK mlarkin@


# 1.125 05-May-2021 dv

Refactor vm_instance to return error value directly.

vmd(8)'s vm_instance function set unsupported errno values. Change the
api to directly return an error (either errno or custom vmd error).

"go for it" -mlarkin@


# 1.124 04-May-2021 dv

Init debug logging state before attempting to log.

Error messages related to bad configuration were not flushing to
stderr.

OK mlarkin@


# 1.123 26-Apr-2021 dv

vmd(8): fix vmctl client "wait" state corruption

Adds queue-based tracking of waiting client state to fix the cause of
state corruption when a vmctl(8) user cancels a wait and restarts it.
The socket fd value for the control process client was being used to
track the waiting party, but this also prevented multiple waiting
clients.

This moves all the state tracking of who to notify of a vm's stopping
to the control process and no longer requires the parent process to
track it in the global environment state.

Future work will be needed to smooth out the difference between the
IMSG_VMDOP_TERMINATE_VM_{EVENT,RESPONSE} events instead of needing to
translate before relaying to the vmctl(8) client.

Tested by Mischa Peters (thanks!)

ok mlarkin@


Revision tags: OPENBSD_6_9_BASE
# 1.122 05-Apr-2021 dv

Send correct response type on unpause errors.

ok pd@


# 1.121 29-Mar-2021 dv

Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp
and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior
ignored did not intercept these packets and instead transmitted them.

This should make vmd(8)'s dhcp behave more as a true dhcp server should and
allows it to work properly with the new dhcpleased(8) attempting a renewal.

OK mlarkin@


# 1.120 27-Jan-2021 deraadt

these programs (with common ancestry) had a -fno-common problem related
to privsep_procid.
ok mortimer


Revision tags: OPENBSD_6_8_BASE
# 1.119 23-Sep-2020 martijn

Revert agentx support for now, we're too close to release.

requested by deraadt@


# 1.118 23-Sep-2020 martijn

Add support for agentx to vmd.

This is based around VM-MIB from RFC7666,but does not export the full
spec. People more knowledgeable of vmd are encouraged to expand on this.


Revision tags: OPENBSD_6_7_BASE
# 1.117 12-Dec-2019 pd

vmd: start vms defined in vm.conf in a staggered fashion

This addresses 'thundering herd' problem when a lot of
vms are configured in vm.conf. A lot of vms booting in parallel can
overload the host and also mess up tsc calibration in openbsd guests as
it uses PIT which doesn't fire reliably if the host is overloaded.

We default to starting vms with parallelism of ncpuonline and a delay 30 seconds
between batches. This is configurable in vm.conf.

ok mlarkin@ (also addressed comments from cheloha@)


Revision tags: OPENBSD_6_6_BASE
# 1.116 04-Sep-2019 mlarkin

vmd(8): memory leak in an error path

Found by Hiltjo Posthuma, thanks!


# 1.115 14-Aug-2019 anton

Improve the error message when supplying an invalid template to vmctl
start. Favoring 'invalid template' over 'permission denied' should give
the user a better hint on what went wrong.

ok kn@ mlarkin@


# 1.114 28-Jun-2019 deraadt

When system calls indicate an error they return -1, not some arbitrary
value < 0. errno is only updated in this case. Change all (most?)
callers of syscalls to follow this better, and let's see if this strictness
helps us in the future.


# 1.113 20-May-2019 jasper

drop fatalx calls when claiming a new vm id; otherwise it's possible
to crash vmd and take all other vms with it. this required a little
shuffling to get the error value reported back to the caller to
handle the error properly.

ok mlarkin@


# 1.112 11-May-2019 jasper

report vm state through 'vmctl status'; whereas previously this would display the state of
the vcpu (which is why it got removed), it now actually reports the correct state
(running, stopped, disabled, paused, etc)

ok ccardenas@ mlarkin@


# 1.111 11-May-2019 jasper

vm_dump_header allocated space for a signature but it was never set;
set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image

ok mlarkin@ pd@


# 1.110 11-May-2019 jasper

track the state of the vm (running, paused, etc) using a single bitfield instead of
a handful of separate variables. this will makes it easier for vmd to report
and check on the individual vm states

no functional change intended

ok ccardenas@ mlarkin@


# 1.109 11-May-2019 jasper

sync the vm state in vmd too when (un)pausing a vm, otherwise the vm process
knows the vm is paused, but vmd does not.

ok mlarkin@ pd@


Revision tags: OPENBSD_6_5_BASE
# 1.108 09-Dec-2018 claudio

When -B is used to specify a specific boot device also change the reboot
behaviour of vmd to stop / exit at guest reboot.
OK ccardenas@


# 1.107 04-Dec-2018 claudio

Introduce IMSG_VMDOP_WAIT_VM_REQUEST a control message that registers a
vmctl peerid that should be informed when the VM is stopped (like when the
guest does a shutdown). Uses the same logic as using the VMOP_WAIT flag on
IMSG_VMDOP_TERMINATE_VM_REQUEST.
Ok ccardenas@, reyk@


# 1.106 26-Nov-2018 ori

Keep a list of known vms, and reuse the VM IDs.

This means that when using '-L', the IP addresses of the VMs are stable.

ok reyk@


# 1.105 21-Nov-2018 reyk

Add support for "local inet6" interfaces.

ok & test ccardenas@, additional review from kn@


# 1.104 15-Oct-2018 reyk

Prevent VM reboot loops by rate-limiting the interval a VM can reboot.

This looping has been experienced by people who run VMs with a broken
kernel or boot loader that trigger a very fast reboot loop (triple
fault) of a VM that ends up using a lot of CPU and resources on the
host. Some fixes in vmm(4) and vmd(8) helped to avoid such conditions
but it can still occur if something is wrong in the guest VM itself.

If the VM restarts after less than VM_START_RATE_SEC (6) seconds, we
increment the limit counter. After VM_START_RATE_LIMIT (3) of suchs
fast reboots the VM is stopped.

There are only very few people who intentionally want to reboot-loop a
VM very quickly (many times within a second); mostly for fuzzing.
They will have to recompile and adjust the stated #defines in the code
as we don't have a config option to disable it.

OK mlarkin@


Revision tags: OPENBSD_6_4_BASE
# 1.103 08-Oct-2018 reyk

Add support for qcow2 base images (external snapshots).

This works is from Ori Bernstein, committing on his behalf:

Add support to vmd for external snapshots. That is, snapshots that are
derived from a base image. Data lookups start in the derived image,
and if the derived image does not contain some data, the search
proceeds ot the base image. Multiple derived images may exist off of
a single base image.

A limitation of this format is that modifying the base image will
corrupt the derived image.

This change also adds support for creating disk derived disk images to
vmctl. To use it:

vmctl create derived.qcow2 -s 16G -b base.qcow2

From Ori Bernstein
OK mlarkin@ reyk@


# 1.102 29-Sep-2018 pd

vmd: don't remove vm if sending failed

Fix a bug where a vm was removed in vmd.c after vmctl send even if sending
failed.
spotted by solene@
ok mlarkin@


# 1.101 28-Sep-2018 reyk

Fix copy-pasto to use maxmem instead of maxcpu

Reported by Greg Steuck

OK mlarkin@


# 1.100 10-Sep-2018 bluhm

vmd(8) clould close file descriptor 0 as not all fd fields were
properly initialized with -1. Also avoid closing -1.
OK mlarkin@


# 1.99 10-Sep-2018 bluhm

During the fork+exec implementation, daemon(3) was moved after
proc_init(). As a consequence vmd(8) child processes did not detach
from the terminal anymore. Dup /dev/null to the stdio file descriptors
in the children.
OK mlarkin@ reyk@


# 1.98 15-Jul-2018 reyk

Track resources and enforce cpu/memory/interface limits for non-root users.

The limits are currently hard-coded and undocumented (4 CPUs/VMs, 2G
memory, 8 interfaces) but will be configurable in an upcoming diff.
These limits are tracked in total usage; for example, a user will be
able to run up to 4 VMs with 512M of memory or a single VM with 2G.

OK ccardenas@ mlarkin@


# 1.97 13-Jul-2018 reyk

Check the disk/kernel/cdrom file permissions after openening the fd.

This prevents time of TOCTOU attacks for instances.

OK mlarkin@


# 1.96 13-Jul-2018 reyk

Add "allow instance" option.

This allows users to create VM instances and change desired options,
for example a user can be allowed to run a VM with all the
pre-configured options but specify an own disk image.

(mlarkin@ was fine with iterating over it)

OK ccardenas@


# 1.95 12-Jul-2018 reyk

Allow to use configured/running VMs as templates for other VM instances.

This introduces new grammar and the -t optional in vmctl start.

(For now, only root can create VM instances; but it is planned to allow
users to create their own VMs based on permissions and quota.)

OK ccardenas@ mlarkin@ jmc@


# 1.94 11-Jul-2018 reyk

style - indent each case statement in a switch.


# 1.93 11-Jul-2018 reyk

Add -w option to vmctl stop to wait for completion of VM termination.

Use it in /etc/rc.d/vmd accordingly.

OK sthen@


# 1.92 11-Jul-2018 reyk

Rename function to vmd_check_vmh


# 1.91 11-Jul-2018 reyk

Add -f option to vmctl stop to forcefully kill a VM.

This also fixes a bug in vmm_sighdlr where it might have missed
forwarding the TERMINATE_EVENT to the vmd parent after a VM child
died, leading to an abandoned VM in the vmd parent process.

OK ccardenas@ mlarkin@ benno@ kn@


# 1.90 10-Jul-2018 reyk

style (single-line ifs don't need braces)


# 1.89 10-Jul-2018 reyk

vmd already had DEBUG/DPRINTF, there is no need for VMD_DEBUG/dprintf

Replace all occurences of dprintf with DPRINTF (defined in proc.h).


# 1.88 10-Jul-2018 reyk

Tweak debug log messages

- Turn tracing messages into DPRINTF (only compiled with DEBUG).

- Pass __func__ to vm_stop and vm_remove: this way we can track who
called the function in the async context. It replaces the manual
log_debug in front of each vm_stop/vm_remove. This debug logging
trick can be removed in the future once we are more confident about
it.

OK ccardenas@ mlarkin@


# 1.87 26-Jun-2018 reyk

Add "socket owner" to allow changing the owner of the vmd control socket.

This allows to open vmctl control or console access to other users
that are not in group wheel. Access for non-root users still defaults
to read-only actions unless you change the owner (user/group) of each
individual VM.

Requested by Mischa Peters

OK mlarkin@


# 1.86 19-Jun-2018 reyk

knf


# 1.85 13-May-2018 pd

vmd(8): enable pause / unpause for vm owners

Patch from Mohamed Aslan. Thanks!
ok kn@


# 1.84 25-Apr-2018 mlarkin

vmd(8)'s early error messages weren't visible when started via /etc/rc
(such as errors relating to not having VMX/etc). Change the log_init
to log to syslog so at least we have some chance of seeing these errors.

requested and ok beck@


# 1.83 21-Apr-2018 mlarkin

spelling error in log message


# 1.82 29-Mar-2018 martijn

Make sure that the global config is send out immediately when it is
loaded. This makes sure that the local prefix specied in the config is
always used.

OK ccardenas@


Revision tags: OPENBSD_6_3_BASE
# 1.81 14-Mar-2018 mlarkin

block two VMs from using the same disk image file at the same time.
Also changes an error message in vmctl to reflect same.


# 1.80 18-Feb-2018 pd

vmd: fix vmctl pause for non existing vm ids (never returns)

check if vm id is valid before sending to vmm for pausing. The 'lock' is caused
by vmm sending back ENOENT for a non existent vm but vmd drops the message
because it doesn't recogize the vmid vmm is talking about. This is an artifact
of the 'policy' don't trust any imsg from a sibling priv sep process and do
your own checking.

reported by Abel Abraham Camarillo Ojeda
ok mlarkin@ and ccardenas@


# 1.79 10-Jan-2018 sthen

Don't require "disk" or "kernel", also allow just "cdrom" instead, a VM can
still be useful with only cdrom storage. ok ccardenas@


# 1.78 08-Jan-2018 mpi

Enable TIOCUCNTL to be able to set ns8250's break detected condition.

It is now possible to send BREAK commands to vmd(8) independently of
the serial terminal emulator.

Happy virtual ddb(4) hacking!

No objection from mlarkin@, ok nicm@, ccardenas@, deraadt@


# 1.77 03-Jan-2018 ccardenas

Add initial CD-ROM support to VMD via vioscsi.

* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8)
* Support various sized ISOs (Limitation of 4G ISOs on Linux guests)
* Known working guests: OpenBSD (primary), Alpine Linux (primary),
CentOS 6 (secondary), Ubuntu 17.10 (secondary).
NOTE: Secondary indicates some issue(s) preventing full/reliable
functionality outside the scope of the vioscsi work.
* If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's
default BIOS) will boot from CD-ROM.

ok mlarkin@, jca@


# 1.76 06-Dec-2017 abieber

Make vmd respect owner when starting non-disabled vms.

OK pd@, benno@


# 1.75 30-Nov-2017 ccardenas

When performing vmctl reload and a previously configured vm is running,
exit with an EALREADY vs EPERM.

ok mlarkin@


# 1.74 11-Nov-2017 mlarkin

update switch handling in vmd(8). vmd now gets switch information (rdomain,
etc) from underlying switch interface instead of handling this on its
own.

Diff from carlos cardenas, Thanks!

ok reyk@


# 1.73 07-Nov-2017 mlarkin

typo in previous


# 1.72 07-Nov-2017 mlarkin

comment function vm_checkperm


# 1.71 24-Oct-2017 mlarkin

The VMD parent process didn't handle the case of a VM exiting
with a non 0 return properly (i.e. EIO).

From: Carlos Cardenas, thanks!


# 1.70 07-Oct-2017 mlarkin

vmd: retain ownership on vm reboot

from Jesper Wallin, thanks!


Revision tags: OPENBSD_6_2_BASE
# 1.69 08-Sep-2017 mlarkin

vmd: add more explanatory log_debug messages

From Carlos Cardenas, many thanks!


# 1.68 20-Aug-2017 pd

vmd: Allow only upward migration

This restricts receiving vms from hosts with more cpu features.

Tested on
broadwell -> skylake (works)
skylake -> broadwell (don't work)

ok mlarkin@


# 1.67 15-Aug-2017 pd

vmd: fix vm id displayed by vmctl when receiving a vm

Also fix two debug messages and an IMSG type.


# 1.66 14-Aug-2017 jasper

validate vm names before creating them; a valid name contains alphanumeric
characters, including '.', '_' and '-'. but does not start with the latter
three.

ok mlarkin@ pd@


# 1.65 13-Aug-2017 jasper

don't issue a termination command to an already stopped vm

ok mlarkin@


# 1.64 15-Jul-2017 pd

Add vmctl send and vmctl receive

ok reyk@ and mlarkin@


# 1.63 09-Jul-2017 pd

vmd/vmctl: Add ability to pause / unpause vms

With help from Ashwin Agrawal

ok reyk@ mlarkin@


# 1.62 29-May-2017 mlarkin

vmd(8): prevent crashing when presented with a vm name argument to
"vmctl stop" that doesn't exist.

Diff from Pratik Vyas, thanks!


# 1.61 04-May-2017 reyk

Report command failure back to vmctl reload, reset, load, log verbose.

OK mlarkin@


# 1.60 04-May-2017 reyk

Add support for rdomains.

This allows to configure VM interfaces and switches in individual rdomains.

OK mlarkin@


# 1.59 25-Apr-2017 reyk

Generate randomized MAC addresses earlier to keep them across reboots.

OK deraadt@


# 1.58 21-Apr-2017 reyk

Add global configuration option "local prefix" to change prefix for -L.

The default prefix is 100.64.0.0/10 from RFC6598.

Requested by sthen@ chris@
OK mlarkin@


# 1.57 19-Apr-2017 reyk

Add support for dynamic "NAT" interfaces (-L/local interface).

When a local interface is configured, vmd configures a /31 address on
the tap(4) interface of the host and provides another IP in the same
subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server
that replies with IP, gateway, and DNS addresses to the VM. The
built-in server only ever responds to the VM on the inside and cannot
leak its DHCP responses to the outside.

Thanks to Uwe Werler, Josh Grosse, and some others for testing!

OK deraadt@


# 1.56 06-Apr-2017 reyk

Do not expose vmm(4) VM IDs to the user, use vmd(8)'s IDs instead.

Each VM has two IDs: one from the kernel (vmm) and a different one
from userland (vmd). The vmm ID is not consistent and incremented on
every boot during runtimg of the host system. The vmd ID remains the
same during the lifetime of a configured VM, even after reboots.
Configured VMs will even get and keep their IDs when the configuration
is loaded. This is more what users expect.

Pointed out and tested by otto@

OK deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.55 15-Mar-2017 reyk

More fixes for starting and stopping VMs, fixing fallout from vm_running.

- Don't start a VM that is already running
- Keep the VM as running until it is powered off (and not stopping)
- Don't fatal in the parent if the vmm process referenced an unknown VM
- Don't stop a VM that is already stopping
- Indicate that a VM is stopping in "vmctl status"

The previous "vmctl stop; vmctl stop" to force-shutdown is not
supported anymore - the shutdown timeout should make sure that the VM
is really terminated. To force-shutdown, reference the VM by ID.
We might add a flag to vmctl stop to just turn the VM off.


# 1.54 15-Mar-2017 reyk

Close the tty if the VM was powered down.

The parent keeps a copy of each VM's tty fd to reuse it on reboot.
Close this tty if the VM was stopped, and not rebooted, by calling
vm_stop(vm, 0) instead of just setting vm_running to 0. Also make
sure that vm_ttyname is not used after free'ing it.


# 1.53 02-Mar-2017 reyk

Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.

This is especially useful when multiple VMs share a switch, the
implementation is independent from the underlying switch or bridge.

no objections mlarkin@


# 1.52 01-Mar-2017 reyk

Add "owner" option to set a user/group ownership for pre-configured VMs

This allows matching users to start or stop VMs that they "own" and to
access the console accordingly.

OK mlarkin@


# 1.51 27-Feb-2017 reyk

Replace openpty(3) with local function that uses pre-opened /dev/ptm fd

This allows more flexibility for upcoming changes and better pledge.
We also didn't use half of the features of libutil's openpty function.
Additionally, make sure that the ttys are closed correctly on shutdown.

OK gilles@


# 1.50 13-Jan-2017 edd

Make it possible to remove VMs from vmd(8)'s internal queue.

The semantics agreed with reyk@ are:

* ad-hoc created vms, created with `vmctl start`, are removed once stopped.
* Stopped VMs defined in a config file are flushed before a `vmctl reload`.

OK reyk@


# 1.49 11-Jan-2017 reyk

Add imsg communication channel between vmd and invividual VMs.
For now, this is only used to forward "log verbose|brief" requests,
but it will be used for better things later.

OK mlarkin@


# 1.48 09-Jan-2017 reyk

Stop accessing verbose and debug variables from log.c directly.

This replaces log_verbose() and "extern int verbose" with the two functions
log_setverbose() and log_getverbose().

Pointed out by benno@
OK krw@ eric@ gilles@ (OK gilles@ for the snmpd bits as well)


# 1.47 14-Dec-2016 reyk

Allow to start disabled and pre-configured VMs by name, "vmctl start foo".

With testing from Jon Bernard

OK mlarkin@


# 1.46 14-Dec-2016 reyk

If a VM terminates with the result EAGAIN, close all fds except the
pty and re-send it to the vmm monitor process. With additional
changes in vmm.c, this will allow perform a cold reboot of VM.

With testing and feedback from Jon Bernard
OK mlarkin@


# 1.45 26-Nov-2016 reyk

Implement basic support for boot.conf(8) on the disk image.

Like the real boot loader, load and parse hd0a:/etc/boot.conf from the
first disk and fall back to /bsd. Not all boot loader options are
supported, but it at least does set device, set image, and boot -acds
(eg. for booting single-user).

For example, it can now boot install60.fs that includes a boot.conf
with "set image /6.0/amd64/bsd.rd":
vmctl start install -c -d install60.fs -d OpenBSD.img

This pseudo-bootloader is only needed without BIOS and could
potentially be replaced in the future.

OK mlarkin@


# 1.44 26-Nov-2016 reyk

If -m/memory is not specified, use 512M by default.

Default value picked with mlarkin - not too small and not too large.

OK mlarkin@


# 1.43 24-Nov-2016 reyk

Add support for booting the kernel from the disk image.

This make the kernel/-k argument optional and, if not specified, tries
to find the /bsd kernel in the primary hd0a partition of the first
disk image itself. It doesn't support hd0a:/etc/boot.conf yet, and it
is no BIOS or full boot loader, but it makes booting and handling of
VMs a bit easier - booting an external kernel is still supported.

The UFS file system code ufs.c is directly from libsa which is also
used by the real boot loader. The code compiles with a few signedness
warning which will be fixed separately.

OK mlarkin@


# 1.42 22-Nov-2016 reyk

Fix error path of config_setvm() and its callers. This unbreaks
loading of invalid kernel files.

Reported by mlarkin@
OK mlarkin@


# 1.41 22-Nov-2016 reyk

There is no need for res when there is already ret.


# 1.40 22-Nov-2016 edd

Insert disabled VMs into vmd(8)'s queues and allow vmctl(8) to display them.

Tested by Jon Bernard and reyk@.

OK reyk@, no objections mlarkin@.

Thanks


# 1.39 04-Nov-2016 reyk

Pass the internal vmid or 0 to vm_register() instead of changing it
once again after setting the next available id.

Suggested by edd@


# 1.38 04-Nov-2016 reyk

Update the config/register/get VM methods to match the config_set/get
style that is used in other places. Also keep the vmid from the parent.

OK edd@


# 1.37 29-Oct-2016 edd

Separate parsing vms and switches from starting them in vmd(8).

Brings us one step closer to having disabled by default vms is vm.conf(5),
which can be started with vmctl(8).

Input, testing and OK reyk@. Thanks.


# 1.36 17-Oct-2016 reyk

Add the option to specify an interface group per virtual switch as well;
this group will be added to all VM tap(4) interfaces in the switch.

Tested by martijn@


# 1.35 15-Oct-2016 reyk

Allow to add an interface to an interface group; with the group keyword.

Requested and tested by martijn@


# 1.34 12-Oct-2016 reyk

Fix functionality and semantics of vmctl load/reload/reset.

OK rzalamena@


# 1.33 06-Oct-2016 reyk

Terminate VMs on shutdown of vmd instead of leaving them running as
undead VM processes.

OK mlarkin@


# 1.32 05-Oct-2016 reyk

Add support for enhanced networking configuration and virtual switches.
See vm.conf(5) for more details.

OK mlarkin@


# 1.31 04-Oct-2016 reyk

Add a new "priv" process that is responsible for ioctls and restricted
operations that aren't allowed under pledge. This is a companion to
the "vmd" process that runs as root but with pledge.

With the "priv" process, each new tap(4) interface now gets a
description to indicate the vm, eg. "vm1-if0-myvm". For network
configuration will be done by vmd/priv later.

OK mlarkin@


# 1.30 29-Sep-2016 reyk

Implement fork+exec for vmd, using the same framework from httpd etc.

No objections from mlarkin@ sunil@


# 1.29 17-Aug-2016 deraadt

small bits of header cleanup; ok mlarkin


# 1.28 29-Jul-2016 stefan

Allow starting a VM again after it was terminated

If a VM exits, terminate it and remove it from the list of
available VMs. That allows a VM with name `foo' to be restarted
after it has exited.

This changes structures shared between vmd and vmctl. You need to
rebuild vmctl also.

ok mlarkin@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.27 05-Feb-2016 reyk

Fix a possible use-after-free in vmd, forward the result to the
control socket before free'ing the vm.

Found by and OK jsg@


# 1.26 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.25 11-Dec-2015 reyk

The vmctl "id" argument can now be a number of or a vm name, eg.
vmctl stop 3
vmctl stop "openbsd.vm"


# 1.24 08-Dec-2015 jsg

when checking the config file with -n don't open /dev/vmm or require root
ok reyk@


# 1.23 08-Dec-2015 jsg

make the -f option work as intended
ok reyk@


# 1.22 07-Dec-2015 reyk

tweak initial error logging


# 1.21 06-Dec-2015 reyk

Prevent running a VM with the same name multiple times - multiple
instances of the same configuration will be handled in a different way
later. It is also not a good idea to use the same writeable disk
with multiple VMs at the same time.

As discussed with mlarkin@


# 1.20 06-Dec-2015 reyk

Print the TTY in the vmctl status output.


# 1.19 06-Dec-2015 reyk

When a new vm is created with VMM_IOC_CREATE, the kernel assigns a
unique id to it. This happens in the vm child process and has to be
communicated to the parent processes to track the vm. Knowing the vm
id in the parent and vmm processes also allows to remove vm from the
daemons list on terminate requests later.


# 1.18 06-Dec-2015 reyk

Check errno from config_getvm() correctly


# 1.17 05-Dec-2015 reyk

Print shorter error message if opening /dev/vmm failed.

Pointed out by deraadt@


# 1.16 03-Dec-2015 reyk

Re-add the "load" and "reload" commands to vmctl: Instead of parsing
the configuration in vmctl directly, it now sends a (re)load request
to vmd. The reload also resets the existing configuration status -
this doesn't do much difference yet but a future change will compare
if a specified VM is already running. "load" will allow to add
configuration, while "reload" resets the state before loading.


# 1.15 03-Dec-2015 reyk

Add and document -D and -f flags to vmd.


# 1.14 03-Dec-2015 reyk

mlarkin's code has been moved to vmm.c, so it is ok to claim the copyright.


# 1.13 03-Dec-2015 reyk

Add support for an optional vm.conf(5) file in vmd. This will replace
vmm.conf(5) in vmmctl. For a short time, both vmd and vmmctl will
support a configuration file, but vmmctl will be changed to send
"load" requests to vmd instead of loading and parsing the file
directly.


# 1.12 03-Dec-2015 reyk

prepare config_getvm() for parse.y


# 1.11 02-Dec-2015 reyk

send the tty name to vmmctl and print it as a result.


# 1.10 02-Dec-2015 reyk

Split the fully privileged parent into two processes "parent" and
"vmm" with reduced privileges:
- the "parent" opens fds (disks, ifs, etc.) but runs as root but pledged as
"stdio rpath wpath proc tty sendfd".
- the "vmm" process handles the creation and supervision of vm processes,
and the primary communication with the vmm(4) subsystem. It runs as _vmd
in the chroot but does not use pledge, as the vmm ioctls are not allowed
by any pledge model yet.
With this change, vmd starts to track the configuration state of VMs
in vmd and will allow other things later (like terminating a vm by
name, moving the configuration parser to vmd, ...). More incremental
changes will follow.


# 1.9 02-Dec-2015 reyk

Start tweaking vmd's privsep and daemon model by splitting the main
process into multiple parts and adopting the "proc.c"-style from other
daemons. This allows to further reduce the privileges, to give better
pledge(2), and to add some upcoming changes.

"please do" mlarkin@, deraadt@


# 1.8 26-Nov-2015 reyk

Automatically start vmm(4) when the first VM is created and after the
last VM is terminated. This allows to remove the explicit "vmm
enable" / "vmm disable" (VMM_IOC_START / VMM_IOC_STOP) ioctls. You'll
have to update kernel and userland for this change, as the kernel ABI
changes.

OK mpi@ mlarkin@


# 1.7 25-Nov-2015 tedu

typo: should be looking pid == -1


# 1.6 23-Nov-2015 reyk

accept4() is restarted after signals which prevents vmd from exiting
in the current control socket loop. Add a poll before the accept that
is not restarted and allows to escape the loop. This code is kind of
temporary, as we're planning to replace the event handling, but it
allows to kill (or Ctrl+c) vmd for now.

OK tedu@, discussed with many


# 1.5 23-Nov-2015 reyk

I accidentally removed a newline in usage() when converting the log
messages to log_*.

From Cesare Gargano


# 1.4 23-Nov-2015 reyk

Add support for logging to stderr or syslog, and to run vmd in
foreground with -d.

OK mlarkin@ jung@


# 1.3 22-Nov-2015 deraadt

use PATH_MAX where needed


# 1.2 22-Nov-2015 reyk

Add $ Ids


# 1.1 22-Nov-2015 mlarkin

vmd(8) - virtual machine daemon.

There is still a lot to be done, and fixed, in these userland components
but I have received enough "it works, commit it" emails that it's time
to finish those things in tree.

discussed with many, tested by many.


# 1.144 25-Apr-2023 dv

vmm(4)/vmd(8): pull struct members out of vmm ioctl create struct.

The object sent to vmm(4) contained file paths and details the
kernel does not need for cpu virtualization as device emulation is
in userland. Effectively, "pull up" the struct members from the
vm_create_params struct to the parent vmop_create_params struct.

This allows us to clean up some of vmd(8) and simplify things for
switching to having vmctl(8) open the "kernel" file (SeaBIOS, bsd.rd,
etc.) to allow users to boot recovery ramdisk kernels.

ok mlarkin@


# 1.143 24-Apr-2023 kn

Missing the optional default config is not an error

/var/log/{messages,daemon} logs ENOENT as error on default configless vmd.
Only complain on explicitly passed files and print a debug hint under `-vv'
in case someone forgot to populate their /etc/vm.conf.

OK dv mlarkin


# 1.142 23-Apr-2023 dv

vmd(8): teach vmm process how to exec.

Use execvp(2) to launch vm children with new address spaces.
Consequently, introduces use of unveil(2) into the vmm and vm
processes.

This imposes the requirement of launching vmd with absolute paths,
similar to sshd(8).

ok mlarkin@


# 1.141 19-Apr-2023 jsg

remove duplicate includes


# 1.140 16-Apr-2023 dv

vmd(8): clean up fd closing in vmm process.

Some mild tidying of fd closing in the vmm process in prep for
landing parts of my fork+exec diff.

With input from guenther@ on the nuances of if/when EINTR may happen
in a call to close(2).

ok mlarkin@


# 1.139 02-Apr-2023 dv

vmd(8): migrate vmd_vm.vm_ttyname to char array.

Other structs use a fixed length array already. This allows a vmd_vm
object to be transmitted over an ipc channel, too.

Additionally, solves a segfault caused by a strlcpy(3) in an error
path.

ok mlarkin@


Revision tags: OPENBSD_7_3_BASE
# 1.138 28-Jan-2023 dv

Move some header definitions from vmm(4) to vmd(8).

Part of an ongoing effort to move userland-specific information out
of a kernel header and directly into vmd(8). No functional change.

ok mlarkin@


# 1.137 22-Jan-2023 dv

vmd(8): don't remove known vm's from the config on error.

Multiple error paths, specifically the one related to if a guest
cannot allocate memory at start, resulted in a known vm (via
vm.conf(5)) being removed from the vm list. Adjust the error paths
to check if the failing vm is defined in the config before tearing
it down.

Tested with help from beck@ and Mischa Peters.

ok beck@


# 1.136 14-Jan-2023 dv

Only open /dev/vmm once in vmd(8).

Have the parent process open /dev/vmm and send the fd to the vmm
child process. Only the vmm process and its resulting children
(guest vms) need it for ioctl calls.

ok kn@


# 1.135 28-Dec-2022 jmc

spelling fixes; from paul tagliamonte
any parts of his diff not taken are noted on tech


# 1.134 15-Dec-2022 dv

Add explicit casts to ctype functions in vmd(8).

OK millert@


# 1.133 31-Oct-2022 dv

vmd(8): remove unfinished user accounting.

User accounting and enforcement was never finished. tedu the thing
until someone wants to pick it up and finish it.

Originally found by Matthew Martin.

ok mlarkin@, kn@. input from tb@.


Revision tags: OPENBSD_7_2_BASE
# 1.132 13-Sep-2022 martijn

branches: 1.132.2;
Add (partial) support for agentx in vmd.

Metrics can be found under mib-2.236 and VM-MIB (RFC7666).

Stress tested by and happy noises from Mischa Peters
OK dv@


# 1.131 08-May-2022 dv

vmd: fix rebooting a received vm

Rebooting a received vm resulted in vmd(8) exiting as a result of
flawed state tracking in the parent process.

When stopping a vm, clear the VM_RECEIVE_STATE flag. When starting
a vm, make sure the parent process collapses any existing memory
ranges after the vm is sent to the vmm process (responsible for
launching the vm).

ok mlarkin@


Revision tags: OPENBSD_7_1_BASE
# 1.130 01-Mar-2022 dv

branches: 1.130.2;
vmd(8): gracefully handle hitting data limits when starting a vm

With recent changes to login.conf(5) to restrict daemon datasize
to a finite value, users can now hit resource limits when attempting
to start a vm.

This change fixes the error path when hitting the limit. vmd(8)
will no longer abort and memory error messages are relayed to the
user.

While here, address potential under-reads/writes using atomicio
when relaying data between the child vm process and vmd's vmm
process.

Original diff from tedu@. OK mlarkin@.


# 1.129 04-Jan-2022 claudio

Fix some simple -Wunused-but-set-variable warnings.
OK benno@ dv@


# 1.128 13-Dec-2021 deraadt

including sys/cdefs.h manually started as a result of netbsd trying to
macro-build a replacement for sccsid, and was done without any concern
for namespace damage. Unfortunately this practice started infecting
other code as others were unaware they didn't need the file.
ok millert guenther


# 1.127 29-Nov-2021 deraadt

mostly avoid sys/param.h with a local nitems()
ok mlarkin


Revision tags: OPENBSD_7_0_BASE
# 1.126 18-Jul-2021 dv

vmd(8): remove invalid errno values from config_setvm

Refactor config_setvm to directly return error code on failure
instead of returning -1 and setting errno. It was setting unsupported
values not defined in <errno.h>.

OK mlarkin@


# 1.125 05-May-2021 dv

Refactor vm_instance to return error value directly.

vmd(8)'s vm_instance function set unsupported errno values. Change the
api to directly return an error (either errno or custom vmd error).

"go for it" -mlarkin@


# 1.124 04-May-2021 dv

Init debug logging state before attempting to log.

Error messages related to bad configuration were not flushing to
stderr.

OK mlarkin@


# 1.123 26-Apr-2021 dv

vmd(8): fix vmctl client "wait" state corruption

Adds queue-based tracking of waiting client state to fix the cause of
state corruption when a vmctl(8) user cancels a wait and restarts it.
The socket fd value for the control process client was being used to
track the waiting party, but this also prevented multiple waiting
clients.

This moves all the state tracking of who to notify of a vm's stopping
to the control process and no longer requires the parent process to
track it in the global environment state.

Future work will be needed to smooth out the difference between the
IMSG_VMDOP_TERMINATE_VM_{EVENT,RESPONSE} events instead of needing to
translate before relaying to the vmctl(8) client.

Tested by Mischa Peters (thanks!)

ok mlarkin@


Revision tags: OPENBSD_6_9_BASE
# 1.122 05-Apr-2021 dv

Send correct response type on unpause errors.

ok pd@


# 1.121 29-Mar-2021 dv

Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp
and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior
ignored did not intercept these packets and instead transmitted them.

This should make vmd(8)'s dhcp behave more as a true dhcp server should and
allows it to work properly with the new dhcpleased(8) attempting a renewal.

OK mlarkin@


# 1.120 27-Jan-2021 deraadt

these programs (with common ancestry) had a -fno-common problem related
to privsep_procid.
ok mortimer


Revision tags: OPENBSD_6_8_BASE
# 1.119 23-Sep-2020 martijn

Revert agentx support for now, we're too close to release.

requested by deraadt@


# 1.118 23-Sep-2020 martijn

Add support for agentx to vmd.

This is based around VM-MIB from RFC7666,but does not export the full
spec. People more knowledgeable of vmd are encouraged to expand on this.


Revision tags: OPENBSD_6_7_BASE
# 1.117 12-Dec-2019 pd

vmd: start vms defined in vm.conf in a staggered fashion

This addresses 'thundering herd' problem when a lot of
vms are configured in vm.conf. A lot of vms booting in parallel can
overload the host and also mess up tsc calibration in openbsd guests as
it uses PIT which doesn't fire reliably if the host is overloaded.

We default to starting vms with parallelism of ncpuonline and a delay 30 seconds
between batches. This is configurable in vm.conf.

ok mlarkin@ (also addressed comments from cheloha@)


Revision tags: OPENBSD_6_6_BASE
# 1.116 04-Sep-2019 mlarkin

vmd(8): memory leak in an error path

Found by Hiltjo Posthuma, thanks!


# 1.115 14-Aug-2019 anton

Improve the error message when supplying an invalid template to vmctl
start. Favoring 'invalid template' over 'permission denied' should give
the user a better hint on what went wrong.

ok kn@ mlarkin@


# 1.114 28-Jun-2019 deraadt

When system calls indicate an error they return -1, not some arbitrary
value < 0. errno is only updated in this case. Change all (most?)
callers of syscalls to follow this better, and let's see if this strictness
helps us in the future.


# 1.113 20-May-2019 jasper

drop fatalx calls when claiming a new vm id; otherwise it's possible
to crash vmd and take all other vms with it. this required a little
shuffling to get the error value reported back to the caller to
handle the error properly.

ok mlarkin@


# 1.112 11-May-2019 jasper

report vm state through 'vmctl status'; whereas previously this would display the state of
the vcpu (which is why it got removed), it now actually reports the correct state
(running, stopped, disabled, paused, etc)

ok ccardenas@ mlarkin@


# 1.111 11-May-2019 jasper

vm_dump_header allocated space for a signature but it was never set;
set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image

ok mlarkin@ pd@


# 1.110 11-May-2019 jasper

track the state of the vm (running, paused, etc) using a single bitfield instead of
a handful of separate variables. this will makes it easier for vmd to report
and check on the individual vm states

no functional change intended

ok ccardenas@ mlarkin@


# 1.109 11-May-2019 jasper

sync the vm state in vmd too when (un)pausing a vm, otherwise the vm process
knows the vm is paused, but vmd does not.

ok mlarkin@ pd@


Revision tags: OPENBSD_6_5_BASE
# 1.108 09-Dec-2018 claudio

When -B is used to specify a specific boot device also change the reboot
behaviour of vmd to stop / exit at guest reboot.
OK ccardenas@


# 1.107 04-Dec-2018 claudio

Introduce IMSG_VMDOP_WAIT_VM_REQUEST a control message that registers a
vmctl peerid that should be informed when the VM is stopped (like when the
guest does a shutdown). Uses the same logic as using the VMOP_WAIT flag on
IMSG_VMDOP_TERMINATE_VM_REQUEST.
Ok ccardenas@, reyk@


# 1.106 26-Nov-2018 ori

Keep a list of known vms, and reuse the VM IDs.

This means that when using '-L', the IP addresses of the VMs are stable.

ok reyk@


# 1.105 21-Nov-2018 reyk

Add support for "local inet6" interfaces.

ok & test ccardenas@, additional review from kn@


# 1.104 15-Oct-2018 reyk

Prevent VM reboot loops by rate-limiting the interval a VM can reboot.

This looping has been experienced by people who run VMs with a broken
kernel or boot loader that trigger a very fast reboot loop (triple
fault) of a VM that ends up using a lot of CPU and resources on the
host. Some fixes in vmm(4) and vmd(8) helped to avoid such conditions
but it can still occur if something is wrong in the guest VM itself.

If the VM restarts after less than VM_START_RATE_SEC (6) seconds, we
increment the limit counter. After VM_START_RATE_LIMIT (3) of suchs
fast reboots the VM is stopped.

There are only very few people who intentionally want to reboot-loop a
VM very quickly (many times within a second); mostly for fuzzing.
They will have to recompile and adjust the stated #defines in the code
as we don't have a config option to disable it.

OK mlarkin@


Revision tags: OPENBSD_6_4_BASE
# 1.103 08-Oct-2018 reyk

Add support for qcow2 base images (external snapshots).

This works is from Ori Bernstein, committing on his behalf:

Add support to vmd for external snapshots. That is, snapshots that are
derived from a base image. Data lookups start in the derived image,
and if the derived image does not contain some data, the search
proceeds ot the base image. Multiple derived images may exist off of
a single base image.

A limitation of this format is that modifying the base image will
corrupt the derived image.

This change also adds support for creating disk derived disk images to
vmctl. To use it:

vmctl create derived.qcow2 -s 16G -b base.qcow2

From Ori Bernstein
OK mlarkin@ reyk@


# 1.102 29-Sep-2018 pd

vmd: don't remove vm if sending failed

Fix a bug where a vm was removed in vmd.c after vmctl send even if sending
failed.
spotted by solene@
ok mlarkin@


# 1.101 28-Sep-2018 reyk

Fix copy-pasto to use maxmem instead of maxcpu

Reported by Greg Steuck

OK mlarkin@


# 1.100 10-Sep-2018 bluhm

vmd(8) clould close file descriptor 0 as not all fd fields were
properly initialized with -1. Also avoid closing -1.
OK mlarkin@


# 1.99 10-Sep-2018 bluhm

During the fork+exec implementation, daemon(3) was moved after
proc_init(). As a consequence vmd(8) child processes did not detach
from the terminal anymore. Dup /dev/null to the stdio file descriptors
in the children.
OK mlarkin@ reyk@


# 1.98 15-Jul-2018 reyk

Track resources and enforce cpu/memory/interface limits for non-root users.

The limits are currently hard-coded and undocumented (4 CPUs/VMs, 2G
memory, 8 interfaces) but will be configurable in an upcoming diff.
These limits are tracked in total usage; for example, a user will be
able to run up to 4 VMs with 512M of memory or a single VM with 2G.

OK ccardenas@ mlarkin@


# 1.97 13-Jul-2018 reyk

Check the disk/kernel/cdrom file permissions after openening the fd.

This prevents time of TOCTOU attacks for instances.

OK mlarkin@


# 1.96 13-Jul-2018 reyk

Add "allow instance" option.

This allows users to create VM instances and change desired options,
for example a user can be allowed to run a VM with all the
pre-configured options but specify an own disk image.

(mlarkin@ was fine with iterating over it)

OK ccardenas@


# 1.95 12-Jul-2018 reyk

Allow to use configured/running VMs as templates for other VM instances.

This introduces new grammar and the -t optional in vmctl start.

(For now, only root can create VM instances; but it is planned to allow
users to create their own VMs based on permissions and quota.)

OK ccardenas@ mlarkin@ jmc@


# 1.94 11-Jul-2018 reyk

style - indent each case statement in a switch.


# 1.93 11-Jul-2018 reyk

Add -w option to vmctl stop to wait for completion of VM termination.

Use it in /etc/rc.d/vmd accordingly.

OK sthen@


# 1.92 11-Jul-2018 reyk

Rename function to vmd_check_vmh


# 1.91 11-Jul-2018 reyk

Add -f option to vmctl stop to forcefully kill a VM.

This also fixes a bug in vmm_sighdlr where it might have missed
forwarding the TERMINATE_EVENT to the vmd parent after a VM child
died, leading to an abandoned VM in the vmd parent process.

OK ccardenas@ mlarkin@ benno@ kn@


# 1.90 10-Jul-2018 reyk

style (single-line ifs don't need braces)


# 1.89 10-Jul-2018 reyk

vmd already had DEBUG/DPRINTF, there is no need for VMD_DEBUG/dprintf

Replace all occurences of dprintf with DPRINTF (defined in proc.h).


# 1.88 10-Jul-2018 reyk

Tweak debug log messages

- Turn tracing messages into DPRINTF (only compiled with DEBUG).

- Pass __func__ to vm_stop and vm_remove: this way we can track who
called the function in the async context. It replaces the manual
log_debug in front of each vm_stop/vm_remove. This debug logging
trick can be removed in the future once we are more confident about
it.

OK ccardenas@ mlarkin@


# 1.87 26-Jun-2018 reyk

Add "socket owner" to allow changing the owner of the vmd control socket.

This allows to open vmctl control or console access to other users
that are not in group wheel. Access for non-root users still defaults
to read-only actions unless you change the owner (user/group) of each
individual VM.

Requested by Mischa Peters

OK mlarkin@


# 1.86 19-Jun-2018 reyk

knf


# 1.85 13-May-2018 pd

vmd(8): enable pause / unpause for vm owners

Patch from Mohamed Aslan. Thanks!
ok kn@


# 1.84 25-Apr-2018 mlarkin

vmd(8)'s early error messages weren't visible when started via /etc/rc
(such as errors relating to not having VMX/etc). Change the log_init
to log to syslog so at least we have some chance of seeing these errors.

requested and ok beck@


# 1.83 21-Apr-2018 mlarkin

spelling error in log message


# 1.82 29-Mar-2018 martijn

Make sure that the global config is send out immediately when it is
loaded. This makes sure that the local prefix specied in the config is
always used.

OK ccardenas@


Revision tags: OPENBSD_6_3_BASE
# 1.81 14-Mar-2018 mlarkin

block two VMs from using the same disk image file at the same time.
Also changes an error message in vmctl to reflect same.


# 1.80 18-Feb-2018 pd

vmd: fix vmctl pause for non existing vm ids (never returns)

check if vm id is valid before sending to vmm for pausing. The 'lock' is caused
by vmm sending back ENOENT for a non existent vm but vmd drops the message
because it doesn't recogize the vmid vmm is talking about. This is an artifact
of the 'policy' don't trust any imsg from a sibling priv sep process and do
your own checking.

reported by Abel Abraham Camarillo Ojeda
ok mlarkin@ and ccardenas@


# 1.79 10-Jan-2018 sthen

Don't require "disk" or "kernel", also allow just "cdrom" instead, a VM can
still be useful with only cdrom storage. ok ccardenas@


# 1.78 08-Jan-2018 mpi

Enable TIOCUCNTL to be able to set ns8250's break detected condition.

It is now possible to send BREAK commands to vmd(8) independently of
the serial terminal emulator.

Happy virtual ddb(4) hacking!

No objection from mlarkin@, ok nicm@, ccardenas@, deraadt@


# 1.77 03-Jan-2018 ccardenas

Add initial CD-ROM support to VMD via vioscsi.

* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8)
* Support various sized ISOs (Limitation of 4G ISOs on Linux guests)
* Known working guests: OpenBSD (primary), Alpine Linux (primary),
CentOS 6 (secondary), Ubuntu 17.10 (secondary).
NOTE: Secondary indicates some issue(s) preventing full/reliable
functionality outside the scope of the vioscsi work.
* If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's
default BIOS) will boot from CD-ROM.

ok mlarkin@, jca@


# 1.76 06-Dec-2017 abieber

Make vmd respect owner when starting non-disabled vms.

OK pd@, benno@


# 1.75 30-Nov-2017 ccardenas

When performing vmctl reload and a previously configured vm is running,
exit with an EALREADY vs EPERM.

ok mlarkin@


# 1.74 11-Nov-2017 mlarkin

update switch handling in vmd(8). vmd now gets switch information (rdomain,
etc) from underlying switch interface instead of handling this on its
own.

Diff from carlos cardenas, Thanks!

ok reyk@


# 1.73 07-Nov-2017 mlarkin

typo in previous


# 1.72 07-Nov-2017 mlarkin

comment function vm_checkperm


# 1.71 24-Oct-2017 mlarkin

The VMD parent process didn't handle the case of a VM exiting
with a non 0 return properly (i.e. EIO).

From: Carlos Cardenas, thanks!


# 1.70 07-Oct-2017 mlarkin

vmd: retain ownership on vm reboot

from Jesper Wallin, thanks!


Revision tags: OPENBSD_6_2_BASE
# 1.69 08-Sep-2017 mlarkin

vmd: add more explanatory log_debug messages

From Carlos Cardenas, many thanks!


# 1.68 20-Aug-2017 pd

vmd: Allow only upward migration

This restricts receiving vms from hosts with more cpu features.

Tested on
broadwell -> skylake (works)
skylake -> broadwell (don't work)

ok mlarkin@


# 1.67 15-Aug-2017 pd

vmd: fix vm id displayed by vmctl when receiving a vm

Also fix two debug messages and an IMSG type.


# 1.66 14-Aug-2017 jasper

validate vm names before creating them; a valid name contains alphanumeric
characters, including '.', '_' and '-'. but does not start with the latter
three.

ok mlarkin@ pd@


# 1.65 13-Aug-2017 jasper

don't issue a termination command to an already stopped vm

ok mlarkin@


# 1.64 15-Jul-2017 pd

Add vmctl send and vmctl receive

ok reyk@ and mlarkin@


# 1.63 09-Jul-2017 pd

vmd/vmctl: Add ability to pause / unpause vms

With help from Ashwin Agrawal

ok reyk@ mlarkin@


# 1.62 29-May-2017 mlarkin

vmd(8): prevent crashing when presented with a vm name argument to
"vmctl stop" that doesn't exist.

Diff from Pratik Vyas, thanks!


# 1.61 04-May-2017 reyk

Report command failure back to vmctl reload, reset, load, log verbose.

OK mlarkin@


# 1.60 04-May-2017 reyk

Add support for rdomains.

This allows to configure VM interfaces and switches in individual rdomains.

OK mlarkin@


# 1.59 25-Apr-2017 reyk

Generate randomized MAC addresses earlier to keep them across reboots.

OK deraadt@


# 1.58 21-Apr-2017 reyk

Add global configuration option "local prefix" to change prefix for -L.

The default prefix is 100.64.0.0/10 from RFC6598.

Requested by sthen@ chris@
OK mlarkin@


# 1.57 19-Apr-2017 reyk

Add support for dynamic "NAT" interfaces (-L/local interface).

When a local interface is configured, vmd configures a /31 address on
the tap(4) interface of the host and provides another IP in the same
subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server
that replies with IP, gateway, and DNS addresses to the VM. The
built-in server only ever responds to the VM on the inside and cannot
leak its DHCP responses to the outside.

Thanks to Uwe Werler, Josh Grosse, and some others for testing!

OK deraadt@


# 1.56 06-Apr-2017 reyk

Do not expose vmm(4) VM IDs to the user, use vmd(8)'s IDs instead.

Each VM has two IDs: one from the kernel (vmm) and a different one
from userland (vmd). The vmm ID is not consistent and incremented on
every boot during runtimg of the host system. The vmd ID remains the
same during the lifetime of a configured VM, even after reboots.
Configured VMs will even get and keep their IDs when the configuration
is loaded. This is more what users expect.

Pointed out and tested by otto@

OK deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.55 15-Mar-2017 reyk

More fixes for starting and stopping VMs, fixing fallout from vm_running.

- Don't start a VM that is already running
- Keep the VM as running until it is powered off (and not stopping)
- Don't fatal in the parent if the vmm process referenced an unknown VM
- Don't stop a VM that is already stopping
- Indicate that a VM is stopping in "vmctl status"

The previous "vmctl stop; vmctl stop" to force-shutdown is not
supported anymore - the shutdown timeout should make sure that the VM
is really terminated. To force-shutdown, reference the VM by ID.
We might add a flag to vmctl stop to just turn the VM off.


# 1.54 15-Mar-2017 reyk

Close the tty if the VM was powered down.

The parent keeps a copy of each VM's tty fd to reuse it on reboot.
Close this tty if the VM was stopped, and not rebooted, by calling
vm_stop(vm, 0) instead of just setting vm_running to 0. Also make
sure that vm_ttyname is not used after free'ing it.


# 1.53 02-Mar-2017 reyk

Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.

This is especially useful when multiple VMs share a switch, the
implementation is independent from the underlying switch or bridge.

no objections mlarkin@


# 1.52 01-Mar-2017 reyk

Add "owner" option to set a user/group ownership for pre-configured VMs

This allows matching users to start or stop VMs that they "own" and to
access the console accordingly.

OK mlarkin@


# 1.51 27-Feb-2017 reyk

Replace openpty(3) with local function that uses pre-opened /dev/ptm fd

This allows more flexibility for upcoming changes and better pledge.
We also didn't use half of the features of libutil's openpty function.
Additionally, make sure that the ttys are closed correctly on shutdown.

OK gilles@


# 1.50 13-Jan-2017 edd

Make it possible to remove VMs from vmd(8)'s internal queue.

The semantics agreed with reyk@ are:

* ad-hoc created vms, created with `vmctl start`, are removed once stopped.
* Stopped VMs defined in a config file are flushed before a `vmctl reload`.

OK reyk@


# 1.49 11-Jan-2017 reyk

Add imsg communication channel between vmd and invividual VMs.
For now, this is only used to forward "log verbose|brief" requests,
but it will be used for better things later.

OK mlarkin@


# 1.48 09-Jan-2017 reyk

Stop accessing verbose and debug variables from log.c directly.

This replaces log_verbose() and "extern int verbose" with the two functions
log_setverbose() and log_getverbose().

Pointed out by benno@
OK krw@ eric@ gilles@ (OK gilles@ for the snmpd bits as well)


# 1.47 14-Dec-2016 reyk

Allow to start disabled and pre-configured VMs by name, "vmctl start foo".

With testing from Jon Bernard

OK mlarkin@


# 1.46 14-Dec-2016 reyk

If a VM terminates with the result EAGAIN, close all fds except the
pty and re-send it to the vmm monitor process. With additional
changes in vmm.c, this will allow perform a cold reboot of VM.

With testing and feedback from Jon Bernard
OK mlarkin@


# 1.45 26-Nov-2016 reyk

Implement basic support for boot.conf(8) on the disk image.

Like the real boot loader, load and parse hd0a:/etc/boot.conf from the
first disk and fall back to /bsd. Not all boot loader options are
supported, but it at least does set device, set image, and boot -acds
(eg. for booting single-user).

For example, it can now boot install60.fs that includes a boot.conf
with "set image /6.0/amd64/bsd.rd":
vmctl start install -c -d install60.fs -d OpenBSD.img

This pseudo-bootloader is only needed without BIOS and could
potentially be replaced in the future.

OK mlarkin@


# 1.44 26-Nov-2016 reyk

If -m/memory is not specified, use 512M by default.

Default value picked with mlarkin - not too small and not too large.

OK mlarkin@


# 1.43 24-Nov-2016 reyk

Add support for booting the kernel from the disk image.

This make the kernel/-k argument optional and, if not specified, tries
to find the /bsd kernel in the primary hd0a partition of the first
disk image itself. It doesn't support hd0a:/etc/boot.conf yet, and it
is no BIOS or full boot loader, but it makes booting and handling of
VMs a bit easier - booting an external kernel is still supported.

The UFS file system code ufs.c is directly from libsa which is also
used by the real boot loader. The code compiles with a few signedness
warning which will be fixed separately.

OK mlarkin@


# 1.42 22-Nov-2016 reyk

Fix error path of config_setvm() and its callers. This unbreaks
loading of invalid kernel files.

Reported by mlarkin@
OK mlarkin@


# 1.41 22-Nov-2016 reyk

There is no need for res when there is already ret.


# 1.40 22-Nov-2016 edd

Insert disabled VMs into vmd(8)'s queues and allow vmctl(8) to display them.

Tested by Jon Bernard and reyk@.

OK reyk@, no objections mlarkin@.

Thanks


# 1.39 04-Nov-2016 reyk

Pass the internal vmid or 0 to vm_register() instead of changing it
once again after setting the next available id.

Suggested by edd@


# 1.38 04-Nov-2016 reyk

Update the config/register/get VM methods to match the config_set/get
style that is used in other places. Also keep the vmid from the parent.

OK edd@


# 1.37 29-Oct-2016 edd

Separate parsing vms and switches from starting them in vmd(8).

Brings us one step closer to having disabled by default vms is vm.conf(5),
which can be started with vmctl(8).

Input, testing and OK reyk@. Thanks.


# 1.36 17-Oct-2016 reyk

Add the option to specify an interface group per virtual switch as well;
this group will be added to all VM tap(4) interfaces in the switch.

Tested by martijn@


# 1.35 15-Oct-2016 reyk

Allow to add an interface to an interface group; with the group keyword.

Requested and tested by martijn@


# 1.34 12-Oct-2016 reyk

Fix functionality and semantics of vmctl load/reload/reset.

OK rzalamena@


# 1.33 06-Oct-2016 reyk

Terminate VMs on shutdown of vmd instead of leaving them running as
undead VM processes.

OK mlarkin@


# 1.32 05-Oct-2016 reyk

Add support for enhanced networking configuration and virtual switches.
See vm.conf(5) for more details.

OK mlarkin@


# 1.31 04-Oct-2016 reyk

Add a new "priv" process that is responsible for ioctls and restricted
operations that aren't allowed under pledge. This is a companion to
the "vmd" process that runs as root but with pledge.

With the "priv" process, each new tap(4) interface now gets a
description to indicate the vm, eg. "vm1-if0-myvm". For network
configuration will be done by vmd/priv later.

OK mlarkin@


# 1.30 29-Sep-2016 reyk

Implement fork+exec for vmd, using the same framework from httpd etc.

No objections from mlarkin@ sunil@


# 1.29 17-Aug-2016 deraadt

small bits of header cleanup; ok mlarkin


# 1.28 29-Jul-2016 stefan

Allow starting a VM again after it was terminated

If a VM exits, terminate it and remove it from the list of
available VMs. That allows a VM with name `foo' to be restarted
after it has exited.

This changes structures shared between vmd and vmctl. You need to
rebuild vmctl also.

ok mlarkin@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.27 05-Feb-2016 reyk

Fix a possible use-after-free in vmd, forward the result to the
control socket before free'ing the vm.

Found by and OK jsg@


# 1.26 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.25 11-Dec-2015 reyk

The vmctl "id" argument can now be a number of or a vm name, eg.
vmctl stop 3
vmctl stop "openbsd.vm"


# 1.24 08-Dec-2015 jsg

when checking the config file with -n don't open /dev/vmm or require root
ok reyk@


# 1.23 08-Dec-2015 jsg

make the -f option work as intended
ok reyk@


# 1.22 07-Dec-2015 reyk

tweak initial error logging


# 1.21 06-Dec-2015 reyk

Prevent running a VM with the same name multiple times - multiple
instances of the same configuration will be handled in a different way
later. It is also not a good idea to use the same writeable disk
with multiple VMs at the same time.

As discussed with mlarkin@


# 1.20 06-Dec-2015 reyk

Print the TTY in the vmctl status output.


# 1.19 06-Dec-2015 reyk

When a new vm is created with VMM_IOC_CREATE, the kernel assigns a
unique id to it. This happens in the vm child process and has to be
communicated to the parent processes to track the vm. Knowing the vm
id in the parent and vmm processes also allows to remove vm from the
daemons list on terminate requests later.


# 1.18 06-Dec-2015 reyk

Check errno from config_getvm() correctly


# 1.17 05-Dec-2015 reyk

Print shorter error message if opening /dev/vmm failed.

Pointed out by deraadt@


# 1.16 03-Dec-2015 reyk

Re-add the "load" and "reload" commands to vmctl: Instead of parsing
the configuration in vmctl directly, it now sends a (re)load request
to vmd. The reload also resets the existing configuration status -
this doesn't do much difference yet but a future change will compare
if a specified VM is already running. "load" will allow to add
configuration, while "reload" resets the state before loading.


# 1.15 03-Dec-2015 reyk

Add and document -D and -f flags to vmd.


# 1.14 03-Dec-2015 reyk

mlarkin's code has been moved to vmm.c, so it is ok to claim the copyright.


# 1.13 03-Dec-2015 reyk

Add support for an optional vm.conf(5) file in vmd. This will replace
vmm.conf(5) in vmmctl. For a short time, both vmd and vmmctl will
support a configuration file, but vmmctl will be changed to send
"load" requests to vmd instead of loading and parsing the file
directly.


# 1.12 03-Dec-2015 reyk

prepare config_getvm() for parse.y


# 1.11 02-Dec-2015 reyk

send the tty name to vmmctl and print it as a result.


# 1.10 02-Dec-2015 reyk

Split the fully privileged parent into two processes "parent" and
"vmm" with reduced privileges:
- the "parent" opens fds (disks, ifs, etc.) but runs as root but pledged as
"stdio rpath wpath proc tty sendfd".
- the "vmm" process handles the creation and supervision of vm processes,
and the primary communication with the vmm(4) subsystem. It runs as _vmd
in the chroot but does not use pledge, as the vmm ioctls are not allowed
by any pledge model yet.
With this change, vmd starts to track the configuration state of VMs
in vmd and will allow other things later (like terminating a vm by
name, moving the configuration parser to vmd, ...). More incremental
changes will follow.


# 1.9 02-Dec-2015 reyk

Start tweaking vmd's privsep and daemon model by splitting the main
process into multiple parts and adopting the "proc.c"-style from other
daemons. This allows to further reduce the privileges, to give better
pledge(2), and to add some upcoming changes.

"please do" mlarkin@, deraadt@


# 1.8 26-Nov-2015 reyk

Automatically start vmm(4) when the first VM is created and after the
last VM is terminated. This allows to remove the explicit "vmm
enable" / "vmm disable" (VMM_IOC_START / VMM_IOC_STOP) ioctls. You'll
have to update kernel and userland for this change, as the kernel ABI
changes.

OK mpi@ mlarkin@


# 1.7 25-Nov-2015 tedu

typo: should be looking pid == -1


# 1.6 23-Nov-2015 reyk

accept4() is restarted after signals which prevents vmd from exiting
in the current control socket loop. Add a poll before the accept that
is not restarted and allows to escape the loop. This code is kind of
temporary, as we're planning to replace the event handling, but it
allows to kill (or Ctrl+c) vmd for now.

OK tedu@, discussed with many


# 1.5 23-Nov-2015 reyk

I accidentally removed a newline in usage() when converting the log
messages to log_*.

From Cesare Gargano


# 1.4 23-Nov-2015 reyk

Add support for logging to stderr or syslog, and to run vmd in
foreground with -d.

OK mlarkin@ jung@


# 1.3 22-Nov-2015 deraadt

use PATH_MAX where needed


# 1.2 22-Nov-2015 reyk

Add $ Ids


# 1.1 22-Nov-2015 mlarkin

vmd(8) - virtual machine daemon.

There is still a lot to be done, and fixed, in these userland components
but I have received enough "it works, commit it" emails that it's time
to finish those things in tree.

discussed with many, tested by many.


# 1.144 25-Apr-2023 dv

vmm(4)/vmd(8): pull struct members out of vmm ioctl create struct.

The object sent to vmm(4) contained file paths and details the
kernel does not need for cpu virtualization as device emulation is
in userland. Effectively, "pull up" the struct members from the
vm_create_params struct to the parent vmop_create_params struct.

This allows us to clean up some of vmd(8) and simplify things for
switching to having vmctl(8) open the "kernel" file (SeaBIOS, bsd.rd,
etc.) to allow users to boot recovery ramdisk kernels.

ok mlarkin@


# 1.143 24-Apr-2023 kn

Missing the optional default config is not an error

/var/log/{messages,daemon} logs ENOENT as error on default configless vmd.
Only complain on explicitly passed files and print a debug hint under `-vv'
in case someone forgot to populate their /etc/vm.conf.

OK dv mlarkin


# 1.142 23-Apr-2023 dv

vmd(8): teach vmm process how to exec.

Use execvp(2) to launch vm children with new address spaces.
Consequently, introduces use of unveil(2) into the vmm and vm
processes.

This imposes the requirement of launching vmd with absolute paths,
similar to sshd(8).

ok mlarkin@


# 1.141 19-Apr-2023 jsg

remove duplicate includes


# 1.140 16-Apr-2023 dv

vmd(8): clean up fd closing in vmm process.

Some mild tidying of fd closing in the vmm process in prep for
landing parts of my fork+exec diff.

With input from guenther@ on the nuances of if/when EINTR may happen
in a call to close(2).

ok mlarkin@


# 1.139 02-Apr-2023 dv

vmd(8): migrate vmd_vm.vm_ttyname to char array.

Other structs use a fixed length array already. This allows a vmd_vm
object to be transmitted over an ipc channel, too.

Additionally, solves a segfault caused by a strlcpy(3) in an error
path.

ok mlarkin@


Revision tags: OPENBSD_7_3_BASE
# 1.138 28-Jan-2023 dv

Move some header definitions from vmm(4) to vmd(8).

Part of an ongoing effort to move userland-specific information out
of a kernel header and directly into vmd(8). No functional change.

ok mlarkin@


# 1.137 22-Jan-2023 dv

vmd(8): don't remove known vm's from the config on error.

Multiple error paths, specifically the one related to if a guest
cannot allocate memory at start, resulted in a known vm (via
vm.conf(5)) being removed from the vm list. Adjust the error paths
to check if the failing vm is defined in the config before tearing
it down.

Tested with help from beck@ and Mischa Peters.

ok beck@


# 1.136 14-Jan-2023 dv

Only open /dev/vmm once in vmd(8).

Have the parent process open /dev/vmm and send the fd to the vmm
child process. Only the vmm process and its resulting children
(guest vms) need it for ioctl calls.

ok kn@


# 1.135 28-Dec-2022 jmc

spelling fixes; from paul tagliamonte
any parts of his diff not taken are noted on tech


# 1.134 15-Dec-2022 dv

Add explicit casts to ctype functions in vmd(8).

OK millert@


# 1.133 31-Oct-2022 dv

vmd(8): remove unfinished user accounting.

User accounting and enforcement was never finished. tedu the thing
until someone wants to pick it up and finish it.

Originally found by Matthew Martin.

ok mlarkin@, kn@. input from tb@.


Revision tags: OPENBSD_7_2_BASE
# 1.132 13-Sep-2022 martijn

branches: 1.132.2;
Add (partial) support for agentx in vmd.

Metrics can be found under mib-2.236 and VM-MIB (RFC7666).

Stress tested by and happy noises from Mischa Peters
OK dv@


# 1.131 08-May-2022 dv

vmd: fix rebooting a received vm

Rebooting a received vm resulted in vmd(8) exiting as a result of
flawed state tracking in the parent process.

When stopping a vm, clear the VM_RECEIVE_STATE flag. When starting
a vm, make sure the parent process collapses any existing memory
ranges after the vm is sent to the vmm process (responsible for
launching the vm).

ok mlarkin@


Revision tags: OPENBSD_7_1_BASE
# 1.130 01-Mar-2022 dv

branches: 1.130.2;
vmd(8): gracefully handle hitting data limits when starting a vm

With recent changes to login.conf(5) to restrict daemon datasize
to a finite value, users can now hit resource limits when attempting
to start a vm.

This change fixes the error path when hitting the limit. vmd(8)
will no longer abort and memory error messages are relayed to the
user.

While here, address potential under-reads/writes using atomicio
when relaying data between the child vm process and vmd's vmm
process.

Original diff from tedu@. OK mlarkin@.


# 1.129 04-Jan-2022 claudio

Fix some simple -Wunused-but-set-variable warnings.
OK benno@ dv@


# 1.128 13-Dec-2021 deraadt

including sys/cdefs.h manually started as a result of netbsd trying to
macro-build a replacement for sccsid, and was done without any concern
for namespace damage. Unfortunately this practice started infecting
other code as others were unaware they didn't need the file.
ok millert guenther


# 1.127 29-Nov-2021 deraadt

mostly avoid sys/param.h with a local nitems()
ok mlarkin


Revision tags: OPENBSD_7_0_BASE
# 1.126 18-Jul-2021 dv

vmd(8): remove invalid errno values from config_setvm

Refactor config_setvm to directly return error code on failure
instead of returning -1 and setting errno. It was setting unsupported
values not defined in <errno.h>.

OK mlarkin@


# 1.125 05-May-2021 dv

Refactor vm_instance to return error value directly.

vmd(8)'s vm_instance function set unsupported errno values. Change the
api to directly return an error (either errno or custom vmd error).

"go for it" -mlarkin@


# 1.124 04-May-2021 dv

Init debug logging state before attempting to log.

Error messages related to bad configuration were not flushing to
stderr.

OK mlarkin@


# 1.123 26-Apr-2021 dv

vmd(8): fix vmctl client "wait" state corruption

Adds queue-based tracking of waiting client state to fix the cause of
state corruption when a vmctl(8) user cancels a wait and restarts it.
The socket fd value for the control process client was being used to
track the waiting party, but this also prevented multiple waiting
clients.

This moves all the state tracking of who to notify of a vm's stopping
to the control process and no longer requires the parent process to
track it in the global environment state.

Future work will be needed to smooth out the difference between the
IMSG_VMDOP_TERMINATE_VM_{EVENT,RESPONSE} events instead of needing to
translate before relaying to the vmctl(8) client.

Tested by Mischa Peters (thanks!)

ok mlarkin@


Revision tags: OPENBSD_6_9_BASE
# 1.122 05-Apr-2021 dv

Send correct response type on unpause errors.

ok pd@


# 1.121 29-Mar-2021 dv

Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp
and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior
ignored did not intercept these packets and instead transmitted them.

This should make vmd(8)'s dhcp behave more as a true dhcp server should and
allows it to work properly with the new dhcpleased(8) attempting a renewal.

OK mlarkin@


# 1.120 27-Jan-2021 deraadt

these programs (with common ancestry) had a -fno-common problem related
to privsep_procid.
ok mortimer


Revision tags: OPENBSD_6_8_BASE
# 1.119 23-Sep-2020 martijn

Revert agentx support for now, we're too close to release.

requested by deraadt@


# 1.118 23-Sep-2020 martijn

Add support for agentx to vmd.

This is based around VM-MIB from RFC7666,but does not export the full
spec. People more knowledgeable of vmd are encouraged to expand on this.


Revision tags: OPENBSD_6_7_BASE
# 1.117 12-Dec-2019 pd

vmd: start vms defined in vm.conf in a staggered fashion

This addresses 'thundering herd' problem when a lot of
vms are configured in vm.conf. A lot of vms booting in parallel can
overload the host and also mess up tsc calibration in openbsd guests as
it uses PIT which doesn't fire reliably if the host is overloaded.

We default to starting vms with parallelism of ncpuonline and a delay 30 seconds
between batches. This is configurable in vm.conf.

ok mlarkin@ (also addressed comments from cheloha@)


Revision tags: OPENBSD_6_6_BASE
# 1.116 04-Sep-2019 mlarkin

vmd(8): memory leak in an error path

Found by Hiltjo Posthuma, thanks!


# 1.115 14-Aug-2019 anton

Improve the error message when supplying an invalid template to vmctl
start. Favoring 'invalid template' over 'permission denied' should give
the user a better hint on what went wrong.

ok kn@ mlarkin@


# 1.114 28-Jun-2019 deraadt

When system calls indicate an error they return -1, not some arbitrary
value < 0. errno is only updated in this case. Change all (most?)
callers of syscalls to follow this better, and let's see if this strictness
helps us in the future.


# 1.113 20-May-2019 jasper

drop fatalx calls when claiming a new vm id; otherwise it's possible
to crash vmd and take all other vms with it. this required a little
shuffling to get the error value reported back to the caller to
handle the error properly.

ok mlarkin@


# 1.112 11-May-2019 jasper

report vm state through 'vmctl status'; whereas previously this would display the state of
the vcpu (which is why it got removed), it now actually reports the correct state
(running, stopped, disabled, paused, etc)

ok ccardenas@ mlarkin@


# 1.111 11-May-2019 jasper

vm_dump_header allocated space for a signature but it was never set;
set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image

ok mlarkin@ pd@


# 1.110 11-May-2019 jasper

track the state of the vm (running, paused, etc) using a single bitfield instead of
a handful of separate variables. this will makes it easier for vmd to report
and check on the individual vm states

no functional change intended

ok ccardenas@ mlarkin@


# 1.109 11-May-2019 jasper

sync the vm state in vmd too when (un)pausing a vm, otherwise the vm process
knows the vm is paused, but vmd does not.

ok mlarkin@ pd@


Revision tags: OPENBSD_6_5_BASE
# 1.108 09-Dec-2018 claudio

When -B is used to specify a specific boot device also change the reboot
behaviour of vmd to stop / exit at guest reboot.
OK ccardenas@


# 1.107 04-Dec-2018 claudio

Introduce IMSG_VMDOP_WAIT_VM_REQUEST a control message that registers a
vmctl peerid that should be informed when the VM is stopped (like when the
guest does a shutdown). Uses the same logic as using the VMOP_WAIT flag on
IMSG_VMDOP_TERMINATE_VM_REQUEST.
Ok ccardenas@, reyk@


# 1.106 26-Nov-2018 ori

Keep a list of known vms, and reuse the VM IDs.

This means that when using '-L', the IP addresses of the VMs are stable.

ok reyk@


# 1.105 21-Nov-2018 reyk

Add support for "local inet6" interfaces.

ok & test ccardenas@, additional review from kn@


# 1.104 15-Oct-2018 reyk

Prevent VM reboot loops by rate-limiting the interval a VM can reboot.

This looping has been experienced by people who run VMs with a broken
kernel or boot loader that trigger a very fast reboot loop (triple
fault) of a VM that ends up using a lot of CPU and resources on the
host. Some fixes in vmm(4) and vmd(8) helped to avoid such conditions
but it can still occur if something is wrong in the guest VM itself.

If the VM restarts after less than VM_START_RATE_SEC (6) seconds, we
increment the limit counter. After VM_START_RATE_LIMIT (3) of suchs
fast reboots the VM is stopped.

There are only very few people who intentionally want to reboot-loop a
VM very quickly (many times within a second); mostly for fuzzing.
They will have to recompile and adjust the stated #defines in the code
as we don't have a config option to disable it.

OK mlarkin@


Revision tags: OPENBSD_6_4_BASE
# 1.103 08-Oct-2018 reyk

Add support for qcow2 base images (external snapshots).

This works is from Ori Bernstein, committing on his behalf:

Add support to vmd for external snapshots. That is, snapshots that are
derived from a base image. Data lookups start in the derived image,
and if the derived image does not contain some data, the search
proceeds ot the base image. Multiple derived images may exist off of
a single base image.

A limitation of this format is that modifying the base image will
corrupt the derived image.

This change also adds support for creating disk derived disk images to
vmctl. To use it:

vmctl create derived.qcow2 -s 16G -b base.qcow2

From Ori Bernstein
OK mlarkin@ reyk@


# 1.102 29-Sep-2018 pd

vmd: don't remove vm if sending failed

Fix a bug where a vm was removed in vmd.c after vmctl send even if sending
failed.
spotted by solene@
ok mlarkin@


# 1.101 28-Sep-2018 reyk

Fix copy-pasto to use maxmem instead of maxcpu

Reported by Greg Steuck

OK mlarkin@


# 1.100 10-Sep-2018 bluhm

vmd(8) clould close file descriptor 0 as not all fd fields were
properly initialized with -1. Also avoid closing -1.
OK mlarkin@


# 1.99 10-Sep-2018 bluhm

During the fork+exec implementation, daemon(3) was moved after
proc_init(). As a consequence vmd(8) child processes did not detach
from the terminal anymore. Dup /dev/null to the stdio file descriptors
in the children.
OK mlarkin@ reyk@


# 1.98 15-Jul-2018 reyk

Track resources and enforce cpu/memory/interface limits for non-root users.

The limits are currently hard-coded and undocumented (4 CPUs/VMs, 2G
memory, 8 interfaces) but will be configurable in an upcoming diff.
These limits are tracked in total usage; for example, a user will be
able to run up to 4 VMs with 512M of memory or a single VM with 2G.

OK ccardenas@ mlarkin@


# 1.97 13-Jul-2018 reyk

Check the disk/kernel/cdrom file permissions after openening the fd.

This prevents time of TOCTOU attacks for instances.

OK mlarkin@


# 1.96 13-Jul-2018 reyk

Add "allow instance" option.

This allows users to create VM instances and change desired options,
for example a user can be allowed to run a VM with all the
pre-configured options but specify an own disk image.

(mlarkin@ was fine with iterating over it)

OK ccardenas@


# 1.95 12-Jul-2018 reyk

Allow to use configured/running VMs as templates for other VM instances.

This introduces new grammar and the -t optional in vmctl start.

(For now, only root can create VM instances; but it is planned to allow
users to create their own VMs based on permissions and quota.)

OK ccardenas@ mlarkin@ jmc@


# 1.94 11-Jul-2018 reyk

style - indent each case statement in a switch.


# 1.93 11-Jul-2018 reyk

Add -w option to vmctl stop to wait for completion of VM termination.

Use it in /etc/rc.d/vmd accordingly.

OK sthen@


# 1.92 11-Jul-2018 reyk

Rename function to vmd_check_vmh


# 1.91 11-Jul-2018 reyk

Add -f option to vmctl stop to forcefully kill a VM.

This also fixes a bug in vmm_sighdlr where it might have missed
forwarding the TERMINATE_EVENT to the vmd parent after a VM child
died, leading to an abandoned VM in the vmd parent process.

OK ccardenas@ mlarkin@ benno@ kn@


# 1.90 10-Jul-2018 reyk

style (single-line ifs don't need braces)


# 1.89 10-Jul-2018 reyk

vmd already had DEBUG/DPRINTF, there is no need for VMD_DEBUG/dprintf

Replace all occurences of dprintf with DPRINTF (defined in proc.h).


# 1.88 10-Jul-2018 reyk

Tweak debug log messages

- Turn tracing messages into DPRINTF (only compiled with DEBUG).

- Pass __func__ to vm_stop and vm_remove: this way we can track who
called the function in the async context. It replaces the manual
log_debug in front of each vm_stop/vm_remove. This debug logging
trick can be removed in the future once we are more confident about
it.

OK ccardenas@ mlarkin@


# 1.87 26-Jun-2018 reyk

Add "socket owner" to allow changing the owner of the vmd control socket.

This allows to open vmctl control or console access to other users
that are not in group wheel. Access for non-root users still defaults
to read-only actions unless you change the owner (user/group) of each
individual VM.

Requested by Mischa Peters

OK mlarkin@


# 1.86 19-Jun-2018 reyk

knf


# 1.85 13-May-2018 pd

vmd(8): enable pause / unpause for vm owners

Patch from Mohamed Aslan. Thanks!
ok kn@


# 1.84 25-Apr-2018 mlarkin

vmd(8)'s early error messages weren't visible when started via /etc/rc
(such as errors relating to not having VMX/etc). Change the log_init
to log to syslog so at least we have some chance of seeing these errors.

requested and ok beck@


# 1.83 21-Apr-2018 mlarkin

spelling error in log message


# 1.82 29-Mar-2018 martijn

Make sure that the global config is send out immediately when it is
loaded. This makes sure that the local prefix specied in the config is
always used.

OK ccardenas@


Revision tags: OPENBSD_6_3_BASE
# 1.81 14-Mar-2018 mlarkin

block two VMs from using the same disk image file at the same time.
Also changes an error message in vmctl to reflect same.


# 1.80 18-Feb-2018 pd

vmd: fix vmctl pause for non existing vm ids (never returns)

check if vm id is valid before sending to vmm for pausing. The 'lock' is caused
by vmm sending back ENOENT for a non existent vm but vmd drops the message
because it doesn't recogize the vmid vmm is talking about. This is an artifact
of the 'policy' don't trust any imsg from a sibling priv sep process and do
your own checking.

reported by Abel Abraham Camarillo Ojeda
ok mlarkin@ and ccardenas@


# 1.79 10-Jan-2018 sthen

Don't require "disk" or "kernel", also allow just "cdrom" instead, a VM can
still be useful with only cdrom storage. ok ccardenas@


# 1.78 08-Jan-2018 mpi

Enable TIOCUCNTL to be able to set ns8250's break detected condition.

It is now possible to send BREAK commands to vmd(8) independently of
the serial terminal emulator.

Happy virtual ddb(4) hacking!

No objection from mlarkin@, ok nicm@, ccardenas@, deraadt@


# 1.77 03-Jan-2018 ccardenas

Add initial CD-ROM support to VMD via vioscsi.

* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8)
* Support various sized ISOs (Limitation of 4G ISOs on Linux guests)
* Known working guests: OpenBSD (primary), Alpine Linux (primary),
CentOS 6 (secondary), Ubuntu 17.10 (secondary).
NOTE: Secondary indicates some issue(s) preventing full/reliable
functionality outside the scope of the vioscsi work.
* If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's
default BIOS) will boot from CD-ROM.

ok mlarkin@, jca@


# 1.76 06-Dec-2017 abieber

Make vmd respect owner when starting non-disabled vms.

OK pd@, benno@


# 1.75 30-Nov-2017 ccardenas

When performing vmctl reload and a previously configured vm is running,
exit with an EALREADY vs EPERM.

ok mlarkin@


# 1.74 11-Nov-2017 mlarkin

update switch handling in vmd(8). vmd now gets switch information (rdomain,
etc) from underlying switch interface instead of handling this on its
own.

Diff from carlos cardenas, Thanks!

ok reyk@


# 1.73 07-Nov-2017 mlarkin

typo in previous


# 1.72 07-Nov-2017 mlarkin

comment function vm_checkperm


# 1.71 24-Oct-2017 mlarkin

The VMD parent process didn't handle the case of a VM exiting
with a non 0 return properly (i.e. EIO).

From: Carlos Cardenas, thanks!


# 1.70 07-Oct-2017 mlarkin

vmd: retain ownership on vm reboot

from Jesper Wallin, thanks!


Revision tags: OPENBSD_6_2_BASE
# 1.69 08-Sep-2017 mlarkin

vmd: add more explanatory log_debug messages

From Carlos Cardenas, many thanks!


# 1.68 20-Aug-2017 pd

vmd: Allow only upward migration

This restricts receiving vms from hosts with more cpu features.

Tested on
broadwell -> skylake (works)
skylake -> broadwell (don't work)

ok mlarkin@


# 1.67 15-Aug-2017 pd

vmd: fix vm id displayed by vmctl when receiving a vm

Also fix two debug messages and an IMSG type.


# 1.66 14-Aug-2017 jasper

validate vm names before creating them; a valid name contains alphanumeric
characters, including '.', '_' and '-'. but does not start with the latter
three.

ok mlarkin@ pd@


# 1.65 13-Aug-2017 jasper

don't issue a termination command to an already stopped vm

ok mlarkin@


# 1.64 15-Jul-2017 pd

Add vmctl send and vmctl receive

ok reyk@ and mlarkin@


# 1.63 09-Jul-2017 pd

vmd/vmctl: Add ability to pause / unpause vms

With help from Ashwin Agrawal

ok reyk@ mlarkin@


# 1.62 29-May-2017 mlarkin

vmd(8): prevent crashing when presented with a vm name argument to
"vmctl stop" that doesn't exist.

Diff from Pratik Vyas, thanks!


# 1.61 04-May-2017 reyk

Report command failure back to vmctl reload, reset, load, log verbose.

OK mlarkin@


# 1.60 04-May-2017 reyk

Add support for rdomains.

This allows to configure VM interfaces and switches in individual rdomains.

OK mlarkin@


# 1.59 25-Apr-2017 reyk

Generate randomized MAC addresses earlier to keep them across reboots.

OK deraadt@


# 1.58 21-Apr-2017 reyk

Add global configuration option "local prefix" to change prefix for -L.

The default prefix is 100.64.0.0/10 from RFC6598.

Requested by sthen@ chris@
OK mlarkin@


# 1.57 19-Apr-2017 reyk

Add support for dynamic "NAT" interfaces (-L/local interface).

When a local interface is configured, vmd configures a /31 address on
the tap(4) interface of the host and provides another IP in the same
subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server
that replies with IP, gateway, and DNS addresses to the VM. The
built-in server only ever responds to the VM on the inside and cannot
leak its DHCP responses to the outside.

Thanks to Uwe Werler, Josh Grosse, and some others for testing!

OK deraadt@


# 1.56 06-Apr-2017 reyk

Do not expose vmm(4) VM IDs to the user, use vmd(8)'s IDs instead.

Each VM has two IDs: one from the kernel (vmm) and a different one
from userland (vmd). The vmm ID is not consistent and incremented on
every boot during runtimg of the host system. The vmd ID remains the
same during the lifetime of a configured VM, even after reboots.
Configured VMs will even get and keep their IDs when the configuration
is loaded. This is more what users expect.

Pointed out and tested by otto@

OK deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.55 15-Mar-2017 reyk

More fixes for starting and stopping VMs, fixing fallout from vm_running.

- Don't start a VM that is already running
- Keep the VM as running until it is powered off (and not stopping)
- Don't fatal in the parent if the vmm process referenced an unknown VM
- Don't stop a VM that is already stopping
- Indicate that a VM is stopping in "vmctl status"

The previous "vmctl stop; vmctl stop" to force-shutdown is not
supported anymore - the shutdown timeout should make sure that the VM
is really terminated. To force-shutdown, reference the VM by ID.
We might add a flag to vmctl stop to just turn the VM off.


# 1.54 15-Mar-2017 reyk

Close the tty if the VM was powered down.

The parent keeps a copy of each VM's tty fd to reuse it on reboot.
Close this tty if the VM was stopped, and not rebooted, by calling
vm_stop(vm, 0) instead of just setting vm_running to 0. Also make
sure that vm_ttyname is not used after free'ing it.


# 1.53 02-Mar-2017 reyk

Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.

This is especially useful when multiple VMs share a switch, the
implementation is independent from the underlying switch or bridge.

no objections mlarkin@


# 1.52 01-Mar-2017 reyk

Add "owner" option to set a user/group ownership for pre-configured VMs

This allows matching users to start or stop VMs that they "own" and to
access the console accordingly.

OK mlarkin@


# 1.51 27-Feb-2017 reyk

Replace openpty(3) with local function that uses pre-opened /dev/ptm fd

This allows more flexibility for upcoming changes and better pledge.
We also didn't use half of the features of libutil's openpty function.
Additionally, make sure that the ttys are closed correctly on shutdown.

OK gilles@


# 1.50 13-Jan-2017 edd

Make it possible to remove VMs from vmd(8)'s internal queue.

The semantics agreed with reyk@ are:

* ad-hoc created vms, created with `vmctl start`, are removed once stopped.
* Stopped VMs defined in a config file are flushed before a `vmctl reload`.

OK reyk@


# 1.49 11-Jan-2017 reyk

Add imsg communication channel between vmd and invividual VMs.
For now, this is only used to forward "log verbose|brief" requests,
but it will be used for better things later.

OK mlarkin@


# 1.48 09-Jan-2017 reyk

Stop accessing verbose and debug variables from log.c directly.

This replaces log_verbose() and "extern int verbose" with the two functions
log_setverbose() and log_getverbose().

Pointed out by benno@
OK krw@ eric@ gilles@ (OK gilles@ for the snmpd bits as well)


# 1.47 14-Dec-2016 reyk

Allow to start disabled and pre-configured VMs by name, "vmctl start foo".

With testing from Jon Bernard

OK mlarkin@


# 1.46 14-Dec-2016 reyk

If a VM terminates with the result EAGAIN, close all fds except the
pty and re-send it to the vmm monitor process. With additional
changes in vmm.c, this will allow perform a cold reboot of VM.

With testing and feedback from Jon Bernard
OK mlarkin@


# 1.45 26-Nov-2016 reyk

Implement basic support for boot.conf(8) on the disk image.

Like the real boot loader, load and parse hd0a:/etc/boot.conf from the
first disk and fall back to /bsd. Not all boot loader options are
supported, but it at least does set device, set image, and boot -acds
(eg. for booting single-user).

For example, it can now boot install60.fs that includes a boot.conf
with "set image /6.0/amd64/bsd.rd":
vmctl start install -c -d install60.fs -d OpenBSD.img

This pseudo-bootloader is only needed without BIOS and could
potentially be replaced in the future.

OK mlarkin@


# 1.44 26-Nov-2016 reyk

If -m/memory is not specified, use 512M by default.

Default value picked with mlarkin - not too small and not too large.

OK mlarkin@


# 1.43 24-Nov-2016 reyk

Add support for booting the kernel from the disk image.

This make the kernel/-k argument optional and, if not specified, tries
to find the /bsd kernel in the primary hd0a partition of the first
disk image itself. It doesn't support hd0a:/etc/boot.conf yet, and it
is no BIOS or full boot loader, but it makes booting and handling of
VMs a bit easier - booting an external kernel is still supported.

The UFS file system code ufs.c is directly from libsa which is also
used by the real boot loader. The code compiles with a few signedness
warning which will be fixed separately.

OK mlarkin@


# 1.42 22-Nov-2016 reyk

Fix error path of config_setvm() and its callers. This unbreaks
loading of invalid kernel files.

Reported by mlarkin@
OK mlarkin@


# 1.41 22-Nov-2016 reyk

There is no need for res when there is already ret.


# 1.40 22-Nov-2016 edd

Insert disabled VMs into vmd(8)'s queues and allow vmctl(8) to display them.

Tested by Jon Bernard and reyk@.

OK reyk@, no objections mlarkin@.

Thanks


# 1.39 04-Nov-2016 reyk

Pass the internal vmid or 0 to vm_register() instead of changing it
once again after setting the next available id.

Suggested by edd@


# 1.38 04-Nov-2016 reyk

Update the config/register/get VM methods to match the config_set/get
style that is used in other places. Also keep the vmid from the parent.

OK edd@


# 1.37 29-Oct-2016 edd

Separate parsing vms and switches from starting them in vmd(8).

Brings us one step closer to having disabled by default vms is vm.conf(5),
which can be started with vmctl(8).

Input, testing and OK reyk@. Thanks.


# 1.36 17-Oct-2016 reyk

Add the option to specify an interface group per virtual switch as well;
this group will be added to all VM tap(4) interfaces in the switch.

Tested by martijn@


# 1.35 15-Oct-2016 reyk

Allow to add an interface to an interface group; with the group keyword.

Requested and tested by martijn@


# 1.34 12-Oct-2016 reyk

Fix functionality and semantics of vmctl load/reload/reset.

OK rzalamena@


# 1.33 06-Oct-2016 reyk

Terminate VMs on shutdown of vmd instead of leaving them running as
undead VM processes.

OK mlarkin@


# 1.32 05-Oct-2016 reyk

Add support for enhanced networking configuration and virtual switches.
See vm.conf(5) for more details.

OK mlarkin@


# 1.31 04-Oct-2016 reyk

Add a new "priv" process that is responsible for ioctls and restricted
operations that aren't allowed under pledge. This is a companion to
the "vmd" process that runs as root but with pledge.

With the "priv" process, each new tap(4) interface now gets a
description to indicate the vm, eg. "vm1-if0-myvm". For network
configuration will be done by vmd/priv later.

OK mlarkin@


# 1.30 29-Sep-2016 reyk

Implement fork+exec for vmd, using the same framework from httpd etc.

No objections from mlarkin@ sunil@


# 1.29 17-Aug-2016 deraadt

small bits of header cleanup; ok mlarkin


# 1.28 29-Jul-2016 stefan

Allow starting a VM again after it was terminated

If a VM exits, terminate it and remove it from the list of
available VMs. That allows a VM with name `foo' to be restarted
after it has exited.

This changes structures shared between vmd and vmctl. You need to
rebuild vmctl also.

ok mlarkin@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.27 05-Feb-2016 reyk

Fix a possible use-after-free in vmd, forward the result to the
control socket before free'ing the vm.

Found by and OK jsg@


# 1.26 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.25 11-Dec-2015 reyk

The vmctl "id" argument can now be a number of or a vm name, eg.
vmctl stop 3
vmctl stop "openbsd.vm"


# 1.24 08-Dec-2015 jsg

when checking the config file with -n don't open /dev/vmm or require root
ok reyk@


# 1.23 08-Dec-2015 jsg

make the -f option work as intended
ok reyk@


# 1.22 07-Dec-2015 reyk

tweak initial error logging


# 1.21 06-Dec-2015 reyk

Prevent running a VM with the same name multiple times - multiple
instances of the same configuration will be handled in a different way
later. It is also not a good idea to use the same writeable disk
with multiple VMs at the same time.

As discussed with mlarkin@


# 1.20 06-Dec-2015 reyk

Print the TTY in the vmctl status output.


# 1.19 06-Dec-2015 reyk

When a new vm is created with VMM_IOC_CREATE, the kernel assigns a
unique id to it. This happens in the vm child process and has to be
communicated to the parent processes to track the vm. Knowing the vm
id in the parent and vmm processes also allows to remove vm from the
daemons list on terminate requests later.


# 1.18 06-Dec-2015 reyk

Check errno from config_getvm() correctly


# 1.17 05-Dec-2015 reyk

Print shorter error message if opening /dev/vmm failed.

Pointed out by deraadt@


# 1.16 03-Dec-2015 reyk

Re-add the "load" and "reload" commands to vmctl: Instead of parsing
the configuration in vmctl directly, it now sends a (re)load request
to vmd. The reload also resets the existing configuration status -
this doesn't do much difference yet but a future change will compare
if a specified VM is already running. "load" will allow to add
configuration, while "reload" resets the state before loading.


# 1.15 03-Dec-2015 reyk

Add and document -D and -f flags to vmd.


# 1.14 03-Dec-2015 reyk

mlarkin's code has been moved to vmm.c, so it is ok to claim the copyright.


# 1.13 03-Dec-2015 reyk

Add support for an optional vm.conf(5) file in vmd. This will replace
vmm.conf(5) in vmmctl. For a short time, both vmd and vmmctl will
support a configuration file, but vmmctl will be changed to send
"load" requests to vmd instead of loading and parsing the file
directly.


# 1.12 03-Dec-2015 reyk

prepare config_getvm() for parse.y


# 1.11 02-Dec-2015 reyk

send the tty name to vmmctl and print it as a result.


# 1.10 02-Dec-2015 reyk

Split the fully privileged parent into two processes "parent" and
"vmm" with reduced privileges:
- the "parent" opens fds (disks, ifs, etc.) but runs as root but pledged as
"stdio rpath wpath proc tty sendfd".
- the "vmm" process handles the creation and supervision of vm processes,
and the primary communication with the vmm(4) subsystem. It runs as _vmd
in the chroot but does not use pledge, as the vmm ioctls are not allowed
by any pledge model yet.
With this change, vmd starts to track the configuration state of VMs
in vmd and will allow other things later (like terminating a vm by
name, moving the configuration parser to vmd, ...). More incremental
changes will follow.


# 1.9 02-Dec-2015 reyk

Start tweaking vmd's privsep and daemon model by splitting the main
process into multiple parts and adopting the "proc.c"-style from other
daemons. This allows to further reduce the privileges, to give better
pledge(2), and to add some upcoming changes.

"please do" mlarkin@, deraadt@


# 1.8 26-Nov-2015 reyk

Automatically start vmm(4) when the first VM is created and after the
last VM is terminated. This allows to remove the explicit "vmm
enable" / "vmm disable" (VMM_IOC_START / VMM_IOC_STOP) ioctls. You'll
have to update kernel and userland for this change, as the kernel ABI
changes.

OK mpi@ mlarkin@


# 1.7 25-Nov-2015 tedu

typo: should be looking pid == -1


# 1.6 23-Nov-2015 reyk

accept4() is restarted after signals which prevents vmd from exiting
in the current control socket loop. Add a poll before the accept that
is not restarted and allows to escape the loop. This code is kind of
temporary, as we're planning to replace the event handling, but it
allows to kill (or Ctrl+c) vmd for now.

OK tedu@, discussed with many


# 1.5 23-Nov-2015 reyk

I accidentally removed a newline in usage() when converting the log
messages to log_*.

From Cesare Gargano


# 1.4 23-Nov-2015 reyk

Add support for logging to stderr or syslog, and to run vmd in
foreground with -d.

OK mlarkin@ jung@


# 1.3 22-Nov-2015 deraadt

use PATH_MAX where needed


# 1.2 22-Nov-2015 reyk

Add $ Ids


# 1.1 22-Nov-2015 mlarkin

vmd(8) - virtual machine daemon.

There is still a lot to be done, and fixed, in these userland components
but I have received enough "it works, commit it" emails that it's time
to finish those things in tree.

discussed with many, tested by many.


# 1.143 24-Apr-2023 kn

Missing the optional default config is not an error

/var/log/{messages,daemon} logs ENOENT as error on default configless vmd.
Only complain on explicitly passed files and print a debug hint under `-vv'
in case someone forgot to populate their /etc/vm.conf.

OK dv mlarkin


# 1.142 23-Apr-2023 dv

vmd(8): teach vmm process how to exec.

Use execvp(2) to launch vm children with new address spaces.
Consequently, introduces use of unveil(2) into the vmm and vm
processes.

This imposes the requirement of launching vmd with absolute paths,
similar to sshd(8).

ok mlarkin@


# 1.141 19-Apr-2023 jsg

remove duplicate includes


# 1.140 16-Apr-2023 dv

vmd(8): clean up fd closing in vmm process.

Some mild tidying of fd closing in the vmm process in prep for
landing parts of my fork+exec diff.

With input from guenther@ on the nuances of if/when EINTR may happen
in a call to close(2).

ok mlarkin@


# 1.139 02-Apr-2023 dv

vmd(8): migrate vmd_vm.vm_ttyname to char array.

Other structs use a fixed length array already. This allows a vmd_vm
object to be transmitted over an ipc channel, too.

Additionally, solves a segfault caused by a strlcpy(3) in an error
path.

ok mlarkin@


Revision tags: OPENBSD_7_3_BASE
# 1.138 28-Jan-2023 dv

Move some header definitions from vmm(4) to vmd(8).

Part of an ongoing effort to move userland-specific information out
of a kernel header and directly into vmd(8). No functional change.

ok mlarkin@


# 1.137 22-Jan-2023 dv

vmd(8): don't remove known vm's from the config on error.

Multiple error paths, specifically the one related to if a guest
cannot allocate memory at start, resulted in a known vm (via
vm.conf(5)) being removed from the vm list. Adjust the error paths
to check if the failing vm is defined in the config before tearing
it down.

Tested with help from beck@ and Mischa Peters.

ok beck@


# 1.136 14-Jan-2023 dv

Only open /dev/vmm once in vmd(8).

Have the parent process open /dev/vmm and send the fd to the vmm
child process. Only the vmm process and its resulting children
(guest vms) need it for ioctl calls.

ok kn@


# 1.135 28-Dec-2022 jmc

spelling fixes; from paul tagliamonte
any parts of his diff not taken are noted on tech


# 1.134 15-Dec-2022 dv

Add explicit casts to ctype functions in vmd(8).

OK millert@


# 1.133 31-Oct-2022 dv

vmd(8): remove unfinished user accounting.

User accounting and enforcement was never finished. tedu the thing
until someone wants to pick it up and finish it.

Originally found by Matthew Martin.

ok mlarkin@, kn@. input from tb@.


Revision tags: OPENBSD_7_2_BASE
# 1.132 13-Sep-2022 martijn

branches: 1.132.2;
Add (partial) support for agentx in vmd.

Metrics can be found under mib-2.236 and VM-MIB (RFC7666).

Stress tested by and happy noises from Mischa Peters
OK dv@


# 1.131 08-May-2022 dv

vmd: fix rebooting a received vm

Rebooting a received vm resulted in vmd(8) exiting as a result of
flawed state tracking in the parent process.

When stopping a vm, clear the VM_RECEIVE_STATE flag. When starting
a vm, make sure the parent process collapses any existing memory
ranges after the vm is sent to the vmm process (responsible for
launching the vm).

ok mlarkin@


Revision tags: OPENBSD_7_1_BASE
# 1.130 01-Mar-2022 dv

branches: 1.130.2;
vmd(8): gracefully handle hitting data limits when starting a vm

With recent changes to login.conf(5) to restrict daemon datasize
to a finite value, users can now hit resource limits when attempting
to start a vm.

This change fixes the error path when hitting the limit. vmd(8)
will no longer abort and memory error messages are relayed to the
user.

While here, address potential under-reads/writes using atomicio
when relaying data between the child vm process and vmd's vmm
process.

Original diff from tedu@. OK mlarkin@.


# 1.129 04-Jan-2022 claudio

Fix some simple -Wunused-but-set-variable warnings.
OK benno@ dv@


# 1.128 13-Dec-2021 deraadt

including sys/cdefs.h manually started as a result of netbsd trying to
macro-build a replacement for sccsid, and was done without any concern
for namespace damage. Unfortunately this practice started infecting
other code as others were unaware they didn't need the file.
ok millert guenther


# 1.127 29-Nov-2021 deraadt

mostly avoid sys/param.h with a local nitems()
ok mlarkin


Revision tags: OPENBSD_7_0_BASE
# 1.126 18-Jul-2021 dv

vmd(8): remove invalid errno values from config_setvm

Refactor config_setvm to directly return error code on failure
instead of returning -1 and setting errno. It was setting unsupported
values not defined in <errno.h>.

OK mlarkin@


# 1.125 05-May-2021 dv

Refactor vm_instance to return error value directly.

vmd(8)'s vm_instance function set unsupported errno values. Change the
api to directly return an error (either errno or custom vmd error).

"go for it" -mlarkin@


# 1.124 04-May-2021 dv

Init debug logging state before attempting to log.

Error messages related to bad configuration were not flushing to
stderr.

OK mlarkin@


# 1.123 26-Apr-2021 dv

vmd(8): fix vmctl client "wait" state corruption

Adds queue-based tracking of waiting client state to fix the cause of
state corruption when a vmctl(8) user cancels a wait and restarts it.
The socket fd value for the control process client was being used to
track the waiting party, but this also prevented multiple waiting
clients.

This moves all the state tracking of who to notify of a vm's stopping
to the control process and no longer requires the parent process to
track it in the global environment state.

Future work will be needed to smooth out the difference between the
IMSG_VMDOP_TERMINATE_VM_{EVENT,RESPONSE} events instead of needing to
translate before relaying to the vmctl(8) client.

Tested by Mischa Peters (thanks!)

ok mlarkin@


Revision tags: OPENBSD_6_9_BASE
# 1.122 05-Apr-2021 dv

Send correct response type on unpause errors.

ok pd@


# 1.121 29-Mar-2021 dv

Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp
and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior
ignored did not intercept these packets and instead transmitted them.

This should make vmd(8)'s dhcp behave more as a true dhcp server should and
allows it to work properly with the new dhcpleased(8) attempting a renewal.

OK mlarkin@


# 1.120 27-Jan-2021 deraadt

these programs (with common ancestry) had a -fno-common problem related
to privsep_procid.
ok mortimer


Revision tags: OPENBSD_6_8_BASE
# 1.119 23-Sep-2020 martijn

Revert agentx support for now, we're too close to release.

requested by deraadt@


# 1.118 23-Sep-2020 martijn

Add support for agentx to vmd.

This is based around VM-MIB from RFC7666,but does not export the full
spec. People more knowledgeable of vmd are encouraged to expand on this.


Revision tags: OPENBSD_6_7_BASE
# 1.117 12-Dec-2019 pd

vmd: start vms defined in vm.conf in a staggered fashion

This addresses 'thundering herd' problem when a lot of
vms are configured in vm.conf. A lot of vms booting in parallel can
overload the host and also mess up tsc calibration in openbsd guests as
it uses PIT which doesn't fire reliably if the host is overloaded.

We default to starting vms with parallelism of ncpuonline and a delay 30 seconds
between batches. This is configurable in vm.conf.

ok mlarkin@ (also addressed comments from cheloha@)


Revision tags: OPENBSD_6_6_BASE
# 1.116 04-Sep-2019 mlarkin

vmd(8): memory leak in an error path

Found by Hiltjo Posthuma, thanks!


# 1.115 14-Aug-2019 anton

Improve the error message when supplying an invalid template to vmctl
start. Favoring 'invalid template' over 'permission denied' should give
the user a better hint on what went wrong.

ok kn@ mlarkin@


# 1.114 28-Jun-2019 deraadt

When system calls indicate an error they return -1, not some arbitrary
value < 0. errno is only updated in this case. Change all (most?)
callers of syscalls to follow this better, and let's see if this strictness
helps us in the future.


# 1.113 20-May-2019 jasper

drop fatalx calls when claiming a new vm id; otherwise it's possible
to crash vmd and take all other vms with it. this required a little
shuffling to get the error value reported back to the caller to
handle the error properly.

ok mlarkin@


# 1.112 11-May-2019 jasper

report vm state through 'vmctl status'; whereas previously this would display the state of
the vcpu (which is why it got removed), it now actually reports the correct state
(running, stopped, disabled, paused, etc)

ok ccardenas@ mlarkin@


# 1.111 11-May-2019 jasper

vm_dump_header allocated space for a signature but it was never set;
set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image

ok mlarkin@ pd@


# 1.110 11-May-2019 jasper

track the state of the vm (running, paused, etc) using a single bitfield instead of
a handful of separate variables. this will makes it easier for vmd to report
and check on the individual vm states

no functional change intended

ok ccardenas@ mlarkin@


# 1.109 11-May-2019 jasper

sync the vm state in vmd too when (un)pausing a vm, otherwise the vm process
knows the vm is paused, but vmd does not.

ok mlarkin@ pd@


Revision tags: OPENBSD_6_5_BASE
# 1.108 09-Dec-2018 claudio

When -B is used to specify a specific boot device also change the reboot
behaviour of vmd to stop / exit at guest reboot.
OK ccardenas@


# 1.107 04-Dec-2018 claudio

Introduce IMSG_VMDOP_WAIT_VM_REQUEST a control message that registers a
vmctl peerid that should be informed when the VM is stopped (like when the
guest does a shutdown). Uses the same logic as using the VMOP_WAIT flag on
IMSG_VMDOP_TERMINATE_VM_REQUEST.
Ok ccardenas@, reyk@


# 1.106 26-Nov-2018 ori

Keep a list of known vms, and reuse the VM IDs.

This means that when using '-L', the IP addresses of the VMs are stable.

ok reyk@


# 1.105 21-Nov-2018 reyk

Add support for "local inet6" interfaces.

ok & test ccardenas@, additional review from kn@


# 1.104 15-Oct-2018 reyk

Prevent VM reboot loops by rate-limiting the interval a VM can reboot.

This looping has been experienced by people who run VMs with a broken
kernel or boot loader that trigger a very fast reboot loop (triple
fault) of a VM that ends up using a lot of CPU and resources on the
host. Some fixes in vmm(4) and vmd(8) helped to avoid such conditions
but it can still occur if something is wrong in the guest VM itself.

If the VM restarts after less than VM_START_RATE_SEC (6) seconds, we
increment the limit counter. After VM_START_RATE_LIMIT (3) of suchs
fast reboots the VM is stopped.

There are only very few people who intentionally want to reboot-loop a
VM very quickly (many times within a second); mostly for fuzzing.
They will have to recompile and adjust the stated #defines in the code
as we don't have a config option to disable it.

OK mlarkin@


Revision tags: OPENBSD_6_4_BASE
# 1.103 08-Oct-2018 reyk

Add support for qcow2 base images (external snapshots).

This works is from Ori Bernstein, committing on his behalf:

Add support to vmd for external snapshots. That is, snapshots that are
derived from a base image. Data lookups start in the derived image,
and if the derived image does not contain some data, the search
proceeds ot the base image. Multiple derived images may exist off of
a single base image.

A limitation of this format is that modifying the base image will
corrupt the derived image.

This change also adds support for creating disk derived disk images to
vmctl. To use it:

vmctl create derived.qcow2 -s 16G -b base.qcow2

From Ori Bernstein
OK mlarkin@ reyk@


# 1.102 29-Sep-2018 pd

vmd: don't remove vm if sending failed

Fix a bug where a vm was removed in vmd.c after vmctl send even if sending
failed.
spotted by solene@
ok mlarkin@


# 1.101 28-Sep-2018 reyk

Fix copy-pasto to use maxmem instead of maxcpu

Reported by Greg Steuck

OK mlarkin@


# 1.100 10-Sep-2018 bluhm

vmd(8) clould close file descriptor 0 as not all fd fields were
properly initialized with -1. Also avoid closing -1.
OK mlarkin@


# 1.99 10-Sep-2018 bluhm

During the fork+exec implementation, daemon(3) was moved after
proc_init(). As a consequence vmd(8) child processes did not detach
from the terminal anymore. Dup /dev/null to the stdio file descriptors
in the children.
OK mlarkin@ reyk@


# 1.98 15-Jul-2018 reyk

Track resources and enforce cpu/memory/interface limits for non-root users.

The limits are currently hard-coded and undocumented (4 CPUs/VMs, 2G
memory, 8 interfaces) but will be configurable in an upcoming diff.
These limits are tracked in total usage; for example, a user will be
able to run up to 4 VMs with 512M of memory or a single VM with 2G.

OK ccardenas@ mlarkin@


# 1.97 13-Jul-2018 reyk

Check the disk/kernel/cdrom file permissions after openening the fd.

This prevents time of TOCTOU attacks for instances.

OK mlarkin@


# 1.96 13-Jul-2018 reyk

Add "allow instance" option.

This allows users to create VM instances and change desired options,
for example a user can be allowed to run a VM with all the
pre-configured options but specify an own disk image.

(mlarkin@ was fine with iterating over it)

OK ccardenas@


# 1.95 12-Jul-2018 reyk

Allow to use configured/running VMs as templates for other VM instances.

This introduces new grammar and the -t optional in vmctl start.

(For now, only root can create VM instances; but it is planned to allow
users to create their own VMs based on permissions and quota.)

OK ccardenas@ mlarkin@ jmc@


# 1.94 11-Jul-2018 reyk

style - indent each case statement in a switch.


# 1.93 11-Jul-2018 reyk

Add -w option to vmctl stop to wait for completion of VM termination.

Use it in /etc/rc.d/vmd accordingly.

OK sthen@


# 1.92 11-Jul-2018 reyk

Rename function to vmd_check_vmh


# 1.91 11-Jul-2018 reyk

Add -f option to vmctl stop to forcefully kill a VM.

This also fixes a bug in vmm_sighdlr where it might have missed
forwarding the TERMINATE_EVENT to the vmd parent after a VM child
died, leading to an abandoned VM in the vmd parent process.

OK ccardenas@ mlarkin@ benno@ kn@


# 1.90 10-Jul-2018 reyk

style (single-line ifs don't need braces)


# 1.89 10-Jul-2018 reyk

vmd already had DEBUG/DPRINTF, there is no need for VMD_DEBUG/dprintf

Replace all occurences of dprintf with DPRINTF (defined in proc.h).


# 1.88 10-Jul-2018 reyk

Tweak debug log messages

- Turn tracing messages into DPRINTF (only compiled with DEBUG).

- Pass __func__ to vm_stop and vm_remove: this way we can track who
called the function in the async context. It replaces the manual
log_debug in front of each vm_stop/vm_remove. This debug logging
trick can be removed in the future once we are more confident about
it.

OK ccardenas@ mlarkin@


# 1.87 26-Jun-2018 reyk

Add "socket owner" to allow changing the owner of the vmd control socket.

This allows to open vmctl control or console access to other users
that are not in group wheel. Access for non-root users still defaults
to read-only actions unless you change the owner (user/group) of each
individual VM.

Requested by Mischa Peters

OK mlarkin@


# 1.86 19-Jun-2018 reyk

knf


# 1.85 13-May-2018 pd

vmd(8): enable pause / unpause for vm owners

Patch from Mohamed Aslan. Thanks!
ok kn@


# 1.84 25-Apr-2018 mlarkin

vmd(8)'s early error messages weren't visible when started via /etc/rc
(such as errors relating to not having VMX/etc). Change the log_init
to log to syslog so at least we have some chance of seeing these errors.

requested and ok beck@


# 1.83 21-Apr-2018 mlarkin

spelling error in log message


# 1.82 29-Mar-2018 martijn

Make sure that the global config is send out immediately when it is
loaded. This makes sure that the local prefix specied in the config is
always used.

OK ccardenas@


Revision tags: OPENBSD_6_3_BASE
# 1.81 14-Mar-2018 mlarkin

block two VMs from using the same disk image file at the same time.
Also changes an error message in vmctl to reflect same.


# 1.80 18-Feb-2018 pd

vmd: fix vmctl pause for non existing vm ids (never returns)

check if vm id is valid before sending to vmm for pausing. The 'lock' is caused
by vmm sending back ENOENT for a non existent vm but vmd drops the message
because it doesn't recogize the vmid vmm is talking about. This is an artifact
of the 'policy' don't trust any imsg from a sibling priv sep process and do
your own checking.

reported by Abel Abraham Camarillo Ojeda
ok mlarkin@ and ccardenas@


# 1.79 10-Jan-2018 sthen

Don't require "disk" or "kernel", also allow just "cdrom" instead, a VM can
still be useful with only cdrom storage. ok ccardenas@


# 1.78 08-Jan-2018 mpi

Enable TIOCUCNTL to be able to set ns8250's break detected condition.

It is now possible to send BREAK commands to vmd(8) independently of
the serial terminal emulator.

Happy virtual ddb(4) hacking!

No objection from mlarkin@, ok nicm@, ccardenas@, deraadt@


# 1.77 03-Jan-2018 ccardenas

Add initial CD-ROM support to VMD via vioscsi.

* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8)
* Support various sized ISOs (Limitation of 4G ISOs on Linux guests)
* Known working guests: OpenBSD (primary), Alpine Linux (primary),
CentOS 6 (secondary), Ubuntu 17.10 (secondary).
NOTE: Secondary indicates some issue(s) preventing full/reliable
functionality outside the scope of the vioscsi work.
* If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's
default BIOS) will boot from CD-ROM.

ok mlarkin@, jca@


# 1.76 06-Dec-2017 abieber

Make vmd respect owner when starting non-disabled vms.

OK pd@, benno@


# 1.75 30-Nov-2017 ccardenas

When performing vmctl reload and a previously configured vm is running,
exit with an EALREADY vs EPERM.

ok mlarkin@


# 1.74 11-Nov-2017 mlarkin

update switch handling in vmd(8). vmd now gets switch information (rdomain,
etc) from underlying switch interface instead of handling this on its
own.

Diff from carlos cardenas, Thanks!

ok reyk@


# 1.73 07-Nov-2017 mlarkin

typo in previous


# 1.72 07-Nov-2017 mlarkin

comment function vm_checkperm


# 1.71 24-Oct-2017 mlarkin

The VMD parent process didn't handle the case of a VM exiting
with a non 0 return properly (i.e. EIO).

From: Carlos Cardenas, thanks!


# 1.70 07-Oct-2017 mlarkin

vmd: retain ownership on vm reboot

from Jesper Wallin, thanks!


Revision tags: OPENBSD_6_2_BASE
# 1.69 08-Sep-2017 mlarkin

vmd: add more explanatory log_debug messages

From Carlos Cardenas, many thanks!


# 1.68 20-Aug-2017 pd

vmd: Allow only upward migration

This restricts receiving vms from hosts with more cpu features.

Tested on
broadwell -> skylake (works)
skylake -> broadwell (don't work)

ok mlarkin@


# 1.67 15-Aug-2017 pd

vmd: fix vm id displayed by vmctl when receiving a vm

Also fix two debug messages and an IMSG type.


# 1.66 14-Aug-2017 jasper

validate vm names before creating them; a valid name contains alphanumeric
characters, including '.', '_' and '-'. but does not start with the latter
three.

ok mlarkin@ pd@


# 1.65 13-Aug-2017 jasper

don't issue a termination command to an already stopped vm

ok mlarkin@


# 1.64 15-Jul-2017 pd

Add vmctl send and vmctl receive

ok reyk@ and mlarkin@


# 1.63 09-Jul-2017 pd

vmd/vmctl: Add ability to pause / unpause vms

With help from Ashwin Agrawal

ok reyk@ mlarkin@


# 1.62 29-May-2017 mlarkin

vmd(8): prevent crashing when presented with a vm name argument to
"vmctl stop" that doesn't exist.

Diff from Pratik Vyas, thanks!


# 1.61 04-May-2017 reyk

Report command failure back to vmctl reload, reset, load, log verbose.

OK mlarkin@


# 1.60 04-May-2017 reyk

Add support for rdomains.

This allows to configure VM interfaces and switches in individual rdomains.

OK mlarkin@


# 1.59 25-Apr-2017 reyk

Generate randomized MAC addresses earlier to keep them across reboots.

OK deraadt@


# 1.58 21-Apr-2017 reyk

Add global configuration option "local prefix" to change prefix for -L.

The default prefix is 100.64.0.0/10 from RFC6598.

Requested by sthen@ chris@
OK mlarkin@


# 1.57 19-Apr-2017 reyk

Add support for dynamic "NAT" interfaces (-L/local interface).

When a local interface is configured, vmd configures a /31 address on
the tap(4) interface of the host and provides another IP in the same
subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server
that replies with IP, gateway, and DNS addresses to the VM. The
built-in server only ever responds to the VM on the inside and cannot
leak its DHCP responses to the outside.

Thanks to Uwe Werler, Josh Grosse, and some others for testing!

OK deraadt@


# 1.56 06-Apr-2017 reyk

Do not expose vmm(4) VM IDs to the user, use vmd(8)'s IDs instead.

Each VM has two IDs: one from the kernel (vmm) and a different one
from userland (vmd). The vmm ID is not consistent and incremented on
every boot during runtimg of the host system. The vmd ID remains the
same during the lifetime of a configured VM, even after reboots.
Configured VMs will even get and keep their IDs when the configuration
is loaded. This is more what users expect.

Pointed out and tested by otto@

OK deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.55 15-Mar-2017 reyk

More fixes for starting and stopping VMs, fixing fallout from vm_running.

- Don't start a VM that is already running
- Keep the VM as running until it is powered off (and not stopping)
- Don't fatal in the parent if the vmm process referenced an unknown VM
- Don't stop a VM that is already stopping
- Indicate that a VM is stopping in "vmctl status"

The previous "vmctl stop; vmctl stop" to force-shutdown is not
supported anymore - the shutdown timeout should make sure that the VM
is really terminated. To force-shutdown, reference the VM by ID.
We might add a flag to vmctl stop to just turn the VM off.


# 1.54 15-Mar-2017 reyk

Close the tty if the VM was powered down.

The parent keeps a copy of each VM's tty fd to reuse it on reboot.
Close this tty if the VM was stopped, and not rebooted, by calling
vm_stop(vm, 0) instead of just setting vm_running to 0. Also make
sure that vm_ttyname is not used after free'ing it.


# 1.53 02-Mar-2017 reyk

Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.

This is especially useful when multiple VMs share a switch, the
implementation is independent from the underlying switch or bridge.

no objections mlarkin@


# 1.52 01-Mar-2017 reyk

Add "owner" option to set a user/group ownership for pre-configured VMs

This allows matching users to start or stop VMs that they "own" and to
access the console accordingly.

OK mlarkin@


# 1.51 27-Feb-2017 reyk

Replace openpty(3) with local function that uses pre-opened /dev/ptm fd

This allows more flexibility for upcoming changes and better pledge.
We also didn't use half of the features of libutil's openpty function.
Additionally, make sure that the ttys are closed correctly on shutdown.

OK gilles@


# 1.50 13-Jan-2017 edd

Make it possible to remove VMs from vmd(8)'s internal queue.

The semantics agreed with reyk@ are:

* ad-hoc created vms, created with `vmctl start`, are removed once stopped.
* Stopped VMs defined in a config file are flushed before a `vmctl reload`.

OK reyk@


# 1.49 11-Jan-2017 reyk

Add imsg communication channel between vmd and invividual VMs.
For now, this is only used to forward "log verbose|brief" requests,
but it will be used for better things later.

OK mlarkin@


# 1.48 09-Jan-2017 reyk

Stop accessing verbose and debug variables from log.c directly.

This replaces log_verbose() and "extern int verbose" with the two functions
log_setverbose() and log_getverbose().

Pointed out by benno@
OK krw@ eric@ gilles@ (OK gilles@ for the snmpd bits as well)


# 1.47 14-Dec-2016 reyk

Allow to start disabled and pre-configured VMs by name, "vmctl start foo".

With testing from Jon Bernard

OK mlarkin@


# 1.46 14-Dec-2016 reyk

If a VM terminates with the result EAGAIN, close all fds except the
pty and re-send it to the vmm monitor process. With additional
changes in vmm.c, this will allow perform a cold reboot of VM.

With testing and feedback from Jon Bernard
OK mlarkin@


# 1.45 26-Nov-2016 reyk

Implement basic support for boot.conf(8) on the disk image.

Like the real boot loader, load and parse hd0a:/etc/boot.conf from the
first disk and fall back to /bsd. Not all boot loader options are
supported, but it at least does set device, set image, and boot -acds
(eg. for booting single-user).

For example, it can now boot install60.fs that includes a boot.conf
with "set image /6.0/amd64/bsd.rd":
vmctl start install -c -d install60.fs -d OpenBSD.img

This pseudo-bootloader is only needed without BIOS and could
potentially be replaced in the future.

OK mlarkin@


# 1.44 26-Nov-2016 reyk

If -m/memory is not specified, use 512M by default.

Default value picked with mlarkin - not too small and not too large.

OK mlarkin@


# 1.43 24-Nov-2016 reyk

Add support for booting the kernel from the disk image.

This make the kernel/-k argument optional and, if not specified, tries
to find the /bsd kernel in the primary hd0a partition of the first
disk image itself. It doesn't support hd0a:/etc/boot.conf yet, and it
is no BIOS or full boot loader, but it makes booting and handling of
VMs a bit easier - booting an external kernel is still supported.

The UFS file system code ufs.c is directly from libsa which is also
used by the real boot loader. The code compiles with a few signedness
warning which will be fixed separately.

OK mlarkin@


# 1.42 22-Nov-2016 reyk

Fix error path of config_setvm() and its callers. This unbreaks
loading of invalid kernel files.

Reported by mlarkin@
OK mlarkin@


# 1.41 22-Nov-2016 reyk

There is no need for res when there is already ret.


# 1.40 22-Nov-2016 edd

Insert disabled VMs into vmd(8)'s queues and allow vmctl(8) to display them.

Tested by Jon Bernard and reyk@.

OK reyk@, no objections mlarkin@.

Thanks


# 1.39 04-Nov-2016 reyk

Pass the internal vmid or 0 to vm_register() instead of changing it
once again after setting the next available id.

Suggested by edd@


# 1.38 04-Nov-2016 reyk

Update the config/register/get VM methods to match the config_set/get
style that is used in other places. Also keep the vmid from the parent.

OK edd@


# 1.37 29-Oct-2016 edd

Separate parsing vms and switches from starting them in vmd(8).

Brings us one step closer to having disabled by default vms is vm.conf(5),
which can be started with vmctl(8).

Input, testing and OK reyk@. Thanks.


# 1.36 17-Oct-2016 reyk

Add the option to specify an interface group per virtual switch as well;
this group will be added to all VM tap(4) interfaces in the switch.

Tested by martijn@


# 1.35 15-Oct-2016 reyk

Allow to add an interface to an interface group; with the group keyword.

Requested and tested by martijn@


# 1.34 12-Oct-2016 reyk

Fix functionality and semantics of vmctl load/reload/reset.

OK rzalamena@


# 1.33 06-Oct-2016 reyk

Terminate VMs on shutdown of vmd instead of leaving them running as
undead VM processes.

OK mlarkin@


# 1.32 05-Oct-2016 reyk

Add support for enhanced networking configuration and virtual switches.
See vm.conf(5) for more details.

OK mlarkin@


# 1.31 04-Oct-2016 reyk

Add a new "priv" process that is responsible for ioctls and restricted
operations that aren't allowed under pledge. This is a companion to
the "vmd" process that runs as root but with pledge.

With the "priv" process, each new tap(4) interface now gets a
description to indicate the vm, eg. "vm1-if0-myvm". For network
configuration will be done by vmd/priv later.

OK mlarkin@


# 1.30 29-Sep-2016 reyk

Implement fork+exec for vmd, using the same framework from httpd etc.

No objections from mlarkin@ sunil@


# 1.29 17-Aug-2016 deraadt

small bits of header cleanup; ok mlarkin


# 1.28 29-Jul-2016 stefan

Allow starting a VM again after it was terminated

If a VM exits, terminate it and remove it from the list of
available VMs. That allows a VM with name `foo' to be restarted
after it has exited.

This changes structures shared between vmd and vmctl. You need to
rebuild vmctl also.

ok mlarkin@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.27 05-Feb-2016 reyk

Fix a possible use-after-free in vmd, forward the result to the
control socket before free'ing the vm.

Found by and OK jsg@


# 1.26 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.25 11-Dec-2015 reyk

The vmctl "id" argument can now be a number of or a vm name, eg.
vmctl stop 3
vmctl stop "openbsd.vm"


# 1.24 08-Dec-2015 jsg

when checking the config file with -n don't open /dev/vmm or require root
ok reyk@


# 1.23 08-Dec-2015 jsg

make the -f option work as intended
ok reyk@


# 1.22 07-Dec-2015 reyk

tweak initial error logging


# 1.21 06-Dec-2015 reyk

Prevent running a VM with the same name multiple times - multiple
instances of the same configuration will be handled in a different way
later. It is also not a good idea to use the same writeable disk
with multiple VMs at the same time.

As discussed with mlarkin@


# 1.20 06-Dec-2015 reyk

Print the TTY in the vmctl status output.


# 1.19 06-Dec-2015 reyk

When a new vm is created with VMM_IOC_CREATE, the kernel assigns a
unique id to it. This happens in the vm child process and has to be
communicated to the parent processes to track the vm. Knowing the vm
id in the parent and vmm processes also allows to remove vm from the
daemons list on terminate requests later.


# 1.18 06-Dec-2015 reyk

Check errno from config_getvm() correctly


# 1.17 05-Dec-2015 reyk

Print shorter error message if opening /dev/vmm failed.

Pointed out by deraadt@


# 1.16 03-Dec-2015 reyk

Re-add the "load" and "reload" commands to vmctl: Instead of parsing
the configuration in vmctl directly, it now sends a (re)load request
to vmd. The reload also resets the existing configuration status -
this doesn't do much difference yet but a future change will compare
if a specified VM is already running. "load" will allow to add
configuration, while "reload" resets the state before loading.


# 1.15 03-Dec-2015 reyk

Add and document -D and -f flags to vmd.


# 1.14 03-Dec-2015 reyk

mlarkin's code has been moved to vmm.c, so it is ok to claim the copyright.


# 1.13 03-Dec-2015 reyk

Add support for an optional vm.conf(5) file in vmd. This will replace
vmm.conf(5) in vmmctl. For a short time, both vmd and vmmctl will
support a configuration file, but vmmctl will be changed to send
"load" requests to vmd instead of loading and parsing the file
directly.


# 1.12 03-Dec-2015 reyk

prepare config_getvm() for parse.y


# 1.11 02-Dec-2015 reyk

send the tty name to vmmctl and print it as a result.


# 1.10 02-Dec-2015 reyk

Split the fully privileged parent into two processes "parent" and
"vmm" with reduced privileges:
- the "parent" opens fds (disks, ifs, etc.) but runs as root but pledged as
"stdio rpath wpath proc tty sendfd".
- the "vmm" process handles the creation and supervision of vm processes,
and the primary communication with the vmm(4) subsystem. It runs as _vmd
in the chroot but does not use pledge, as the vmm ioctls are not allowed
by any pledge model yet.
With this change, vmd starts to track the configuration state of VMs
in vmd and will allow other things later (like terminating a vm by
name, moving the configuration parser to vmd, ...). More incremental
changes will follow.


# 1.9 02-Dec-2015 reyk

Start tweaking vmd's privsep and daemon model by splitting the main
process into multiple parts and adopting the "proc.c"-style from other
daemons. This allows to further reduce the privileges, to give better
pledge(2), and to add some upcoming changes.

"please do" mlarkin@, deraadt@


# 1.8 26-Nov-2015 reyk

Automatically start vmm(4) when the first VM is created and after the
last VM is terminated. This allows to remove the explicit "vmm
enable" / "vmm disable" (VMM_IOC_START / VMM_IOC_STOP) ioctls. You'll
have to update kernel and userland for this change, as the kernel ABI
changes.

OK mpi@ mlarkin@


# 1.7 25-Nov-2015 tedu

typo: should be looking pid == -1


# 1.6 23-Nov-2015 reyk

accept4() is restarted after signals which prevents vmd from exiting
in the current control socket loop. Add a poll before the accept that
is not restarted and allows to escape the loop. This code is kind of
temporary, as we're planning to replace the event handling, but it
allows to kill (or Ctrl+c) vmd for now.

OK tedu@, discussed with many


# 1.5 23-Nov-2015 reyk

I accidentally removed a newline in usage() when converting the log
messages to log_*.

From Cesare Gargano


# 1.4 23-Nov-2015 reyk

Add support for logging to stderr or syslog, and to run vmd in
foreground with -d.

OK mlarkin@ jung@


# 1.3 22-Nov-2015 deraadt

use PATH_MAX where needed


# 1.2 22-Nov-2015 reyk

Add $ Ids


# 1.1 22-Nov-2015 mlarkin

vmd(8) - virtual machine daemon.

There is still a lot to be done, and fixed, in these userland components
but I have received enough "it works, commit it" emails that it's time
to finish those things in tree.

discussed with many, tested by many.


# 1.141 19-Apr-2023 jsg

remove duplicate includes


# 1.140 16-Apr-2023 dv

vmd(8): clean up fd closing in vmm process.

Some mild tidying of fd closing in the vmm process in prep for
landing parts of my fork+exec diff.

With input from guenther@ on the nuances of if/when EINTR may happen
in a call to close(2).

ok mlarkin@


# 1.139 02-Apr-2023 dv

vmd(8): migrate vmd_vm.vm_ttyname to char array.

Other structs use a fixed length array already. This allows a vmd_vm
object to be transmitted over an ipc channel, too.

Additionally, solves a segfault caused by a strlcpy(3) in an error
path.

ok mlarkin@


Revision tags: OPENBSD_7_3_BASE
# 1.138 28-Jan-2023 dv

Move some header definitions from vmm(4) to vmd(8).

Part of an ongoing effort to move userland-specific information out
of a kernel header and directly into vmd(8). No functional change.

ok mlarkin@


# 1.137 22-Jan-2023 dv

vmd(8): don't remove known vm's from the config on error.

Multiple error paths, specifically the one related to if a guest
cannot allocate memory at start, resulted in a known vm (via
vm.conf(5)) being removed from the vm list. Adjust the error paths
to check if the failing vm is defined in the config before tearing
it down.

Tested with help from beck@ and Mischa Peters.

ok beck@


# 1.136 14-Jan-2023 dv

Only open /dev/vmm once in vmd(8).

Have the parent process open /dev/vmm and send the fd to the vmm
child process. Only the vmm process and its resulting children
(guest vms) need it for ioctl calls.

ok kn@


# 1.135 28-Dec-2022 jmc

spelling fixes; from paul tagliamonte
any parts of his diff not taken are noted on tech


# 1.134 15-Dec-2022 dv

Add explicit casts to ctype functions in vmd(8).

OK millert@


# 1.133 31-Oct-2022 dv

vmd(8): remove unfinished user accounting.

User accounting and enforcement was never finished. tedu the thing
until someone wants to pick it up and finish it.

Originally found by Matthew Martin.

ok mlarkin@, kn@. input from tb@.


Revision tags: OPENBSD_7_2_BASE
# 1.132 13-Sep-2022 martijn

branches: 1.132.2;
Add (partial) support for agentx in vmd.

Metrics can be found under mib-2.236 and VM-MIB (RFC7666).

Stress tested by and happy noises from Mischa Peters
OK dv@


# 1.131 08-May-2022 dv

vmd: fix rebooting a received vm

Rebooting a received vm resulted in vmd(8) exiting as a result of
flawed state tracking in the parent process.

When stopping a vm, clear the VM_RECEIVE_STATE flag. When starting
a vm, make sure the parent process collapses any existing memory
ranges after the vm is sent to the vmm process (responsible for
launching the vm).

ok mlarkin@


Revision tags: OPENBSD_7_1_BASE
# 1.130 01-Mar-2022 dv

branches: 1.130.2;
vmd(8): gracefully handle hitting data limits when starting a vm

With recent changes to login.conf(5) to restrict daemon datasize
to a finite value, users can now hit resource limits when attempting
to start a vm.

This change fixes the error path when hitting the limit. vmd(8)
will no longer abort and memory error messages are relayed to the
user.

While here, address potential under-reads/writes using atomicio
when relaying data between the child vm process and vmd's vmm
process.

Original diff from tedu@. OK mlarkin@.


# 1.129 04-Jan-2022 claudio

Fix some simple -Wunused-but-set-variable warnings.
OK benno@ dv@


# 1.128 13-Dec-2021 deraadt

including sys/cdefs.h manually started as a result of netbsd trying to
macro-build a replacement for sccsid, and was done without any concern
for namespace damage. Unfortunately this practice started infecting
other code as others were unaware they didn't need the file.
ok millert guenther


# 1.127 29-Nov-2021 deraadt

mostly avoid sys/param.h with a local nitems()
ok mlarkin


Revision tags: OPENBSD_7_0_BASE
# 1.126 18-Jul-2021 dv

vmd(8): remove invalid errno values from config_setvm

Refactor config_setvm to directly return error code on failure
instead of returning -1 and setting errno. It was setting unsupported
values not defined in <errno.h>.

OK mlarkin@


# 1.125 05-May-2021 dv

Refactor vm_instance to return error value directly.

vmd(8)'s vm_instance function set unsupported errno values. Change the
api to directly return an error (either errno or custom vmd error).

"go for it" -mlarkin@


# 1.124 04-May-2021 dv

Init debug logging state before attempting to log.

Error messages related to bad configuration were not flushing to
stderr.

OK mlarkin@


# 1.123 26-Apr-2021 dv

vmd(8): fix vmctl client "wait" state corruption

Adds queue-based tracking of waiting client state to fix the cause of
state corruption when a vmctl(8) user cancels a wait and restarts it.
The socket fd value for the control process client was being used to
track the waiting party, but this also prevented multiple waiting
clients.

This moves all the state tracking of who to notify of a vm's stopping
to the control process and no longer requires the parent process to
track it in the global environment state.

Future work will be needed to smooth out the difference between the
IMSG_VMDOP_TERMINATE_VM_{EVENT,RESPONSE} events instead of needing to
translate before relaying to the vmctl(8) client.

Tested by Mischa Peters (thanks!)

ok mlarkin@


Revision tags: OPENBSD_6_9_BASE
# 1.122 05-Apr-2021 dv

Send correct response type on unpause errors.

ok pd@


# 1.121 29-Mar-2021 dv

Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp
and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior
ignored did not intercept these packets and instead transmitted them.

This should make vmd(8)'s dhcp behave more as a true dhcp server should and
allows it to work properly with the new dhcpleased(8) attempting a renewal.

OK mlarkin@


# 1.120 27-Jan-2021 deraadt

these programs (with common ancestry) had a -fno-common problem related
to privsep_procid.
ok mortimer


Revision tags: OPENBSD_6_8_BASE
# 1.119 23-Sep-2020 martijn

Revert agentx support for now, we're too close to release.

requested by deraadt@


# 1.118 23-Sep-2020 martijn

Add support for agentx to vmd.

This is based around VM-MIB from RFC7666,but does not export the full
spec. People more knowledgeable of vmd are encouraged to expand on this.


Revision tags: OPENBSD_6_7_BASE
# 1.117 12-Dec-2019 pd

vmd: start vms defined in vm.conf in a staggered fashion

This addresses 'thundering herd' problem when a lot of
vms are configured in vm.conf. A lot of vms booting in parallel can
overload the host and also mess up tsc calibration in openbsd guests as
it uses PIT which doesn't fire reliably if the host is overloaded.

We default to starting vms with parallelism of ncpuonline and a delay 30 seconds
between batches. This is configurable in vm.conf.

ok mlarkin@ (also addressed comments from cheloha@)


Revision tags: OPENBSD_6_6_BASE
# 1.116 04-Sep-2019 mlarkin

vmd(8): memory leak in an error path

Found by Hiltjo Posthuma, thanks!


# 1.115 14-Aug-2019 anton

Improve the error message when supplying an invalid template to vmctl
start. Favoring 'invalid template' over 'permission denied' should give
the user a better hint on what went wrong.

ok kn@ mlarkin@


# 1.114 28-Jun-2019 deraadt

When system calls indicate an error they return -1, not some arbitrary
value < 0. errno is only updated in this case. Change all (most?)
callers of syscalls to follow this better, and let's see if this strictness
helps us in the future.


# 1.113 20-May-2019 jasper

drop fatalx calls when claiming a new vm id; otherwise it's possible
to crash vmd and take all other vms with it. this required a little
shuffling to get the error value reported back to the caller to
handle the error properly.

ok mlarkin@


# 1.112 11-May-2019 jasper

report vm state through 'vmctl status'; whereas previously this would display the state of
the vcpu (which is why it got removed), it now actually reports the correct state
(running, stopped, disabled, paused, etc)

ok ccardenas@ mlarkin@


# 1.111 11-May-2019 jasper

vm_dump_header allocated space for a signature but it was never set;
set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image

ok mlarkin@ pd@


# 1.110 11-May-2019 jasper

track the state of the vm (running, paused, etc) using a single bitfield instead of
a handful of separate variables. this will makes it easier for vmd to report
and check on the individual vm states

no functional change intended

ok ccardenas@ mlarkin@


# 1.109 11-May-2019 jasper

sync the vm state in vmd too when (un)pausing a vm, otherwise the vm process
knows the vm is paused, but vmd does not.

ok mlarkin@ pd@


Revision tags: OPENBSD_6_5_BASE
# 1.108 09-Dec-2018 claudio

When -B is used to specify a specific boot device also change the reboot
behaviour of vmd to stop / exit at guest reboot.
OK ccardenas@


# 1.107 04-Dec-2018 claudio

Introduce IMSG_VMDOP_WAIT_VM_REQUEST a control message that registers a
vmctl peerid that should be informed when the VM is stopped (like when the
guest does a shutdown). Uses the same logic as using the VMOP_WAIT flag on
IMSG_VMDOP_TERMINATE_VM_REQUEST.
Ok ccardenas@, reyk@


# 1.106 26-Nov-2018 ori

Keep a list of known vms, and reuse the VM IDs.

This means that when using '-L', the IP addresses of the VMs are stable.

ok reyk@


# 1.105 21-Nov-2018 reyk

Add support for "local inet6" interfaces.

ok & test ccardenas@, additional review from kn@


# 1.104 15-Oct-2018 reyk

Prevent VM reboot loops by rate-limiting the interval a VM can reboot.

This looping has been experienced by people who run VMs with a broken
kernel or boot loader that trigger a very fast reboot loop (triple
fault) of a VM that ends up using a lot of CPU and resources on the
host. Some fixes in vmm(4) and vmd(8) helped to avoid such conditions
but it can still occur if something is wrong in the guest VM itself.

If the VM restarts after less than VM_START_RATE_SEC (6) seconds, we
increment the limit counter. After VM_START_RATE_LIMIT (3) of suchs
fast reboots the VM is stopped.

There are only very few people who intentionally want to reboot-loop a
VM very quickly (many times within a second); mostly for fuzzing.
They will have to recompile and adjust the stated #defines in the code
as we don't have a config option to disable it.

OK mlarkin@


Revision tags: OPENBSD_6_4_BASE
# 1.103 08-Oct-2018 reyk

Add support for qcow2 base images (external snapshots).

This works is from Ori Bernstein, committing on his behalf:

Add support to vmd for external snapshots. That is, snapshots that are
derived from a base image. Data lookups start in the derived image,
and if the derived image does not contain some data, the search
proceeds ot the base image. Multiple derived images may exist off of
a single base image.

A limitation of this format is that modifying the base image will
corrupt the derived image.

This change also adds support for creating disk derived disk images to
vmctl. To use it:

vmctl create derived.qcow2 -s 16G -b base.qcow2

From Ori Bernstein
OK mlarkin@ reyk@


# 1.102 29-Sep-2018 pd

vmd: don't remove vm if sending failed

Fix a bug where a vm was removed in vmd.c after vmctl send even if sending
failed.
spotted by solene@
ok mlarkin@


# 1.101 28-Sep-2018 reyk

Fix copy-pasto to use maxmem instead of maxcpu

Reported by Greg Steuck

OK mlarkin@


# 1.100 10-Sep-2018 bluhm

vmd(8) clould close file descriptor 0 as not all fd fields were
properly initialized with -1. Also avoid closing -1.
OK mlarkin@


# 1.99 10-Sep-2018 bluhm

During the fork+exec implementation, daemon(3) was moved after
proc_init(). As a consequence vmd(8) child processes did not detach
from the terminal anymore. Dup /dev/null to the stdio file descriptors
in the children.
OK mlarkin@ reyk@


# 1.98 15-Jul-2018 reyk

Track resources and enforce cpu/memory/interface limits for non-root users.

The limits are currently hard-coded and undocumented (4 CPUs/VMs, 2G
memory, 8 interfaces) but will be configurable in an upcoming diff.
These limits are tracked in total usage; for example, a user will be
able to run up to 4 VMs with 512M of memory or a single VM with 2G.

OK ccardenas@ mlarkin@


# 1.97 13-Jul-2018 reyk

Check the disk/kernel/cdrom file permissions after openening the fd.

This prevents time of TOCTOU attacks for instances.

OK mlarkin@


# 1.96 13-Jul-2018 reyk

Add "allow instance" option.

This allows users to create VM instances and change desired options,
for example a user can be allowed to run a VM with all the
pre-configured options but specify an own disk image.

(mlarkin@ was fine with iterating over it)

OK ccardenas@


# 1.95 12-Jul-2018 reyk

Allow to use configured/running VMs as templates for other VM instances.

This introduces new grammar and the -t optional in vmctl start.

(For now, only root can create VM instances; but it is planned to allow
users to create their own VMs based on permissions and quota.)

OK ccardenas@ mlarkin@ jmc@


# 1.94 11-Jul-2018 reyk

style - indent each case statement in a switch.


# 1.93 11-Jul-2018 reyk

Add -w option to vmctl stop to wait for completion of VM termination.

Use it in /etc/rc.d/vmd accordingly.

OK sthen@


# 1.92 11-Jul-2018 reyk

Rename function to vmd_check_vmh


# 1.91 11-Jul-2018 reyk

Add -f option to vmctl stop to forcefully kill a VM.

This also fixes a bug in vmm_sighdlr where it might have missed
forwarding the TERMINATE_EVENT to the vmd parent after a VM child
died, leading to an abandoned VM in the vmd parent process.

OK ccardenas@ mlarkin@ benno@ kn@


# 1.90 10-Jul-2018 reyk

style (single-line ifs don't need braces)


# 1.89 10-Jul-2018 reyk

vmd already had DEBUG/DPRINTF, there is no need for VMD_DEBUG/dprintf

Replace all occurences of dprintf with DPRINTF (defined in proc.h).


# 1.88 10-Jul-2018 reyk

Tweak debug log messages

- Turn tracing messages into DPRINTF (only compiled with DEBUG).

- Pass __func__ to vm_stop and vm_remove: this way we can track who
called the function in the async context. It replaces the manual
log_debug in front of each vm_stop/vm_remove. This debug logging
trick can be removed in the future once we are more confident about
it.

OK ccardenas@ mlarkin@


# 1.87 26-Jun-2018 reyk

Add "socket owner" to allow changing the owner of the vmd control socket.

This allows to open vmctl control or console access to other users
that are not in group wheel. Access for non-root users still defaults
to read-only actions unless you change the owner (user/group) of each
individual VM.

Requested by Mischa Peters

OK mlarkin@


# 1.86 19-Jun-2018 reyk

knf


# 1.85 13-May-2018 pd

vmd(8): enable pause / unpause for vm owners

Patch from Mohamed Aslan. Thanks!
ok kn@


# 1.84 25-Apr-2018 mlarkin

vmd(8)'s early error messages weren't visible when started via /etc/rc
(such as errors relating to not having VMX/etc). Change the log_init
to log to syslog so at least we have some chance of seeing these errors.

requested and ok beck@


# 1.83 21-Apr-2018 mlarkin

spelling error in log message


# 1.82 29-Mar-2018 martijn

Make sure that the global config is send out immediately when it is
loaded. This makes sure that the local prefix specied in the config is
always used.

OK ccardenas@


Revision tags: OPENBSD_6_3_BASE
# 1.81 14-Mar-2018 mlarkin

block two VMs from using the same disk image file at the same time.
Also changes an error message in vmctl to reflect same.


# 1.80 18-Feb-2018 pd

vmd: fix vmctl pause for non existing vm ids (never returns)

check if vm id is valid before sending to vmm for pausing. The 'lock' is caused
by vmm sending back ENOENT for a non existent vm but vmd drops the message
because it doesn't recogize the vmid vmm is talking about. This is an artifact
of the 'policy' don't trust any imsg from a sibling priv sep process and do
your own checking.

reported by Abel Abraham Camarillo Ojeda
ok mlarkin@ and ccardenas@


# 1.79 10-Jan-2018 sthen

Don't require "disk" or "kernel", also allow just "cdrom" instead, a VM can
still be useful with only cdrom storage. ok ccardenas@


# 1.78 08-Jan-2018 mpi

Enable TIOCUCNTL to be able to set ns8250's break detected condition.

It is now possible to send BREAK commands to vmd(8) independently of
the serial terminal emulator.

Happy virtual ddb(4) hacking!

No objection from mlarkin@, ok nicm@, ccardenas@, deraadt@


# 1.77 03-Jan-2018 ccardenas

Add initial CD-ROM support to VMD via vioscsi.

* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8)
* Support various sized ISOs (Limitation of 4G ISOs on Linux guests)
* Known working guests: OpenBSD (primary), Alpine Linux (primary),
CentOS 6 (secondary), Ubuntu 17.10 (secondary).
NOTE: Secondary indicates some issue(s) preventing full/reliable
functionality outside the scope of the vioscsi work.
* If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's
default BIOS) will boot from CD-ROM.

ok mlarkin@, jca@


# 1.76 06-Dec-2017 abieber

Make vmd respect owner when starting non-disabled vms.

OK pd@, benno@


# 1.75 30-Nov-2017 ccardenas

When performing vmctl reload and a previously configured vm is running,
exit with an EALREADY vs EPERM.

ok mlarkin@


# 1.74 11-Nov-2017 mlarkin

update switch handling in vmd(8). vmd now gets switch information (rdomain,
etc) from underlying switch interface instead of handling this on its
own.

Diff from carlos cardenas, Thanks!

ok reyk@


# 1.73 07-Nov-2017 mlarkin

typo in previous


# 1.72 07-Nov-2017 mlarkin

comment function vm_checkperm


# 1.71 24-Oct-2017 mlarkin

The VMD parent process didn't handle the case of a VM exiting
with a non 0 return properly (i.e. EIO).

From: Carlos Cardenas, thanks!


# 1.70 07-Oct-2017 mlarkin

vmd: retain ownership on vm reboot

from Jesper Wallin, thanks!


Revision tags: OPENBSD_6_2_BASE
# 1.69 08-Sep-2017 mlarkin

vmd: add more explanatory log_debug messages

From Carlos Cardenas, many thanks!


# 1.68 20-Aug-2017 pd

vmd: Allow only upward migration

This restricts receiving vms from hosts with more cpu features.

Tested on
broadwell -> skylake (works)
skylake -> broadwell (don't work)

ok mlarkin@


# 1.67 15-Aug-2017 pd

vmd: fix vm id displayed by vmctl when receiving a vm

Also fix two debug messages and an IMSG type.


# 1.66 14-Aug-2017 jasper

validate vm names before creating them; a valid name contains alphanumeric
characters, including '.', '_' and '-'. but does not start with the latter
three.

ok mlarkin@ pd@


# 1.65 13-Aug-2017 jasper

don't issue a termination command to an already stopped vm

ok mlarkin@


# 1.64 15-Jul-2017 pd

Add vmctl send and vmctl receive

ok reyk@ and mlarkin@


# 1.63 09-Jul-2017 pd

vmd/vmctl: Add ability to pause / unpause vms

With help from Ashwin Agrawal

ok reyk@ mlarkin@


# 1.62 29-May-2017 mlarkin

vmd(8): prevent crashing when presented with a vm name argument to
"vmctl stop" that doesn't exist.

Diff from Pratik Vyas, thanks!


# 1.61 04-May-2017 reyk

Report command failure back to vmctl reload, reset, load, log verbose.

OK mlarkin@


# 1.60 04-May-2017 reyk

Add support for rdomains.

This allows to configure VM interfaces and switches in individual rdomains.

OK mlarkin@


# 1.59 25-Apr-2017 reyk

Generate randomized MAC addresses earlier to keep them across reboots.

OK deraadt@


# 1.58 21-Apr-2017 reyk

Add global configuration option "local prefix" to change prefix for -L.

The default prefix is 100.64.0.0/10 from RFC6598.

Requested by sthen@ chris@
OK mlarkin@


# 1.57 19-Apr-2017 reyk

Add support for dynamic "NAT" interfaces (-L/local interface).

When a local interface is configured, vmd configures a /31 address on
the tap(4) interface of the host and provides another IP in the same
subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server
that replies with IP, gateway, and DNS addresses to the VM. The
built-in server only ever responds to the VM on the inside and cannot
leak its DHCP responses to the outside.

Thanks to Uwe Werler, Josh Grosse, and some others for testing!

OK deraadt@


# 1.56 06-Apr-2017 reyk

Do not expose vmm(4) VM IDs to the user, use vmd(8)'s IDs instead.

Each VM has two IDs: one from the kernel (vmm) and a different one
from userland (vmd). The vmm ID is not consistent and incremented on
every boot during runtimg of the host system. The vmd ID remains the
same during the lifetime of a configured VM, even after reboots.
Configured VMs will even get and keep their IDs when the configuration
is loaded. This is more what users expect.

Pointed out and tested by otto@

OK deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.55 15-Mar-2017 reyk

More fixes for starting and stopping VMs, fixing fallout from vm_running.

- Don't start a VM that is already running
- Keep the VM as running until it is powered off (and not stopping)
- Don't fatal in the parent if the vmm process referenced an unknown VM
- Don't stop a VM that is already stopping
- Indicate that a VM is stopping in "vmctl status"

The previous "vmctl stop; vmctl stop" to force-shutdown is not
supported anymore - the shutdown timeout should make sure that the VM
is really terminated. To force-shutdown, reference the VM by ID.
We might add a flag to vmctl stop to just turn the VM off.


# 1.54 15-Mar-2017 reyk

Close the tty if the VM was powered down.

The parent keeps a copy of each VM's tty fd to reuse it on reboot.
Close this tty if the VM was stopped, and not rebooted, by calling
vm_stop(vm, 0) instead of just setting vm_running to 0. Also make
sure that vm_ttyname is not used after free'ing it.


# 1.53 02-Mar-2017 reyk

Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.

This is especially useful when multiple VMs share a switch, the
implementation is independent from the underlying switch or bridge.

no objections mlarkin@


# 1.52 01-Mar-2017 reyk

Add "owner" option to set a user/group ownership for pre-configured VMs

This allows matching users to start or stop VMs that they "own" and to
access the console accordingly.

OK mlarkin@


# 1.51 27-Feb-2017 reyk

Replace openpty(3) with local function that uses pre-opened /dev/ptm fd

This allows more flexibility for upcoming changes and better pledge.
We also didn't use half of the features of libutil's openpty function.
Additionally, make sure that the ttys are closed correctly on shutdown.

OK gilles@


# 1.50 13-Jan-2017 edd

Make it possible to remove VMs from vmd(8)'s internal queue.

The semantics agreed with reyk@ are:

* ad-hoc created vms, created with `vmctl start`, are removed once stopped.
* Stopped VMs defined in a config file are flushed before a `vmctl reload`.

OK reyk@


# 1.49 11-Jan-2017 reyk

Add imsg communication channel between vmd and invividual VMs.
For now, this is only used to forward "log verbose|brief" requests,
but it will be used for better things later.

OK mlarkin@


# 1.48 09-Jan-2017 reyk

Stop accessing verbose and debug variables from log.c directly.

This replaces log_verbose() and "extern int verbose" with the two functions
log_setverbose() and log_getverbose().

Pointed out by benno@
OK krw@ eric@ gilles@ (OK gilles@ for the snmpd bits as well)


# 1.47 14-Dec-2016 reyk

Allow to start disabled and pre-configured VMs by name, "vmctl start foo".

With testing from Jon Bernard

OK mlarkin@


# 1.46 14-Dec-2016 reyk

If a VM terminates with the result EAGAIN, close all fds except the
pty and re-send it to the vmm monitor process. With additional
changes in vmm.c, this will allow perform a cold reboot of VM.

With testing and feedback from Jon Bernard
OK mlarkin@


# 1.45 26-Nov-2016 reyk

Implement basic support for boot.conf(8) on the disk image.

Like the real boot loader, load and parse hd0a:/etc/boot.conf from the
first disk and fall back to /bsd. Not all boot loader options are
supported, but it at least does set device, set image, and boot -acds
(eg. for booting single-user).

For example, it can now boot install60.fs that includes a boot.conf
with "set image /6.0/amd64/bsd.rd":
vmctl start install -c -d install60.fs -d OpenBSD.img

This pseudo-bootloader is only needed without BIOS and could
potentially be replaced in the future.

OK mlarkin@


# 1.44 26-Nov-2016 reyk

If -m/memory is not specified, use 512M by default.

Default value picked with mlarkin - not too small and not too large.

OK mlarkin@


# 1.43 24-Nov-2016 reyk

Add support for booting the kernel from the disk image.

This make the kernel/-k argument optional and, if not specified, tries
to find the /bsd kernel in the primary hd0a partition of the first
disk image itself. It doesn't support hd0a:/etc/boot.conf yet, and it
is no BIOS or full boot loader, but it makes booting and handling of
VMs a bit easier - booting an external kernel is still supported.

The UFS file system code ufs.c is directly from libsa which is also
used by the real boot loader. The code compiles with a few signedness
warning which will be fixed separately.

OK mlarkin@


# 1.42 22-Nov-2016 reyk

Fix error path of config_setvm() and its callers. This unbreaks
loading of invalid kernel files.

Reported by mlarkin@
OK mlarkin@


# 1.41 22-Nov-2016 reyk

There is no need for res when there is already ret.


# 1.40 22-Nov-2016 edd

Insert disabled VMs into vmd(8)'s queues and allow vmctl(8) to display them.

Tested by Jon Bernard and reyk@.

OK reyk@, no objections mlarkin@.

Thanks


# 1.39 04-Nov-2016 reyk

Pass the internal vmid or 0 to vm_register() instead of changing it
once again after setting the next available id.

Suggested by edd@


# 1.38 04-Nov-2016 reyk

Update the config/register/get VM methods to match the config_set/get
style that is used in other places. Also keep the vmid from the parent.

OK edd@


# 1.37 29-Oct-2016 edd

Separate parsing vms and switches from starting them in vmd(8).

Brings us one step closer to having disabled by default vms is vm.conf(5),
which can be started with vmctl(8).

Input, testing and OK reyk@. Thanks.


# 1.36 17-Oct-2016 reyk

Add the option to specify an interface group per virtual switch as well;
this group will be added to all VM tap(4) interfaces in the switch.

Tested by martijn@


# 1.35 15-Oct-2016 reyk

Allow to add an interface to an interface group; with the group keyword.

Requested and tested by martijn@


# 1.34 12-Oct-2016 reyk

Fix functionality and semantics of vmctl load/reload/reset.

OK rzalamena@


# 1.33 06-Oct-2016 reyk

Terminate VMs on shutdown of vmd instead of leaving them running as
undead VM processes.

OK mlarkin@


# 1.32 05-Oct-2016 reyk

Add support for enhanced networking configuration and virtual switches.
See vm.conf(5) for more details.

OK mlarkin@


# 1.31 04-Oct-2016 reyk

Add a new "priv" process that is responsible for ioctls and restricted
operations that aren't allowed under pledge. This is a companion to
the "vmd" process that runs as root but with pledge.

With the "priv" process, each new tap(4) interface now gets a
description to indicate the vm, eg. "vm1-if0-myvm". For network
configuration will be done by vmd/priv later.

OK mlarkin@


# 1.30 29-Sep-2016 reyk

Implement fork+exec for vmd, using the same framework from httpd etc.

No objections from mlarkin@ sunil@


# 1.29 17-Aug-2016 deraadt

small bits of header cleanup; ok mlarkin


# 1.28 29-Jul-2016 stefan

Allow starting a VM again after it was terminated

If a VM exits, terminate it and remove it from the list of
available VMs. That allows a VM with name `foo' to be restarted
after it has exited.

This changes structures shared between vmd and vmctl. You need to
rebuild vmctl also.

ok mlarkin@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.27 05-Feb-2016 reyk

Fix a possible use-after-free in vmd, forward the result to the
control socket before free'ing the vm.

Found by and OK jsg@


# 1.26 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.25 11-Dec-2015 reyk

The vmctl "id" argument can now be a number of or a vm name, eg.
vmctl stop 3
vmctl stop "openbsd.vm"


# 1.24 08-Dec-2015 jsg

when checking the config file with -n don't open /dev/vmm or require root
ok reyk@


# 1.23 08-Dec-2015 jsg

make the -f option work as intended
ok reyk@


# 1.22 07-Dec-2015 reyk

tweak initial error logging


# 1.21 06-Dec-2015 reyk

Prevent running a VM with the same name multiple times - multiple
instances of the same configuration will be handled in a different way
later. It is also not a good idea to use the same writeable disk
with multiple VMs at the same time.

As discussed with mlarkin@


# 1.20 06-Dec-2015 reyk

Print the TTY in the vmctl status output.


# 1.19 06-Dec-2015 reyk

When a new vm is created with VMM_IOC_CREATE, the kernel assigns a
unique id to it. This happens in the vm child process and has to be
communicated to the parent processes to track the vm. Knowing the vm
id in the parent and vmm processes also allows to remove vm from the
daemons list on terminate requests later.


# 1.18 06-Dec-2015 reyk

Check errno from config_getvm() correctly


# 1.17 05-Dec-2015 reyk

Print shorter error message if opening /dev/vmm failed.

Pointed out by deraadt@


# 1.16 03-Dec-2015 reyk

Re-add the "load" and "reload" commands to vmctl: Instead of parsing
the configuration in vmctl directly, it now sends a (re)load request
to vmd. The reload also resets the existing configuration status -
this doesn't do much difference yet but a future change will compare
if a specified VM is already running. "load" will allow to add
configuration, while "reload" resets the state before loading.


# 1.15 03-Dec-2015 reyk

Add and document -D and -f flags to vmd.


# 1.14 03-Dec-2015 reyk

mlarkin's code has been moved to vmm.c, so it is ok to claim the copyright.


# 1.13 03-Dec-2015 reyk

Add support for an optional vm.conf(5) file in vmd. This will replace
vmm.conf(5) in vmmctl. For a short time, both vmd and vmmctl will
support a configuration file, but vmmctl will be changed to send
"load" requests to vmd instead of loading and parsing the file
directly.


# 1.12 03-Dec-2015 reyk

prepare config_getvm() for parse.y


# 1.11 02-Dec-2015 reyk

send the tty name to vmmctl and print it as a result.


# 1.10 02-Dec-2015 reyk

Split the fully privileged parent into two processes "parent" and
"vmm" with reduced privileges:
- the "parent" opens fds (disks, ifs, etc.) but runs as root but pledged as
"stdio rpath wpath proc tty sendfd".
- the "vmm" process handles the creation and supervision of vm processes,
and the primary communication with the vmm(4) subsystem. It runs as _vmd
in the chroot but does not use pledge, as the vmm ioctls are not allowed
by any pledge model yet.
With this change, vmd starts to track the configuration state of VMs
in vmd and will allow other things later (like terminating a vm by
name, moving the configuration parser to vmd, ...). More incremental
changes will follow.


# 1.9 02-Dec-2015 reyk

Start tweaking vmd's privsep and daemon model by splitting the main
process into multiple parts and adopting the "proc.c"-style from other
daemons. This allows to further reduce the privileges, to give better
pledge(2), and to add some upcoming changes.

"please do" mlarkin@, deraadt@


# 1.8 26-Nov-2015 reyk

Automatically start vmm(4) when the first VM is created and after the
last VM is terminated. This allows to remove the explicit "vmm
enable" / "vmm disable" (VMM_IOC_START / VMM_IOC_STOP) ioctls. You'll
have to update kernel and userland for this change, as the kernel ABI
changes.

OK mpi@ mlarkin@


# 1.7 25-Nov-2015 tedu

typo: should be looking pid == -1


# 1.6 23-Nov-2015 reyk

accept4() is restarted after signals which prevents vmd from exiting
in the current control socket loop. Add a poll before the accept that
is not restarted and allows to escape the loop. This code is kind of
temporary, as we're planning to replace the event handling, but it
allows to kill (or Ctrl+c) vmd for now.

OK tedu@, discussed with many


# 1.5 23-Nov-2015 reyk

I accidentally removed a newline in usage() when converting the log
messages to log_*.

From Cesare Gargano


# 1.4 23-Nov-2015 reyk

Add support for logging to stderr or syslog, and to run vmd in
foreground with -d.

OK mlarkin@ jung@


# 1.3 22-Nov-2015 deraadt

use PATH_MAX where needed


# 1.2 22-Nov-2015 reyk

Add $ Ids


# 1.1 22-Nov-2015 mlarkin

vmd(8) - virtual machine daemon.

There is still a lot to be done, and fixed, in these userland components
but I have received enough "it works, commit it" emails that it's time
to finish those things in tree.

discussed with many, tested by many.


# 1.138 28-Jan-2023 dv

Move some header definitions from vmm(4) to vmd(8).

Part of an ongoing effort to move userland-specific information out
of a kernel header and directly into vmd(8). No functional change.

ok mlarkin@


# 1.137 22-Jan-2023 dv

vmd(8): don't remove known vm's from the config on error.

Multiple error paths, specifically the one related to if a guest
cannot allocate memory at start, resulted in a known vm (via
vm.conf(5)) being removed from the vm list. Adjust the error paths
to check if the failing vm is defined in the config before tearing
it down.

Tested with help from beck@ and Mischa Peters.

ok beck@


# 1.136 14-Jan-2023 dv

Only open /dev/vmm once in vmd(8).

Have the parent process open /dev/vmm and send the fd to the vmm
child process. Only the vmm process and its resulting children
(guest vms) need it for ioctl calls.

ok kn@


# 1.135 28-Dec-2022 jmc

spelling fixes; from paul tagliamonte
any parts of his diff not taken are noted on tech


# 1.134 15-Dec-2022 dv

Add explicit casts to ctype functions in vmd(8).

OK millert@


# 1.133 31-Oct-2022 dv

vmd(8): remove unfinished user accounting.

User accounting and enforcement was never finished. tedu the thing
until someone wants to pick it up and finish it.

Originally found by Matthew Martin.

ok mlarkin@, kn@. input from tb@.


Revision tags: OPENBSD_7_2_BASE
# 1.132 13-Sep-2022 martijn

branches: 1.132.2;
Add (partial) support for agentx in vmd.

Metrics can be found under mib-2.236 and VM-MIB (RFC7666).

Stress tested by and happy noises from Mischa Peters
OK dv@


# 1.131 08-May-2022 dv

vmd: fix rebooting a received vm

Rebooting a received vm resulted in vmd(8) exiting as a result of
flawed state tracking in the parent process.

When stopping a vm, clear the VM_RECEIVE_STATE flag. When starting
a vm, make sure the parent process collapses any existing memory
ranges after the vm is sent to the vmm process (responsible for
launching the vm).

ok mlarkin@


Revision tags: OPENBSD_7_1_BASE
# 1.130 01-Mar-2022 dv

branches: 1.130.2;
vmd(8): gracefully handle hitting data limits when starting a vm

With recent changes to login.conf(5) to restrict daemon datasize
to a finite value, users can now hit resource limits when attempting
to start a vm.

This change fixes the error path when hitting the limit. vmd(8)
will no longer abort and memory error messages are relayed to the
user.

While here, address potential under-reads/writes using atomicio
when relaying data between the child vm process and vmd's vmm
process.

Original diff from tedu@. OK mlarkin@.


# 1.129 04-Jan-2022 claudio

Fix some simple -Wunused-but-set-variable warnings.
OK benno@ dv@


# 1.128 13-Dec-2021 deraadt

including sys/cdefs.h manually started as a result of netbsd trying to
macro-build a replacement for sccsid, and was done without any concern
for namespace damage. Unfortunately this practice started infecting
other code as others were unaware they didn't need the file.
ok millert guenther


# 1.127 29-Nov-2021 deraadt

mostly avoid sys/param.h with a local nitems()
ok mlarkin


Revision tags: OPENBSD_7_0_BASE
# 1.126 18-Jul-2021 dv

vmd(8): remove invalid errno values from config_setvm

Refactor config_setvm to directly return error code on failure
instead of returning -1 and setting errno. It was setting unsupported
values not defined in <errno.h>.

OK mlarkin@


# 1.125 05-May-2021 dv

Refactor vm_instance to return error value directly.

vmd(8)'s vm_instance function set unsupported errno values. Change the
api to directly return an error (either errno or custom vmd error).

"go for it" -mlarkin@


# 1.124 04-May-2021 dv

Init debug logging state before attempting to log.

Error messages related to bad configuration were not flushing to
stderr.

OK mlarkin@


# 1.123 26-Apr-2021 dv

vmd(8): fix vmctl client "wait" state corruption

Adds queue-based tracking of waiting client state to fix the cause of
state corruption when a vmctl(8) user cancels a wait and restarts it.
The socket fd value for the control process client was being used to
track the waiting party, but this also prevented multiple waiting
clients.

This moves all the state tracking of who to notify of a vm's stopping
to the control process and no longer requires the parent process to
track it in the global environment state.

Future work will be needed to smooth out the difference between the
IMSG_VMDOP_TERMINATE_VM_{EVENT,RESPONSE} events instead of needing to
translate before relaying to the vmctl(8) client.

Tested by Mischa Peters (thanks!)

ok mlarkin@


Revision tags: OPENBSD_6_9_BASE
# 1.122 05-Apr-2021 dv

Send correct response type on unpause errors.

ok pd@


# 1.121 29-Mar-2021 dv

Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp
and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior
ignored did not intercept these packets and instead transmitted them.

This should make vmd(8)'s dhcp behave more as a true dhcp server should and
allows it to work properly with the new dhcpleased(8) attempting a renewal.

OK mlarkin@


# 1.120 27-Jan-2021 deraadt

these programs (with common ancestry) had a -fno-common problem related
to privsep_procid.
ok mortimer


Revision tags: OPENBSD_6_8_BASE
# 1.119 23-Sep-2020 martijn

Revert agentx support for now, we're too close to release.

requested by deraadt@


# 1.118 23-Sep-2020 martijn

Add support for agentx to vmd.

This is based around VM-MIB from RFC7666,but does not export the full
spec. People more knowledgeable of vmd are encouraged to expand on this.


Revision tags: OPENBSD_6_7_BASE
# 1.117 12-Dec-2019 pd

vmd: start vms defined in vm.conf in a staggered fashion

This addresses 'thundering herd' problem when a lot of
vms are configured in vm.conf. A lot of vms booting in parallel can
overload the host and also mess up tsc calibration in openbsd guests as
it uses PIT which doesn't fire reliably if the host is overloaded.

We default to starting vms with parallelism of ncpuonline and a delay 30 seconds
between batches. This is configurable in vm.conf.

ok mlarkin@ (also addressed comments from cheloha@)


Revision tags: OPENBSD_6_6_BASE
# 1.116 04-Sep-2019 mlarkin

vmd(8): memory leak in an error path

Found by Hiltjo Posthuma, thanks!


# 1.115 14-Aug-2019 anton

Improve the error message when supplying an invalid template to vmctl
start. Favoring 'invalid template' over 'permission denied' should give
the user a better hint on what went wrong.

ok kn@ mlarkin@


# 1.114 28-Jun-2019 deraadt

When system calls indicate an error they return -1, not some arbitrary
value < 0. errno is only updated in this case. Change all (most?)
callers of syscalls to follow this better, and let's see if this strictness
helps us in the future.


# 1.113 20-May-2019 jasper

drop fatalx calls when claiming a new vm id; otherwise it's possible
to crash vmd and take all other vms with it. this required a little
shuffling to get the error value reported back to the caller to
handle the error properly.

ok mlarkin@


# 1.112 11-May-2019 jasper

report vm state through 'vmctl status'; whereas previously this would display the state of
the vcpu (which is why it got removed), it now actually reports the correct state
(running, stopped, disabled, paused, etc)

ok ccardenas@ mlarkin@


# 1.111 11-May-2019 jasper

vm_dump_header allocated space for a signature but it was never set;
set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image

ok mlarkin@ pd@


# 1.110 11-May-2019 jasper

track the state of the vm (running, paused, etc) using a single bitfield instead of
a handful of separate variables. this will makes it easier for vmd to report
and check on the individual vm states

no functional change intended

ok ccardenas@ mlarkin@


# 1.109 11-May-2019 jasper

sync the vm state in vmd too when (un)pausing a vm, otherwise the vm process
knows the vm is paused, but vmd does not.

ok mlarkin@ pd@


Revision tags: OPENBSD_6_5_BASE
# 1.108 09-Dec-2018 claudio

When -B is used to specify a specific boot device also change the reboot
behaviour of vmd to stop / exit at guest reboot.
OK ccardenas@


# 1.107 04-Dec-2018 claudio

Introduce IMSG_VMDOP_WAIT_VM_REQUEST a control message that registers a
vmctl peerid that should be informed when the VM is stopped (like when the
guest does a shutdown). Uses the same logic as using the VMOP_WAIT flag on
IMSG_VMDOP_TERMINATE_VM_REQUEST.
Ok ccardenas@, reyk@


# 1.106 26-Nov-2018 ori

Keep a list of known vms, and reuse the VM IDs.

This means that when using '-L', the IP addresses of the VMs are stable.

ok reyk@


# 1.105 21-Nov-2018 reyk

Add support for "local inet6" interfaces.

ok & test ccardenas@, additional review from kn@


# 1.104 15-Oct-2018 reyk

Prevent VM reboot loops by rate-limiting the interval a VM can reboot.

This looping has been experienced by people who run VMs with a broken
kernel or boot loader that trigger a very fast reboot loop (triple
fault) of a VM that ends up using a lot of CPU and resources on the
host. Some fixes in vmm(4) and vmd(8) helped to avoid such conditions
but it can still occur if something is wrong in the guest VM itself.

If the VM restarts after less than VM_START_RATE_SEC (6) seconds, we
increment the limit counter. After VM_START_RATE_LIMIT (3) of suchs
fast reboots the VM is stopped.

There are only very few people who intentionally want to reboot-loop a
VM very quickly (many times within a second); mostly for fuzzing.
They will have to recompile and adjust the stated #defines in the code
as we don't have a config option to disable it.

OK mlarkin@


Revision tags: OPENBSD_6_4_BASE
# 1.103 08-Oct-2018 reyk

Add support for qcow2 base images (external snapshots).

This works is from Ori Bernstein, committing on his behalf:

Add support to vmd for external snapshots. That is, snapshots that are
derived from a base image. Data lookups start in the derived image,
and if the derived image does not contain some data, the search
proceeds ot the base image. Multiple derived images may exist off of
a single base image.

A limitation of this format is that modifying the base image will
corrupt the derived image.

This change also adds support for creating disk derived disk images to
vmctl. To use it:

vmctl create derived.qcow2 -s 16G -b base.qcow2

From Ori Bernstein
OK mlarkin@ reyk@


# 1.102 29-Sep-2018 pd

vmd: don't remove vm if sending failed

Fix a bug where a vm was removed in vmd.c after vmctl send even if sending
failed.
spotted by solene@
ok mlarkin@


# 1.101 28-Sep-2018 reyk

Fix copy-pasto to use maxmem instead of maxcpu

Reported by Greg Steuck

OK mlarkin@


# 1.100 10-Sep-2018 bluhm

vmd(8) clould close file descriptor 0 as not all fd fields were
properly initialized with -1. Also avoid closing -1.
OK mlarkin@


# 1.99 10-Sep-2018 bluhm

During the fork+exec implementation, daemon(3) was moved after
proc_init(). As a consequence vmd(8) child processes did not detach
from the terminal anymore. Dup /dev/null to the stdio file descriptors
in the children.
OK mlarkin@ reyk@


# 1.98 15-Jul-2018 reyk

Track resources and enforce cpu/memory/interface limits for non-root users.

The limits are currently hard-coded and undocumented (4 CPUs/VMs, 2G
memory, 8 interfaces) but will be configurable in an upcoming diff.
These limits are tracked in total usage; for example, a user will be
able to run up to 4 VMs with 512M of memory or a single VM with 2G.

OK ccardenas@ mlarkin@


# 1.97 13-Jul-2018 reyk

Check the disk/kernel/cdrom file permissions after openening the fd.

This prevents time of TOCTOU attacks for instances.

OK mlarkin@


# 1.96 13-Jul-2018 reyk

Add "allow instance" option.

This allows users to create VM instances and change desired options,
for example a user can be allowed to run a VM with all the
pre-configured options but specify an own disk image.

(mlarkin@ was fine with iterating over it)

OK ccardenas@


# 1.95 12-Jul-2018 reyk

Allow to use configured/running VMs as templates for other VM instances.

This introduces new grammar and the -t optional in vmctl start.

(For now, only root can create VM instances; but it is planned to allow
users to create their own VMs based on permissions and quota.)

OK ccardenas@ mlarkin@ jmc@


# 1.94 11-Jul-2018 reyk

style - indent each case statement in a switch.


# 1.93 11-Jul-2018 reyk

Add -w option to vmctl stop to wait for completion of VM termination.

Use it in /etc/rc.d/vmd accordingly.

OK sthen@


# 1.92 11-Jul-2018 reyk

Rename function to vmd_check_vmh


# 1.91 11-Jul-2018 reyk

Add -f option to vmctl stop to forcefully kill a VM.

This also fixes a bug in vmm_sighdlr where it might have missed
forwarding the TERMINATE_EVENT to the vmd parent after a VM child
died, leading to an abandoned VM in the vmd parent process.

OK ccardenas@ mlarkin@ benno@ kn@


# 1.90 10-Jul-2018 reyk

style (single-line ifs don't need braces)


# 1.89 10-Jul-2018 reyk

vmd already had DEBUG/DPRINTF, there is no need for VMD_DEBUG/dprintf

Replace all occurences of dprintf with DPRINTF (defined in proc.h).


# 1.88 10-Jul-2018 reyk

Tweak debug log messages

- Turn tracing messages into DPRINTF (only compiled with DEBUG).

- Pass __func__ to vm_stop and vm_remove: this way we can track who
called the function in the async context. It replaces the manual
log_debug in front of each vm_stop/vm_remove. This debug logging
trick can be removed in the future once we are more confident about
it.

OK ccardenas@ mlarkin@


# 1.87 26-Jun-2018 reyk

Add "socket owner" to allow changing the owner of the vmd control socket.

This allows to open vmctl control or console access to other users
that are not in group wheel. Access for non-root users still defaults
to read-only actions unless you change the owner (user/group) of each
individual VM.

Requested by Mischa Peters

OK mlarkin@


# 1.86 19-Jun-2018 reyk

knf


# 1.85 13-May-2018 pd

vmd(8): enable pause / unpause for vm owners

Patch from Mohamed Aslan. Thanks!
ok kn@


# 1.84 25-Apr-2018 mlarkin

vmd(8)'s early error messages weren't visible when started via /etc/rc
(such as errors relating to not having VMX/etc). Change the log_init
to log to syslog so at least we have some chance of seeing these errors.

requested and ok beck@


# 1.83 21-Apr-2018 mlarkin

spelling error in log message


# 1.82 29-Mar-2018 martijn

Make sure that the global config is send out immediately when it is
loaded. This makes sure that the local prefix specied in the config is
always used.

OK ccardenas@


Revision tags: OPENBSD_6_3_BASE
# 1.81 14-Mar-2018 mlarkin

block two VMs from using the same disk image file at the same time.
Also changes an error message in vmctl to reflect same.


# 1.80 18-Feb-2018 pd

vmd: fix vmctl pause for non existing vm ids (never returns)

check if vm id is valid before sending to vmm for pausing. The 'lock' is caused
by vmm sending back ENOENT for a non existent vm but vmd drops the message
because it doesn't recogize the vmid vmm is talking about. This is an artifact
of the 'policy' don't trust any imsg from a sibling priv sep process and do
your own checking.

reported by Abel Abraham Camarillo Ojeda
ok mlarkin@ and ccardenas@


# 1.79 10-Jan-2018 sthen

Don't require "disk" or "kernel", also allow just "cdrom" instead, a VM can
still be useful with only cdrom storage. ok ccardenas@


# 1.78 08-Jan-2018 mpi

Enable TIOCUCNTL to be able to set ns8250's break detected condition.

It is now possible to send BREAK commands to vmd(8) independently of
the serial terminal emulator.

Happy virtual ddb(4) hacking!

No objection from mlarkin@, ok nicm@, ccardenas@, deraadt@


# 1.77 03-Jan-2018 ccardenas

Add initial CD-ROM support to VMD via vioscsi.

* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8)
* Support various sized ISOs (Limitation of 4G ISOs on Linux guests)
* Known working guests: OpenBSD (primary), Alpine Linux (primary),
CentOS 6 (secondary), Ubuntu 17.10 (secondary).
NOTE: Secondary indicates some issue(s) preventing full/reliable
functionality outside the scope of the vioscsi work.
* If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's
default BIOS) will boot from CD-ROM.

ok mlarkin@, jca@


# 1.76 06-Dec-2017 abieber

Make vmd respect owner when starting non-disabled vms.

OK pd@, benno@


# 1.75 30-Nov-2017 ccardenas

When performing vmctl reload and a previously configured vm is running,
exit with an EALREADY vs EPERM.

ok mlarkin@


# 1.74 11-Nov-2017 mlarkin

update switch handling in vmd(8). vmd now gets switch information (rdomain,
etc) from underlying switch interface instead of handling this on its
own.

Diff from carlos cardenas, Thanks!

ok reyk@


# 1.73 07-Nov-2017 mlarkin

typo in previous


# 1.72 07-Nov-2017 mlarkin

comment function vm_checkperm


# 1.71 24-Oct-2017 mlarkin

The VMD parent process didn't handle the case of a VM exiting
with a non 0 return properly (i.e. EIO).

From: Carlos Cardenas, thanks!


# 1.70 07-Oct-2017 mlarkin

vmd: retain ownership on vm reboot

from Jesper Wallin, thanks!


Revision tags: OPENBSD_6_2_BASE
# 1.69 08-Sep-2017 mlarkin

vmd: add more explanatory log_debug messages

From Carlos Cardenas, many thanks!


# 1.68 20-Aug-2017 pd

vmd: Allow only upward migration

This restricts receiving vms from hosts with more cpu features.

Tested on
broadwell -> skylake (works)
skylake -> broadwell (don't work)

ok mlarkin@


# 1.67 15-Aug-2017 pd

vmd: fix vm id displayed by vmctl when receiving a vm

Also fix two debug messages and an IMSG type.


# 1.66 14-Aug-2017 jasper

validate vm names before creating them; a valid name contains alphanumeric
characters, including '.', '_' and '-'. but does not start with the latter
three.

ok mlarkin@ pd@


# 1.65 13-Aug-2017 jasper

don't issue a termination command to an already stopped vm

ok mlarkin@


# 1.64 15-Jul-2017 pd

Add vmctl send and vmctl receive

ok reyk@ and mlarkin@


# 1.63 09-Jul-2017 pd

vmd/vmctl: Add ability to pause / unpause vms

With help from Ashwin Agrawal

ok reyk@ mlarkin@


# 1.62 29-May-2017 mlarkin

vmd(8): prevent crashing when presented with a vm name argument to
"vmctl stop" that doesn't exist.

Diff from Pratik Vyas, thanks!


# 1.61 04-May-2017 reyk

Report command failure back to vmctl reload, reset, load, log verbose.

OK mlarkin@


# 1.60 04-May-2017 reyk

Add support for rdomains.

This allows to configure VM interfaces and switches in individual rdomains.

OK mlarkin@


# 1.59 25-Apr-2017 reyk

Generate randomized MAC addresses earlier to keep them across reboots.

OK deraadt@


# 1.58 21-Apr-2017 reyk

Add global configuration option "local prefix" to change prefix for -L.

The default prefix is 100.64.0.0/10 from RFC6598.

Requested by sthen@ chris@
OK mlarkin@


# 1.57 19-Apr-2017 reyk

Add support for dynamic "NAT" interfaces (-L/local interface).

When a local interface is configured, vmd configures a /31 address on
the tap(4) interface of the host and provides another IP in the same
subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server
that replies with IP, gateway, and DNS addresses to the VM. The
built-in server only ever responds to the VM on the inside and cannot
leak its DHCP responses to the outside.

Thanks to Uwe Werler, Josh Grosse, and some others for testing!

OK deraadt@


# 1.56 06-Apr-2017 reyk

Do not expose vmm(4) VM IDs to the user, use vmd(8)'s IDs instead.

Each VM has two IDs: one from the kernel (vmm) and a different one
from userland (vmd). The vmm ID is not consistent and incremented on
every boot during runtimg of the host system. The vmd ID remains the
same during the lifetime of a configured VM, even after reboots.
Configured VMs will even get and keep their IDs when the configuration
is loaded. This is more what users expect.

Pointed out and tested by otto@

OK deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.55 15-Mar-2017 reyk

More fixes for starting and stopping VMs, fixing fallout from vm_running.

- Don't start a VM that is already running
- Keep the VM as running until it is powered off (and not stopping)
- Don't fatal in the parent if the vmm process referenced an unknown VM
- Don't stop a VM that is already stopping
- Indicate that a VM is stopping in "vmctl status"

The previous "vmctl stop; vmctl stop" to force-shutdown is not
supported anymore - the shutdown timeout should make sure that the VM
is really terminated. To force-shutdown, reference the VM by ID.
We might add a flag to vmctl stop to just turn the VM off.


# 1.54 15-Mar-2017 reyk

Close the tty if the VM was powered down.

The parent keeps a copy of each VM's tty fd to reuse it on reboot.
Close this tty if the VM was stopped, and not rebooted, by calling
vm_stop(vm, 0) instead of just setting vm_running to 0. Also make
sure that vm_ttyname is not used after free'ing it.


# 1.53 02-Mar-2017 reyk

Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.

This is especially useful when multiple VMs share a switch, the
implementation is independent from the underlying switch or bridge.

no objections mlarkin@


# 1.52 01-Mar-2017 reyk

Add "owner" option to set a user/group ownership for pre-configured VMs

This allows matching users to start or stop VMs that they "own" and to
access the console accordingly.

OK mlarkin@


# 1.51 27-Feb-2017 reyk

Replace openpty(3) with local function that uses pre-opened /dev/ptm fd

This allows more flexibility for upcoming changes and better pledge.
We also didn't use half of the features of libutil's openpty function.
Additionally, make sure that the ttys are closed correctly on shutdown.

OK gilles@


# 1.50 13-Jan-2017 edd

Make it possible to remove VMs from vmd(8)'s internal queue.

The semantics agreed with reyk@ are:

* ad-hoc created vms, created with `vmctl start`, are removed once stopped.
* Stopped VMs defined in a config file are flushed before a `vmctl reload`.

OK reyk@


# 1.49 11-Jan-2017 reyk

Add imsg communication channel between vmd and invividual VMs.
For now, this is only used to forward "log verbose|brief" requests,
but it will be used for better things later.

OK mlarkin@


# 1.48 09-Jan-2017 reyk

Stop accessing verbose and debug variables from log.c directly.

This replaces log_verbose() and "extern int verbose" with the two functions
log_setverbose() and log_getverbose().

Pointed out by benno@
OK krw@ eric@ gilles@ (OK gilles@ for the snmpd bits as well)


# 1.47 14-Dec-2016 reyk

Allow to start disabled and pre-configured VMs by name, "vmctl start foo".

With testing from Jon Bernard

OK mlarkin@


# 1.46 14-Dec-2016 reyk

If a VM terminates with the result EAGAIN, close all fds except the
pty and re-send it to the vmm monitor process. With additional
changes in vmm.c, this will allow perform a cold reboot of VM.

With testing and feedback from Jon Bernard
OK mlarkin@


# 1.45 26-Nov-2016 reyk

Implement basic support for boot.conf(8) on the disk image.

Like the real boot loader, load and parse hd0a:/etc/boot.conf from the
first disk and fall back to /bsd. Not all boot loader options are
supported, but it at least does set device, set image, and boot -acds
(eg. for booting single-user).

For example, it can now boot install60.fs that includes a boot.conf
with "set image /6.0/amd64/bsd.rd":
vmctl start install -c -d install60.fs -d OpenBSD.img

This pseudo-bootloader is only needed without BIOS and could
potentially be replaced in the future.

OK mlarkin@


# 1.44 26-Nov-2016 reyk

If -m/memory is not specified, use 512M by default.

Default value picked with mlarkin - not too small and not too large.

OK mlarkin@


# 1.43 24-Nov-2016 reyk

Add support for booting the kernel from the disk image.

This make the kernel/-k argument optional and, if not specified, tries
to find the /bsd kernel in the primary hd0a partition of the first
disk image itself. It doesn't support hd0a:/etc/boot.conf yet, and it
is no BIOS or full boot loader, but it makes booting and handling of
VMs a bit easier - booting an external kernel is still supported.

The UFS file system code ufs.c is directly from libsa which is also
used by the real boot loader. The code compiles with a few signedness
warning which will be fixed separately.

OK mlarkin@


# 1.42 22-Nov-2016 reyk

Fix error path of config_setvm() and its callers. This unbreaks
loading of invalid kernel files.

Reported by mlarkin@
OK mlarkin@


# 1.41 22-Nov-2016 reyk

There is no need for res when there is already ret.


# 1.40 22-Nov-2016 edd

Insert disabled VMs into vmd(8)'s queues and allow vmctl(8) to display them.

Tested by Jon Bernard and reyk@.

OK reyk@, no objections mlarkin@.

Thanks


# 1.39 04-Nov-2016 reyk

Pass the internal vmid or 0 to vm_register() instead of changing it
once again after setting the next available id.

Suggested by edd@


# 1.38 04-Nov-2016 reyk

Update the config/register/get VM methods to match the config_set/get
style that is used in other places. Also keep the vmid from the parent.

OK edd@


# 1.37 29-Oct-2016 edd

Separate parsing vms and switches from starting them in vmd(8).

Brings us one step closer to having disabled by default vms is vm.conf(5),
which can be started with vmctl(8).

Input, testing and OK reyk@. Thanks.


# 1.36 17-Oct-2016 reyk

Add the option to specify an interface group per virtual switch as well;
this group will be added to all VM tap(4) interfaces in the switch.

Tested by martijn@


# 1.35 15-Oct-2016 reyk

Allow to add an interface to an interface group; with the group keyword.

Requested and tested by martijn@


# 1.34 12-Oct-2016 reyk

Fix functionality and semantics of vmctl load/reload/reset.

OK rzalamena@


# 1.33 06-Oct-2016 reyk

Terminate VMs on shutdown of vmd instead of leaving them running as
undead VM processes.

OK mlarkin@


# 1.32 05-Oct-2016 reyk

Add support for enhanced networking configuration and virtual switches.
See vm.conf(5) for more details.

OK mlarkin@


# 1.31 04-Oct-2016 reyk

Add a new "priv" process that is responsible for ioctls and restricted
operations that aren't allowed under pledge. This is a companion to
the "vmd" process that runs as root but with pledge.

With the "priv" process, each new tap(4) interface now gets a
description to indicate the vm, eg. "vm1-if0-myvm". For network
configuration will be done by vmd/priv later.

OK mlarkin@


# 1.30 29-Sep-2016 reyk

Implement fork+exec for vmd, using the same framework from httpd etc.

No objections from mlarkin@ sunil@


# 1.29 17-Aug-2016 deraadt

small bits of header cleanup; ok mlarkin


# 1.28 29-Jul-2016 stefan

Allow starting a VM again after it was terminated

If a VM exits, terminate it and remove it from the list of
available VMs. That allows a VM with name `foo' to be restarted
after it has exited.

This changes structures shared between vmd and vmctl. You need to
rebuild vmctl also.

ok mlarkin@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.27 05-Feb-2016 reyk

Fix a possible use-after-free in vmd, forward the result to the
control socket before free'ing the vm.

Found by and OK jsg@


# 1.26 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.25 11-Dec-2015 reyk

The vmctl "id" argument can now be a number of or a vm name, eg.
vmctl stop 3
vmctl stop "openbsd.vm"


# 1.24 08-Dec-2015 jsg

when checking the config file with -n don't open /dev/vmm or require root
ok reyk@


# 1.23 08-Dec-2015 jsg

make the -f option work as intended
ok reyk@


# 1.22 07-Dec-2015 reyk

tweak initial error logging


# 1.21 06-Dec-2015 reyk

Prevent running a VM with the same name multiple times - multiple
instances of the same configuration will be handled in a different way
later. It is also not a good idea to use the same writeable disk
with multiple VMs at the same time.

As discussed with mlarkin@


# 1.20 06-Dec-2015 reyk

Print the TTY in the vmctl status output.


# 1.19 06-Dec-2015 reyk

When a new vm is created with VMM_IOC_CREATE, the kernel assigns a
unique id to it. This happens in the vm child process and has to be
communicated to the parent processes to track the vm. Knowing the vm
id in the parent and vmm processes also allows to remove vm from the
daemons list on terminate requests later.


# 1.18 06-Dec-2015 reyk

Check errno from config_getvm() correctly


# 1.17 05-Dec-2015 reyk

Print shorter error message if opening /dev/vmm failed.

Pointed out by deraadt@


# 1.16 03-Dec-2015 reyk

Re-add the "load" and "reload" commands to vmctl: Instead of parsing
the configuration in vmctl directly, it now sends a (re)load request
to vmd. The reload also resets the existing configuration status -
this doesn't do much difference yet but a future change will compare
if a specified VM is already running. "load" will allow to add
configuration, while "reload" resets the state before loading.


# 1.15 03-Dec-2015 reyk

Add and document -D and -f flags to vmd.


# 1.14 03-Dec-2015 reyk

mlarkin's code has been moved to vmm.c, so it is ok to claim the copyright.


# 1.13 03-Dec-2015 reyk

Add support for an optional vm.conf(5) file in vmd. This will replace
vmm.conf(5) in vmmctl. For a short time, both vmd and vmmctl will
support a configuration file, but vmmctl will be changed to send
"load" requests to vmd instead of loading and parsing the file
directly.


# 1.12 03-Dec-2015 reyk

prepare config_getvm() for parse.y


# 1.11 02-Dec-2015 reyk

send the tty name to vmmctl and print it as a result.


# 1.10 02-Dec-2015 reyk

Split the fully privileged parent into two processes "parent" and
"vmm" with reduced privileges:
- the "parent" opens fds (disks, ifs, etc.) but runs as root but pledged as
"stdio rpath wpath proc tty sendfd".
- the "vmm" process handles the creation and supervision of vm processes,
and the primary communication with the vmm(4) subsystem. It runs as _vmd
in the chroot but does not use pledge, as the vmm ioctls are not allowed
by any pledge model yet.
With this change, vmd starts to track the configuration state of VMs
in vmd and will allow other things later (like terminating a vm by
name, moving the configuration parser to vmd, ...). More incremental
changes will follow.


# 1.9 02-Dec-2015 reyk

Start tweaking vmd's privsep and daemon model by splitting the main
process into multiple parts and adopting the "proc.c"-style from other
daemons. This allows to further reduce the privileges, to give better
pledge(2), and to add some upcoming changes.

"please do" mlarkin@, deraadt@


# 1.8 26-Nov-2015 reyk

Automatically start vmm(4) when the first VM is created and after the
last VM is terminated. This allows to remove the explicit "vmm
enable" / "vmm disable" (VMM_IOC_START / VMM_IOC_STOP) ioctls. You'll
have to update kernel and userland for this change, as the kernel ABI
changes.

OK mpi@ mlarkin@


# 1.7 25-Nov-2015 tedu

typo: should be looking pid == -1


# 1.6 23-Nov-2015 reyk

accept4() is restarted after signals which prevents vmd from exiting
in the current control socket loop. Add a poll before the accept that
is not restarted and allows to escape the loop. This code is kind of
temporary, as we're planning to replace the event handling, but it
allows to kill (or Ctrl+c) vmd for now.

OK tedu@, discussed with many


# 1.5 23-Nov-2015 reyk

I accidentally removed a newline in usage() when converting the log
messages to log_*.

From Cesare Gargano


# 1.4 23-Nov-2015 reyk

Add support for logging to stderr or syslog, and to run vmd in
foreground with -d.

OK mlarkin@ jung@


# 1.3 22-Nov-2015 deraadt

use PATH_MAX where needed


# 1.2 22-Nov-2015 reyk

Add $ Ids


# 1.1 22-Nov-2015 mlarkin

vmd(8) - virtual machine daemon.

There is still a lot to be done, and fixed, in these userland components
but I have received enough "it works, commit it" emails that it's time
to finish those things in tree.

discussed with many, tested by many.


# 1.137 22-Jan-2023 dv

vmd(8): don't remove known vm's from the config on error.

Multiple error paths, specifically the one related to if a guest
cannot allocate memory at start, resulted in a known vm (via
vm.conf(5)) being removed from the vm list. Adjust the error paths
to check if the failing vm is defined in the config before tearing
it down.

Tested with help from beck@ and Mischa Peters.

ok beck@


# 1.136 14-Jan-2023 dv

Only open /dev/vmm once in vmd(8).

Have the parent process open /dev/vmm and send the fd to the vmm
child process. Only the vmm process and its resulting children
(guest vms) need it for ioctl calls.

ok kn@


# 1.135 28-Dec-2022 jmc

spelling fixes; from paul tagliamonte
any parts of his diff not taken are noted on tech


# 1.134 15-Dec-2022 dv

Add explicit casts to ctype functions in vmd(8).

OK millert@


# 1.133 31-Oct-2022 dv

vmd(8): remove unfinished user accounting.

User accounting and enforcement was never finished. tedu the thing
until someone wants to pick it up and finish it.

Originally found by Matthew Martin.

ok mlarkin@, kn@. input from tb@.


Revision tags: OPENBSD_7_2_BASE
# 1.132 13-Sep-2022 martijn

branches: 1.132.2;
Add (partial) support for agentx in vmd.

Metrics can be found under mib-2.236 and VM-MIB (RFC7666).

Stress tested by and happy noises from Mischa Peters
OK dv@


# 1.131 08-May-2022 dv

vmd: fix rebooting a received vm

Rebooting a received vm resulted in vmd(8) exiting as a result of
flawed state tracking in the parent process.

When stopping a vm, clear the VM_RECEIVE_STATE flag. When starting
a vm, make sure the parent process collapses any existing memory
ranges after the vm is sent to the vmm process (responsible for
launching the vm).

ok mlarkin@


Revision tags: OPENBSD_7_1_BASE
# 1.130 01-Mar-2022 dv

branches: 1.130.2;
vmd(8): gracefully handle hitting data limits when starting a vm

With recent changes to login.conf(5) to restrict daemon datasize
to a finite value, users can now hit resource limits when attempting
to start a vm.

This change fixes the error path when hitting the limit. vmd(8)
will no longer abort and memory error messages are relayed to the
user.

While here, address potential under-reads/writes using atomicio
when relaying data between the child vm process and vmd's vmm
process.

Original diff from tedu@. OK mlarkin@.


# 1.129 04-Jan-2022 claudio

Fix some simple -Wunused-but-set-variable warnings.
OK benno@ dv@


# 1.128 13-Dec-2021 deraadt

including sys/cdefs.h manually started as a result of netbsd trying to
macro-build a replacement for sccsid, and was done without any concern
for namespace damage. Unfortunately this practice started infecting
other code as others were unaware they didn't need the file.
ok millert guenther


# 1.127 29-Nov-2021 deraadt

mostly avoid sys/param.h with a local nitems()
ok mlarkin


Revision tags: OPENBSD_7_0_BASE
# 1.126 18-Jul-2021 dv

vmd(8): remove invalid errno values from config_setvm

Refactor config_setvm to directly return error code on failure
instead of returning -1 and setting errno. It was setting unsupported
values not defined in <errno.h>.

OK mlarkin@


# 1.125 05-May-2021 dv

Refactor vm_instance to return error value directly.

vmd(8)'s vm_instance function set unsupported errno values. Change the
api to directly return an error (either errno or custom vmd error).

"go for it" -mlarkin@


# 1.124 04-May-2021 dv

Init debug logging state before attempting to log.

Error messages related to bad configuration were not flushing to
stderr.

OK mlarkin@


# 1.123 26-Apr-2021 dv

vmd(8): fix vmctl client "wait" state corruption

Adds queue-based tracking of waiting client state to fix the cause of
state corruption when a vmctl(8) user cancels a wait and restarts it.
The socket fd value for the control process client was being used to
track the waiting party, but this also prevented multiple waiting
clients.

This moves all the state tracking of who to notify of a vm's stopping
to the control process and no longer requires the parent process to
track it in the global environment state.

Future work will be needed to smooth out the difference between the
IMSG_VMDOP_TERMINATE_VM_{EVENT,RESPONSE} events instead of needing to
translate before relaying to the vmctl(8) client.

Tested by Mischa Peters (thanks!)

ok mlarkin@


Revision tags: OPENBSD_6_9_BASE
# 1.122 05-Apr-2021 dv

Send correct response type on unpause errors.

ok pd@


# 1.121 29-Mar-2021 dv

Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp
and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior
ignored did not intercept these packets and instead transmitted them.

This should make vmd(8)'s dhcp behave more as a true dhcp server should and
allows it to work properly with the new dhcpleased(8) attempting a renewal.

OK mlarkin@


# 1.120 27-Jan-2021 deraadt

these programs (with common ancestry) had a -fno-common problem related
to privsep_procid.
ok mortimer


Revision tags: OPENBSD_6_8_BASE
# 1.119 23-Sep-2020 martijn

Revert agentx support for now, we're too close to release.

requested by deraadt@


# 1.118 23-Sep-2020 martijn

Add support for agentx to vmd.

This is based around VM-MIB from RFC7666,but does not export the full
spec. People more knowledgeable of vmd are encouraged to expand on this.


Revision tags: OPENBSD_6_7_BASE
# 1.117 12-Dec-2019 pd

vmd: start vms defined in vm.conf in a staggered fashion

This addresses 'thundering herd' problem when a lot of
vms are configured in vm.conf. A lot of vms booting in parallel can
overload the host and also mess up tsc calibration in openbsd guests as
it uses PIT which doesn't fire reliably if the host is overloaded.

We default to starting vms with parallelism of ncpuonline and a delay 30 seconds
between batches. This is configurable in vm.conf.

ok mlarkin@ (also addressed comments from cheloha@)


Revision tags: OPENBSD_6_6_BASE
# 1.116 04-Sep-2019 mlarkin

vmd(8): memory leak in an error path

Found by Hiltjo Posthuma, thanks!


# 1.115 14-Aug-2019 anton

Improve the error message when supplying an invalid template to vmctl
start. Favoring 'invalid template' over 'permission denied' should give
the user a better hint on what went wrong.

ok kn@ mlarkin@


# 1.114 28-Jun-2019 deraadt

When system calls indicate an error they return -1, not some arbitrary
value < 0. errno is only updated in this case. Change all (most?)
callers of syscalls to follow this better, and let's see if this strictness
helps us in the future.


# 1.113 20-May-2019 jasper

drop fatalx calls when claiming a new vm id; otherwise it's possible
to crash vmd and take all other vms with it. this required a little
shuffling to get the error value reported back to the caller to
handle the error properly.

ok mlarkin@


# 1.112 11-May-2019 jasper

report vm state through 'vmctl status'; whereas previously this would display the state of
the vcpu (which is why it got removed), it now actually reports the correct state
(running, stopped, disabled, paused, etc)

ok ccardenas@ mlarkin@


# 1.111 11-May-2019 jasper

vm_dump_header allocated space for a signature but it was never set;
set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image

ok mlarkin@ pd@


# 1.110 11-May-2019 jasper

track the state of the vm (running, paused, etc) using a single bitfield instead of
a handful of separate variables. this will makes it easier for vmd to report
and check on the individual vm states

no functional change intended

ok ccardenas@ mlarkin@


# 1.109 11-May-2019 jasper

sync the vm state in vmd too when (un)pausing a vm, otherwise the vm process
knows the vm is paused, but vmd does not.

ok mlarkin@ pd@


Revision tags: OPENBSD_6_5_BASE
# 1.108 09-Dec-2018 claudio

When -B is used to specify a specific boot device also change the reboot
behaviour of vmd to stop / exit at guest reboot.
OK ccardenas@


# 1.107 04-Dec-2018 claudio

Introduce IMSG_VMDOP_WAIT_VM_REQUEST a control message that registers a
vmctl peerid that should be informed when the VM is stopped (like when the
guest does a shutdown). Uses the same logic as using the VMOP_WAIT flag on
IMSG_VMDOP_TERMINATE_VM_REQUEST.
Ok ccardenas@, reyk@


# 1.106 26-Nov-2018 ori

Keep a list of known vms, and reuse the VM IDs.

This means that when using '-L', the IP addresses of the VMs are stable.

ok reyk@


# 1.105 21-Nov-2018 reyk

Add support for "local inet6" interfaces.

ok & test ccardenas@, additional review from kn@


# 1.104 15-Oct-2018 reyk

Prevent VM reboot loops by rate-limiting the interval a VM can reboot.

This looping has been experienced by people who run VMs with a broken
kernel or boot loader that trigger a very fast reboot loop (triple
fault) of a VM that ends up using a lot of CPU and resources on the
host. Some fixes in vmm(4) and vmd(8) helped to avoid such conditions
but it can still occur if something is wrong in the guest VM itself.

If the VM restarts after less than VM_START_RATE_SEC (6) seconds, we
increment the limit counter. After VM_START_RATE_LIMIT (3) of suchs
fast reboots the VM is stopped.

There are only very few people who intentionally want to reboot-loop a
VM very quickly (many times within a second); mostly for fuzzing.
They will have to recompile and adjust the stated #defines in the code
as we don't have a config option to disable it.

OK mlarkin@


Revision tags: OPENBSD_6_4_BASE
# 1.103 08-Oct-2018 reyk

Add support for qcow2 base images (external snapshots).

This works is from Ori Bernstein, committing on his behalf:

Add support to vmd for external snapshots. That is, snapshots that are
derived from a base image. Data lookups start in the derived image,
and if the derived image does not contain some data, the search
proceeds ot the base image. Multiple derived images may exist off of
a single base image.

A limitation of this format is that modifying the base image will
corrupt the derived image.

This change also adds support for creating disk derived disk images to
vmctl. To use it:

vmctl create derived.qcow2 -s 16G -b base.qcow2

From Ori Bernstein
OK mlarkin@ reyk@


# 1.102 29-Sep-2018 pd

vmd: don't remove vm if sending failed

Fix a bug where a vm was removed in vmd.c after vmctl send even if sending
failed.
spotted by solene@
ok mlarkin@


# 1.101 28-Sep-2018 reyk

Fix copy-pasto to use maxmem instead of maxcpu

Reported by Greg Steuck

OK mlarkin@


# 1.100 10-Sep-2018 bluhm

vmd(8) clould close file descriptor 0 as not all fd fields were
properly initialized with -1. Also avoid closing -1.
OK mlarkin@


# 1.99 10-Sep-2018 bluhm

During the fork+exec implementation, daemon(3) was moved after
proc_init(). As a consequence vmd(8) child processes did not detach
from the terminal anymore. Dup /dev/null to the stdio file descriptors
in the children.
OK mlarkin@ reyk@


# 1.98 15-Jul-2018 reyk

Track resources and enforce cpu/memory/interface limits for non-root users.

The limits are currently hard-coded and undocumented (4 CPUs/VMs, 2G
memory, 8 interfaces) but will be configurable in an upcoming diff.
These limits are tracked in total usage; for example, a user will be
able to run up to 4 VMs with 512M of memory or a single VM with 2G.

OK ccardenas@ mlarkin@


# 1.97 13-Jul-2018 reyk

Check the disk/kernel/cdrom file permissions after openening the fd.

This prevents time of TOCTOU attacks for instances.

OK mlarkin@


# 1.96 13-Jul-2018 reyk

Add "allow instance" option.

This allows users to create VM instances and change desired options,
for example a user can be allowed to run a VM with all the
pre-configured options but specify an own disk image.

(mlarkin@ was fine with iterating over it)

OK ccardenas@


# 1.95 12-Jul-2018 reyk

Allow to use configured/running VMs as templates for other VM instances.

This introduces new grammar and the -t optional in vmctl start.

(For now, only root can create VM instances; but it is planned to allow
users to create their own VMs based on permissions and quota.)

OK ccardenas@ mlarkin@ jmc@


# 1.94 11-Jul-2018 reyk

style - indent each case statement in a switch.


# 1.93 11-Jul-2018 reyk

Add -w option to vmctl stop to wait for completion of VM termination.

Use it in /etc/rc.d/vmd accordingly.

OK sthen@


# 1.92 11-Jul-2018 reyk

Rename function to vmd_check_vmh


# 1.91 11-Jul-2018 reyk

Add -f option to vmctl stop to forcefully kill a VM.

This also fixes a bug in vmm_sighdlr where it might have missed
forwarding the TERMINATE_EVENT to the vmd parent after a VM child
died, leading to an abandoned VM in the vmd parent process.

OK ccardenas@ mlarkin@ benno@ kn@


# 1.90 10-Jul-2018 reyk

style (single-line ifs don't need braces)


# 1.89 10-Jul-2018 reyk

vmd already had DEBUG/DPRINTF, there is no need for VMD_DEBUG/dprintf

Replace all occurences of dprintf with DPRINTF (defined in proc.h).


# 1.88 10-Jul-2018 reyk

Tweak debug log messages

- Turn tracing messages into DPRINTF (only compiled with DEBUG).

- Pass __func__ to vm_stop and vm_remove: this way we can track who
called the function in the async context. It replaces the manual
log_debug in front of each vm_stop/vm_remove. This debug logging
trick can be removed in the future once we are more confident about
it.

OK ccardenas@ mlarkin@


# 1.87 26-Jun-2018 reyk

Add "socket owner" to allow changing the owner of the vmd control socket.

This allows to open vmctl control or console access to other users
that are not in group wheel. Access for non-root users still defaults
to read-only actions unless you change the owner (user/group) of each
individual VM.

Requested by Mischa Peters

OK mlarkin@


# 1.86 19-Jun-2018 reyk

knf


# 1.85 13-May-2018 pd

vmd(8): enable pause / unpause for vm owners

Patch from Mohamed Aslan. Thanks!
ok kn@


# 1.84 25-Apr-2018 mlarkin

vmd(8)'s early error messages weren't visible when started via /etc/rc
(such as errors relating to not having VMX/etc). Change the log_init
to log to syslog so at least we have some chance of seeing these errors.

requested and ok beck@


# 1.83 21-Apr-2018 mlarkin

spelling error in log message


# 1.82 29-Mar-2018 martijn

Make sure that the global config is send out immediately when it is
loaded. This makes sure that the local prefix specied in the config is
always used.

OK ccardenas@


Revision tags: OPENBSD_6_3_BASE
# 1.81 14-Mar-2018 mlarkin

block two VMs from using the same disk image file at the same time.
Also changes an error message in vmctl to reflect same.


# 1.80 18-Feb-2018 pd

vmd: fix vmctl pause for non existing vm ids (never returns)

check if vm id is valid before sending to vmm for pausing. The 'lock' is caused
by vmm sending back ENOENT for a non existent vm but vmd drops the message
because it doesn't recogize the vmid vmm is talking about. This is an artifact
of the 'policy' don't trust any imsg from a sibling priv sep process and do
your own checking.

reported by Abel Abraham Camarillo Ojeda
ok mlarkin@ and ccardenas@


# 1.79 10-Jan-2018 sthen

Don't require "disk" or "kernel", also allow just "cdrom" instead, a VM can
still be useful with only cdrom storage. ok ccardenas@


# 1.78 08-Jan-2018 mpi

Enable TIOCUCNTL to be able to set ns8250's break detected condition.

It is now possible to send BREAK commands to vmd(8) independently of
the serial terminal emulator.

Happy virtual ddb(4) hacking!

No objection from mlarkin@, ok nicm@, ccardenas@, deraadt@


# 1.77 03-Jan-2018 ccardenas

Add initial CD-ROM support to VMD via vioscsi.

* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8)
* Support various sized ISOs (Limitation of 4G ISOs on Linux guests)
* Known working guests: OpenBSD (primary), Alpine Linux (primary),
CentOS 6 (secondary), Ubuntu 17.10 (secondary).
NOTE: Secondary indicates some issue(s) preventing full/reliable
functionality outside the scope of the vioscsi work.
* If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's
default BIOS) will boot from CD-ROM.

ok mlarkin@, jca@


# 1.76 06-Dec-2017 abieber

Make vmd respect owner when starting non-disabled vms.

OK pd@, benno@


# 1.75 30-Nov-2017 ccardenas

When performing vmctl reload and a previously configured vm is running,
exit with an EALREADY vs EPERM.

ok mlarkin@


# 1.74 11-Nov-2017 mlarkin

update switch handling in vmd(8). vmd now gets switch information (rdomain,
etc) from underlying switch interface instead of handling this on its
own.

Diff from carlos cardenas, Thanks!

ok reyk@


# 1.73 07-Nov-2017 mlarkin

typo in previous


# 1.72 07-Nov-2017 mlarkin

comment function vm_checkperm


# 1.71 24-Oct-2017 mlarkin

The VMD parent process didn't handle the case of a VM exiting
with a non 0 return properly (i.e. EIO).

From: Carlos Cardenas, thanks!


# 1.70 07-Oct-2017 mlarkin

vmd: retain ownership on vm reboot

from Jesper Wallin, thanks!


Revision tags: OPENBSD_6_2_BASE
# 1.69 08-Sep-2017 mlarkin

vmd: add more explanatory log_debug messages

From Carlos Cardenas, many thanks!


# 1.68 20-Aug-2017 pd

vmd: Allow only upward migration

This restricts receiving vms from hosts with more cpu features.

Tested on
broadwell -> skylake (works)
skylake -> broadwell (don't work)

ok mlarkin@


# 1.67 15-Aug-2017 pd

vmd: fix vm id displayed by vmctl when receiving a vm

Also fix two debug messages and an IMSG type.


# 1.66 14-Aug-2017 jasper

validate vm names before creating them; a valid name contains alphanumeric
characters, including '.', '_' and '-'. but does not start with the latter
three.

ok mlarkin@ pd@


# 1.65 13-Aug-2017 jasper

don't issue a termination command to an already stopped vm

ok mlarkin@


# 1.64 15-Jul-2017 pd

Add vmctl send and vmctl receive

ok reyk@ and mlarkin@


# 1.63 09-Jul-2017 pd

vmd/vmctl: Add ability to pause / unpause vms

With help from Ashwin Agrawal

ok reyk@ mlarkin@


# 1.62 29-May-2017 mlarkin

vmd(8): prevent crashing when presented with a vm name argument to
"vmctl stop" that doesn't exist.

Diff from Pratik Vyas, thanks!


# 1.61 04-May-2017 reyk

Report command failure back to vmctl reload, reset, load, log verbose.

OK mlarkin@


# 1.60 04-May-2017 reyk

Add support for rdomains.

This allows to configure VM interfaces and switches in individual rdomains.

OK mlarkin@


# 1.59 25-Apr-2017 reyk

Generate randomized MAC addresses earlier to keep them across reboots.

OK deraadt@


# 1.58 21-Apr-2017 reyk

Add global configuration option "local prefix" to change prefix for -L.

The default prefix is 100.64.0.0/10 from RFC6598.

Requested by sthen@ chris@
OK mlarkin@


# 1.57 19-Apr-2017 reyk

Add support for dynamic "NAT" interfaces (-L/local interface).

When a local interface is configured, vmd configures a /31 address on
the tap(4) interface of the host and provides another IP in the same
subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server
that replies with IP, gateway, and DNS addresses to the VM. The
built-in server only ever responds to the VM on the inside and cannot
leak its DHCP responses to the outside.

Thanks to Uwe Werler, Josh Grosse, and some others for testing!

OK deraadt@


# 1.56 06-Apr-2017 reyk

Do not expose vmm(4) VM IDs to the user, use vmd(8)'s IDs instead.

Each VM has two IDs: one from the kernel (vmm) and a different one
from userland (vmd). The vmm ID is not consistent and incremented on
every boot during runtimg of the host system. The vmd ID remains the
same during the lifetime of a configured VM, even after reboots.
Configured VMs will even get and keep their IDs when the configuration
is loaded. This is more what users expect.

Pointed out and tested by otto@

OK deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.55 15-Mar-2017 reyk

More fixes for starting and stopping VMs, fixing fallout from vm_running.

- Don't start a VM that is already running
- Keep the VM as running until it is powered off (and not stopping)
- Don't fatal in the parent if the vmm process referenced an unknown VM
- Don't stop a VM that is already stopping
- Indicate that a VM is stopping in "vmctl status"

The previous "vmctl stop; vmctl stop" to force-shutdown is not
supported anymore - the shutdown timeout should make sure that the VM
is really terminated. To force-shutdown, reference the VM by ID.
We might add a flag to vmctl stop to just turn the VM off.


# 1.54 15-Mar-2017 reyk

Close the tty if the VM was powered down.

The parent keeps a copy of each VM's tty fd to reuse it on reboot.
Close this tty if the VM was stopped, and not rebooted, by calling
vm_stop(vm, 0) instead of just setting vm_running to 0. Also make
sure that vm_ttyname is not used after free'ing it.


# 1.53 02-Mar-2017 reyk

Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.

This is especially useful when multiple VMs share a switch, the
implementation is independent from the underlying switch or bridge.

no objections mlarkin@


# 1.52 01-Mar-2017 reyk

Add "owner" option to set a user/group ownership for pre-configured VMs

This allows matching users to start or stop VMs that they "own" and to
access the console accordingly.

OK mlarkin@


# 1.51 27-Feb-2017 reyk

Replace openpty(3) with local function that uses pre-opened /dev/ptm fd

This allows more flexibility for upcoming changes and better pledge.
We also didn't use half of the features of libutil's openpty function.
Additionally, make sure that the ttys are closed correctly on shutdown.

OK gilles@


# 1.50 13-Jan-2017 edd

Make it possible to remove VMs from vmd(8)'s internal queue.

The semantics agreed with reyk@ are:

* ad-hoc created vms, created with `vmctl start`, are removed once stopped.
* Stopped VMs defined in a config file are flushed before a `vmctl reload`.

OK reyk@


# 1.49 11-Jan-2017 reyk

Add imsg communication channel between vmd and invividual VMs.
For now, this is only used to forward "log verbose|brief" requests,
but it will be used for better things later.

OK mlarkin@


# 1.48 09-Jan-2017 reyk

Stop accessing verbose and debug variables from log.c directly.

This replaces log_verbose() and "extern int verbose" with the two functions
log_setverbose() and log_getverbose().

Pointed out by benno@
OK krw@ eric@ gilles@ (OK gilles@ for the snmpd bits as well)


# 1.47 14-Dec-2016 reyk

Allow to start disabled and pre-configured VMs by name, "vmctl start foo".

With testing from Jon Bernard

OK mlarkin@


# 1.46 14-Dec-2016 reyk

If a VM terminates with the result EAGAIN, close all fds except the
pty and re-send it to the vmm monitor process. With additional
changes in vmm.c, this will allow perform a cold reboot of VM.

With testing and feedback from Jon Bernard
OK mlarkin@


# 1.45 26-Nov-2016 reyk

Implement basic support for boot.conf(8) on the disk image.

Like the real boot loader, load and parse hd0a:/etc/boot.conf from the
first disk and fall back to /bsd. Not all boot loader options are
supported, but it at least does set device, set image, and boot -acds
(eg. for booting single-user).

For example, it can now boot install60.fs that includes a boot.conf
with "set image /6.0/amd64/bsd.rd":
vmctl start install -c -d install60.fs -d OpenBSD.img

This pseudo-bootloader is only needed without BIOS and could
potentially be replaced in the future.

OK mlarkin@


# 1.44 26-Nov-2016 reyk

If -m/memory is not specified, use 512M by default.

Default value picked with mlarkin - not too small and not too large.

OK mlarkin@


# 1.43 24-Nov-2016 reyk

Add support for booting the kernel from the disk image.

This make the kernel/-k argument optional and, if not specified, tries
to find the /bsd kernel in the primary hd0a partition of the first
disk image itself. It doesn't support hd0a:/etc/boot.conf yet, and it
is no BIOS or full boot loader, but it makes booting and handling of
VMs a bit easier - booting an external kernel is still supported.

The UFS file system code ufs.c is directly from libsa which is also
used by the real boot loader. The code compiles with a few signedness
warning which will be fixed separately.

OK mlarkin@


# 1.42 22-Nov-2016 reyk

Fix error path of config_setvm() and its callers. This unbreaks
loading of invalid kernel files.

Reported by mlarkin@
OK mlarkin@


# 1.41 22-Nov-2016 reyk

There is no need for res when there is already ret.


# 1.40 22-Nov-2016 edd

Insert disabled VMs into vmd(8)'s queues and allow vmctl(8) to display them.

Tested by Jon Bernard and reyk@.

OK reyk@, no objections mlarkin@.

Thanks


# 1.39 04-Nov-2016 reyk

Pass the internal vmid or 0 to vm_register() instead of changing it
once again after setting the next available id.

Suggested by edd@


# 1.38 04-Nov-2016 reyk

Update the config/register/get VM methods to match the config_set/get
style that is used in other places. Also keep the vmid from the parent.

OK edd@


# 1.37 29-Oct-2016 edd

Separate parsing vms and switches from starting them in vmd(8).

Brings us one step closer to having disabled by default vms is vm.conf(5),
which can be started with vmctl(8).

Input, testing and OK reyk@. Thanks.


# 1.36 17-Oct-2016 reyk

Add the option to specify an interface group per virtual switch as well;
this group will be added to all VM tap(4) interfaces in the switch.

Tested by martijn@


# 1.35 15-Oct-2016 reyk

Allow to add an interface to an interface group; with the group keyword.

Requested and tested by martijn@


# 1.34 12-Oct-2016 reyk

Fix functionality and semantics of vmctl load/reload/reset.

OK rzalamena@


# 1.33 06-Oct-2016 reyk

Terminate VMs on shutdown of vmd instead of leaving them running as
undead VM processes.

OK mlarkin@


# 1.32 05-Oct-2016 reyk

Add support for enhanced networking configuration and virtual switches.
See vm.conf(5) for more details.

OK mlarkin@


# 1.31 04-Oct-2016 reyk

Add a new "priv" process that is responsible for ioctls and restricted
operations that aren't allowed under pledge. This is a companion to
the "vmd" process that runs as root but with pledge.

With the "priv" process, each new tap(4) interface now gets a
description to indicate the vm, eg. "vm1-if0-myvm". For network
configuration will be done by vmd/priv later.

OK mlarkin@


# 1.30 29-Sep-2016 reyk

Implement fork+exec for vmd, using the same framework from httpd etc.

No objections from mlarkin@ sunil@


# 1.29 17-Aug-2016 deraadt

small bits of header cleanup; ok mlarkin


# 1.28 29-Jul-2016 stefan

Allow starting a VM again after it was terminated

If a VM exits, terminate it and remove it from the list of
available VMs. That allows a VM with name `foo' to be restarted
after it has exited.

This changes structures shared between vmd and vmctl. You need to
rebuild vmctl also.

ok mlarkin@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.27 05-Feb-2016 reyk

Fix a possible use-after-free in vmd, forward the result to the
control socket before free'ing the vm.

Found by and OK jsg@


# 1.26 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.25 11-Dec-2015 reyk

The vmctl "id" argument can now be a number of or a vm name, eg.
vmctl stop 3
vmctl stop "openbsd.vm"


# 1.24 08-Dec-2015 jsg

when checking the config file with -n don't open /dev/vmm or require root
ok reyk@


# 1.23 08-Dec-2015 jsg

make the -f option work as intended
ok reyk@


# 1.22 07-Dec-2015 reyk

tweak initial error logging


# 1.21 06-Dec-2015 reyk

Prevent running a VM with the same name multiple times - multiple
instances of the same configuration will be handled in a different way
later. It is also not a good idea to use the same writeable disk
with multiple VMs at the same time.

As discussed with mlarkin@


# 1.20 06-Dec-2015 reyk

Print the TTY in the vmctl status output.


# 1.19 06-Dec-2015 reyk

When a new vm is created with VMM_IOC_CREATE, the kernel assigns a
unique id to it. This happens in the vm child process and has to be
communicated to the parent processes to track the vm. Knowing the vm
id in the parent and vmm processes also allows to remove vm from the
daemons list on terminate requests later.


# 1.18 06-Dec-2015 reyk

Check errno from config_getvm() correctly


# 1.17 05-Dec-2015 reyk

Print shorter error message if opening /dev/vmm failed.

Pointed out by deraadt@


# 1.16 03-Dec-2015 reyk

Re-add the "load" and "reload" commands to vmctl: Instead of parsing
the configuration in vmctl directly, it now sends a (re)load request
to vmd. The reload also resets the existing configuration status -
this doesn't do much difference yet but a future change will compare
if a specified VM is already running. "load" will allow to add
configuration, while "reload" resets the state before loading.


# 1.15 03-Dec-2015 reyk

Add and document -D and -f flags to vmd.


# 1.14 03-Dec-2015 reyk

mlarkin's code has been moved to vmm.c, so it is ok to claim the copyright.


# 1.13 03-Dec-2015 reyk

Add support for an optional vm.conf(5) file in vmd. This will replace
vmm.conf(5) in vmmctl. For a short time, both vmd and vmmctl will
support a configuration file, but vmmctl will be changed to send
"load" requests to vmd instead of loading and parsing the file
directly.


# 1.12 03-Dec-2015 reyk

prepare config_getvm() for parse.y


# 1.11 02-Dec-2015 reyk

send the tty name to vmmctl and print it as a result.


# 1.10 02-Dec-2015 reyk

Split the fully privileged parent into two processes "parent" and
"vmm" with reduced privileges:
- the "parent" opens fds (disks, ifs, etc.) but runs as root but pledged as
"stdio rpath wpath proc tty sendfd".
- the "vmm" process handles the creation and supervision of vm processes,
and the primary communication with the vmm(4) subsystem. It runs as _vmd
in the chroot but does not use pledge, as the vmm ioctls are not allowed
by any pledge model yet.
With this change, vmd starts to track the configuration state of VMs
in vmd and will allow other things later (like terminating a vm by
name, moving the configuration parser to vmd, ...). More incremental
changes will follow.


# 1.9 02-Dec-2015 reyk

Start tweaking vmd's privsep and daemon model by splitting the main
process into multiple parts and adopting the "proc.c"-style from other
daemons. This allows to further reduce the privileges, to give better
pledge(2), and to add some upcoming changes.

"please do" mlarkin@, deraadt@


# 1.8 26-Nov-2015 reyk

Automatically start vmm(4) when the first VM is created and after the
last VM is terminated. This allows to remove the explicit "vmm
enable" / "vmm disable" (VMM_IOC_START / VMM_IOC_STOP) ioctls. You'll
have to update kernel and userland for this change, as the kernel ABI
changes.

OK mpi@ mlarkin@


# 1.7 25-Nov-2015 tedu

typo: should be looking pid == -1


# 1.6 23-Nov-2015 reyk

accept4() is restarted after signals which prevents vmd from exiting
in the current control socket loop. Add a poll before the accept that
is not restarted and allows to escape the loop. This code is kind of
temporary, as we're planning to replace the event handling, but it
allows to kill (or Ctrl+c) vmd for now.

OK tedu@, discussed with many


# 1.5 23-Nov-2015 reyk

I accidentally removed a newline in usage() when converting the log
messages to log_*.

From Cesare Gargano


# 1.4 23-Nov-2015 reyk

Add support for logging to stderr or syslog, and to run vmd in
foreground with -d.

OK mlarkin@ jung@


# 1.3 22-Nov-2015 deraadt

use PATH_MAX where needed


# 1.2 22-Nov-2015 reyk

Add $ Ids


# 1.1 22-Nov-2015 mlarkin

vmd(8) - virtual machine daemon.

There is still a lot to be done, and fixed, in these userland components
but I have received enough "it works, commit it" emails that it's time
to finish those things in tree.

discussed with many, tested by many.


# 1.136 14-Jan-2023 dv

Only open /dev/vmm once in vmd(8).

Have the parent process open /dev/vmm and send the fd to the vmm
child process. Only the vmm process and its resulting children
(guest vms) need it for ioctl calls.

ok kn@


# 1.135 28-Dec-2022 jmc

spelling fixes; from paul tagliamonte
any parts of his diff not taken are noted on tech


# 1.134 15-Dec-2022 dv

Add explicit casts to ctype functions in vmd(8).

OK millert@


# 1.133 31-Oct-2022 dv

vmd(8): remove unfinished user accounting.

User accounting and enforcement was never finished. tedu the thing
until someone wants to pick it up and finish it.

Originally found by Matthew Martin.

ok mlarkin@, kn@. input from tb@.


Revision tags: OPENBSD_7_2_BASE
# 1.132 13-Sep-2022 martijn

Add (partial) support for agentx in vmd.

Metrics can be found under mib-2.236 and VM-MIB (RFC7666).

Stress tested by and happy noises from Mischa Peters
OK dv@


# 1.131 08-May-2022 dv

vmd: fix rebooting a received vm

Rebooting a received vm resulted in vmd(8) exiting as a result of
flawed state tracking in the parent process.

When stopping a vm, clear the VM_RECEIVE_STATE flag. When starting
a vm, make sure the parent process collapses any existing memory
ranges after the vm is sent to the vmm process (responsible for
launching the vm).

ok mlarkin@


Revision tags: OPENBSD_7_1_BASE
# 1.130 01-Mar-2022 dv

vmd(8): gracefully handle hitting data limits when starting a vm

With recent changes to login.conf(5) to restrict daemon datasize
to a finite value, users can now hit resource limits when attempting
to start a vm.

This change fixes the error path when hitting the limit. vmd(8)
will no longer abort and memory error messages are relayed to the
user.

While here, address potential under-reads/writes using atomicio
when relaying data between the child vm process and vmd's vmm
process.

Original diff from tedu@. OK mlarkin@.


# 1.129 04-Jan-2022 claudio

Fix some simple -Wunused-but-set-variable warnings.
OK benno@ dv@


# 1.128 13-Dec-2021 deraadt

including sys/cdefs.h manually started as a result of netbsd trying to
macro-build a replacement for sccsid, and was done without any concern
for namespace damage. Unfortunately this practice started infecting
other code as others were unaware they didn't need the file.
ok millert guenther


# 1.127 29-Nov-2021 deraadt

mostly avoid sys/param.h with a local nitems()
ok mlarkin


Revision tags: OPENBSD_7_0_BASE
# 1.126 18-Jul-2021 dv

vmd(8): remove invalid errno values from config_setvm

Refactor config_setvm to directly return error code on failure
instead of returning -1 and setting errno. It was setting unsupported
values not defined in <errno.h>.

OK mlarkin@


# 1.125 05-May-2021 dv

Refactor vm_instance to return error value directly.

vmd(8)'s vm_instance function set unsupported errno values. Change the
api to directly return an error (either errno or custom vmd error).

"go for it" -mlarkin@


# 1.124 04-May-2021 dv

Init debug logging state before attempting to log.

Error messages related to bad configuration were not flushing to
stderr.

OK mlarkin@


# 1.123 26-Apr-2021 dv

vmd(8): fix vmctl client "wait" state corruption

Adds queue-based tracking of waiting client state to fix the cause of
state corruption when a vmctl(8) user cancels a wait and restarts it.
The socket fd value for the control process client was being used to
track the waiting party, but this also prevented multiple waiting
clients.

This moves all the state tracking of who to notify of a vm's stopping
to the control process and no longer requires the parent process to
track it in the global environment state.

Future work will be needed to smooth out the difference between the
IMSG_VMDOP_TERMINATE_VM_{EVENT,RESPONSE} events instead of needing to
translate before relaying to the vmctl(8) client.

Tested by Mischa Peters (thanks!)

ok mlarkin@


Revision tags: OPENBSD_6_9_BASE
# 1.122 05-Apr-2021 dv

Send correct response type on unpause errors.

ok pd@


# 1.121 29-Mar-2021 dv

Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp
and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior
ignored did not intercept these packets and instead transmitted them.

This should make vmd(8)'s dhcp behave more as a true dhcp server should and
allows it to work properly with the new dhcpleased(8) attempting a renewal.

OK mlarkin@


# 1.120 27-Jan-2021 deraadt

these programs (with common ancestry) had a -fno-common problem related
to privsep_procid.
ok mortimer


Revision tags: OPENBSD_6_8_BASE
# 1.119 23-Sep-2020 martijn

Revert agentx support for now, we're too close to release.

requested by deraadt@


# 1.118 23-Sep-2020 martijn

Add support for agentx to vmd.

This is based around VM-MIB from RFC7666,but does not export the full
spec. People more knowledgeable of vmd are encouraged to expand on this.


Revision tags: OPENBSD_6_7_BASE
# 1.117 12-Dec-2019 pd

vmd: start vms defined in vm.conf in a staggered fashion

This addresses 'thundering herd' problem when a lot of
vms are configured in vm.conf. A lot of vms booting in parallel can
overload the host and also mess up tsc calibration in openbsd guests as
it uses PIT which doesn't fire reliably if the host is overloaded.

We default to starting vms with parallelism of ncpuonline and a delay 30 seconds
between batches. This is configurable in vm.conf.

ok mlarkin@ (also addressed comments from cheloha@)


Revision tags: OPENBSD_6_6_BASE
# 1.116 04-Sep-2019 mlarkin

vmd(8): memory leak in an error path

Found by Hiltjo Posthuma, thanks!


# 1.115 14-Aug-2019 anton

Improve the error message when supplying an invalid template to vmctl
start. Favoring 'invalid template' over 'permission denied' should give
the user a better hint on what went wrong.

ok kn@ mlarkin@


# 1.114 28-Jun-2019 deraadt

When system calls indicate an error they return -1, not some arbitrary
value < 0. errno is only updated in this case. Change all (most?)
callers of syscalls to follow this better, and let's see if this strictness
helps us in the future.


# 1.113 20-May-2019 jasper

drop fatalx calls when claiming a new vm id; otherwise it's possible
to crash vmd and take all other vms with it. this required a little
shuffling to get the error value reported back to the caller to
handle the error properly.

ok mlarkin@


# 1.112 11-May-2019 jasper

report vm state through 'vmctl status'; whereas previously this would display the state of
the vcpu (which is why it got removed), it now actually reports the correct state
(running, stopped, disabled, paused, etc)

ok ccardenas@ mlarkin@


# 1.111 11-May-2019 jasper

vm_dump_header allocated space for a signature but it was never set;
set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image

ok mlarkin@ pd@


# 1.110 11-May-2019 jasper

track the state of the vm (running, paused, etc) using a single bitfield instead of
a handful of separate variables. this will makes it easier for vmd to report
and check on the individual vm states

no functional change intended

ok ccardenas@ mlarkin@


# 1.109 11-May-2019 jasper

sync the vm state in vmd too when (un)pausing a vm, otherwise the vm process
knows the vm is paused, but vmd does not.

ok mlarkin@ pd@


Revision tags: OPENBSD_6_5_BASE
# 1.108 09-Dec-2018 claudio

When -B is used to specify a specific boot device also change the reboot
behaviour of vmd to stop / exit at guest reboot.
OK ccardenas@


# 1.107 04-Dec-2018 claudio

Introduce IMSG_VMDOP_WAIT_VM_REQUEST a control message that registers a
vmctl peerid that should be informed when the VM is stopped (like when the
guest does a shutdown). Uses the same logic as using the VMOP_WAIT flag on
IMSG_VMDOP_TERMINATE_VM_REQUEST.
Ok ccardenas@, reyk@


# 1.106 26-Nov-2018 ori

Keep a list of known vms, and reuse the VM IDs.

This means that when using '-L', the IP addresses of the VMs are stable.

ok reyk@


# 1.105 21-Nov-2018 reyk

Add support for "local inet6" interfaces.

ok & test ccardenas@, additional review from kn@


# 1.104 15-Oct-2018 reyk

Prevent VM reboot loops by rate-limiting the interval a VM can reboot.

This looping has been experienced by people who run VMs with a broken
kernel or boot loader that trigger a very fast reboot loop (triple
fault) of a VM that ends up using a lot of CPU and resources on the
host. Some fixes in vmm(4) and vmd(8) helped to avoid such conditions
but it can still occur if something is wrong in the guest VM itself.

If the VM restarts after less than VM_START_RATE_SEC (6) seconds, we
increment the limit counter. After VM_START_RATE_LIMIT (3) of suchs
fast reboots the VM is stopped.

There are only very few people who intentionally want to reboot-loop a
VM very quickly (many times within a second); mostly for fuzzing.
They will have to recompile and adjust the stated #defines in the code
as we don't have a config option to disable it.

OK mlarkin@


Revision tags: OPENBSD_6_4_BASE
# 1.103 08-Oct-2018 reyk

Add support for qcow2 base images (external snapshots).

This works is from Ori Bernstein, committing on his behalf:

Add support to vmd for external snapshots. That is, snapshots that are
derived from a base image. Data lookups start in the derived image,
and if the derived image does not contain some data, the search
proceeds ot the base image. Multiple derived images may exist off of
a single base image.

A limitation of this format is that modifying the base image will
corrupt the derived image.

This change also adds support for creating disk derived disk images to
vmctl. To use it:

vmctl create derived.qcow2 -s 16G -b base.qcow2

From Ori Bernstein
OK mlarkin@ reyk@


# 1.102 29-Sep-2018 pd

vmd: don't remove vm if sending failed

Fix a bug where a vm was removed in vmd.c after vmctl send even if sending
failed.
spotted by solene@
ok mlarkin@


# 1.101 28-Sep-2018 reyk

Fix copy-pasto to use maxmem instead of maxcpu

Reported by Greg Steuck

OK mlarkin@


# 1.100 10-Sep-2018 bluhm

vmd(8) clould close file descriptor 0 as not all fd fields were
properly initialized with -1. Also avoid closing -1.
OK mlarkin@


# 1.99 10-Sep-2018 bluhm

During the fork+exec implementation, daemon(3) was moved after
proc_init(). As a consequence vmd(8) child processes did not detach
from the terminal anymore. Dup /dev/null to the stdio file descriptors
in the children.
OK mlarkin@ reyk@


# 1.98 15-Jul-2018 reyk

Track resources and enforce cpu/memory/interface limits for non-root users.

The limits are currently hard-coded and undocumented (4 CPUs/VMs, 2G
memory, 8 interfaces) but will be configurable in an upcoming diff.
These limits are tracked in total usage; for example, a user will be
able to run up to 4 VMs with 512M of memory or a single VM with 2G.

OK ccardenas@ mlarkin@


# 1.97 13-Jul-2018 reyk

Check the disk/kernel/cdrom file permissions after openening the fd.

This prevents time of TOCTOU attacks for instances.

OK mlarkin@


# 1.96 13-Jul-2018 reyk

Add "allow instance" option.

This allows users to create VM instances and change desired options,
for example a user can be allowed to run a VM with all the
pre-configured options but specify an own disk image.

(mlarkin@ was fine with iterating over it)

OK ccardenas@


# 1.95 12-Jul-2018 reyk

Allow to use configured/running VMs as templates for other VM instances.

This introduces new grammar and the -t optional in vmctl start.

(For now, only root can create VM instances; but it is planned to allow
users to create their own VMs based on permissions and quota.)

OK ccardenas@ mlarkin@ jmc@


# 1.94 11-Jul-2018 reyk

style - indent each case statement in a switch.


# 1.93 11-Jul-2018 reyk

Add -w option to vmctl stop to wait for completion of VM termination.

Use it in /etc/rc.d/vmd accordingly.

OK sthen@


# 1.92 11-Jul-2018 reyk

Rename function to vmd_check_vmh


# 1.91 11-Jul-2018 reyk

Add -f option to vmctl stop to forcefully kill a VM.

This also fixes a bug in vmm_sighdlr where it might have missed
forwarding the TERMINATE_EVENT to the vmd parent after a VM child
died, leading to an abandoned VM in the vmd parent process.

OK ccardenas@ mlarkin@ benno@ kn@


# 1.90 10-Jul-2018 reyk

style (single-line ifs don't need braces)


# 1.89 10-Jul-2018 reyk

vmd already had DEBUG/DPRINTF, there is no need for VMD_DEBUG/dprintf

Replace all occurences of dprintf with DPRINTF (defined in proc.h).


# 1.88 10-Jul-2018 reyk

Tweak debug log messages

- Turn tracing messages into DPRINTF (only compiled with DEBUG).

- Pass __func__ to vm_stop and vm_remove: this way we can track who
called the function in the async context. It replaces the manual
log_debug in front of each vm_stop/vm_remove. This debug logging
trick can be removed in the future once we are more confident about
it.

OK ccardenas@ mlarkin@


# 1.87 26-Jun-2018 reyk

Add "socket owner" to allow changing the owner of the vmd control socket.

This allows to open vmctl control or console access to other users
that are not in group wheel. Access for non-root users still defaults
to read-only actions unless you change the owner (user/group) of each
individual VM.

Requested by Mischa Peters

OK mlarkin@


# 1.86 19-Jun-2018 reyk

knf


# 1.85 13-May-2018 pd

vmd(8): enable pause / unpause for vm owners

Patch from Mohamed Aslan. Thanks!
ok kn@


# 1.84 25-Apr-2018 mlarkin

vmd(8)'s early error messages weren't visible when started via /etc/rc
(such as errors relating to not having VMX/etc). Change the log_init
to log to syslog so at least we have some chance of seeing these errors.

requested and ok beck@


# 1.83 21-Apr-2018 mlarkin

spelling error in log message


# 1.82 29-Mar-2018 martijn

Make sure that the global config is send out immediately when it is
loaded. This makes sure that the local prefix specied in the config is
always used.

OK ccardenas@


Revision tags: OPENBSD_6_3_BASE
# 1.81 14-Mar-2018 mlarkin

block two VMs from using the same disk image file at the same time.
Also changes an error message in vmctl to reflect same.


# 1.80 18-Feb-2018 pd

vmd: fix vmctl pause for non existing vm ids (never returns)

check if vm id is valid before sending to vmm for pausing. The 'lock' is caused
by vmm sending back ENOENT for a non existent vm but vmd drops the message
because it doesn't recogize the vmid vmm is talking about. This is an artifact
of the 'policy' don't trust any imsg from a sibling priv sep process and do
your own checking.

reported by Abel Abraham Camarillo Ojeda
ok mlarkin@ and ccardenas@


# 1.79 10-Jan-2018 sthen

Don't require "disk" or "kernel", also allow just "cdrom" instead, a VM can
still be useful with only cdrom storage. ok ccardenas@


# 1.78 08-Jan-2018 mpi

Enable TIOCUCNTL to be able to set ns8250's break detected condition.

It is now possible to send BREAK commands to vmd(8) independently of
the serial terminal emulator.

Happy virtual ddb(4) hacking!

No objection from mlarkin@, ok nicm@, ccardenas@, deraadt@


# 1.77 03-Jan-2018 ccardenas

Add initial CD-ROM support to VMD via vioscsi.

* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8)
* Support various sized ISOs (Limitation of 4G ISOs on Linux guests)
* Known working guests: OpenBSD (primary), Alpine Linux (primary),
CentOS 6 (secondary), Ubuntu 17.10 (secondary).
NOTE: Secondary indicates some issue(s) preventing full/reliable
functionality outside the scope of the vioscsi work.
* If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's
default BIOS) will boot from CD-ROM.

ok mlarkin@, jca@


# 1.76 06-Dec-2017 abieber

Make vmd respect owner when starting non-disabled vms.

OK pd@, benno@


# 1.75 30-Nov-2017 ccardenas

When performing vmctl reload and a previously configured vm is running,
exit with an EALREADY vs EPERM.

ok mlarkin@


# 1.74 11-Nov-2017 mlarkin

update switch handling in vmd(8). vmd now gets switch information (rdomain,
etc) from underlying switch interface instead of handling this on its
own.

Diff from carlos cardenas, Thanks!

ok reyk@


# 1.73 07-Nov-2017 mlarkin

typo in previous


# 1.72 07-Nov-2017 mlarkin

comment function vm_checkperm


# 1.71 24-Oct-2017 mlarkin

The VMD parent process didn't handle the case of a VM exiting
with a non 0 return properly (i.e. EIO).

From: Carlos Cardenas, thanks!


# 1.70 07-Oct-2017 mlarkin

vmd: retain ownership on vm reboot

from Jesper Wallin, thanks!


Revision tags: OPENBSD_6_2_BASE
# 1.69 08-Sep-2017 mlarkin

vmd: add more explanatory log_debug messages

From Carlos Cardenas, many thanks!


# 1.68 20-Aug-2017 pd

vmd: Allow only upward migration

This restricts receiving vms from hosts with more cpu features.

Tested on
broadwell -> skylake (works)
skylake -> broadwell (don't work)

ok mlarkin@


# 1.67 15-Aug-2017 pd

vmd: fix vm id displayed by vmctl when receiving a vm

Also fix two debug messages and an IMSG type.


# 1.66 14-Aug-2017 jasper

validate vm names before creating them; a valid name contains alphanumeric
characters, including '.', '_' and '-'. but does not start with the latter
three.

ok mlarkin@ pd@


# 1.65 13-Aug-2017 jasper

don't issue a termination command to an already stopped vm

ok mlarkin@


# 1.64 15-Jul-2017 pd

Add vmctl send and vmctl receive

ok reyk@ and mlarkin@


# 1.63 09-Jul-2017 pd

vmd/vmctl: Add ability to pause / unpause vms

With help from Ashwin Agrawal

ok reyk@ mlarkin@


# 1.62 29-May-2017 mlarkin

vmd(8): prevent crashing when presented with a vm name argument to
"vmctl stop" that doesn't exist.

Diff from Pratik Vyas, thanks!


# 1.61 04-May-2017 reyk

Report command failure back to vmctl reload, reset, load, log verbose.

OK mlarkin@


# 1.60 04-May-2017 reyk

Add support for rdomains.

This allows to configure VM interfaces and switches in individual rdomains.

OK mlarkin@


# 1.59 25-Apr-2017 reyk

Generate randomized MAC addresses earlier to keep them across reboots.

OK deraadt@


# 1.58 21-Apr-2017 reyk

Add global configuration option "local prefix" to change prefix for -L.

The default prefix is 100.64.0.0/10 from RFC6598.

Requested by sthen@ chris@
OK mlarkin@


# 1.57 19-Apr-2017 reyk

Add support for dynamic "NAT" interfaces (-L/local interface).

When a local interface is configured, vmd configures a /31 address on
the tap(4) interface of the host and provides another IP in the same
subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server
that replies with IP, gateway, and DNS addresses to the VM. The
built-in server only ever responds to the VM on the inside and cannot
leak its DHCP responses to the outside.

Thanks to Uwe Werler, Josh Grosse, and some others for testing!

OK deraadt@


# 1.56 06-Apr-2017 reyk

Do not expose vmm(4) VM IDs to the user, use vmd(8)'s IDs instead.

Each VM has two IDs: one from the kernel (vmm) and a different one
from userland (vmd). The vmm ID is not consistent and incremented on
every boot during runtimg of the host system. The vmd ID remains the
same during the lifetime of a configured VM, even after reboots.
Configured VMs will even get and keep their IDs when the configuration
is loaded. This is more what users expect.

Pointed out and tested by otto@

OK deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.55 15-Mar-2017 reyk

More fixes for starting and stopping VMs, fixing fallout from vm_running.

- Don't start a VM that is already running
- Keep the VM as running until it is powered off (and not stopping)
- Don't fatal in the parent if the vmm process referenced an unknown VM
- Don't stop a VM that is already stopping
- Indicate that a VM is stopping in "vmctl status"

The previous "vmctl stop; vmctl stop" to force-shutdown is not
supported anymore - the shutdown timeout should make sure that the VM
is really terminated. To force-shutdown, reference the VM by ID.
We might add a flag to vmctl stop to just turn the VM off.


# 1.54 15-Mar-2017 reyk

Close the tty if the VM was powered down.

The parent keeps a copy of each VM's tty fd to reuse it on reboot.
Close this tty if the VM was stopped, and not rebooted, by calling
vm_stop(vm, 0) instead of just setting vm_running to 0. Also make
sure that vm_ttyname is not used after free'ing it.


# 1.53 02-Mar-2017 reyk

Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.

This is especially useful when multiple VMs share a switch, the
implementation is independent from the underlying switch or bridge.

no objections mlarkin@


# 1.52 01-Mar-2017 reyk

Add "owner" option to set a user/group ownership for pre-configured VMs

This allows matching users to start or stop VMs that they "own" and to
access the console accordingly.

OK mlarkin@


# 1.51 27-Feb-2017 reyk

Replace openpty(3) with local function that uses pre-opened /dev/ptm fd

This allows more flexibility for upcoming changes and better pledge.
We also didn't use half of the features of libutil's openpty function.
Additionally, make sure that the ttys are closed correctly on shutdown.

OK gilles@


# 1.50 13-Jan-2017 edd

Make it possible to remove VMs from vmd(8)'s internal queue.

The semantics agreed with reyk@ are:

* ad-hoc created vms, created with `vmctl start`, are removed once stopped.
* Stopped VMs defined in a config file are flushed before a `vmctl reload`.

OK reyk@


# 1.49 11-Jan-2017 reyk

Add imsg communication channel between vmd and invividual VMs.
For now, this is only used to forward "log verbose|brief" requests,
but it will be used for better things later.

OK mlarkin@


# 1.48 09-Jan-2017 reyk

Stop accessing verbose and debug variables from log.c directly.

This replaces log_verbose() and "extern int verbose" with the two functions
log_setverbose() and log_getverbose().

Pointed out by benno@
OK krw@ eric@ gilles@ (OK gilles@ for the snmpd bits as well)


# 1.47 14-Dec-2016 reyk

Allow to start disabled and pre-configured VMs by name, "vmctl start foo".

With testing from Jon Bernard

OK mlarkin@


# 1.46 14-Dec-2016 reyk

If a VM terminates with the result EAGAIN, close all fds except the
pty and re-send it to the vmm monitor process. With additional
changes in vmm.c, this will allow perform a cold reboot of VM.

With testing and feedback from Jon Bernard
OK mlarkin@


# 1.45 26-Nov-2016 reyk

Implement basic support for boot.conf(8) on the disk image.

Like the real boot loader, load and parse hd0a:/etc/boot.conf from the
first disk and fall back to /bsd. Not all boot loader options are
supported, but it at least does set device, set image, and boot -acds
(eg. for booting single-user).

For example, it can now boot install60.fs that includes a boot.conf
with "set image /6.0/amd64/bsd.rd":
vmctl start install -c -d install60.fs -d OpenBSD.img

This pseudo-bootloader is only needed without BIOS and could
potentially be replaced in the future.

OK mlarkin@


# 1.44 26-Nov-2016 reyk

If -m/memory is not specified, use 512M by default.

Default value picked with mlarkin - not too small and not too large.

OK mlarkin@


# 1.43 24-Nov-2016 reyk

Add support for booting the kernel from the disk image.

This make the kernel/-k argument optional and, if not specified, tries
to find the /bsd kernel in the primary hd0a partition of the first
disk image itself. It doesn't support hd0a:/etc/boot.conf yet, and it
is no BIOS or full boot loader, but it makes booting and handling of
VMs a bit easier - booting an external kernel is still supported.

The UFS file system code ufs.c is directly from libsa which is also
used by the real boot loader. The code compiles with a few signedness
warning which will be fixed separately.

OK mlarkin@


# 1.42 22-Nov-2016 reyk

Fix error path of config_setvm() and its callers. This unbreaks
loading of invalid kernel files.

Reported by mlarkin@
OK mlarkin@


# 1.41 22-Nov-2016 reyk

There is no need for res when there is already ret.


# 1.40 22-Nov-2016 edd

Insert disabled VMs into vmd(8)'s queues and allow vmctl(8) to display them.

Tested by Jon Bernard and reyk@.

OK reyk@, no objections mlarkin@.

Thanks


# 1.39 04-Nov-2016 reyk

Pass the internal vmid or 0 to vm_register() instead of changing it
once again after setting the next available id.

Suggested by edd@


# 1.38 04-Nov-2016 reyk

Update the config/register/get VM methods to match the config_set/get
style that is used in other places. Also keep the vmid from the parent.

OK edd@


# 1.37 29-Oct-2016 edd

Separate parsing vms and switches from starting them in vmd(8).

Brings us one step closer to having disabled by default vms is vm.conf(5),
which can be started with vmctl(8).

Input, testing and OK reyk@. Thanks.


# 1.36 17-Oct-2016 reyk

Add the option to specify an interface group per virtual switch as well;
this group will be added to all VM tap(4) interfaces in the switch.

Tested by martijn@


# 1.35 15-Oct-2016 reyk

Allow to add an interface to an interface group; with the group keyword.

Requested and tested by martijn@


# 1.34 12-Oct-2016 reyk

Fix functionality and semantics of vmctl load/reload/reset.

OK rzalamena@


# 1.33 06-Oct-2016 reyk

Terminate VMs on shutdown of vmd instead of leaving them running as
undead VM processes.

OK mlarkin@


# 1.32 05-Oct-2016 reyk

Add support for enhanced networking configuration and virtual switches.
See vm.conf(5) for more details.

OK mlarkin@


# 1.31 04-Oct-2016 reyk

Add a new "priv" process that is responsible for ioctls and restricted
operations that aren't allowed under pledge. This is a companion to
the "vmd" process that runs as root but with pledge.

With the "priv" process, each new tap(4) interface now gets a
description to indicate the vm, eg. "vm1-if0-myvm". For network
configuration will be done by vmd/priv later.

OK mlarkin@


# 1.30 29-Sep-2016 reyk

Implement fork+exec for vmd, using the same framework from httpd etc.

No objections from mlarkin@ sunil@


# 1.29 17-Aug-2016 deraadt

small bits of header cleanup; ok mlarkin


# 1.28 29-Jul-2016 stefan

Allow starting a VM again after it was terminated

If a VM exits, terminate it and remove it from the list of
available VMs. That allows a VM with name `foo' to be restarted
after it has exited.

This changes structures shared between vmd and vmctl. You need to
rebuild vmctl also.

ok mlarkin@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.27 05-Feb-2016 reyk

Fix a possible use-after-free in vmd, forward the result to the
control socket before free'ing the vm.

Found by and OK jsg@


# 1.26 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.25 11-Dec-2015 reyk

The vmctl "id" argument can now be a number of or a vm name, eg.
vmctl stop 3
vmctl stop "openbsd.vm"


# 1.24 08-Dec-2015 jsg

when checking the config file with -n don't open /dev/vmm or require root
ok reyk@


# 1.23 08-Dec-2015 jsg

make the -f option work as intended
ok reyk@


# 1.22 07-Dec-2015 reyk

tweak initial error logging


# 1.21 06-Dec-2015 reyk

Prevent running a VM with the same name multiple times - multiple
instances of the same configuration will be handled in a different way
later. It is also not a good idea to use the same writeable disk
with multiple VMs at the same time.

As discussed with mlarkin@


# 1.20 06-Dec-2015 reyk

Print the TTY in the vmctl status output.


# 1.19 06-Dec-2015 reyk

When a new vm is created with VMM_IOC_CREATE, the kernel assigns a
unique id to it. This happens in the vm child process and has to be
communicated to the parent processes to track the vm. Knowing the vm
id in the parent and vmm processes also allows to remove vm from the
daemons list on terminate requests later.


# 1.18 06-Dec-2015 reyk

Check errno from config_getvm() correctly


# 1.17 05-Dec-2015 reyk

Print shorter error message if opening /dev/vmm failed.

Pointed out by deraadt@


# 1.16 03-Dec-2015 reyk

Re-add the "load" and "reload" commands to vmctl: Instead of parsing
the configuration in vmctl directly, it now sends a (re)load request
to vmd. The reload also resets the existing configuration status -
this doesn't do much difference yet but a future change will compare
if a specified VM is already running. "load" will allow to add
configuration, while "reload" resets the state before loading.


# 1.15 03-Dec-2015 reyk

Add and document -D and -f flags to vmd.


# 1.14 03-Dec-2015 reyk

mlarkin's code has been moved to vmm.c, so it is ok to claim the copyright.


# 1.13 03-Dec-2015 reyk

Add support for an optional vm.conf(5) file in vmd. This will replace
vmm.conf(5) in vmmctl. For a short time, both vmd and vmmctl will
support a configuration file, but vmmctl will be changed to send
"load" requests to vmd instead of loading and parsing the file
directly.


# 1.12 03-Dec-2015 reyk

prepare config_getvm() for parse.y


# 1.11 02-Dec-2015 reyk

send the tty name to vmmctl and print it as a result.


# 1.10 02-Dec-2015 reyk

Split the fully privileged parent into two processes "parent" and
"vmm" with reduced privileges:
- the "parent" opens fds (disks, ifs, etc.) but runs as root but pledged as
"stdio rpath wpath proc tty sendfd".
- the "vmm" process handles the creation and supervision of vm processes,
and the primary communication with the vmm(4) subsystem. It runs as _vmd
in the chroot but does not use pledge, as the vmm ioctls are not allowed
by any pledge model yet.
With this change, vmd starts to track the configuration state of VMs
in vmd and will allow other things later (like terminating a vm by
name, moving the configuration parser to vmd, ...). More incremental
changes will follow.


# 1.9 02-Dec-2015 reyk

Start tweaking vmd's privsep and daemon model by splitting the main
process into multiple parts and adopting the "proc.c"-style from other
daemons. This allows to further reduce the privileges, to give better
pledge(2), and to add some upcoming changes.

"please do" mlarkin@, deraadt@


# 1.8 26-Nov-2015 reyk

Automatically start vmm(4) when the first VM is created and after the
last VM is terminated. This allows to remove the explicit "vmm
enable" / "vmm disable" (VMM_IOC_START / VMM_IOC_STOP) ioctls. You'll
have to update kernel and userland for this change, as the kernel ABI
changes.

OK mpi@ mlarkin@


# 1.7 25-Nov-2015 tedu

typo: should be looking pid == -1


# 1.6 23-Nov-2015 reyk

accept4() is restarted after signals which prevents vmd from exiting
in the current control socket loop. Add a poll before the accept that
is not restarted and allows to escape the loop. This code is kind of
temporary, as we're planning to replace the event handling, but it
allows to kill (or Ctrl+c) vmd for now.

OK tedu@, discussed with many


# 1.5 23-Nov-2015 reyk

I accidentally removed a newline in usage() when converting the log
messages to log_*.

From Cesare Gargano


# 1.4 23-Nov-2015 reyk

Add support for logging to stderr or syslog, and to run vmd in
foreground with -d.

OK mlarkin@ jung@


# 1.3 22-Nov-2015 deraadt

use PATH_MAX where needed


# 1.2 22-Nov-2015 reyk

Add $ Ids


# 1.1 22-Nov-2015 mlarkin

vmd(8) - virtual machine daemon.

There is still a lot to be done, and fixed, in these userland components
but I have received enough "it works, commit it" emails that it's time
to finish those things in tree.

discussed with many, tested by many.


# 1.135 28-Dec-2022 jmc

spelling fixes; from paul tagliamonte
any parts of his diff not taken are noted on tech


# 1.134 15-Dec-2022 dv

Add explicit casts to ctype functions in vmd(8).

OK millert@


# 1.133 31-Oct-2022 dv

vmd(8): remove unfinished user accounting.

User accounting and enforcement was never finished. tedu the thing
until someone wants to pick it up and finish it.

Originally found by Matthew Martin.

ok mlarkin@, kn@. input from tb@.


Revision tags: OPENBSD_7_2_BASE
# 1.132 13-Sep-2022 martijn

Add (partial) support for agentx in vmd.

Metrics can be found under mib-2.236 and VM-MIB (RFC7666).

Stress tested by and happy noises from Mischa Peters
OK dv@


# 1.131 08-May-2022 dv

vmd: fix rebooting a received vm

Rebooting a received vm resulted in vmd(8) exiting as a result of
flawed state tracking in the parent process.

When stopping a vm, clear the VM_RECEIVE_STATE flag. When starting
a vm, make sure the parent process collapses any existing memory
ranges after the vm is sent to the vmm process (responsible for
launching the vm).

ok mlarkin@


Revision tags: OPENBSD_7_1_BASE
# 1.130 01-Mar-2022 dv

vmd(8): gracefully handle hitting data limits when starting a vm

With recent changes to login.conf(5) to restrict daemon datasize
to a finite value, users can now hit resource limits when attempting
to start a vm.

This change fixes the error path when hitting the limit. vmd(8)
will no longer abort and memory error messages are relayed to the
user.

While here, address potential under-reads/writes using atomicio
when relaying data between the child vm process and vmd's vmm
process.

Original diff from tedu@. OK mlarkin@.


# 1.129 04-Jan-2022 claudio

Fix some simple -Wunused-but-set-variable warnings.
OK benno@ dv@


# 1.128 13-Dec-2021 deraadt

including sys/cdefs.h manually started as a result of netbsd trying to
macro-build a replacement for sccsid, and was done without any concern
for namespace damage. Unfortunately this practice started infecting
other code as others were unaware they didn't need the file.
ok millert guenther


# 1.127 29-Nov-2021 deraadt

mostly avoid sys/param.h with a local nitems()
ok mlarkin


Revision tags: OPENBSD_7_0_BASE
# 1.126 18-Jul-2021 dv

vmd(8): remove invalid errno values from config_setvm

Refactor config_setvm to directly return error code on failure
instead of returning -1 and setting errno. It was setting unsupported
values not defined in <errno.h>.

OK mlarkin@


# 1.125 05-May-2021 dv

Refactor vm_instance to return error value directly.

vmd(8)'s vm_instance function set unsupported errno values. Change the
api to directly return an error (either errno or custom vmd error).

"go for it" -mlarkin@


# 1.124 04-May-2021 dv

Init debug logging state before attempting to log.

Error messages related to bad configuration were not flushing to
stderr.

OK mlarkin@


# 1.123 26-Apr-2021 dv

vmd(8): fix vmctl client "wait" state corruption

Adds queue-based tracking of waiting client state to fix the cause of
state corruption when a vmctl(8) user cancels a wait and restarts it.
The socket fd value for the control process client was being used to
track the waiting party, but this also prevented multiple waiting
clients.

This moves all the state tracking of who to notify of a vm's stopping
to the control process and no longer requires the parent process to
track it in the global environment state.

Future work will be needed to smooth out the difference between the
IMSG_VMDOP_TERMINATE_VM_{EVENT,RESPONSE} events instead of needing to
translate before relaying to the vmctl(8) client.

Tested by Mischa Peters (thanks!)

ok mlarkin@


Revision tags: OPENBSD_6_9_BASE
# 1.122 05-Apr-2021 dv

Send correct response type on unpause errors.

ok pd@


# 1.121 29-Mar-2021 dv

Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp
and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior
ignored did not intercept these packets and instead transmitted them.

This should make vmd(8)'s dhcp behave more as a true dhcp server should and
allows it to work properly with the new dhcpleased(8) attempting a renewal.

OK mlarkin@


# 1.120 27-Jan-2021 deraadt

these programs (with common ancestry) had a -fno-common problem related
to privsep_procid.
ok mortimer


Revision tags: OPENBSD_6_8_BASE
# 1.119 23-Sep-2020 martijn

Revert agentx support for now, we're too close to release.

requested by deraadt@


# 1.118 23-Sep-2020 martijn

Add support for agentx to vmd.

This is based around VM-MIB from RFC7666,but does not export the full
spec. People more knowledgeable of vmd are encouraged to expand on this.


Revision tags: OPENBSD_6_7_BASE
# 1.117 12-Dec-2019 pd

vmd: start vms defined in vm.conf in a staggered fashion

This addresses 'thundering herd' problem when a lot of
vms are configured in vm.conf. A lot of vms booting in parallel can
overload the host and also mess up tsc calibration in openbsd guests as
it uses PIT which doesn't fire reliably if the host is overloaded.

We default to starting vms with parallelism of ncpuonline and a delay 30 seconds
between batches. This is configurable in vm.conf.

ok mlarkin@ (also addressed comments from cheloha@)


Revision tags: OPENBSD_6_6_BASE
# 1.116 04-Sep-2019 mlarkin

vmd(8): memory leak in an error path

Found by Hiltjo Posthuma, thanks!


# 1.115 14-Aug-2019 anton

Improve the error message when supplying an invalid template to vmctl
start. Favoring 'invalid template' over 'permission denied' should give
the user a better hint on what went wrong.

ok kn@ mlarkin@


# 1.114 28-Jun-2019 deraadt

When system calls indicate an error they return -1, not some arbitrary
value < 0. errno is only updated in this case. Change all (most?)
callers of syscalls to follow this better, and let's see if this strictness
helps us in the future.


# 1.113 20-May-2019 jasper

drop fatalx calls when claiming a new vm id; otherwise it's possible
to crash vmd and take all other vms with it. this required a little
shuffling to get the error value reported back to the caller to
handle the error properly.

ok mlarkin@


# 1.112 11-May-2019 jasper

report vm state through 'vmctl status'; whereas previously this would display the state of
the vcpu (which is why it got removed), it now actually reports the correct state
(running, stopped, disabled, paused, etc)

ok ccardenas@ mlarkin@


# 1.111 11-May-2019 jasper

vm_dump_header allocated space for a signature but it was never set;
set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image

ok mlarkin@ pd@


# 1.110 11-May-2019 jasper

track the state of the vm (running, paused, etc) using a single bitfield instead of
a handful of separate variables. this will makes it easier for vmd to report
and check on the individual vm states

no functional change intended

ok ccardenas@ mlarkin@


# 1.109 11-May-2019 jasper

sync the vm state in vmd too when (un)pausing a vm, otherwise the vm process
knows the vm is paused, but vmd does not.

ok mlarkin@ pd@


Revision tags: OPENBSD_6_5_BASE
# 1.108 09-Dec-2018 claudio

When -B is used to specify a specific boot device also change the reboot
behaviour of vmd to stop / exit at guest reboot.
OK ccardenas@


# 1.107 04-Dec-2018 claudio

Introduce IMSG_VMDOP_WAIT_VM_REQUEST a control message that registers a
vmctl peerid that should be informed when the VM is stopped (like when the
guest does a shutdown). Uses the same logic as using the VMOP_WAIT flag on
IMSG_VMDOP_TERMINATE_VM_REQUEST.
Ok ccardenas@, reyk@


# 1.106 26-Nov-2018 ori

Keep a list of known vms, and reuse the VM IDs.

This means that when using '-L', the IP addresses of the VMs are stable.

ok reyk@


# 1.105 21-Nov-2018 reyk

Add support for "local inet6" interfaces.

ok & test ccardenas@, additional review from kn@


# 1.104 15-Oct-2018 reyk

Prevent VM reboot loops by rate-limiting the interval a VM can reboot.

This looping has been experienced by people who run VMs with a broken
kernel or boot loader that trigger a very fast reboot loop (triple
fault) of a VM that ends up using a lot of CPU and resources on the
host. Some fixes in vmm(4) and vmd(8) helped to avoid such conditions
but it can still occur if something is wrong in the guest VM itself.

If the VM restarts after less than VM_START_RATE_SEC (6) seconds, we
increment the limit counter. After VM_START_RATE_LIMIT (3) of suchs
fast reboots the VM is stopped.

There are only very few people who intentionally want to reboot-loop a
VM very quickly (many times within a second); mostly for fuzzing.
They will have to recompile and adjust the stated #defines in the code
as we don't have a config option to disable it.

OK mlarkin@


Revision tags: OPENBSD_6_4_BASE
# 1.103 08-Oct-2018 reyk

Add support for qcow2 base images (external snapshots).

This works is from Ori Bernstein, committing on his behalf:

Add support to vmd for external snapshots. That is, snapshots that are
derived from a base image. Data lookups start in the derived image,
and if the derived image does not contain some data, the search
proceeds ot the base image. Multiple derived images may exist off of
a single base image.

A limitation of this format is that modifying the base image will
corrupt the derived image.

This change also adds support for creating disk derived disk images to
vmctl. To use it:

vmctl create derived.qcow2 -s 16G -b base.qcow2

From Ori Bernstein
OK mlarkin@ reyk@


# 1.102 29-Sep-2018 pd

vmd: don't remove vm if sending failed

Fix a bug where a vm was removed in vmd.c after vmctl send even if sending
failed.
spotted by solene@
ok mlarkin@


# 1.101 28-Sep-2018 reyk

Fix copy-pasto to use maxmem instead of maxcpu

Reported by Greg Steuck

OK mlarkin@


# 1.100 10-Sep-2018 bluhm

vmd(8) clould close file descriptor 0 as not all fd fields were
properly initialized with -1. Also avoid closing -1.
OK mlarkin@


# 1.99 10-Sep-2018 bluhm

During the fork+exec implementation, daemon(3) was moved after
proc_init(). As a consequence vmd(8) child processes did not detach
from the terminal anymore. Dup /dev/null to the stdio file descriptors
in the children.
OK mlarkin@ reyk@


# 1.98 15-Jul-2018 reyk

Track resources and enforce cpu/memory/interface limits for non-root users.

The limits are currently hard-coded and undocumented (4 CPUs/VMs, 2G
memory, 8 interfaces) but will be configurable in an upcoming diff.
These limits are tracked in total usage; for example, a user will be
able to run up to 4 VMs with 512M of memory or a single VM with 2G.

OK ccardenas@ mlarkin@


# 1.97 13-Jul-2018 reyk

Check the disk/kernel/cdrom file permissions after openening the fd.

This prevents time of TOCTOU attacks for instances.

OK mlarkin@


# 1.96 13-Jul-2018 reyk

Add "allow instance" option.

This allows users to create VM instances and change desired options,
for example a user can be allowed to run a VM with all the
pre-configured options but specify an own disk image.

(mlarkin@ was fine with iterating over it)

OK ccardenas@


# 1.95 12-Jul-2018 reyk

Allow to use configured/running VMs as templates for other VM instances.

This introduces new grammar and the -t optional in vmctl start.

(For now, only root can create VM instances; but it is planned to allow
users to create their own VMs based on permissions and quota.)

OK ccardenas@ mlarkin@ jmc@


# 1.94 11-Jul-2018 reyk

style - indent each case statement in a switch.


# 1.93 11-Jul-2018 reyk

Add -w option to vmctl stop to wait for completion of VM termination.

Use it in /etc/rc.d/vmd accordingly.

OK sthen@


# 1.92 11-Jul-2018 reyk

Rename function to vmd_check_vmh


# 1.91 11-Jul-2018 reyk

Add -f option to vmctl stop to forcefully kill a VM.

This also fixes a bug in vmm_sighdlr where it might have missed
forwarding the TERMINATE_EVENT to the vmd parent after a VM child
died, leading to an abandoned VM in the vmd parent process.

OK ccardenas@ mlarkin@ benno@ kn@


# 1.90 10-Jul-2018 reyk

style (single-line ifs don't need braces)


# 1.89 10-Jul-2018 reyk

vmd already had DEBUG/DPRINTF, there is no need for VMD_DEBUG/dprintf

Replace all occurences of dprintf with DPRINTF (defined in proc.h).


# 1.88 10-Jul-2018 reyk

Tweak debug log messages

- Turn tracing messages into DPRINTF (only compiled with DEBUG).

- Pass __func__ to vm_stop and vm_remove: this way we can track who
called the function in the async context. It replaces the manual
log_debug in front of each vm_stop/vm_remove. This debug logging
trick can be removed in the future once we are more confident about
it.

OK ccardenas@ mlarkin@


# 1.87 26-Jun-2018 reyk

Add "socket owner" to allow changing the owner of the vmd control socket.

This allows to open vmctl control or console access to other users
that are not in group wheel. Access for non-root users still defaults
to read-only actions unless you change the owner (user/group) of each
individual VM.

Requested by Mischa Peters

OK mlarkin@


# 1.86 19-Jun-2018 reyk

knf


# 1.85 13-May-2018 pd

vmd(8): enable pause / unpause for vm owners

Patch from Mohamed Aslan. Thanks!
ok kn@


# 1.84 25-Apr-2018 mlarkin

vmd(8)'s early error messages weren't visible when started via /etc/rc
(such as errors relating to not having VMX/etc). Change the log_init
to log to syslog so at least we have some chance of seeing these errors.

requested and ok beck@


# 1.83 21-Apr-2018 mlarkin

spelling error in log message


# 1.82 29-Mar-2018 martijn

Make sure that the global config is send out immediately when it is
loaded. This makes sure that the local prefix specied in the config is
always used.

OK ccardenas@


Revision tags: OPENBSD_6_3_BASE
# 1.81 14-Mar-2018 mlarkin

block two VMs from using the same disk image file at the same time.
Also changes an error message in vmctl to reflect same.


# 1.80 18-Feb-2018 pd

vmd: fix vmctl pause for non existing vm ids (never returns)

check if vm id is valid before sending to vmm for pausing. The 'lock' is caused
by vmm sending back ENOENT for a non existent vm but vmd drops the message
because it doesn't recogize the vmid vmm is talking about. This is an artifact
of the 'policy' don't trust any imsg from a sibling priv sep process and do
your own checking.

reported by Abel Abraham Camarillo Ojeda
ok mlarkin@ and ccardenas@


# 1.79 10-Jan-2018 sthen

Don't require "disk" or "kernel", also allow just "cdrom" instead, a VM can
still be useful with only cdrom storage. ok ccardenas@


# 1.78 08-Jan-2018 mpi

Enable TIOCUCNTL to be able to set ns8250's break detected condition.

It is now possible to send BREAK commands to vmd(8) independently of
the serial terminal emulator.

Happy virtual ddb(4) hacking!

No objection from mlarkin@, ok nicm@, ccardenas@, deraadt@


# 1.77 03-Jan-2018 ccardenas

Add initial CD-ROM support to VMD via vioscsi.

* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8)
* Support various sized ISOs (Limitation of 4G ISOs on Linux guests)
* Known working guests: OpenBSD (primary), Alpine Linux (primary),
CentOS 6 (secondary), Ubuntu 17.10 (secondary).
NOTE: Secondary indicates some issue(s) preventing full/reliable
functionality outside the scope of the vioscsi work.
* If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's
default BIOS) will boot from CD-ROM.

ok mlarkin@, jca@


# 1.76 06-Dec-2017 abieber

Make vmd respect owner when starting non-disabled vms.

OK pd@, benno@


# 1.75 30-Nov-2017 ccardenas

When performing vmctl reload and a previously configured vm is running,
exit with an EALREADY vs EPERM.

ok mlarkin@


# 1.74 11-Nov-2017 mlarkin

update switch handling in vmd(8). vmd now gets switch information (rdomain,
etc) from underlying switch interface instead of handling this on its
own.

Diff from carlos cardenas, Thanks!

ok reyk@


# 1.73 07-Nov-2017 mlarkin

typo in previous


# 1.72 07-Nov-2017 mlarkin

comment function vm_checkperm


# 1.71 24-Oct-2017 mlarkin

The VMD parent process didn't handle the case of a VM exiting
with a non 0 return properly (i.e. EIO).

From: Carlos Cardenas, thanks!


# 1.70 07-Oct-2017 mlarkin

vmd: retain ownership on vm reboot

from Jesper Wallin, thanks!


Revision tags: OPENBSD_6_2_BASE
# 1.69 08-Sep-2017 mlarkin

vmd: add more explanatory log_debug messages

From Carlos Cardenas, many thanks!


# 1.68 20-Aug-2017 pd

vmd: Allow only upward migration

This restricts receiving vms from hosts with more cpu features.

Tested on
broadwell -> skylake (works)
skylake -> broadwell (don't work)

ok mlarkin@


# 1.67 15-Aug-2017 pd

vmd: fix vm id displayed by vmctl when receiving a vm

Also fix two debug messages and an IMSG type.


# 1.66 14-Aug-2017 jasper

validate vm names before creating them; a valid name contains alphanumeric
characters, including '.', '_' and '-'. but does not start with the latter
three.

ok mlarkin@ pd@


# 1.65 13-Aug-2017 jasper

don't issue a termination command to an already stopped vm

ok mlarkin@


# 1.64 15-Jul-2017 pd

Add vmctl send and vmctl receive

ok reyk@ and mlarkin@


# 1.63 09-Jul-2017 pd

vmd/vmctl: Add ability to pause / unpause vms

With help from Ashwin Agrawal

ok reyk@ mlarkin@


# 1.62 29-May-2017 mlarkin

vmd(8): prevent crashing when presented with a vm name argument to
"vmctl stop" that doesn't exist.

Diff from Pratik Vyas, thanks!


# 1.61 04-May-2017 reyk

Report command failure back to vmctl reload, reset, load, log verbose.

OK mlarkin@


# 1.60 04-May-2017 reyk

Add support for rdomains.

This allows to configure VM interfaces and switches in individual rdomains.

OK mlarkin@


# 1.59 25-Apr-2017 reyk

Generate randomized MAC addresses earlier to keep them across reboots.

OK deraadt@


# 1.58 21-Apr-2017 reyk

Add global configuration option "local prefix" to change prefix for -L.

The default prefix is 100.64.0.0/10 from RFC6598.

Requested by sthen@ chris@
OK mlarkin@


# 1.57 19-Apr-2017 reyk

Add support for dynamic "NAT" interfaces (-L/local interface).

When a local interface is configured, vmd configures a /31 address on
the tap(4) interface of the host and provides another IP in the same
subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server
that replies with IP, gateway, and DNS addresses to the VM. The
built-in server only ever responds to the VM on the inside and cannot
leak its DHCP responses to the outside.

Thanks to Uwe Werler, Josh Grosse, and some others for testing!

OK deraadt@


# 1.56 06-Apr-2017 reyk

Do not expose vmm(4) VM IDs to the user, use vmd(8)'s IDs instead.

Each VM has two IDs: one from the kernel (vmm) and a different one
from userland (vmd). The vmm ID is not consistent and incremented on
every boot during runtimg of the host system. The vmd ID remains the
same during the lifetime of a configured VM, even after reboots.
Configured VMs will even get and keep their IDs when the configuration
is loaded. This is more what users expect.

Pointed out and tested by otto@

OK deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.55 15-Mar-2017 reyk

More fixes for starting and stopping VMs, fixing fallout from vm_running.

- Don't start a VM that is already running
- Keep the VM as running until it is powered off (and not stopping)
- Don't fatal in the parent if the vmm process referenced an unknown VM
- Don't stop a VM that is already stopping
- Indicate that a VM is stopping in "vmctl status"

The previous "vmctl stop; vmctl stop" to force-shutdown is not
supported anymore - the shutdown timeout should make sure that the VM
is really terminated. To force-shutdown, reference the VM by ID.
We might add a flag to vmctl stop to just turn the VM off.


# 1.54 15-Mar-2017 reyk

Close the tty if the VM was powered down.

The parent keeps a copy of each VM's tty fd to reuse it on reboot.
Close this tty if the VM was stopped, and not rebooted, by calling
vm_stop(vm, 0) instead of just setting vm_running to 0. Also make
sure that vm_ttyname is not used after free'ing it.


# 1.53 02-Mar-2017 reyk

Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.

This is especially useful when multiple VMs share a switch, the
implementation is independent from the underlying switch or bridge.

no objections mlarkin@


# 1.52 01-Mar-2017 reyk

Add "owner" option to set a user/group ownership for pre-configured VMs

This allows matching users to start or stop VMs that they "own" and to
access the console accordingly.

OK mlarkin@


# 1.51 27-Feb-2017 reyk

Replace openpty(3) with local function that uses pre-opened /dev/ptm fd

This allows more flexibility for upcoming changes and better pledge.
We also didn't use half of the features of libutil's openpty function.
Additionally, make sure that the ttys are closed correctly on shutdown.

OK gilles@


# 1.50 13-Jan-2017 edd

Make it possible to remove VMs from vmd(8)'s internal queue.

The semantics agreed with reyk@ are:

* ad-hoc created vms, created with `vmctl start`, are removed once stopped.
* Stopped VMs defined in a config file are flushed before a `vmctl reload`.

OK reyk@


# 1.49 11-Jan-2017 reyk

Add imsg communication channel between vmd and invividual VMs.
For now, this is only used to forward "log verbose|brief" requests,
but it will be used for better things later.

OK mlarkin@


# 1.48 09-Jan-2017 reyk

Stop accessing verbose and debug variables from log.c directly.

This replaces log_verbose() and "extern int verbose" with the two functions
log_setverbose() and log_getverbose().

Pointed out by benno@
OK krw@ eric@ gilles@ (OK gilles@ for the snmpd bits as well)


# 1.47 14-Dec-2016 reyk

Allow to start disabled and pre-configured VMs by name, "vmctl start foo".

With testing from Jon Bernard

OK mlarkin@


# 1.46 14-Dec-2016 reyk

If a VM terminates with the result EAGAIN, close all fds except the
pty and re-send it to the vmm monitor process. With additional
changes in vmm.c, this will allow perform a cold reboot of VM.

With testing and feedback from Jon Bernard
OK mlarkin@


# 1.45 26-Nov-2016 reyk

Implement basic support for boot.conf(8) on the disk image.

Like the real boot loader, load and parse hd0a:/etc/boot.conf from the
first disk and fall back to /bsd. Not all boot loader options are
supported, but it at least does set device, set image, and boot -acds
(eg. for booting single-user).

For example, it can now boot install60.fs that includes a boot.conf
with "set image /6.0/amd64/bsd.rd":
vmctl start install -c -d install60.fs -d OpenBSD.img

This pseudo-bootloader is only needed without BIOS and could
potentially be replaced in the future.

OK mlarkin@


# 1.44 26-Nov-2016 reyk

If -m/memory is not specified, use 512M by default.

Default value picked with mlarkin - not too small and not too large.

OK mlarkin@


# 1.43 24-Nov-2016 reyk

Add support for booting the kernel from the disk image.

This make the kernel/-k argument optional and, if not specified, tries
to find the /bsd kernel in the primary hd0a partition of the first
disk image itself. It doesn't support hd0a:/etc/boot.conf yet, and it
is no BIOS or full boot loader, but it makes booting and handling of
VMs a bit easier - booting an external kernel is still supported.

The UFS file system code ufs.c is directly from libsa which is also
used by the real boot loader. The code compiles with a few signedness
warning which will be fixed separately.

OK mlarkin@


# 1.42 22-Nov-2016 reyk

Fix error path of config_setvm() and its callers. This unbreaks
loading of invalid kernel files.

Reported by mlarkin@
OK mlarkin@


# 1.41 22-Nov-2016 reyk

There is no need for res when there is already ret.


# 1.40 22-Nov-2016 edd

Insert disabled VMs into vmd(8)'s queues and allow vmctl(8) to display them.

Tested by Jon Bernard and reyk@.

OK reyk@, no objections mlarkin@.

Thanks


# 1.39 04-Nov-2016 reyk

Pass the internal vmid or 0 to vm_register() instead of changing it
once again after setting the next available id.

Suggested by edd@


# 1.38 04-Nov-2016 reyk

Update the config/register/get VM methods to match the config_set/get
style that is used in other places. Also keep the vmid from the parent.

OK edd@


# 1.37 29-Oct-2016 edd

Separate parsing vms and switches from starting them in vmd(8).

Brings us one step closer to having disabled by default vms is vm.conf(5),
which can be started with vmctl(8).

Input, testing and OK reyk@. Thanks.


# 1.36 17-Oct-2016 reyk

Add the option to specify an interface group per virtual switch as well;
this group will be added to all VM tap(4) interfaces in the switch.

Tested by martijn@


# 1.35 15-Oct-2016 reyk

Allow to add an interface to an interface group; with the group keyword.

Requested and tested by martijn@


# 1.34 12-Oct-2016 reyk

Fix functionality and semantics of vmctl load/reload/reset.

OK rzalamena@


# 1.33 06-Oct-2016 reyk

Terminate VMs on shutdown of vmd instead of leaving them running as
undead VM processes.

OK mlarkin@


# 1.32 05-Oct-2016 reyk

Add support for enhanced networking configuration and virtual switches.
See vm.conf(5) for more details.

OK mlarkin@


# 1.31 04-Oct-2016 reyk

Add a new "priv" process that is responsible for ioctls and restricted
operations that aren't allowed under pledge. This is a companion to
the "vmd" process that runs as root but with pledge.

With the "priv" process, each new tap(4) interface now gets a
description to indicate the vm, eg. "vm1-if0-myvm". For network
configuration will be done by vmd/priv later.

OK mlarkin@


# 1.30 29-Sep-2016 reyk

Implement fork+exec for vmd, using the same framework from httpd etc.

No objections from mlarkin@ sunil@


# 1.29 17-Aug-2016 deraadt

small bits of header cleanup; ok mlarkin


# 1.28 29-Jul-2016 stefan

Allow starting a VM again after it was terminated

If a VM exits, terminate it and remove it from the list of
available VMs. That allows a VM with name `foo' to be restarted
after it has exited.

This changes structures shared between vmd and vmctl. You need to
rebuild vmctl also.

ok mlarkin@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.27 05-Feb-2016 reyk

Fix a possible use-after-free in vmd, forward the result to the
control socket before free'ing the vm.

Found by and OK jsg@


# 1.26 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.25 11-Dec-2015 reyk

The vmctl "id" argument can now be a number of or a vm name, eg.
vmctl stop 3
vmctl stop "openbsd.vm"


# 1.24 08-Dec-2015 jsg

when checking the config file with -n don't open /dev/vmm or require root
ok reyk@


# 1.23 08-Dec-2015 jsg

make the -f option work as intended
ok reyk@


# 1.22 07-Dec-2015 reyk

tweak initial error logging


# 1.21 06-Dec-2015 reyk

Prevent running a VM with the same name multiple times - multiple
instances of the same configuration will be handled in a different way
later. It is also not a good idea to use the same writeable disk
with multiple VMs at the same time.

As discussed with mlarkin@


# 1.20 06-Dec-2015 reyk

Print the TTY in the vmctl status output.


# 1.19 06-Dec-2015 reyk

When a new vm is created with VMM_IOC_CREATE, the kernel assigns a
unique id to it. This happens in the vm child process and has to be
communicated to the parent processes to track the vm. Knowing the vm
id in the parent and vmm processes also allows to remove vm from the
daemons list on terminate requests later.


# 1.18 06-Dec-2015 reyk

Check errno from config_getvm() correctly


# 1.17 05-Dec-2015 reyk

Print shorter error message if opening /dev/vmm failed.

Pointed out by deraadt@


# 1.16 03-Dec-2015 reyk

Re-add the "load" and "reload" commands to vmctl: Instead of parsing
the configuration in vmctl directly, it now sends a (re)load request
to vmd. The reload also resets the existing configuration status -
this doesn't do much difference yet but a future change will compare
if a specified VM is already running. "load" will allow to add
configuration, while "reload" resets the state before loading.


# 1.15 03-Dec-2015 reyk

Add and document -D and -f flags to vmd.


# 1.14 03-Dec-2015 reyk

mlarkin's code has been moved to vmm.c, so it is ok to claim the copyright.


# 1.13 03-Dec-2015 reyk

Add support for an optional vm.conf(5) file in vmd. This will replace
vmm.conf(5) in vmmctl. For a short time, both vmd and vmmctl will
support a configuration file, but vmmctl will be changed to send
"load" requests to vmd instead of loading and parsing the file
directly.


# 1.12 03-Dec-2015 reyk

prepare config_getvm() for parse.y


# 1.11 02-Dec-2015 reyk

send the tty name to vmmctl and print it as a result.


# 1.10 02-Dec-2015 reyk

Split the fully privileged parent into two processes "parent" and
"vmm" with reduced privileges:
- the "parent" opens fds (disks, ifs, etc.) but runs as root but pledged as
"stdio rpath wpath proc tty sendfd".
- the "vmm" process handles the creation and supervision of vm processes,
and the primary communication with the vmm(4) subsystem. It runs as _vmd
in the chroot but does not use pledge, as the vmm ioctls are not allowed
by any pledge model yet.
With this change, vmd starts to track the configuration state of VMs
in vmd and will allow other things later (like terminating a vm by
name, moving the configuration parser to vmd, ...). More incremental
changes will follow.


# 1.9 02-Dec-2015 reyk

Start tweaking vmd's privsep and daemon model by splitting the main
process into multiple parts and adopting the "proc.c"-style from other
daemons. This allows to further reduce the privileges, to give better
pledge(2), and to add some upcoming changes.

"please do" mlarkin@, deraadt@


# 1.8 26-Nov-2015 reyk

Automatically start vmm(4) when the first VM is created and after the
last VM is terminated. This allows to remove the explicit "vmm
enable" / "vmm disable" (VMM_IOC_START / VMM_IOC_STOP) ioctls. You'll
have to update kernel and userland for this change, as the kernel ABI
changes.

OK mpi@ mlarkin@


# 1.7 25-Nov-2015 tedu

typo: should be looking pid == -1


# 1.6 23-Nov-2015 reyk

accept4() is restarted after signals which prevents vmd from exiting
in the current control socket loop. Add a poll before the accept that
is not restarted and allows to escape the loop. This code is kind of
temporary, as we're planning to replace the event handling, but it
allows to kill (or Ctrl+c) vmd for now.

OK tedu@, discussed with many


# 1.5 23-Nov-2015 reyk

I accidentally removed a newline in usage() when converting the log
messages to log_*.

From Cesare Gargano


# 1.4 23-Nov-2015 reyk

Add support for logging to stderr or syslog, and to run vmd in
foreground with -d.

OK mlarkin@ jung@


# 1.3 22-Nov-2015 deraadt

use PATH_MAX where needed


# 1.2 22-Nov-2015 reyk

Add $ Ids


# 1.1 22-Nov-2015 mlarkin

vmd(8) - virtual machine daemon.

There is still a lot to be done, and fixed, in these userland components
but I have received enough "it works, commit it" emails that it's time
to finish those things in tree.

discussed with many, tested by many.


# 1.134 15-Dec-2022 dv

Add explicit casts to ctype functions in vmd(8).

OK millert@


# 1.133 31-Oct-2022 dv

vmd(8): remove unfinished user accounting.

User accounting and enforcement was never finished. tedu the thing
until someone wants to pick it up and finish it.

Originally found by Matthew Martin.

ok mlarkin@, kn@. input from tb@.


Revision tags: OPENBSD_7_2_BASE
# 1.132 13-Sep-2022 martijn

Add (partial) support for agentx in vmd.

Metrics can be found under mib-2.236 and VM-MIB (RFC7666).

Stress tested by and happy noises from Mischa Peters
OK dv@


# 1.131 08-May-2022 dv

vmd: fix rebooting a received vm

Rebooting a received vm resulted in vmd(8) exiting as a result of
flawed state tracking in the parent process.

When stopping a vm, clear the VM_RECEIVE_STATE flag. When starting
a vm, make sure the parent process collapses any existing memory
ranges after the vm is sent to the vmm process (responsible for
launching the vm).

ok mlarkin@


Revision tags: OPENBSD_7_1_BASE
# 1.130 01-Mar-2022 dv

vmd(8): gracefully handle hitting data limits when starting a vm

With recent changes to login.conf(5) to restrict daemon datasize
to a finite value, users can now hit resource limits when attempting
to start a vm.

This change fixes the error path when hitting the limit. vmd(8)
will no longer abort and memory error messages are relayed to the
user.

While here, address potential under-reads/writes using atomicio
when relaying data between the child vm process and vmd's vmm
process.

Original diff from tedu@. OK mlarkin@.


# 1.129 04-Jan-2022 claudio

Fix some simple -Wunused-but-set-variable warnings.
OK benno@ dv@


# 1.128 13-Dec-2021 deraadt

including sys/cdefs.h manually started as a result of netbsd trying to
macro-build a replacement for sccsid, and was done without any concern
for namespace damage. Unfortunately this practice started infecting
other code as others were unaware they didn't need the file.
ok millert guenther


# 1.127 29-Nov-2021 deraadt

mostly avoid sys/param.h with a local nitems()
ok mlarkin


Revision tags: OPENBSD_7_0_BASE
# 1.126 18-Jul-2021 dv

vmd(8): remove invalid errno values from config_setvm

Refactor config_setvm to directly return error code on failure
instead of returning -1 and setting errno. It was setting unsupported
values not defined in <errno.h>.

OK mlarkin@


# 1.125 05-May-2021 dv

Refactor vm_instance to return error value directly.

vmd(8)'s vm_instance function set unsupported errno values. Change the
api to directly return an error (either errno or custom vmd error).

"go for it" -mlarkin@


# 1.124 04-May-2021 dv

Init debug logging state before attempting to log.

Error messages related to bad configuration were not flushing to
stderr.

OK mlarkin@


# 1.123 26-Apr-2021 dv

vmd(8): fix vmctl client "wait" state corruption

Adds queue-based tracking of waiting client state to fix the cause of
state corruption when a vmctl(8) user cancels a wait and restarts it.
The socket fd value for the control process client was being used to
track the waiting party, but this also prevented multiple waiting
clients.

This moves all the state tracking of who to notify of a vm's stopping
to the control process and no longer requires the parent process to
track it in the global environment state.

Future work will be needed to smooth out the difference between the
IMSG_VMDOP_TERMINATE_VM_{EVENT,RESPONSE} events instead of needing to
translate before relaying to the vmctl(8) client.

Tested by Mischa Peters (thanks!)

ok mlarkin@


Revision tags: OPENBSD_6_9_BASE
# 1.122 05-Apr-2021 dv

Send correct response type on unpause errors.

ok pd@


# 1.121 29-Mar-2021 dv

Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp
and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior
ignored did not intercept these packets and instead transmitted them.

This should make vmd(8)'s dhcp behave more as a true dhcp server should and
allows it to work properly with the new dhcpleased(8) attempting a renewal.

OK mlarkin@


# 1.120 27-Jan-2021 deraadt

these programs (with common ancestry) had a -fno-common problem related
to privsep_procid.
ok mortimer


Revision tags: OPENBSD_6_8_BASE
# 1.119 23-Sep-2020 martijn

Revert agentx support for now, we're too close to release.

requested by deraadt@


# 1.118 23-Sep-2020 martijn

Add support for agentx to vmd.

This is based around VM-MIB from RFC7666,but does not export the full
spec. People more knowledgeable of vmd are encouraged to expand on this.


Revision tags: OPENBSD_6_7_BASE
# 1.117 12-Dec-2019 pd

vmd: start vms defined in vm.conf in a staggered fashion

This addresses 'thundering herd' problem when a lot of
vms are configured in vm.conf. A lot of vms booting in parallel can
overload the host and also mess up tsc calibration in openbsd guests as
it uses PIT which doesn't fire reliably if the host is overloaded.

We default to starting vms with parallelism of ncpuonline and a delay 30 seconds
between batches. This is configurable in vm.conf.

ok mlarkin@ (also addressed comments from cheloha@)


Revision tags: OPENBSD_6_6_BASE
# 1.116 04-Sep-2019 mlarkin

vmd(8): memory leak in an error path

Found by Hiltjo Posthuma, thanks!


# 1.115 14-Aug-2019 anton

Improve the error message when supplying an invalid template to vmctl
start. Favoring 'invalid template' over 'permission denied' should give
the user a better hint on what went wrong.

ok kn@ mlarkin@


# 1.114 28-Jun-2019 deraadt

When system calls indicate an error they return -1, not some arbitrary
value < 0. errno is only updated in this case. Change all (most?)
callers of syscalls to follow this better, and let's see if this strictness
helps us in the future.


# 1.113 20-May-2019 jasper

drop fatalx calls when claiming a new vm id; otherwise it's possible
to crash vmd and take all other vms with it. this required a little
shuffling to get the error value reported back to the caller to
handle the error properly.

ok mlarkin@


# 1.112 11-May-2019 jasper

report vm state through 'vmctl status'; whereas previously this would display the state of
the vcpu (which is why it got removed), it now actually reports the correct state
(running, stopped, disabled, paused, etc)

ok ccardenas@ mlarkin@


# 1.111 11-May-2019 jasper

vm_dump_header allocated space for a signature but it was never set;
set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image

ok mlarkin@ pd@


# 1.110 11-May-2019 jasper

track the state of the vm (running, paused, etc) using a single bitfield instead of
a handful of separate variables. this will makes it easier for vmd to report
and check on the individual vm states

no functional change intended

ok ccardenas@ mlarkin@


# 1.109 11-May-2019 jasper

sync the vm state in vmd too when (un)pausing a vm, otherwise the vm process
knows the vm is paused, but vmd does not.

ok mlarkin@ pd@


Revision tags: OPENBSD_6_5_BASE
# 1.108 09-Dec-2018 claudio

When -B is used to specify a specific boot device also change the reboot
behaviour of vmd to stop / exit at guest reboot.
OK ccardenas@


# 1.107 04-Dec-2018 claudio

Introduce IMSG_VMDOP_WAIT_VM_REQUEST a control message that registers a
vmctl peerid that should be informed when the VM is stopped (like when the
guest does a shutdown). Uses the same logic as using the VMOP_WAIT flag on
IMSG_VMDOP_TERMINATE_VM_REQUEST.
Ok ccardenas@, reyk@


# 1.106 26-Nov-2018 ori

Keep a list of known vms, and reuse the VM IDs.

This means that when using '-L', the IP addresses of the VMs are stable.

ok reyk@


# 1.105 21-Nov-2018 reyk

Add support for "local inet6" interfaces.

ok & test ccardenas@, additional review from kn@


# 1.104 15-Oct-2018 reyk

Prevent VM reboot loops by rate-limiting the interval a VM can reboot.

This looping has been experienced by people who run VMs with a broken
kernel or boot loader that trigger a very fast reboot loop (triple
fault) of a VM that ends up using a lot of CPU and resources on the
host. Some fixes in vmm(4) and vmd(8) helped to avoid such conditions
but it can still occur if something is wrong in the guest VM itself.

If the VM restarts after less than VM_START_RATE_SEC (6) seconds, we
increment the limit counter. After VM_START_RATE_LIMIT (3) of suchs
fast reboots the VM is stopped.

There are only very few people who intentionally want to reboot-loop a
VM very quickly (many times within a second); mostly for fuzzing.
They will have to recompile and adjust the stated #defines in the code
as we don't have a config option to disable it.

OK mlarkin@


Revision tags: OPENBSD_6_4_BASE
# 1.103 08-Oct-2018 reyk

Add support for qcow2 base images (external snapshots).

This works is from Ori Bernstein, committing on his behalf:

Add support to vmd for external snapshots. That is, snapshots that are
derived from a base image. Data lookups start in the derived image,
and if the derived image does not contain some data, the search
proceeds ot the base image. Multiple derived images may exist off of
a single base image.

A limitation of this format is that modifying the base image will
corrupt the derived image.

This change also adds support for creating disk derived disk images to
vmctl. To use it:

vmctl create derived.qcow2 -s 16G -b base.qcow2

From Ori Bernstein
OK mlarkin@ reyk@


# 1.102 29-Sep-2018 pd

vmd: don't remove vm if sending failed

Fix a bug where a vm was removed in vmd.c after vmctl send even if sending
failed.
spotted by solene@
ok mlarkin@


# 1.101 28-Sep-2018 reyk

Fix copy-pasto to use maxmem instead of maxcpu

Reported by Greg Steuck

OK mlarkin@


# 1.100 10-Sep-2018 bluhm

vmd(8) clould close file descriptor 0 as not all fd fields were
properly initialized with -1. Also avoid closing -1.
OK mlarkin@


# 1.99 10-Sep-2018 bluhm

During the fork+exec implementation, daemon(3) was moved after
proc_init(). As a consequence vmd(8) child processes did not detach
from the terminal anymore. Dup /dev/null to the stdio file descriptors
in the children.
OK mlarkin@ reyk@


# 1.98 15-Jul-2018 reyk

Track resources and enforce cpu/memory/interface limits for non-root users.

The limits are currently hard-coded and undocumented (4 CPUs/VMs, 2G
memory, 8 interfaces) but will be configurable in an upcoming diff.
These limits are tracked in total usage; for example, a user will be
able to run up to 4 VMs with 512M of memory or a single VM with 2G.

OK ccardenas@ mlarkin@


# 1.97 13-Jul-2018 reyk

Check the disk/kernel/cdrom file permissions after openening the fd.

This prevents time of TOCTOU attacks for instances.

OK mlarkin@


# 1.96 13-Jul-2018 reyk

Add "allow instance" option.

This allows users to create VM instances and change desired options,
for example a user can be allowed to run a VM with all the
pre-configured options but specify an own disk image.

(mlarkin@ was fine with iterating over it)

OK ccardenas@


# 1.95 12-Jul-2018 reyk

Allow to use configured/running VMs as templates for other VM instances.

This introduces new grammar and the -t optional in vmctl start.

(For now, only root can create VM instances; but it is planned to allow
users to create their own VMs based on permissions and quota.)

OK ccardenas@ mlarkin@ jmc@


# 1.94 11-Jul-2018 reyk

style - indent each case statement in a switch.


# 1.93 11-Jul-2018 reyk

Add -w option to vmctl stop to wait for completion of VM termination.

Use it in /etc/rc.d/vmd accordingly.

OK sthen@


# 1.92 11-Jul-2018 reyk

Rename function to vmd_check_vmh


# 1.91 11-Jul-2018 reyk

Add -f option to vmctl stop to forcefully kill a VM.

This also fixes a bug in vmm_sighdlr where it might have missed
forwarding the TERMINATE_EVENT to the vmd parent after a VM child
died, leading to an abandoned VM in the vmd parent process.

OK ccardenas@ mlarkin@ benno@ kn@


# 1.90 10-Jul-2018 reyk

style (single-line ifs don't need braces)


# 1.89 10-Jul-2018 reyk

vmd already had DEBUG/DPRINTF, there is no need for VMD_DEBUG/dprintf

Replace all occurences of dprintf with DPRINTF (defined in proc.h).


# 1.88 10-Jul-2018 reyk

Tweak debug log messages

- Turn tracing messages into DPRINTF (only compiled with DEBUG).

- Pass __func__ to vm_stop and vm_remove: this way we can track who
called the function in the async context. It replaces the manual
log_debug in front of each vm_stop/vm_remove. This debug logging
trick can be removed in the future once we are more confident about
it.

OK ccardenas@ mlarkin@


# 1.87 26-Jun-2018 reyk

Add "socket owner" to allow changing the owner of the vmd control socket.

This allows to open vmctl control or console access to other users
that are not in group wheel. Access for non-root users still defaults
to read-only actions unless you change the owner (user/group) of each
individual VM.

Requested by Mischa Peters

OK mlarkin@


# 1.86 19-Jun-2018 reyk

knf


# 1.85 13-May-2018 pd

vmd(8): enable pause / unpause for vm owners

Patch from Mohamed Aslan. Thanks!
ok kn@


# 1.84 25-Apr-2018 mlarkin

vmd(8)'s early error messages weren't visible when started via /etc/rc
(such as errors relating to not having VMX/etc). Change the log_init
to log to syslog so at least we have some chance of seeing these errors.

requested and ok beck@


# 1.83 21-Apr-2018 mlarkin

spelling error in log message


# 1.82 29-Mar-2018 martijn

Make sure that the global config is send out immediately when it is
loaded. This makes sure that the local prefix specied in the config is
always used.

OK ccardenas@


Revision tags: OPENBSD_6_3_BASE
# 1.81 14-Mar-2018 mlarkin

block two VMs from using the same disk image file at the same time.
Also changes an error message in vmctl to reflect same.


# 1.80 18-Feb-2018 pd

vmd: fix vmctl pause for non existing vm ids (never returns)

check if vm id is valid before sending to vmm for pausing. The 'lock' is caused
by vmm sending back ENOENT for a non existent vm but vmd drops the message
because it doesn't recogize the vmid vmm is talking about. This is an artifact
of the 'policy' don't trust any imsg from a sibling priv sep process and do
your own checking.

reported by Abel Abraham Camarillo Ojeda
ok mlarkin@ and ccardenas@


# 1.79 10-Jan-2018 sthen

Don't require "disk" or "kernel", also allow just "cdrom" instead, a VM can
still be useful with only cdrom storage. ok ccardenas@


# 1.78 08-Jan-2018 mpi

Enable TIOCUCNTL to be able to set ns8250's break detected condition.

It is now possible to send BREAK commands to vmd(8) independently of
the serial terminal emulator.

Happy virtual ddb(4) hacking!

No objection from mlarkin@, ok nicm@, ccardenas@, deraadt@


# 1.77 03-Jan-2018 ccardenas

Add initial CD-ROM support to VMD via vioscsi.

* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8)
* Support various sized ISOs (Limitation of 4G ISOs on Linux guests)
* Known working guests: OpenBSD (primary), Alpine Linux (primary),
CentOS 6 (secondary), Ubuntu 17.10 (secondary).
NOTE: Secondary indicates some issue(s) preventing full/reliable
functionality outside the scope of the vioscsi work.
* If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's
default BIOS) will boot from CD-ROM.

ok mlarkin@, jca@


# 1.76 06-Dec-2017 abieber

Make vmd respect owner when starting non-disabled vms.

OK pd@, benno@


# 1.75 30-Nov-2017 ccardenas

When performing vmctl reload and a previously configured vm is running,
exit with an EALREADY vs EPERM.

ok mlarkin@


# 1.74 11-Nov-2017 mlarkin

update switch handling in vmd(8). vmd now gets switch information (rdomain,
etc) from underlying switch interface instead of handling this on its
own.

Diff from carlos cardenas, Thanks!

ok reyk@


# 1.73 07-Nov-2017 mlarkin

typo in previous


# 1.72 07-Nov-2017 mlarkin

comment function vm_checkperm


# 1.71 24-Oct-2017 mlarkin

The VMD parent process didn't handle the case of a VM exiting
with a non 0 return properly (i.e. EIO).

From: Carlos Cardenas, thanks!


# 1.70 07-Oct-2017 mlarkin

vmd: retain ownership on vm reboot

from Jesper Wallin, thanks!


Revision tags: OPENBSD_6_2_BASE
# 1.69 08-Sep-2017 mlarkin

vmd: add more explanatory log_debug messages

From Carlos Cardenas, many thanks!


# 1.68 20-Aug-2017 pd

vmd: Allow only upward migration

This restricts receiving vms from hosts with more cpu features.

Tested on
broadwell -> skylake (works)
skylake -> broadwell (don't work)

ok mlarkin@


# 1.67 15-Aug-2017 pd

vmd: fix vm id displayed by vmctl when receiving a vm

Also fix two debug messages and an IMSG type.


# 1.66 14-Aug-2017 jasper

validate vm names before creating them; a valid name contains alphanumeric
characters, including '.', '_' and '-'. but does not start with the latter
three.

ok mlarkin@ pd@


# 1.65 13-Aug-2017 jasper

don't issue a termination command to an already stopped vm

ok mlarkin@


# 1.64 15-Jul-2017 pd

Add vmctl send and vmctl receive

ok reyk@ and mlarkin@


# 1.63 09-Jul-2017 pd

vmd/vmctl: Add ability to pause / unpause vms

With help from Ashwin Agrawal

ok reyk@ mlarkin@


# 1.62 29-May-2017 mlarkin

vmd(8): prevent crashing when presented with a vm name argument to
"vmctl stop" that doesn't exist.

Diff from Pratik Vyas, thanks!


# 1.61 04-May-2017 reyk

Report command failure back to vmctl reload, reset, load, log verbose.

OK mlarkin@


# 1.60 04-May-2017 reyk

Add support for rdomains.

This allows to configure VM interfaces and switches in individual rdomains.

OK mlarkin@


# 1.59 25-Apr-2017 reyk

Generate randomized MAC addresses earlier to keep them across reboots.

OK deraadt@


# 1.58 21-Apr-2017 reyk

Add global configuration option "local prefix" to change prefix for -L.

The default prefix is 100.64.0.0/10 from RFC6598.

Requested by sthen@ chris@
OK mlarkin@


# 1.57 19-Apr-2017 reyk

Add support for dynamic "NAT" interfaces (-L/local interface).

When a local interface is configured, vmd configures a /31 address on
the tap(4) interface of the host and provides another IP in the same
subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server
that replies with IP, gateway, and DNS addresses to the VM. The
built-in server only ever responds to the VM on the inside and cannot
leak its DHCP responses to the outside.

Thanks to Uwe Werler, Josh Grosse, and some others for testing!

OK deraadt@


# 1.56 06-Apr-2017 reyk

Do not expose vmm(4) VM IDs to the user, use vmd(8)'s IDs instead.

Each VM has two IDs: one from the kernel (vmm) and a different one
from userland (vmd). The vmm ID is not consistent and incremented on
every boot during runtimg of the host system. The vmd ID remains the
same during the lifetime of a configured VM, even after reboots.
Configured VMs will even get and keep their IDs when the configuration
is loaded. This is more what users expect.

Pointed out and tested by otto@

OK deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.55 15-Mar-2017 reyk

More fixes for starting and stopping VMs, fixing fallout from vm_running.

- Don't start a VM that is already running
- Keep the VM as running until it is powered off (and not stopping)
- Don't fatal in the parent if the vmm process referenced an unknown VM
- Don't stop a VM that is already stopping
- Indicate that a VM is stopping in "vmctl status"

The previous "vmctl stop; vmctl stop" to force-shutdown is not
supported anymore - the shutdown timeout should make sure that the VM
is really terminated. To force-shutdown, reference the VM by ID.
We might add a flag to vmctl stop to just turn the VM off.


# 1.54 15-Mar-2017 reyk

Close the tty if the VM was powered down.

The parent keeps a copy of each VM's tty fd to reuse it on reboot.
Close this tty if the VM was stopped, and not rebooted, by calling
vm_stop(vm, 0) instead of just setting vm_running to 0. Also make
sure that vm_ttyname is not used after free'ing it.


# 1.53 02-Mar-2017 reyk

Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.

This is especially useful when multiple VMs share a switch, the
implementation is independent from the underlying switch or bridge.

no objections mlarkin@


# 1.52 01-Mar-2017 reyk

Add "owner" option to set a user/group ownership for pre-configured VMs

This allows matching users to start or stop VMs that they "own" and to
access the console accordingly.

OK mlarkin@


# 1.51 27-Feb-2017 reyk

Replace openpty(3) with local function that uses pre-opened /dev/ptm fd

This allows more flexibility for upcoming changes and better pledge.
We also didn't use half of the features of libutil's openpty function.
Additionally, make sure that the ttys are closed correctly on shutdown.

OK gilles@


# 1.50 13-Jan-2017 edd

Make it possible to remove VMs from vmd(8)'s internal queue.

The semantics agreed with reyk@ are:

* ad-hoc created vms, created with `vmctl start`, are removed once stopped.
* Stopped VMs defined in a config file are flushed before a `vmctl reload`.

OK reyk@


# 1.49 11-Jan-2017 reyk

Add imsg communication channel between vmd and invividual VMs.
For now, this is only used to forward "log verbose|brief" requests,
but it will be used for better things later.

OK mlarkin@


# 1.48 09-Jan-2017 reyk

Stop accessing verbose and debug variables from log.c directly.

This replaces log_verbose() and "extern int verbose" with the two functions
log_setverbose() and log_getverbose().

Pointed out by benno@
OK krw@ eric@ gilles@ (OK gilles@ for the snmpd bits as well)


# 1.47 14-Dec-2016 reyk

Allow to start disabled and pre-configured VMs by name, "vmctl start foo".

With testing from Jon Bernard

OK mlarkin@


# 1.46 14-Dec-2016 reyk

If a VM terminates with the result EAGAIN, close all fds except the
pty and re-send it to the vmm monitor process. With additional
changes in vmm.c, this will allow perform a cold reboot of VM.

With testing and feedback from Jon Bernard
OK mlarkin@


# 1.45 26-Nov-2016 reyk

Implement basic support for boot.conf(8) on the disk image.

Like the real boot loader, load and parse hd0a:/etc/boot.conf from the
first disk and fall back to /bsd. Not all boot loader options are
supported, but it at least does set device, set image, and boot -acds
(eg. for booting single-user).

For example, it can now boot install60.fs that includes a boot.conf
with "set image /6.0/amd64/bsd.rd":
vmctl start install -c -d install60.fs -d OpenBSD.img

This pseudo-bootloader is only needed without BIOS and could
potentially be replaced in the future.

OK mlarkin@


# 1.44 26-Nov-2016 reyk

If -m/memory is not specified, use 512M by default.

Default value picked with mlarkin - not too small and not too large.

OK mlarkin@


# 1.43 24-Nov-2016 reyk

Add support for booting the kernel from the disk image.

This make the kernel/-k argument optional and, if not specified, tries
to find the /bsd kernel in the primary hd0a partition of the first
disk image itself. It doesn't support hd0a:/etc/boot.conf yet, and it
is no BIOS or full boot loader, but it makes booting and handling of
VMs a bit easier - booting an external kernel is still supported.

The UFS file system code ufs.c is directly from libsa which is also
used by the real boot loader. The code compiles with a few signedness
warning which will be fixed separately.

OK mlarkin@


# 1.42 22-Nov-2016 reyk

Fix error path of config_setvm() and its callers. This unbreaks
loading of invalid kernel files.

Reported by mlarkin@
OK mlarkin@


# 1.41 22-Nov-2016 reyk

There is no need for res when there is already ret.


# 1.40 22-Nov-2016 edd

Insert disabled VMs into vmd(8)'s queues and allow vmctl(8) to display them.

Tested by Jon Bernard and reyk@.

OK reyk@, no objections mlarkin@.

Thanks


# 1.39 04-Nov-2016 reyk

Pass the internal vmid or 0 to vm_register() instead of changing it
once again after setting the next available id.

Suggested by edd@


# 1.38 04-Nov-2016 reyk

Update the config/register/get VM methods to match the config_set/get
style that is used in other places. Also keep the vmid from the parent.

OK edd@


# 1.37 29-Oct-2016 edd

Separate parsing vms and switches from starting them in vmd(8).

Brings us one step closer to having disabled by default vms is vm.conf(5),
which can be started with vmctl(8).

Input, testing and OK reyk@. Thanks.


# 1.36 17-Oct-2016 reyk

Add the option to specify an interface group per virtual switch as well;
this group will be added to all VM tap(4) interfaces in the switch.

Tested by martijn@


# 1.35 15-Oct-2016 reyk

Allow to add an interface to an interface group; with the group keyword.

Requested and tested by martijn@


# 1.34 12-Oct-2016 reyk

Fix functionality and semantics of vmctl load/reload/reset.

OK rzalamena@


# 1.33 06-Oct-2016 reyk

Terminate VMs on shutdown of vmd instead of leaving them running as
undead VM processes.

OK mlarkin@


# 1.32 05-Oct-2016 reyk

Add support for enhanced networking configuration and virtual switches.
See vm.conf(5) for more details.

OK mlarkin@


# 1.31 04-Oct-2016 reyk

Add a new "priv" process that is responsible for ioctls and restricted
operations that aren't allowed under pledge. This is a companion to
the "vmd" process that runs as root but with pledge.

With the "priv" process, each new tap(4) interface now gets a
description to indicate the vm, eg. "vm1-if0-myvm". For network
configuration will be done by vmd/priv later.

OK mlarkin@


# 1.30 29-Sep-2016 reyk

Implement fork+exec for vmd, using the same framework from httpd etc.

No objections from mlarkin@ sunil@


# 1.29 17-Aug-2016 deraadt

small bits of header cleanup; ok mlarkin


# 1.28 29-Jul-2016 stefan

Allow starting a VM again after it was terminated

If a VM exits, terminate it and remove it from the list of
available VMs. That allows a VM with name `foo' to be restarted
after it has exited.

This changes structures shared between vmd and vmctl. You need to
rebuild vmctl also.

ok mlarkin@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.27 05-Feb-2016 reyk

Fix a possible use-after-free in vmd, forward the result to the
control socket before free'ing the vm.

Found by and OK jsg@


# 1.26 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.25 11-Dec-2015 reyk

The vmctl "id" argument can now be a number of or a vm name, eg.
vmctl stop 3
vmctl stop "openbsd.vm"


# 1.24 08-Dec-2015 jsg

when checking the config file with -n don't open /dev/vmm or require root
ok reyk@


# 1.23 08-Dec-2015 jsg

make the -f option work as intended
ok reyk@


# 1.22 07-Dec-2015 reyk

tweak initial error logging


# 1.21 06-Dec-2015 reyk

Prevent running a VM with the same name multiple times - multiple
instances of the same configuration will be handled in a different way
later. It is also not a good idea to use the same writeable disk
with multiple VMs at the same time.

As discussed with mlarkin@


# 1.20 06-Dec-2015 reyk

Print the TTY in the vmctl status output.


# 1.19 06-Dec-2015 reyk

When a new vm is created with VMM_IOC_CREATE, the kernel assigns a
unique id to it. This happens in the vm child process and has to be
communicated to the parent processes to track the vm. Knowing the vm
id in the parent and vmm processes also allows to remove vm from the
daemons list on terminate requests later.


# 1.18 06-Dec-2015 reyk

Check errno from config_getvm() correctly


# 1.17 05-Dec-2015 reyk

Print shorter error message if opening /dev/vmm failed.

Pointed out by deraadt@


# 1.16 03-Dec-2015 reyk

Re-add the "load" and "reload" commands to vmctl: Instead of parsing
the configuration in vmctl directly, it now sends a (re)load request
to vmd. The reload also resets the existing configuration status -
this doesn't do much difference yet but a future change will compare
if a specified VM is already running. "load" will allow to add
configuration, while "reload" resets the state before loading.


# 1.15 03-Dec-2015 reyk

Add and document -D and -f flags to vmd.


# 1.14 03-Dec-2015 reyk

mlarkin's code has been moved to vmm.c, so it is ok to claim the copyright.


# 1.13 03-Dec-2015 reyk

Add support for an optional vm.conf(5) file in vmd. This will replace
vmm.conf(5) in vmmctl. For a short time, both vmd and vmmctl will
support a configuration file, but vmmctl will be changed to send
"load" requests to vmd instead of loading and parsing the file
directly.


# 1.12 03-Dec-2015 reyk

prepare config_getvm() for parse.y


# 1.11 02-Dec-2015 reyk

send the tty name to vmmctl and print it as a result.


# 1.10 02-Dec-2015 reyk

Split the fully privileged parent into two processes "parent" and
"vmm" with reduced privileges:
- the "parent" opens fds (disks, ifs, etc.) but runs as root but pledged as
"stdio rpath wpath proc tty sendfd".
- the "vmm" process handles the creation and supervision of vm processes,
and the primary communication with the vmm(4) subsystem. It runs as _vmd
in the chroot but does not use pledge, as the vmm ioctls are not allowed
by any pledge model yet.
With this change, vmd starts to track the configuration state of VMs
in vmd and will allow other things later (like terminating a vm by
name, moving the configuration parser to vmd, ...). More incremental
changes will follow.


# 1.9 02-Dec-2015 reyk

Start tweaking vmd's privsep and daemon model by splitting the main
process into multiple parts and adopting the "proc.c"-style from other
daemons. This allows to further reduce the privileges, to give better
pledge(2), and to add some upcoming changes.

"please do" mlarkin@, deraadt@


# 1.8 26-Nov-2015 reyk

Automatically start vmm(4) when the first VM is created and after the
last VM is terminated. This allows to remove the explicit "vmm
enable" / "vmm disable" (VMM_IOC_START / VMM_IOC_STOP) ioctls. You'll
have to update kernel and userland for this change, as the kernel ABI
changes.

OK mpi@ mlarkin@


# 1.7 25-Nov-2015 tedu

typo: should be looking pid == -1


# 1.6 23-Nov-2015 reyk

accept4() is restarted after signals which prevents vmd from exiting
in the current control socket loop. Add a poll before the accept that
is not restarted and allows to escape the loop. This code is kind of
temporary, as we're planning to replace the event handling, but it
allows to kill (or Ctrl+c) vmd for now.

OK tedu@, discussed with many


# 1.5 23-Nov-2015 reyk

I accidentally removed a newline in usage() when converting the log
messages to log_*.

From Cesare Gargano


# 1.4 23-Nov-2015 reyk

Add support for logging to stderr or syslog, and to run vmd in
foreground with -d.

OK mlarkin@ jung@


# 1.3 22-Nov-2015 deraadt

use PATH_MAX where needed


# 1.2 22-Nov-2015 reyk

Add $ Ids


# 1.1 22-Nov-2015 mlarkin

vmd(8) - virtual machine daemon.

There is still a lot to be done, and fixed, in these userland components
but I have received enough "it works, commit it" emails that it's time
to finish those things in tree.

discussed with many, tested by many.


# 1.133 31-Oct-2022 dv

vmd(8): remove unfinished user accounting.

User accounting and enforcement was never finished. tedu the thing
until someone wants to pick it up and finish it.

Originally found by Matthew Martin.

ok mlarkin@, kn@. input from tb@.


Revision tags: OPENBSD_7_2_BASE
# 1.132 13-Sep-2022 martijn

Add (partial) support for agentx in vmd.

Metrics can be found under mib-2.236 and VM-MIB (RFC7666).

Stress tested by and happy noises from Mischa Peters
OK dv@


# 1.131 08-May-2022 dv

vmd: fix rebooting a received vm

Rebooting a received vm resulted in vmd(8) exiting as a result of
flawed state tracking in the parent process.

When stopping a vm, clear the VM_RECEIVE_STATE flag. When starting
a vm, make sure the parent process collapses any existing memory
ranges after the vm is sent to the vmm process (responsible for
launching the vm).

ok mlarkin@


Revision tags: OPENBSD_7_1_BASE
# 1.130 01-Mar-2022 dv

vmd(8): gracefully handle hitting data limits when starting a vm

With recent changes to login.conf(5) to restrict daemon datasize
to a finite value, users can now hit resource limits when attempting
to start a vm.

This change fixes the error path when hitting the limit. vmd(8)
will no longer abort and memory error messages are relayed to the
user.

While here, address potential under-reads/writes using atomicio
when relaying data between the child vm process and vmd's vmm
process.

Original diff from tedu@. OK mlarkin@.


# 1.129 04-Jan-2022 claudio

Fix some simple -Wunused-but-set-variable warnings.
OK benno@ dv@


# 1.128 13-Dec-2021 deraadt

including sys/cdefs.h manually started as a result of netbsd trying to
macro-build a replacement for sccsid, and was done without any concern
for namespace damage. Unfortunately this practice started infecting
other code as others were unaware they didn't need the file.
ok millert guenther


# 1.127 29-Nov-2021 deraadt

mostly avoid sys/param.h with a local nitems()
ok mlarkin


Revision tags: OPENBSD_7_0_BASE
# 1.126 18-Jul-2021 dv

vmd(8): remove invalid errno values from config_setvm

Refactor config_setvm to directly return error code on failure
instead of returning -1 and setting errno. It was setting unsupported
values not defined in <errno.h>.

OK mlarkin@


# 1.125 05-May-2021 dv

Refactor vm_instance to return error value directly.

vmd(8)'s vm_instance function set unsupported errno values. Change the
api to directly return an error (either errno or custom vmd error).

"go for it" -mlarkin@


# 1.124 04-May-2021 dv

Init debug logging state before attempting to log.

Error messages related to bad configuration were not flushing to
stderr.

OK mlarkin@


# 1.123 26-Apr-2021 dv

vmd(8): fix vmctl client "wait" state corruption

Adds queue-based tracking of waiting client state to fix the cause of
state corruption when a vmctl(8) user cancels a wait and restarts it.
The socket fd value for the control process client was being used to
track the waiting party, but this also prevented multiple waiting
clients.

This moves all the state tracking of who to notify of a vm's stopping
to the control process and no longer requires the parent process to
track it in the global environment state.

Future work will be needed to smooth out the difference between the
IMSG_VMDOP_TERMINATE_VM_{EVENT,RESPONSE} events instead of needing to
translate before relaying to the vmctl(8) client.

Tested by Mischa Peters (thanks!)

ok mlarkin@


Revision tags: OPENBSD_6_9_BASE
# 1.122 05-Apr-2021 dv

Send correct response type on unpause errors.

ok pd@


# 1.121 29-Mar-2021 dv

Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp
and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior
ignored did not intercept these packets and instead transmitted them.

This should make vmd(8)'s dhcp behave more as a true dhcp server should and
allows it to work properly with the new dhcpleased(8) attempting a renewal.

OK mlarkin@


# 1.120 27-Jan-2021 deraadt

these programs (with common ancestry) had a -fno-common problem related
to privsep_procid.
ok mortimer


Revision tags: OPENBSD_6_8_BASE
# 1.119 23-Sep-2020 martijn

Revert agentx support for now, we're too close to release.

requested by deraadt@


# 1.118 23-Sep-2020 martijn

Add support for agentx to vmd.

This is based around VM-MIB from RFC7666,but does not export the full
spec. People more knowledgeable of vmd are encouraged to expand on this.


Revision tags: OPENBSD_6_7_BASE
# 1.117 12-Dec-2019 pd

vmd: start vms defined in vm.conf in a staggered fashion

This addresses 'thundering herd' problem when a lot of
vms are configured in vm.conf. A lot of vms booting in parallel can
overload the host and also mess up tsc calibration in openbsd guests as
it uses PIT which doesn't fire reliably if the host is overloaded.

We default to starting vms with parallelism of ncpuonline and a delay 30 seconds
between batches. This is configurable in vm.conf.

ok mlarkin@ (also addressed comments from cheloha@)


Revision tags: OPENBSD_6_6_BASE
# 1.116 04-Sep-2019 mlarkin

vmd(8): memory leak in an error path

Found by Hiltjo Posthuma, thanks!


# 1.115 14-Aug-2019 anton

Improve the error message when supplying an invalid template to vmctl
start. Favoring 'invalid template' over 'permission denied' should give
the user a better hint on what went wrong.

ok kn@ mlarkin@


# 1.114 28-Jun-2019 deraadt

When system calls indicate an error they return -1, not some arbitrary
value < 0. errno is only updated in this case. Change all (most?)
callers of syscalls to follow this better, and let's see if this strictness
helps us in the future.


# 1.113 20-May-2019 jasper

drop fatalx calls when claiming a new vm id; otherwise it's possible
to crash vmd and take all other vms with it. this required a little
shuffling to get the error value reported back to the caller to
handle the error properly.

ok mlarkin@


# 1.112 11-May-2019 jasper

report vm state through 'vmctl status'; whereas previously this would display the state of
the vcpu (which is why it got removed), it now actually reports the correct state
(running, stopped, disabled, paused, etc)

ok ccardenas@ mlarkin@


# 1.111 11-May-2019 jasper

vm_dump_header allocated space for a signature but it was never set;
set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image

ok mlarkin@ pd@


# 1.110 11-May-2019 jasper

track the state of the vm (running, paused, etc) using a single bitfield instead of
a handful of separate variables. this will makes it easier for vmd to report
and check on the individual vm states

no functional change intended

ok ccardenas@ mlarkin@


# 1.109 11-May-2019 jasper

sync the vm state in vmd too when (un)pausing a vm, otherwise the vm process
knows the vm is paused, but vmd does not.

ok mlarkin@ pd@


Revision tags: OPENBSD_6_5_BASE
# 1.108 09-Dec-2018 claudio

When -B is used to specify a specific boot device also change the reboot
behaviour of vmd to stop / exit at guest reboot.
OK ccardenas@


# 1.107 04-Dec-2018 claudio

Introduce IMSG_VMDOP_WAIT_VM_REQUEST a control message that registers a
vmctl peerid that should be informed when the VM is stopped (like when the
guest does a shutdown). Uses the same logic as using the VMOP_WAIT flag on
IMSG_VMDOP_TERMINATE_VM_REQUEST.
Ok ccardenas@, reyk@


# 1.106 26-Nov-2018 ori

Keep a list of known vms, and reuse the VM IDs.

This means that when using '-L', the IP addresses of the VMs are stable.

ok reyk@


# 1.105 21-Nov-2018 reyk

Add support for "local inet6" interfaces.

ok & test ccardenas@, additional review from kn@


# 1.104 15-Oct-2018 reyk

Prevent VM reboot loops by rate-limiting the interval a VM can reboot.

This looping has been experienced by people who run VMs with a broken
kernel or boot loader that trigger a very fast reboot loop (triple
fault) of a VM that ends up using a lot of CPU and resources on the
host. Some fixes in vmm(4) and vmd(8) helped to avoid such conditions
but it can still occur if something is wrong in the guest VM itself.

If the VM restarts after less than VM_START_RATE_SEC (6) seconds, we
increment the limit counter. After VM_START_RATE_LIMIT (3) of suchs
fast reboots the VM is stopped.

There are only very few people who intentionally want to reboot-loop a
VM very quickly (many times within a second); mostly for fuzzing.
They will have to recompile and adjust the stated #defines in the code
as we don't have a config option to disable it.

OK mlarkin@


Revision tags: OPENBSD_6_4_BASE
# 1.103 08-Oct-2018 reyk

Add support for qcow2 base images (external snapshots).

This works is from Ori Bernstein, committing on his behalf:

Add support to vmd for external snapshots. That is, snapshots that are
derived from a base image. Data lookups start in the derived image,
and if the derived image does not contain some data, the search
proceeds ot the base image. Multiple derived images may exist off of
a single base image.

A limitation of this format is that modifying the base image will
corrupt the derived image.

This change also adds support for creating disk derived disk images to
vmctl. To use it:

vmctl create derived.qcow2 -s 16G -b base.qcow2

From Ori Bernstein
OK mlarkin@ reyk@


# 1.102 29-Sep-2018 pd

vmd: don't remove vm if sending failed

Fix a bug where a vm was removed in vmd.c after vmctl send even if sending
failed.
spotted by solene@
ok mlarkin@


# 1.101 28-Sep-2018 reyk

Fix copy-pasto to use maxmem instead of maxcpu

Reported by Greg Steuck

OK mlarkin@


# 1.100 10-Sep-2018 bluhm

vmd(8) clould close file descriptor 0 as not all fd fields were
properly initialized with -1. Also avoid closing -1.
OK mlarkin@


# 1.99 10-Sep-2018 bluhm

During the fork+exec implementation, daemon(3) was moved after
proc_init(). As a consequence vmd(8) child processes did not detach
from the terminal anymore. Dup /dev/null to the stdio file descriptors
in the children.
OK mlarkin@ reyk@


# 1.98 15-Jul-2018 reyk

Track resources and enforce cpu/memory/interface limits for non-root users.

The limits are currently hard-coded and undocumented (4 CPUs/VMs, 2G
memory, 8 interfaces) but will be configurable in an upcoming diff.
These limits are tracked in total usage; for example, a user will be
able to run up to 4 VMs with 512M of memory or a single VM with 2G.

OK ccardenas@ mlarkin@


# 1.97 13-Jul-2018 reyk

Check the disk/kernel/cdrom file permissions after openening the fd.

This prevents time of TOCTOU attacks for instances.

OK mlarkin@


# 1.96 13-Jul-2018 reyk

Add "allow instance" option.

This allows users to create VM instances and change desired options,
for example a user can be allowed to run a VM with all the
pre-configured options but specify an own disk image.

(mlarkin@ was fine with iterating over it)

OK ccardenas@


# 1.95 12-Jul-2018 reyk

Allow to use configured/running VMs as templates for other VM instances.

This introduces new grammar and the -t optional in vmctl start.

(For now, only root can create VM instances; but it is planned to allow
users to create their own VMs based on permissions and quota.)

OK ccardenas@ mlarkin@ jmc@


# 1.94 11-Jul-2018 reyk

style - indent each case statement in a switch.


# 1.93 11-Jul-2018 reyk

Add -w option to vmctl stop to wait for completion of VM termination.

Use it in /etc/rc.d/vmd accordingly.

OK sthen@


# 1.92 11-Jul-2018 reyk

Rename function to vmd_check_vmh


# 1.91 11-Jul-2018 reyk

Add -f option to vmctl stop to forcefully kill a VM.

This also fixes a bug in vmm_sighdlr where it might have missed
forwarding the TERMINATE_EVENT to the vmd parent after a VM child
died, leading to an abandoned VM in the vmd parent process.

OK ccardenas@ mlarkin@ benno@ kn@


# 1.90 10-Jul-2018 reyk

style (single-line ifs don't need braces)


# 1.89 10-Jul-2018 reyk

vmd already had DEBUG/DPRINTF, there is no need for VMD_DEBUG/dprintf

Replace all occurences of dprintf with DPRINTF (defined in proc.h).


# 1.88 10-Jul-2018 reyk

Tweak debug log messages

- Turn tracing messages into DPRINTF (only compiled with DEBUG).

- Pass __func__ to vm_stop and vm_remove: this way we can track who
called the function in the async context. It replaces the manual
log_debug in front of each vm_stop/vm_remove. This debug logging
trick can be removed in the future once we are more confident about
it.

OK ccardenas@ mlarkin@


# 1.87 26-Jun-2018 reyk

Add "socket owner" to allow changing the owner of the vmd control socket.

This allows to open vmctl control or console access to other users
that are not in group wheel. Access for non-root users still defaults
to read-only actions unless you change the owner (user/group) of each
individual VM.

Requested by Mischa Peters

OK mlarkin@


# 1.86 19-Jun-2018 reyk

knf


# 1.85 13-May-2018 pd

vmd(8): enable pause / unpause for vm owners

Patch from Mohamed Aslan. Thanks!
ok kn@


# 1.84 25-Apr-2018 mlarkin

vmd(8)'s early error messages weren't visible when started via /etc/rc
(such as errors relating to not having VMX/etc). Change the log_init
to log to syslog so at least we have some chance of seeing these errors.

requested and ok beck@


# 1.83 21-Apr-2018 mlarkin

spelling error in log message


# 1.82 29-Mar-2018 martijn

Make sure that the global config is send out immediately when it is
loaded. This makes sure that the local prefix specied in the config is
always used.

OK ccardenas@


Revision tags: OPENBSD_6_3_BASE
# 1.81 14-Mar-2018 mlarkin

block two VMs from using the same disk image file at the same time.
Also changes an error message in vmctl to reflect same.


# 1.80 18-Feb-2018 pd

vmd: fix vmctl pause for non existing vm ids (never returns)

check if vm id is valid before sending to vmm for pausing. The 'lock' is caused
by vmm sending back ENOENT for a non existent vm but vmd drops the message
because it doesn't recogize the vmid vmm is talking about. This is an artifact
of the 'policy' don't trust any imsg from a sibling priv sep process and do
your own checking.

reported by Abel Abraham Camarillo Ojeda
ok mlarkin@ and ccardenas@


# 1.79 10-Jan-2018 sthen

Don't require "disk" or "kernel", also allow just "cdrom" instead, a VM can
still be useful with only cdrom storage. ok ccardenas@


# 1.78 08-Jan-2018 mpi

Enable TIOCUCNTL to be able to set ns8250's break detected condition.

It is now possible to send BREAK commands to vmd(8) independently of
the serial terminal emulator.

Happy virtual ddb(4) hacking!

No objection from mlarkin@, ok nicm@, ccardenas@, deraadt@


# 1.77 03-Jan-2018 ccardenas

Add initial CD-ROM support to VMD via vioscsi.

* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8)
* Support various sized ISOs (Limitation of 4G ISOs on Linux guests)
* Known working guests: OpenBSD (primary), Alpine Linux (primary),
CentOS 6 (secondary), Ubuntu 17.10 (secondary).
NOTE: Secondary indicates some issue(s) preventing full/reliable
functionality outside the scope of the vioscsi work.
* If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's
default BIOS) will boot from CD-ROM.

ok mlarkin@, jca@


# 1.76 06-Dec-2017 abieber

Make vmd respect owner when starting non-disabled vms.

OK pd@, benno@


# 1.75 30-Nov-2017 ccardenas

When performing vmctl reload and a previously configured vm is running,
exit with an EALREADY vs EPERM.

ok mlarkin@


# 1.74 11-Nov-2017 mlarkin

update switch handling in vmd(8). vmd now gets switch information (rdomain,
etc) from underlying switch interface instead of handling this on its
own.

Diff from carlos cardenas, Thanks!

ok reyk@


# 1.73 07-Nov-2017 mlarkin

typo in previous


# 1.72 07-Nov-2017 mlarkin

comment function vm_checkperm


# 1.71 24-Oct-2017 mlarkin

The VMD parent process didn't handle the case of a VM exiting
with a non 0 return properly (i.e. EIO).

From: Carlos Cardenas, thanks!


# 1.70 07-Oct-2017 mlarkin

vmd: retain ownership on vm reboot

from Jesper Wallin, thanks!


Revision tags: OPENBSD_6_2_BASE
# 1.69 08-Sep-2017 mlarkin

vmd: add more explanatory log_debug messages

From Carlos Cardenas, many thanks!


# 1.68 20-Aug-2017 pd

vmd: Allow only upward migration

This restricts receiving vms from hosts with more cpu features.

Tested on
broadwell -> skylake (works)
skylake -> broadwell (don't work)

ok mlarkin@


# 1.67 15-Aug-2017 pd

vmd: fix vm id displayed by vmctl when receiving a vm

Also fix two debug messages and an IMSG type.


# 1.66 14-Aug-2017 jasper

validate vm names before creating them; a valid name contains alphanumeric
characters, including '.', '_' and '-'. but does not start with the latter
three.

ok mlarkin@ pd@


# 1.65 13-Aug-2017 jasper

don't issue a termination command to an already stopped vm

ok mlarkin@


# 1.64 15-Jul-2017 pd

Add vmctl send and vmctl receive

ok reyk@ and mlarkin@


# 1.63 09-Jul-2017 pd

vmd/vmctl: Add ability to pause / unpause vms

With help from Ashwin Agrawal

ok reyk@ mlarkin@


# 1.62 29-May-2017 mlarkin

vmd(8): prevent crashing when presented with a vm name argument to
"vmctl stop" that doesn't exist.

Diff from Pratik Vyas, thanks!


# 1.61 04-May-2017 reyk

Report command failure back to vmctl reload, reset, load, log verbose.

OK mlarkin@


# 1.60 04-May-2017 reyk

Add support for rdomains.

This allows to configure VM interfaces and switches in individual rdomains.

OK mlarkin@


# 1.59 25-Apr-2017 reyk

Generate randomized MAC addresses earlier to keep them across reboots.

OK deraadt@


# 1.58 21-Apr-2017 reyk

Add global configuration option "local prefix" to change prefix for -L.

The default prefix is 100.64.0.0/10 from RFC6598.

Requested by sthen@ chris@
OK mlarkin@


# 1.57 19-Apr-2017 reyk

Add support for dynamic "NAT" interfaces (-L/local interface).

When a local interface is configured, vmd configures a /31 address on
the tap(4) interface of the host and provides another IP in the same
subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server
that replies with IP, gateway, and DNS addresses to the VM. The
built-in server only ever responds to the VM on the inside and cannot
leak its DHCP responses to the outside.

Thanks to Uwe Werler, Josh Grosse, and some others for testing!

OK deraadt@


# 1.56 06-Apr-2017 reyk

Do not expose vmm(4) VM IDs to the user, use vmd(8)'s IDs instead.

Each VM has two IDs: one from the kernel (vmm) and a different one
from userland (vmd). The vmm ID is not consistent and incremented on
every boot during runtimg of the host system. The vmd ID remains the
same during the lifetime of a configured VM, even after reboots.
Configured VMs will even get and keep their IDs when the configuration
is loaded. This is more what users expect.

Pointed out and tested by otto@

OK deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.55 15-Mar-2017 reyk

More fixes for starting and stopping VMs, fixing fallout from vm_running.

- Don't start a VM that is already running
- Keep the VM as running until it is powered off (and not stopping)
- Don't fatal in the parent if the vmm process referenced an unknown VM
- Don't stop a VM that is already stopping
- Indicate that a VM is stopping in "vmctl status"

The previous "vmctl stop; vmctl stop" to force-shutdown is not
supported anymore - the shutdown timeout should make sure that the VM
is really terminated. To force-shutdown, reference the VM by ID.
We might add a flag to vmctl stop to just turn the VM off.


# 1.54 15-Mar-2017 reyk

Close the tty if the VM was powered down.

The parent keeps a copy of each VM's tty fd to reuse it on reboot.
Close this tty if the VM was stopped, and not rebooted, by calling
vm_stop(vm, 0) instead of just setting vm_running to 0. Also make
sure that vm_ttyname is not used after free'ing it.


# 1.53 02-Mar-2017 reyk

Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.

This is especially useful when multiple VMs share a switch, the
implementation is independent from the underlying switch or bridge.

no objections mlarkin@


# 1.52 01-Mar-2017 reyk

Add "owner" option to set a user/group ownership for pre-configured VMs

This allows matching users to start or stop VMs that they "own" and to
access the console accordingly.

OK mlarkin@


# 1.51 27-Feb-2017 reyk

Replace openpty(3) with local function that uses pre-opened /dev/ptm fd

This allows more flexibility for upcoming changes and better pledge.
We also didn't use half of the features of libutil's openpty function.
Additionally, make sure that the ttys are closed correctly on shutdown.

OK gilles@


# 1.50 13-Jan-2017 edd

Make it possible to remove VMs from vmd(8)'s internal queue.

The semantics agreed with reyk@ are:

* ad-hoc created vms, created with `vmctl start`, are removed once stopped.
* Stopped VMs defined in a config file are flushed before a `vmctl reload`.

OK reyk@


# 1.49 11-Jan-2017 reyk

Add imsg communication channel between vmd and invividual VMs.
For now, this is only used to forward "log verbose|brief" requests,
but it will be used for better things later.

OK mlarkin@


# 1.48 09-Jan-2017 reyk

Stop accessing verbose and debug variables from log.c directly.

This replaces log_verbose() and "extern int verbose" with the two functions
log_setverbose() and log_getverbose().

Pointed out by benno@
OK krw@ eric@ gilles@ (OK gilles@ for the snmpd bits as well)


# 1.47 14-Dec-2016 reyk

Allow to start disabled and pre-configured VMs by name, "vmctl start foo".

With testing from Jon Bernard

OK mlarkin@


# 1.46 14-Dec-2016 reyk

If a VM terminates with the result EAGAIN, close all fds except the
pty and re-send it to the vmm monitor process. With additional
changes in vmm.c, this will allow perform a cold reboot of VM.

With testing and feedback from Jon Bernard
OK mlarkin@


# 1.45 26-Nov-2016 reyk

Implement basic support for boot.conf(8) on the disk image.

Like the real boot loader, load and parse hd0a:/etc/boot.conf from the
first disk and fall back to /bsd. Not all boot loader options are
supported, but it at least does set device, set image, and boot -acds
(eg. for booting single-user).

For example, it can now boot install60.fs that includes a boot.conf
with "set image /6.0/amd64/bsd.rd":
vmctl start install -c -d install60.fs -d OpenBSD.img

This pseudo-bootloader is only needed without BIOS and could
potentially be replaced in the future.

OK mlarkin@


# 1.44 26-Nov-2016 reyk

If -m/memory is not specified, use 512M by default.

Default value picked with mlarkin - not too small and not too large.

OK mlarkin@


# 1.43 24-Nov-2016 reyk

Add support for booting the kernel from the disk image.

This make the kernel/-k argument optional and, if not specified, tries
to find the /bsd kernel in the primary hd0a partition of the first
disk image itself. It doesn't support hd0a:/etc/boot.conf yet, and it
is no BIOS or full boot loader, but it makes booting and handling of
VMs a bit easier - booting an external kernel is still supported.

The UFS file system code ufs.c is directly from libsa which is also
used by the real boot loader. The code compiles with a few signedness
warning which will be fixed separately.

OK mlarkin@


# 1.42 22-Nov-2016 reyk

Fix error path of config_setvm() and its callers. This unbreaks
loading of invalid kernel files.

Reported by mlarkin@
OK mlarkin@


# 1.41 22-Nov-2016 reyk

There is no need for res when there is already ret.


# 1.40 22-Nov-2016 edd

Insert disabled VMs into vmd(8)'s queues and allow vmctl(8) to display them.

Tested by Jon Bernard and reyk@.

OK reyk@, no objections mlarkin@.

Thanks


# 1.39 04-Nov-2016 reyk

Pass the internal vmid or 0 to vm_register() instead of changing it
once again after setting the next available id.

Suggested by edd@


# 1.38 04-Nov-2016 reyk

Update the config/register/get VM methods to match the config_set/get
style that is used in other places. Also keep the vmid from the parent.

OK edd@


# 1.37 29-Oct-2016 edd

Separate parsing vms and switches from starting them in vmd(8).

Brings us one step closer to having disabled by default vms is vm.conf(5),
which can be started with vmctl(8).

Input, testing and OK reyk@. Thanks.


# 1.36 17-Oct-2016 reyk

Add the option to specify an interface group per virtual switch as well;
this group will be added to all VM tap(4) interfaces in the switch.

Tested by martijn@


# 1.35 15-Oct-2016 reyk

Allow to add an interface to an interface group; with the group keyword.

Requested and tested by martijn@


# 1.34 12-Oct-2016 reyk

Fix functionality and semantics of vmctl load/reload/reset.

OK rzalamena@


# 1.33 06-Oct-2016 reyk

Terminate VMs on shutdown of vmd instead of leaving them running as
undead VM processes.

OK mlarkin@


# 1.32 05-Oct-2016 reyk

Add support for enhanced networking configuration and virtual switches.
See vm.conf(5) for more details.

OK mlarkin@


# 1.31 04-Oct-2016 reyk

Add a new "priv" process that is responsible for ioctls and restricted
operations that aren't allowed under pledge. This is a companion to
the "vmd" process that runs as root but with pledge.

With the "priv" process, each new tap(4) interface now gets a
description to indicate the vm, eg. "vm1-if0-myvm". For network
configuration will be done by vmd/priv later.

OK mlarkin@


# 1.30 29-Sep-2016 reyk

Implement fork+exec for vmd, using the same framework from httpd etc.

No objections from mlarkin@ sunil@


# 1.29 17-Aug-2016 deraadt

small bits of header cleanup; ok mlarkin


# 1.28 29-Jul-2016 stefan

Allow starting a VM again after it was terminated

If a VM exits, terminate it and remove it from the list of
available VMs. That allows a VM with name `foo' to be restarted
after it has exited.

This changes structures shared between vmd and vmctl. You need to
rebuild vmctl also.

ok mlarkin@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.27 05-Feb-2016 reyk

Fix a possible use-after-free in vmd, forward the result to the
control socket before free'ing the vm.

Found by and OK jsg@


# 1.26 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.25 11-Dec-2015 reyk

The vmctl "id" argument can now be a number of or a vm name, eg.
vmctl stop 3
vmctl stop "openbsd.vm"


# 1.24 08-Dec-2015 jsg

when checking the config file with -n don't open /dev/vmm or require root
ok reyk@


# 1.23 08-Dec-2015 jsg

make the -f option work as intended
ok reyk@


# 1.22 07-Dec-2015 reyk

tweak initial error logging


# 1.21 06-Dec-2015 reyk

Prevent running a VM with the same name multiple times - multiple
instances of the same configuration will be handled in a different way
later. It is also not a good idea to use the same writeable disk
with multiple VMs at the same time.

As discussed with mlarkin@


# 1.20 06-Dec-2015 reyk

Print the TTY in the vmctl status output.


# 1.19 06-Dec-2015 reyk

When a new vm is created with VMM_IOC_CREATE, the kernel assigns a
unique id to it. This happens in the vm child process and has to be
communicated to the parent processes to track the vm. Knowing the vm
id in the parent and vmm processes also allows to remove vm from the
daemons list on terminate requests later.


# 1.18 06-Dec-2015 reyk

Check errno from config_getvm() correctly


# 1.17 05-Dec-2015 reyk

Print shorter error message if opening /dev/vmm failed.

Pointed out by deraadt@


# 1.16 03-Dec-2015 reyk

Re-add the "load" and "reload" commands to vmctl: Instead of parsing
the configuration in vmctl directly, it now sends a (re)load request
to vmd. The reload also resets the existing configuration status -
this doesn't do much difference yet but a future change will compare
if a specified VM is already running. "load" will allow to add
configuration, while "reload" resets the state before loading.


# 1.15 03-Dec-2015 reyk

Add and document -D and -f flags to vmd.


# 1.14 03-Dec-2015 reyk

mlarkin's code has been moved to vmm.c, so it is ok to claim the copyright.


# 1.13 03-Dec-2015 reyk

Add support for an optional vm.conf(5) file in vmd. This will replace
vmm.conf(5) in vmmctl. For a short time, both vmd and vmmctl will
support a configuration file, but vmmctl will be changed to send
"load" requests to vmd instead of loading and parsing the file
directly.


# 1.12 03-Dec-2015 reyk

prepare config_getvm() for parse.y


# 1.11 02-Dec-2015 reyk

send the tty name to vmmctl and print it as a result.


# 1.10 02-Dec-2015 reyk

Split the fully privileged parent into two processes "parent" and
"vmm" with reduced privileges:
- the "parent" opens fds (disks, ifs, etc.) but runs as root but pledged as
"stdio rpath wpath proc tty sendfd".
- the "vmm" process handles the creation and supervision of vm processes,
and the primary communication with the vmm(4) subsystem. It runs as _vmd
in the chroot but does not use pledge, as the vmm ioctls are not allowed
by any pledge model yet.
With this change, vmd starts to track the configuration state of VMs
in vmd and will allow other things later (like terminating a vm by
name, moving the configuration parser to vmd, ...). More incremental
changes will follow.


# 1.9 02-Dec-2015 reyk

Start tweaking vmd's privsep and daemon model by splitting the main
process into multiple parts and adopting the "proc.c"-style from other
daemons. This allows to further reduce the privileges, to give better
pledge(2), and to add some upcoming changes.

"please do" mlarkin@, deraadt@


# 1.8 26-Nov-2015 reyk

Automatically start vmm(4) when the first VM is created and after the
last VM is terminated. This allows to remove the explicit "vmm
enable" / "vmm disable" (VMM_IOC_START / VMM_IOC_STOP) ioctls. You'll
have to update kernel and userland for this change, as the kernel ABI
changes.

OK mpi@ mlarkin@


# 1.7 25-Nov-2015 tedu

typo: should be looking pid == -1


# 1.6 23-Nov-2015 reyk

accept4() is restarted after signals which prevents vmd from exiting
in the current control socket loop. Add a poll before the accept that
is not restarted and allows to escape the loop. This code is kind of
temporary, as we're planning to replace the event handling, but it
allows to kill (or Ctrl+c) vmd for now.

OK tedu@, discussed with many


# 1.5 23-Nov-2015 reyk

I accidentally removed a newline in usage() when converting the log
messages to log_*.

From Cesare Gargano


# 1.4 23-Nov-2015 reyk

Add support for logging to stderr or syslog, and to run vmd in
foreground with -d.

OK mlarkin@ jung@


# 1.3 22-Nov-2015 deraadt

use PATH_MAX where needed


# 1.2 22-Nov-2015 reyk

Add $ Ids


# 1.1 22-Nov-2015 mlarkin

vmd(8) - virtual machine daemon.

There is still a lot to be done, and fixed, in these userland components
but I have received enough "it works, commit it" emails that it's time
to finish those things in tree.

discussed with many, tested by many.


# 1.132 13-Sep-2022 martijn

Add (partial) support for agentx in vmd.

Metrics can be found under mib-2.236 and VM-MIB (RFC7666).

Stress tested by and happy noises from Mischa Peters
OK dv@


# 1.131 08-May-2022 dv

vmd: fix rebooting a received vm

Rebooting a received vm resulted in vmd(8) exiting as a result of
flawed state tracking in the parent process.

When stopping a vm, clear the VM_RECEIVE_STATE flag. When starting
a vm, make sure the parent process collapses any existing memory
ranges after the vm is sent to the vmm process (responsible for
launching the vm).

ok mlarkin@


Revision tags: OPENBSD_7_1_BASE
# 1.130 01-Mar-2022 dv

vmd(8): gracefully handle hitting data limits when starting a vm

With recent changes to login.conf(5) to restrict daemon datasize
to a finite value, users can now hit resource limits when attempting
to start a vm.

This change fixes the error path when hitting the limit. vmd(8)
will no longer abort and memory error messages are relayed to the
user.

While here, address potential under-reads/writes using atomicio
when relaying data between the child vm process and vmd's vmm
process.

Original diff from tedu@. OK mlarkin@.


# 1.129 04-Jan-2022 claudio

Fix some simple -Wunused-but-set-variable warnings.
OK benno@ dv@


# 1.128 13-Dec-2021 deraadt

including sys/cdefs.h manually started as a result of netbsd trying to
macro-build a replacement for sccsid, and was done without any concern
for namespace damage. Unfortunately this practice started infecting
other code as others were unaware they didn't need the file.
ok millert guenther


# 1.127 29-Nov-2021 deraadt

mostly avoid sys/param.h with a local nitems()
ok mlarkin


Revision tags: OPENBSD_7_0_BASE
# 1.126 18-Jul-2021 dv

vmd(8): remove invalid errno values from config_setvm

Refactor config_setvm to directly return error code on failure
instead of returning -1 and setting errno. It was setting unsupported
values not defined in <errno.h>.

OK mlarkin@


# 1.125 05-May-2021 dv

Refactor vm_instance to return error value directly.

vmd(8)'s vm_instance function set unsupported errno values. Change the
api to directly return an error (either errno or custom vmd error).

"go for it" -mlarkin@


# 1.124 04-May-2021 dv

Init debug logging state before attempting to log.

Error messages related to bad configuration were not flushing to
stderr.

OK mlarkin@


# 1.123 26-Apr-2021 dv

vmd(8): fix vmctl client "wait" state corruption

Adds queue-based tracking of waiting client state to fix the cause of
state corruption when a vmctl(8) user cancels a wait and restarts it.
The socket fd value for the control process client was being used to
track the waiting party, but this also prevented multiple waiting
clients.

This moves all the state tracking of who to notify of a vm's stopping
to the control process and no longer requires the parent process to
track it in the global environment state.

Future work will be needed to smooth out the difference between the
IMSG_VMDOP_TERMINATE_VM_{EVENT,RESPONSE} events instead of needing to
translate before relaying to the vmctl(8) client.

Tested by Mischa Peters (thanks!)

ok mlarkin@


Revision tags: OPENBSD_6_9_BASE
# 1.122 05-Apr-2021 dv

Send correct response type on unpause errors.

ok pd@


# 1.121 29-Mar-2021 dv

Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp
and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior
ignored did not intercept these packets and instead transmitted them.

This should make vmd(8)'s dhcp behave more as a true dhcp server should and
allows it to work properly with the new dhcpleased(8) attempting a renewal.

OK mlarkin@


# 1.120 27-Jan-2021 deraadt

these programs (with common ancestry) had a -fno-common problem related
to privsep_procid.
ok mortimer


Revision tags: OPENBSD_6_8_BASE
# 1.119 23-Sep-2020 martijn

Revert agentx support for now, we're too close to release.

requested by deraadt@


# 1.118 23-Sep-2020 martijn

Add support for agentx to vmd.

This is based around VM-MIB from RFC7666,but does not export the full
spec. People more knowledgeable of vmd are encouraged to expand on this.


Revision tags: OPENBSD_6_7_BASE
# 1.117 12-Dec-2019 pd

vmd: start vms defined in vm.conf in a staggered fashion

This addresses 'thundering herd' problem when a lot of
vms are configured in vm.conf. A lot of vms booting in parallel can
overload the host and also mess up tsc calibration in openbsd guests as
it uses PIT which doesn't fire reliably if the host is overloaded.

We default to starting vms with parallelism of ncpuonline and a delay 30 seconds
between batches. This is configurable in vm.conf.

ok mlarkin@ (also addressed comments from cheloha@)


Revision tags: OPENBSD_6_6_BASE
# 1.116 04-Sep-2019 mlarkin

vmd(8): memory leak in an error path

Found by Hiltjo Posthuma, thanks!


# 1.115 14-Aug-2019 anton

Improve the error message when supplying an invalid template to vmctl
start. Favoring 'invalid template' over 'permission denied' should give
the user a better hint on what went wrong.

ok kn@ mlarkin@


# 1.114 28-Jun-2019 deraadt

When system calls indicate an error they return -1, not some arbitrary
value < 0. errno is only updated in this case. Change all (most?)
callers of syscalls to follow this better, and let's see if this strictness
helps us in the future.


# 1.113 20-May-2019 jasper

drop fatalx calls when claiming a new vm id; otherwise it's possible
to crash vmd and take all other vms with it. this required a little
shuffling to get the error value reported back to the caller to
handle the error properly.

ok mlarkin@


# 1.112 11-May-2019 jasper

report vm state through 'vmctl status'; whereas previously this would display the state of
the vcpu (which is why it got removed), it now actually reports the correct state
(running, stopped, disabled, paused, etc)

ok ccardenas@ mlarkin@


# 1.111 11-May-2019 jasper

vm_dump_header allocated space for a signature but it was never set;
set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image

ok mlarkin@ pd@


# 1.110 11-May-2019 jasper

track the state of the vm (running, paused, etc) using a single bitfield instead of
a handful of separate variables. this will makes it easier for vmd to report
and check on the individual vm states

no functional change intended

ok ccardenas@ mlarkin@


# 1.109 11-May-2019 jasper

sync the vm state in vmd too when (un)pausing a vm, otherwise the vm process
knows the vm is paused, but vmd does not.

ok mlarkin@ pd@


Revision tags: OPENBSD_6_5_BASE
# 1.108 09-Dec-2018 claudio

When -B is used to specify a specific boot device also change the reboot
behaviour of vmd to stop / exit at guest reboot.
OK ccardenas@


# 1.107 04-Dec-2018 claudio

Introduce IMSG_VMDOP_WAIT_VM_REQUEST a control message that registers a
vmctl peerid that should be informed when the VM is stopped (like when the
guest does a shutdown). Uses the same logic as using the VMOP_WAIT flag on
IMSG_VMDOP_TERMINATE_VM_REQUEST.
Ok ccardenas@, reyk@


# 1.106 26-Nov-2018 ori

Keep a list of known vms, and reuse the VM IDs.

This means that when using '-L', the IP addresses of the VMs are stable.

ok reyk@


# 1.105 21-Nov-2018 reyk

Add support for "local inet6" interfaces.

ok & test ccardenas@, additional review from kn@


# 1.104 15-Oct-2018 reyk

Prevent VM reboot loops by rate-limiting the interval a VM can reboot.

This looping has been experienced by people who run VMs with a broken
kernel or boot loader that trigger a very fast reboot loop (triple
fault) of a VM that ends up using a lot of CPU and resources on the
host. Some fixes in vmm(4) and vmd(8) helped to avoid such conditions
but it can still occur if something is wrong in the guest VM itself.

If the VM restarts after less than VM_START_RATE_SEC (6) seconds, we
increment the limit counter. After VM_START_RATE_LIMIT (3) of suchs
fast reboots the VM is stopped.

There are only very few people who intentionally want to reboot-loop a
VM very quickly (many times within a second); mostly for fuzzing.
They will have to recompile and adjust the stated #defines in the code
as we don't have a config option to disable it.

OK mlarkin@


Revision tags: OPENBSD_6_4_BASE
# 1.103 08-Oct-2018 reyk

Add support for qcow2 base images (external snapshots).

This works is from Ori Bernstein, committing on his behalf:

Add support to vmd for external snapshots. That is, snapshots that are
derived from a base image. Data lookups start in the derived image,
and if the derived image does not contain some data, the search
proceeds ot the base image. Multiple derived images may exist off of
a single base image.

A limitation of this format is that modifying the base image will
corrupt the derived image.

This change also adds support for creating disk derived disk images to
vmctl. To use it:

vmctl create derived.qcow2 -s 16G -b base.qcow2

From Ori Bernstein
OK mlarkin@ reyk@


# 1.102 29-Sep-2018 pd

vmd: don't remove vm if sending failed

Fix a bug where a vm was removed in vmd.c after vmctl send even if sending
failed.
spotted by solene@
ok mlarkin@


# 1.101 28-Sep-2018 reyk

Fix copy-pasto to use maxmem instead of maxcpu

Reported by Greg Steuck

OK mlarkin@


# 1.100 10-Sep-2018 bluhm

vmd(8) clould close file descriptor 0 as not all fd fields were
properly initialized with -1. Also avoid closing -1.
OK mlarkin@


# 1.99 10-Sep-2018 bluhm

During the fork+exec implementation, daemon(3) was moved after
proc_init(). As a consequence vmd(8) child processes did not detach
from the terminal anymore. Dup /dev/null to the stdio file descriptors
in the children.
OK mlarkin@ reyk@


# 1.98 15-Jul-2018 reyk

Track resources and enforce cpu/memory/interface limits for non-root users.

The limits are currently hard-coded and undocumented (4 CPUs/VMs, 2G
memory, 8 interfaces) but will be configurable in an upcoming diff.
These limits are tracked in total usage; for example, a user will be
able to run up to 4 VMs with 512M of memory or a single VM with 2G.

OK ccardenas@ mlarkin@


# 1.97 13-Jul-2018 reyk

Check the disk/kernel/cdrom file permissions after openening the fd.

This prevents time of TOCTOU attacks for instances.

OK mlarkin@


# 1.96 13-Jul-2018 reyk

Add "allow instance" option.

This allows users to create VM instances and change desired options,
for example a user can be allowed to run a VM with all the
pre-configured options but specify an own disk image.

(mlarkin@ was fine with iterating over it)

OK ccardenas@


# 1.95 12-Jul-2018 reyk

Allow to use configured/running VMs as templates for other VM instances.

This introduces new grammar and the -t optional in vmctl start.

(For now, only root can create VM instances; but it is planned to allow
users to create their own VMs based on permissions and quota.)

OK ccardenas@ mlarkin@ jmc@


# 1.94 11-Jul-2018 reyk

style - indent each case statement in a switch.


# 1.93 11-Jul-2018 reyk

Add -w option to vmctl stop to wait for completion of VM termination.

Use it in /etc/rc.d/vmd accordingly.

OK sthen@


# 1.92 11-Jul-2018 reyk

Rename function to vmd_check_vmh


# 1.91 11-Jul-2018 reyk

Add -f option to vmctl stop to forcefully kill a VM.

This also fixes a bug in vmm_sighdlr where it might have missed
forwarding the TERMINATE_EVENT to the vmd parent after a VM child
died, leading to an abandoned VM in the vmd parent process.

OK ccardenas@ mlarkin@ benno@ kn@


# 1.90 10-Jul-2018 reyk

style (single-line ifs don't need braces)


# 1.89 10-Jul-2018 reyk

vmd already had DEBUG/DPRINTF, there is no need for VMD_DEBUG/dprintf

Replace all occurences of dprintf with DPRINTF (defined in proc.h).


# 1.88 10-Jul-2018 reyk

Tweak debug log messages

- Turn tracing messages into DPRINTF (only compiled with DEBUG).

- Pass __func__ to vm_stop and vm_remove: this way we can track who
called the function in the async context. It replaces the manual
log_debug in front of each vm_stop/vm_remove. This debug logging
trick can be removed in the future once we are more confident about
it.

OK ccardenas@ mlarkin@


# 1.87 26-Jun-2018 reyk

Add "socket owner" to allow changing the owner of the vmd control socket.

This allows to open vmctl control or console access to other users
that are not in group wheel. Access for non-root users still defaults
to read-only actions unless you change the owner (user/group) of each
individual VM.

Requested by Mischa Peters

OK mlarkin@


# 1.86 19-Jun-2018 reyk

knf


# 1.85 13-May-2018 pd

vmd(8): enable pause / unpause for vm owners

Patch from Mohamed Aslan. Thanks!
ok kn@


# 1.84 25-Apr-2018 mlarkin

vmd(8)'s early error messages weren't visible when started via /etc/rc
(such as errors relating to not having VMX/etc). Change the log_init
to log to syslog so at least we have some chance of seeing these errors.

requested and ok beck@


# 1.83 21-Apr-2018 mlarkin

spelling error in log message


# 1.82 29-Mar-2018 martijn

Make sure that the global config is send out immediately when it is
loaded. This makes sure that the local prefix specied in the config is
always used.

OK ccardenas@


Revision tags: OPENBSD_6_3_BASE
# 1.81 14-Mar-2018 mlarkin

block two VMs from using the same disk image file at the same time.
Also changes an error message in vmctl to reflect same.


# 1.80 18-Feb-2018 pd

vmd: fix vmctl pause for non existing vm ids (never returns)

check if vm id is valid before sending to vmm for pausing. The 'lock' is caused
by vmm sending back ENOENT for a non existent vm but vmd drops the message
because it doesn't recogize the vmid vmm is talking about. This is an artifact
of the 'policy' don't trust any imsg from a sibling priv sep process and do
your own checking.

reported by Abel Abraham Camarillo Ojeda
ok mlarkin@ and ccardenas@


# 1.79 10-Jan-2018 sthen

Don't require "disk" or "kernel", also allow just "cdrom" instead, a VM can
still be useful with only cdrom storage. ok ccardenas@


# 1.78 08-Jan-2018 mpi

Enable TIOCUCNTL to be able to set ns8250's break detected condition.

It is now possible to send BREAK commands to vmd(8) independently of
the serial terminal emulator.

Happy virtual ddb(4) hacking!

No objection from mlarkin@, ok nicm@, ccardenas@, deraadt@


# 1.77 03-Jan-2018 ccardenas

Add initial CD-ROM support to VMD via vioscsi.

* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8)
* Support various sized ISOs (Limitation of 4G ISOs on Linux guests)
* Known working guests: OpenBSD (primary), Alpine Linux (primary),
CentOS 6 (secondary), Ubuntu 17.10 (secondary).
NOTE: Secondary indicates some issue(s) preventing full/reliable
functionality outside the scope of the vioscsi work.
* If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's
default BIOS) will boot from CD-ROM.

ok mlarkin@, jca@


# 1.76 06-Dec-2017 abieber

Make vmd respect owner when starting non-disabled vms.

OK pd@, benno@


# 1.75 30-Nov-2017 ccardenas

When performing vmctl reload and a previously configured vm is running,
exit with an EALREADY vs EPERM.

ok mlarkin@


# 1.74 11-Nov-2017 mlarkin

update switch handling in vmd(8). vmd now gets switch information (rdomain,
etc) from underlying switch interface instead of handling this on its
own.

Diff from carlos cardenas, Thanks!

ok reyk@


# 1.73 07-Nov-2017 mlarkin

typo in previous


# 1.72 07-Nov-2017 mlarkin

comment function vm_checkperm


# 1.71 24-Oct-2017 mlarkin

The VMD parent process didn't handle the case of a VM exiting
with a non 0 return properly (i.e. EIO).

From: Carlos Cardenas, thanks!


# 1.70 07-Oct-2017 mlarkin

vmd: retain ownership on vm reboot

from Jesper Wallin, thanks!


Revision tags: OPENBSD_6_2_BASE
# 1.69 08-Sep-2017 mlarkin

vmd: add more explanatory log_debug messages

From Carlos Cardenas, many thanks!


# 1.68 20-Aug-2017 pd

vmd: Allow only upward migration

This restricts receiving vms from hosts with more cpu features.

Tested on
broadwell -> skylake (works)
skylake -> broadwell (don't work)

ok mlarkin@


# 1.67 15-Aug-2017 pd

vmd: fix vm id displayed by vmctl when receiving a vm

Also fix two debug messages and an IMSG type.


# 1.66 14-Aug-2017 jasper

validate vm names before creating them; a valid name contains alphanumeric
characters, including '.', '_' and '-'. but does not start with the latter
three.

ok mlarkin@ pd@


# 1.65 13-Aug-2017 jasper

don't issue a termination command to an already stopped vm

ok mlarkin@


# 1.64 15-Jul-2017 pd

Add vmctl send and vmctl receive

ok reyk@ and mlarkin@


# 1.63 09-Jul-2017 pd

vmd/vmctl: Add ability to pause / unpause vms

With help from Ashwin Agrawal

ok reyk@ mlarkin@


# 1.62 29-May-2017 mlarkin

vmd(8): prevent crashing when presented with a vm name argument to
"vmctl stop" that doesn't exist.

Diff from Pratik Vyas, thanks!


# 1.61 04-May-2017 reyk

Report command failure back to vmctl reload, reset, load, log verbose.

OK mlarkin@


# 1.60 04-May-2017 reyk

Add support for rdomains.

This allows to configure VM interfaces and switches in individual rdomains.

OK mlarkin@


# 1.59 25-Apr-2017 reyk

Generate randomized MAC addresses earlier to keep them across reboots.

OK deraadt@


# 1.58 21-Apr-2017 reyk

Add global configuration option "local prefix" to change prefix for -L.

The default prefix is 100.64.0.0/10 from RFC6598.

Requested by sthen@ chris@
OK mlarkin@


# 1.57 19-Apr-2017 reyk

Add support for dynamic "NAT" interfaces (-L/local interface).

When a local interface is configured, vmd configures a /31 address on
the tap(4) interface of the host and provides another IP in the same
subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server
that replies with IP, gateway, and DNS addresses to the VM. The
built-in server only ever responds to the VM on the inside and cannot
leak its DHCP responses to the outside.

Thanks to Uwe Werler, Josh Grosse, and some others for testing!

OK deraadt@


# 1.56 06-Apr-2017 reyk

Do not expose vmm(4) VM IDs to the user, use vmd(8)'s IDs instead.

Each VM has two IDs: one from the kernel (vmm) and a different one
from userland (vmd). The vmm ID is not consistent and incremented on
every boot during runtimg of the host system. The vmd ID remains the
same during the lifetime of a configured VM, even after reboots.
Configured VMs will even get and keep their IDs when the configuration
is loaded. This is more what users expect.

Pointed out and tested by otto@

OK deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.55 15-Mar-2017 reyk

More fixes for starting and stopping VMs, fixing fallout from vm_running.

- Don't start a VM that is already running
- Keep the VM as running until it is powered off (and not stopping)
- Don't fatal in the parent if the vmm process referenced an unknown VM
- Don't stop a VM that is already stopping
- Indicate that a VM is stopping in "vmctl status"

The previous "vmctl stop; vmctl stop" to force-shutdown is not
supported anymore - the shutdown timeout should make sure that the VM
is really terminated. To force-shutdown, reference the VM by ID.
We might add a flag to vmctl stop to just turn the VM off.


# 1.54 15-Mar-2017 reyk

Close the tty if the VM was powered down.

The parent keeps a copy of each VM's tty fd to reuse it on reboot.
Close this tty if the VM was stopped, and not rebooted, by calling
vm_stop(vm, 0) instead of just setting vm_running to 0. Also make
sure that vm_ttyname is not used after free'ing it.


# 1.53 02-Mar-2017 reyk

Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.

This is especially useful when multiple VMs share a switch, the
implementation is independent from the underlying switch or bridge.

no objections mlarkin@


# 1.52 01-Mar-2017 reyk

Add "owner" option to set a user/group ownership for pre-configured VMs

This allows matching users to start or stop VMs that they "own" and to
access the console accordingly.

OK mlarkin@


# 1.51 27-Feb-2017 reyk

Replace openpty(3) with local function that uses pre-opened /dev/ptm fd

This allows more flexibility for upcoming changes and better pledge.
We also didn't use half of the features of libutil's openpty function.
Additionally, make sure that the ttys are closed correctly on shutdown.

OK gilles@


# 1.50 13-Jan-2017 edd

Make it possible to remove VMs from vmd(8)'s internal queue.

The semantics agreed with reyk@ are:

* ad-hoc created vms, created with `vmctl start`, are removed once stopped.
* Stopped VMs defined in a config file are flushed before a `vmctl reload`.

OK reyk@


# 1.49 11-Jan-2017 reyk

Add imsg communication channel between vmd and invividual VMs.
For now, this is only used to forward "log verbose|brief" requests,
but it will be used for better things later.

OK mlarkin@


# 1.48 09-Jan-2017 reyk

Stop accessing verbose and debug variables from log.c directly.

This replaces log_verbose() and "extern int verbose" with the two functions
log_setverbose() and log_getverbose().

Pointed out by benno@
OK krw@ eric@ gilles@ (OK gilles@ for the snmpd bits as well)


# 1.47 14-Dec-2016 reyk

Allow to start disabled and pre-configured VMs by name, "vmctl start foo".

With testing from Jon Bernard

OK mlarkin@


# 1.46 14-Dec-2016 reyk

If a VM terminates with the result EAGAIN, close all fds except the
pty and re-send it to the vmm monitor process. With additional
changes in vmm.c, this will allow perform a cold reboot of VM.

With testing and feedback from Jon Bernard
OK mlarkin@


# 1.45 26-Nov-2016 reyk

Implement basic support for boot.conf(8) on the disk image.

Like the real boot loader, load and parse hd0a:/etc/boot.conf from the
first disk and fall back to /bsd. Not all boot loader options are
supported, but it at least does set device, set image, and boot -acds
(eg. for booting single-user).

For example, it can now boot install60.fs that includes a boot.conf
with "set image /6.0/amd64/bsd.rd":
vmctl start install -c -d install60.fs -d OpenBSD.img

This pseudo-bootloader is only needed without BIOS and could
potentially be replaced in the future.

OK mlarkin@


# 1.44 26-Nov-2016 reyk

If -m/memory is not specified, use 512M by default.

Default value picked with mlarkin - not too small and not too large.

OK mlarkin@


# 1.43 24-Nov-2016 reyk

Add support for booting the kernel from the disk image.

This make the kernel/-k argument optional and, if not specified, tries
to find the /bsd kernel in the primary hd0a partition of the first
disk image itself. It doesn't support hd0a:/etc/boot.conf yet, and it
is no BIOS or full boot loader, but it makes booting and handling of
VMs a bit easier - booting an external kernel is still supported.

The UFS file system code ufs.c is directly from libsa which is also
used by the real boot loader. The code compiles with a few signedness
warning which will be fixed separately.

OK mlarkin@


# 1.42 22-Nov-2016 reyk

Fix error path of config_setvm() and its callers. This unbreaks
loading of invalid kernel files.

Reported by mlarkin@
OK mlarkin@


# 1.41 22-Nov-2016 reyk

There is no need for res when there is already ret.


# 1.40 22-Nov-2016 edd

Insert disabled VMs into vmd(8)'s queues and allow vmctl(8) to display them.

Tested by Jon Bernard and reyk@.

OK reyk@, no objections mlarkin@.

Thanks


# 1.39 04-Nov-2016 reyk

Pass the internal vmid or 0 to vm_register() instead of changing it
once again after setting the next available id.

Suggested by edd@


# 1.38 04-Nov-2016 reyk

Update the config/register/get VM methods to match the config_set/get
style that is used in other places. Also keep the vmid from the parent.

OK edd@


# 1.37 29-Oct-2016 edd

Separate parsing vms and switches from starting them in vmd(8).

Brings us one step closer to having disabled by default vms is vm.conf(5),
which can be started with vmctl(8).

Input, testing and OK reyk@. Thanks.


# 1.36 17-Oct-2016 reyk

Add the option to specify an interface group per virtual switch as well;
this group will be added to all VM tap(4) interfaces in the switch.

Tested by martijn@


# 1.35 15-Oct-2016 reyk

Allow to add an interface to an interface group; with the group keyword.

Requested and tested by martijn@


# 1.34 12-Oct-2016 reyk

Fix functionality and semantics of vmctl load/reload/reset.

OK rzalamena@


# 1.33 06-Oct-2016 reyk

Terminate VMs on shutdown of vmd instead of leaving them running as
undead VM processes.

OK mlarkin@


# 1.32 05-Oct-2016 reyk

Add support for enhanced networking configuration and virtual switches.
See vm.conf(5) for more details.

OK mlarkin@


# 1.31 04-Oct-2016 reyk

Add a new "priv" process that is responsible for ioctls and restricted
operations that aren't allowed under pledge. This is a companion to
the "vmd" process that runs as root but with pledge.

With the "priv" process, each new tap(4) interface now gets a
description to indicate the vm, eg. "vm1-if0-myvm". For network
configuration will be done by vmd/priv later.

OK mlarkin@


# 1.30 29-Sep-2016 reyk

Implement fork+exec for vmd, using the same framework from httpd etc.

No objections from mlarkin@ sunil@


# 1.29 17-Aug-2016 deraadt

small bits of header cleanup; ok mlarkin


# 1.28 29-Jul-2016 stefan

Allow starting a VM again after it was terminated

If a VM exits, terminate it and remove it from the list of
available VMs. That allows a VM with name `foo' to be restarted
after it has exited.

This changes structures shared between vmd and vmctl. You need to
rebuild vmctl also.

ok mlarkin@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.27 05-Feb-2016 reyk

Fix a possible use-after-free in vmd, forward the result to the
control socket before free'ing the vm.

Found by and OK jsg@


# 1.26 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.25 11-Dec-2015 reyk

The vmctl "id" argument can now be a number of or a vm name, eg.
vmctl stop 3
vmctl stop "openbsd.vm"


# 1.24 08-Dec-2015 jsg

when checking the config file with -n don't open /dev/vmm or require root
ok reyk@


# 1.23 08-Dec-2015 jsg

make the -f option work as intended
ok reyk@


# 1.22 07-Dec-2015 reyk

tweak initial error logging


# 1.21 06-Dec-2015 reyk

Prevent running a VM with the same name multiple times - multiple
instances of the same configuration will be handled in a different way
later. It is also not a good idea to use the same writeable disk
with multiple VMs at the same time.

As discussed with mlarkin@


# 1.20 06-Dec-2015 reyk

Print the TTY in the vmctl status output.


# 1.19 06-Dec-2015 reyk

When a new vm is created with VMM_IOC_CREATE, the kernel assigns a
unique id to it. This happens in the vm child process and has to be
communicated to the parent processes to track the vm. Knowing the vm
id in the parent and vmm processes also allows to remove vm from the
daemons list on terminate requests later.


# 1.18 06-Dec-2015 reyk

Check errno from config_getvm() correctly


# 1.17 05-Dec-2015 reyk

Print shorter error message if opening /dev/vmm failed.

Pointed out by deraadt@


# 1.16 03-Dec-2015 reyk

Re-add the "load" and "reload" commands to vmctl: Instead of parsing
the configuration in vmctl directly, it now sends a (re)load request
to vmd. The reload also resets the existing configuration status -
this doesn't do much difference yet but a future change will compare
if a specified VM is already running. "load" will allow to add
configuration, while "reload" resets the state before loading.


# 1.15 03-Dec-2015 reyk

Add and document -D and -f flags to vmd.


# 1.14 03-Dec-2015 reyk

mlarkin's code has been moved to vmm.c, so it is ok to claim the copyright.


# 1.13 03-Dec-2015 reyk

Add support for an optional vm.conf(5) file in vmd. This will replace
vmm.conf(5) in vmmctl. For a short time, both vmd and vmmctl will
support a configuration file, but vmmctl will be changed to send
"load" requests to vmd instead of loading and parsing the file
directly.


# 1.12 03-Dec-2015 reyk

prepare config_getvm() for parse.y


# 1.11 02-Dec-2015 reyk

send the tty name to vmmctl and print it as a result.


# 1.10 02-Dec-2015 reyk

Split the fully privileged parent into two processes "parent" and
"vmm" with reduced privileges:
- the "parent" opens fds (disks, ifs, etc.) but runs as root but pledged as
"stdio rpath wpath proc tty sendfd".
- the "vmm" process handles the creation and supervision of vm processes,
and the primary communication with the vmm(4) subsystem. It runs as _vmd
in the chroot but does not use pledge, as the vmm ioctls are not allowed
by any pledge model yet.
With this change, vmd starts to track the configuration state of VMs
in vmd and will allow other things later (like terminating a vm by
name, moving the configuration parser to vmd, ...). More incremental
changes will follow.


# 1.9 02-Dec-2015 reyk

Start tweaking vmd's privsep and daemon model by splitting the main
process into multiple parts and adopting the "proc.c"-style from other
daemons. This allows to further reduce the privileges, to give better
pledge(2), and to add some upcoming changes.

"please do" mlarkin@, deraadt@


# 1.8 26-Nov-2015 reyk

Automatically start vmm(4) when the first VM is created and after the
last VM is terminated. This allows to remove the explicit "vmm
enable" / "vmm disable" (VMM_IOC_START / VMM_IOC_STOP) ioctls. You'll
have to update kernel and userland for this change, as the kernel ABI
changes.

OK mpi@ mlarkin@


# 1.7 25-Nov-2015 tedu

typo: should be looking pid == -1


# 1.6 23-Nov-2015 reyk

accept4() is restarted after signals which prevents vmd from exiting
in the current control socket loop. Add a poll before the accept that
is not restarted and allows to escape the loop. This code is kind of
temporary, as we're planning to replace the event handling, but it
allows to kill (or Ctrl+c) vmd for now.

OK tedu@, discussed with many


# 1.5 23-Nov-2015 reyk

I accidentally removed a newline in usage() when converting the log
messages to log_*.

From Cesare Gargano


# 1.4 23-Nov-2015 reyk

Add support for logging to stderr or syslog, and to run vmd in
foreground with -d.

OK mlarkin@ jung@


# 1.3 22-Nov-2015 deraadt

use PATH_MAX where needed


# 1.2 22-Nov-2015 reyk

Add $ Ids


# 1.1 22-Nov-2015 mlarkin

vmd(8) - virtual machine daemon.

There is still a lot to be done, and fixed, in these userland components
but I have received enough "it works, commit it" emails that it's time
to finish those things in tree.

discussed with many, tested by many.


# 1.131 08-May-2022 dv

vmd: fix rebooting a received vm

Rebooting a received vm resulted in vmd(8) exiting as a result of
flawed state tracking in the parent process.

When stopping a vm, clear the VM_RECEIVE_STATE flag. When starting
a vm, make sure the parent process collapses any existing memory
ranges after the vm is sent to the vmm process (responsible for
launching the vm).

ok mlarkin@


Revision tags: OPENBSD_7_1_BASE
# 1.130 01-Mar-2022 dv

vmd(8): gracefully handle hitting data limits when starting a vm

With recent changes to login.conf(5) to restrict daemon datasize
to a finite value, users can now hit resource limits when attempting
to start a vm.

This change fixes the error path when hitting the limit. vmd(8)
will no longer abort and memory error messages are relayed to the
user.

While here, address potential under-reads/writes using atomicio
when relaying data between the child vm process and vmd's vmm
process.

Original diff from tedu@. OK mlarkin@.


# 1.129 04-Jan-2022 claudio

Fix some simple -Wunused-but-set-variable warnings.
OK benno@ dv@


# 1.128 13-Dec-2021 deraadt

including sys/cdefs.h manually started as a result of netbsd trying to
macro-build a replacement for sccsid, and was done without any concern
for namespace damage. Unfortunately this practice started infecting
other code as others were unaware they didn't need the file.
ok millert guenther


# 1.127 29-Nov-2021 deraadt

mostly avoid sys/param.h with a local nitems()
ok mlarkin


Revision tags: OPENBSD_7_0_BASE
# 1.126 18-Jul-2021 dv

vmd(8): remove invalid errno values from config_setvm

Refactor config_setvm to directly return error code on failure
instead of returning -1 and setting errno. It was setting unsupported
values not defined in <errno.h>.

OK mlarkin@


# 1.125 05-May-2021 dv

Refactor vm_instance to return error value directly.

vmd(8)'s vm_instance function set unsupported errno values. Change the
api to directly return an error (either errno or custom vmd error).

"go for it" -mlarkin@


# 1.124 04-May-2021 dv

Init debug logging state before attempting to log.

Error messages related to bad configuration were not flushing to
stderr.

OK mlarkin@


# 1.123 26-Apr-2021 dv

vmd(8): fix vmctl client "wait" state corruption

Adds queue-based tracking of waiting client state to fix the cause of
state corruption when a vmctl(8) user cancels a wait and restarts it.
The socket fd value for the control process client was being used to
track the waiting party, but this also prevented multiple waiting
clients.

This moves all the state tracking of who to notify of a vm's stopping
to the control process and no longer requires the parent process to
track it in the global environment state.

Future work will be needed to smooth out the difference between the
IMSG_VMDOP_TERMINATE_VM_{EVENT,RESPONSE} events instead of needing to
translate before relaying to the vmctl(8) client.

Tested by Mischa Peters (thanks!)

ok mlarkin@


Revision tags: OPENBSD_6_9_BASE
# 1.122 05-Apr-2021 dv

Send correct response type on unpause errors.

ok pd@


# 1.121 29-Mar-2021 dv

Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp
and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior
ignored did not intercept these packets and instead transmitted them.

This should make vmd(8)'s dhcp behave more as a true dhcp server should and
allows it to work properly with the new dhcpleased(8) attempting a renewal.

OK mlarkin@


# 1.120 27-Jan-2021 deraadt

these programs (with common ancestry) had a -fno-common problem related
to privsep_procid.
ok mortimer


Revision tags: OPENBSD_6_8_BASE
# 1.119 23-Sep-2020 martijn

Revert agentx support for now, we're too close to release.

requested by deraadt@


# 1.118 23-Sep-2020 martijn

Add support for agentx to vmd.

This is based around VM-MIB from RFC7666,but does not export the full
spec. People more knowledgeable of vmd are encouraged to expand on this.


Revision tags: OPENBSD_6_7_BASE
# 1.117 12-Dec-2019 pd

vmd: start vms defined in vm.conf in a staggered fashion

This addresses 'thundering herd' problem when a lot of
vms are configured in vm.conf. A lot of vms booting in parallel can
overload the host and also mess up tsc calibration in openbsd guests as
it uses PIT which doesn't fire reliably if the host is overloaded.

We default to starting vms with parallelism of ncpuonline and a delay 30 seconds
between batches. This is configurable in vm.conf.

ok mlarkin@ (also addressed comments from cheloha@)


Revision tags: OPENBSD_6_6_BASE
# 1.116 04-Sep-2019 mlarkin

vmd(8): memory leak in an error path

Found by Hiltjo Posthuma, thanks!


# 1.115 14-Aug-2019 anton

Improve the error message when supplying an invalid template to vmctl
start. Favoring 'invalid template' over 'permission denied' should give
the user a better hint on what went wrong.

ok kn@ mlarkin@


# 1.114 28-Jun-2019 deraadt

When system calls indicate an error they return -1, not some arbitrary
value < 0. errno is only updated in this case. Change all (most?)
callers of syscalls to follow this better, and let's see if this strictness
helps us in the future.


# 1.113 20-May-2019 jasper

drop fatalx calls when claiming a new vm id; otherwise it's possible
to crash vmd and take all other vms with it. this required a little
shuffling to get the error value reported back to the caller to
handle the error properly.

ok mlarkin@


# 1.112 11-May-2019 jasper

report vm state through 'vmctl status'; whereas previously this would display the state of
the vcpu (which is why it got removed), it now actually reports the correct state
(running, stopped, disabled, paused, etc)

ok ccardenas@ mlarkin@


# 1.111 11-May-2019 jasper

vm_dump_header allocated space for a signature but it was never set;
set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image

ok mlarkin@ pd@


# 1.110 11-May-2019 jasper

track the state of the vm (running, paused, etc) using a single bitfield instead of
a handful of separate variables. this will makes it easier for vmd to report
and check on the individual vm states

no functional change intended

ok ccardenas@ mlarkin@


# 1.109 11-May-2019 jasper

sync the vm state in vmd too when (un)pausing a vm, otherwise the vm process
knows the vm is paused, but vmd does not.

ok mlarkin@ pd@


Revision tags: OPENBSD_6_5_BASE
# 1.108 09-Dec-2018 claudio

When -B is used to specify a specific boot device also change the reboot
behaviour of vmd to stop / exit at guest reboot.
OK ccardenas@


# 1.107 04-Dec-2018 claudio

Introduce IMSG_VMDOP_WAIT_VM_REQUEST a control message that registers a
vmctl peerid that should be informed when the VM is stopped (like when the
guest does a shutdown). Uses the same logic as using the VMOP_WAIT flag on
IMSG_VMDOP_TERMINATE_VM_REQUEST.
Ok ccardenas@, reyk@


# 1.106 26-Nov-2018 ori

Keep a list of known vms, and reuse the VM IDs.

This means that when using '-L', the IP addresses of the VMs are stable.

ok reyk@


# 1.105 21-Nov-2018 reyk

Add support for "local inet6" interfaces.

ok & test ccardenas@, additional review from kn@


# 1.104 15-Oct-2018 reyk

Prevent VM reboot loops by rate-limiting the interval a VM can reboot.

This looping has been experienced by people who run VMs with a broken
kernel or boot loader that trigger a very fast reboot loop (triple
fault) of a VM that ends up using a lot of CPU and resources on the
host. Some fixes in vmm(4) and vmd(8) helped to avoid such conditions
but it can still occur if something is wrong in the guest VM itself.

If the VM restarts after less than VM_START_RATE_SEC (6) seconds, we
increment the limit counter. After VM_START_RATE_LIMIT (3) of suchs
fast reboots the VM is stopped.

There are only very few people who intentionally want to reboot-loop a
VM very quickly (many times within a second); mostly for fuzzing.
They will have to recompile and adjust the stated #defines in the code
as we don't have a config option to disable it.

OK mlarkin@


Revision tags: OPENBSD_6_4_BASE
# 1.103 08-Oct-2018 reyk

Add support for qcow2 base images (external snapshots).

This works is from Ori Bernstein, committing on his behalf:

Add support to vmd for external snapshots. That is, snapshots that are
derived from a base image. Data lookups start in the derived image,
and if the derived image does not contain some data, the search
proceeds ot the base image. Multiple derived images may exist off of
a single base image.

A limitation of this format is that modifying the base image will
corrupt the derived image.

This change also adds support for creating disk derived disk images to
vmctl. To use it:

vmctl create derived.qcow2 -s 16G -b base.qcow2

From Ori Bernstein
OK mlarkin@ reyk@


# 1.102 29-Sep-2018 pd

vmd: don't remove vm if sending failed

Fix a bug where a vm was removed in vmd.c after vmctl send even if sending
failed.
spotted by solene@
ok mlarkin@


# 1.101 28-Sep-2018 reyk

Fix copy-pasto to use maxmem instead of maxcpu

Reported by Greg Steuck

OK mlarkin@


# 1.100 10-Sep-2018 bluhm

vmd(8) clould close file descriptor 0 as not all fd fields were
properly initialized with -1. Also avoid closing -1.
OK mlarkin@


# 1.99 10-Sep-2018 bluhm

During the fork+exec implementation, daemon(3) was moved after
proc_init(). As a consequence vmd(8) child processes did not detach
from the terminal anymore. Dup /dev/null to the stdio file descriptors
in the children.
OK mlarkin@ reyk@


# 1.98 15-Jul-2018 reyk

Track resources and enforce cpu/memory/interface limits for non-root users.

The limits are currently hard-coded and undocumented (4 CPUs/VMs, 2G
memory, 8 interfaces) but will be configurable in an upcoming diff.
These limits are tracked in total usage; for example, a user will be
able to run up to 4 VMs with 512M of memory or a single VM with 2G.

OK ccardenas@ mlarkin@


# 1.97 13-Jul-2018 reyk

Check the disk/kernel/cdrom file permissions after openening the fd.

This prevents time of TOCTOU attacks for instances.

OK mlarkin@


# 1.96 13-Jul-2018 reyk

Add "allow instance" option.

This allows users to create VM instances and change desired options,
for example a user can be allowed to run a VM with all the
pre-configured options but specify an own disk image.

(mlarkin@ was fine with iterating over it)

OK ccardenas@


# 1.95 12-Jul-2018 reyk

Allow to use configured/running VMs as templates for other VM instances.

This introduces new grammar and the -t optional in vmctl start.

(For now, only root can create VM instances; but it is planned to allow
users to create their own VMs based on permissions and quota.)

OK ccardenas@ mlarkin@ jmc@


# 1.94 11-Jul-2018 reyk

style - indent each case statement in a switch.


# 1.93 11-Jul-2018 reyk

Add -w option to vmctl stop to wait for completion of VM termination.

Use it in /etc/rc.d/vmd accordingly.

OK sthen@


# 1.92 11-Jul-2018 reyk

Rename function to vmd_check_vmh


# 1.91 11-Jul-2018 reyk

Add -f option to vmctl stop to forcefully kill a VM.

This also fixes a bug in vmm_sighdlr where it might have missed
forwarding the TERMINATE_EVENT to the vmd parent after a VM child
died, leading to an abandoned VM in the vmd parent process.

OK ccardenas@ mlarkin@ benno@ kn@


# 1.90 10-Jul-2018 reyk

style (single-line ifs don't need braces)


# 1.89 10-Jul-2018 reyk

vmd already had DEBUG/DPRINTF, there is no need for VMD_DEBUG/dprintf

Replace all occurences of dprintf with DPRINTF (defined in proc.h).


# 1.88 10-Jul-2018 reyk

Tweak debug log messages

- Turn tracing messages into DPRINTF (only compiled with DEBUG).

- Pass __func__ to vm_stop and vm_remove: this way we can track who
called the function in the async context. It replaces the manual
log_debug in front of each vm_stop/vm_remove. This debug logging
trick can be removed in the future once we are more confident about
it.

OK ccardenas@ mlarkin@


# 1.87 26-Jun-2018 reyk

Add "socket owner" to allow changing the owner of the vmd control socket.

This allows to open vmctl control or console access to other users
that are not in group wheel. Access for non-root users still defaults
to read-only actions unless you change the owner (user/group) of each
individual VM.

Requested by Mischa Peters

OK mlarkin@


# 1.86 19-Jun-2018 reyk

knf


# 1.85 13-May-2018 pd

vmd(8): enable pause / unpause for vm owners

Patch from Mohamed Aslan. Thanks!
ok kn@


# 1.84 25-Apr-2018 mlarkin

vmd(8)'s early error messages weren't visible when started via /etc/rc
(such as errors relating to not having VMX/etc). Change the log_init
to log to syslog so at least we have some chance of seeing these errors.

requested and ok beck@


# 1.83 21-Apr-2018 mlarkin

spelling error in log message


# 1.82 29-Mar-2018 martijn

Make sure that the global config is send out immediately when it is
loaded. This makes sure that the local prefix specied in the config is
always used.

OK ccardenas@


Revision tags: OPENBSD_6_3_BASE
# 1.81 14-Mar-2018 mlarkin

block two VMs from using the same disk image file at the same time.
Also changes an error message in vmctl to reflect same.


# 1.80 18-Feb-2018 pd

vmd: fix vmctl pause for non existing vm ids (never returns)

check if vm id is valid before sending to vmm for pausing. The 'lock' is caused
by vmm sending back ENOENT for a non existent vm but vmd drops the message
because it doesn't recogize the vmid vmm is talking about. This is an artifact
of the 'policy' don't trust any imsg from a sibling priv sep process and do
your own checking.

reported by Abel Abraham Camarillo Ojeda
ok mlarkin@ and ccardenas@


# 1.79 10-Jan-2018 sthen

Don't require "disk" or "kernel", also allow just "cdrom" instead, a VM can
still be useful with only cdrom storage. ok ccardenas@


# 1.78 08-Jan-2018 mpi

Enable TIOCUCNTL to be able to set ns8250's break detected condition.

It is now possible to send BREAK commands to vmd(8) independently of
the serial terminal emulator.

Happy virtual ddb(4) hacking!

No objection from mlarkin@, ok nicm@, ccardenas@, deraadt@


# 1.77 03-Jan-2018 ccardenas

Add initial CD-ROM support to VMD via vioscsi.

* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8)
* Support various sized ISOs (Limitation of 4G ISOs on Linux guests)
* Known working guests: OpenBSD (primary), Alpine Linux (primary),
CentOS 6 (secondary), Ubuntu 17.10 (secondary).
NOTE: Secondary indicates some issue(s) preventing full/reliable
functionality outside the scope of the vioscsi work.
* If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's
default BIOS) will boot from CD-ROM.

ok mlarkin@, jca@


# 1.76 06-Dec-2017 abieber

Make vmd respect owner when starting non-disabled vms.

OK pd@, benno@


# 1.75 30-Nov-2017 ccardenas

When performing vmctl reload and a previously configured vm is running,
exit with an EALREADY vs EPERM.

ok mlarkin@


# 1.74 11-Nov-2017 mlarkin

update switch handling in vmd(8). vmd now gets switch information (rdomain,
etc) from underlying switch interface instead of handling this on its
own.

Diff from carlos cardenas, Thanks!

ok reyk@


# 1.73 07-Nov-2017 mlarkin

typo in previous


# 1.72 07-Nov-2017 mlarkin

comment function vm_checkperm


# 1.71 24-Oct-2017 mlarkin

The VMD parent process didn't handle the case of a VM exiting
with a non 0 return properly (i.e. EIO).

From: Carlos Cardenas, thanks!


# 1.70 07-Oct-2017 mlarkin

vmd: retain ownership on vm reboot

from Jesper Wallin, thanks!


Revision tags: OPENBSD_6_2_BASE
# 1.69 08-Sep-2017 mlarkin

vmd: add more explanatory log_debug messages

From Carlos Cardenas, many thanks!


# 1.68 20-Aug-2017 pd

vmd: Allow only upward migration

This restricts receiving vms from hosts with more cpu features.

Tested on
broadwell -> skylake (works)
skylake -> broadwell (don't work)

ok mlarkin@


# 1.67 15-Aug-2017 pd

vmd: fix vm id displayed by vmctl when receiving a vm

Also fix two debug messages and an IMSG type.


# 1.66 14-Aug-2017 jasper

validate vm names before creating them; a valid name contains alphanumeric
characters, including '.', '_' and '-'. but does not start with the latter
three.

ok mlarkin@ pd@


# 1.65 13-Aug-2017 jasper

don't issue a termination command to an already stopped vm

ok mlarkin@


# 1.64 15-Jul-2017 pd

Add vmctl send and vmctl receive

ok reyk@ and mlarkin@


# 1.63 09-Jul-2017 pd

vmd/vmctl: Add ability to pause / unpause vms

With help from Ashwin Agrawal

ok reyk@ mlarkin@


# 1.62 29-May-2017 mlarkin

vmd(8): prevent crashing when presented with a vm name argument to
"vmctl stop" that doesn't exist.

Diff from Pratik Vyas, thanks!


# 1.61 04-May-2017 reyk

Report command failure back to vmctl reload, reset, load, log verbose.

OK mlarkin@


# 1.60 04-May-2017 reyk

Add support for rdomains.

This allows to configure VM interfaces and switches in individual rdomains.

OK mlarkin@


# 1.59 25-Apr-2017 reyk

Generate randomized MAC addresses earlier to keep them across reboots.

OK deraadt@


# 1.58 21-Apr-2017 reyk

Add global configuration option "local prefix" to change prefix for -L.

The default prefix is 100.64.0.0/10 from RFC6598.

Requested by sthen@ chris@
OK mlarkin@


# 1.57 19-Apr-2017 reyk

Add support for dynamic "NAT" interfaces (-L/local interface).

When a local interface is configured, vmd configures a /31 address on
the tap(4) interface of the host and provides another IP in the same
subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server
that replies with IP, gateway, and DNS addresses to the VM. The
built-in server only ever responds to the VM on the inside and cannot
leak its DHCP responses to the outside.

Thanks to Uwe Werler, Josh Grosse, and some others for testing!

OK deraadt@


# 1.56 06-Apr-2017 reyk

Do not expose vmm(4) VM IDs to the user, use vmd(8)'s IDs instead.

Each VM has two IDs: one from the kernel (vmm) and a different one
from userland (vmd). The vmm ID is not consistent and incremented on
every boot during runtimg of the host system. The vmd ID remains the
same during the lifetime of a configured VM, even after reboots.
Configured VMs will even get and keep their IDs when the configuration
is loaded. This is more what users expect.

Pointed out and tested by otto@

OK deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.55 15-Mar-2017 reyk

More fixes for starting and stopping VMs, fixing fallout from vm_running.

- Don't start a VM that is already running
- Keep the VM as running until it is powered off (and not stopping)
- Don't fatal in the parent if the vmm process referenced an unknown VM
- Don't stop a VM that is already stopping
- Indicate that a VM is stopping in "vmctl status"

The previous "vmctl stop; vmctl stop" to force-shutdown is not
supported anymore - the shutdown timeout should make sure that the VM
is really terminated. To force-shutdown, reference the VM by ID.
We might add a flag to vmctl stop to just turn the VM off.


# 1.54 15-Mar-2017 reyk

Close the tty if the VM was powered down.

The parent keeps a copy of each VM's tty fd to reuse it on reboot.
Close this tty if the VM was stopped, and not rebooted, by calling
vm_stop(vm, 0) instead of just setting vm_running to 0. Also make
sure that vm_ttyname is not used after free'ing it.


# 1.53 02-Mar-2017 reyk

Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.

This is especially useful when multiple VMs share a switch, the
implementation is independent from the underlying switch or bridge.

no objections mlarkin@


# 1.52 01-Mar-2017 reyk

Add "owner" option to set a user/group ownership for pre-configured VMs

This allows matching users to start or stop VMs that they "own" and to
access the console accordingly.

OK mlarkin@


# 1.51 27-Feb-2017 reyk

Replace openpty(3) with local function that uses pre-opened /dev/ptm fd

This allows more flexibility for upcoming changes and better pledge.
We also didn't use half of the features of libutil's openpty function.
Additionally, make sure that the ttys are closed correctly on shutdown.

OK gilles@


# 1.50 13-Jan-2017 edd

Make it possible to remove VMs from vmd(8)'s internal queue.

The semantics agreed with reyk@ are:

* ad-hoc created vms, created with `vmctl start`, are removed once stopped.
* Stopped VMs defined in a config file are flushed before a `vmctl reload`.

OK reyk@


# 1.49 11-Jan-2017 reyk

Add imsg communication channel between vmd and invividual VMs.
For now, this is only used to forward "log verbose|brief" requests,
but it will be used for better things later.

OK mlarkin@


# 1.48 09-Jan-2017 reyk

Stop accessing verbose and debug variables from log.c directly.

This replaces log_verbose() and "extern int verbose" with the two functions
log_setverbose() and log_getverbose().

Pointed out by benno@
OK krw@ eric@ gilles@ (OK gilles@ for the snmpd bits as well)


# 1.47 14-Dec-2016 reyk

Allow to start disabled and pre-configured VMs by name, "vmctl start foo".

With testing from Jon Bernard

OK mlarkin@


# 1.46 14-Dec-2016 reyk

If a VM terminates with the result EAGAIN, close all fds except the
pty and re-send it to the vmm monitor process. With additional
changes in vmm.c, this will allow perform a cold reboot of VM.

With testing and feedback from Jon Bernard
OK mlarkin@


# 1.45 26-Nov-2016 reyk

Implement basic support for boot.conf(8) on the disk image.

Like the real boot loader, load and parse hd0a:/etc/boot.conf from the
first disk and fall back to /bsd. Not all boot loader options are
supported, but it at least does set device, set image, and boot -acds
(eg. for booting single-user).

For example, it can now boot install60.fs that includes a boot.conf
with "set image /6.0/amd64/bsd.rd":
vmctl start install -c -d install60.fs -d OpenBSD.img

This pseudo-bootloader is only needed without BIOS and could
potentially be replaced in the future.

OK mlarkin@


# 1.44 26-Nov-2016 reyk

If -m/memory is not specified, use 512M by default.

Default value picked with mlarkin - not too small and not too large.

OK mlarkin@


# 1.43 24-Nov-2016 reyk

Add support for booting the kernel from the disk image.

This make the kernel/-k argument optional and, if not specified, tries
to find the /bsd kernel in the primary hd0a partition of the first
disk image itself. It doesn't support hd0a:/etc/boot.conf yet, and it
is no BIOS or full boot loader, but it makes booting and handling of
VMs a bit easier - booting an external kernel is still supported.

The UFS file system code ufs.c is directly from libsa which is also
used by the real boot loader. The code compiles with a few signedness
warning which will be fixed separately.

OK mlarkin@


# 1.42 22-Nov-2016 reyk

Fix error path of config_setvm() and its callers. This unbreaks
loading of invalid kernel files.

Reported by mlarkin@
OK mlarkin@


# 1.41 22-Nov-2016 reyk

There is no need for res when there is already ret.


# 1.40 22-Nov-2016 edd

Insert disabled VMs into vmd(8)'s queues and allow vmctl(8) to display them.

Tested by Jon Bernard and reyk@.

OK reyk@, no objections mlarkin@.

Thanks


# 1.39 04-Nov-2016 reyk

Pass the internal vmid or 0 to vm_register() instead of changing it
once again after setting the next available id.

Suggested by edd@


# 1.38 04-Nov-2016 reyk

Update the config/register/get VM methods to match the config_set/get
style that is used in other places. Also keep the vmid from the parent.

OK edd@


# 1.37 29-Oct-2016 edd

Separate parsing vms and switches from starting them in vmd(8).

Brings us one step closer to having disabled by default vms is vm.conf(5),
which can be started with vmctl(8).

Input, testing and OK reyk@. Thanks.


# 1.36 17-Oct-2016 reyk

Add the option to specify an interface group per virtual switch as well;
this group will be added to all VM tap(4) interfaces in the switch.

Tested by martijn@


# 1.35 15-Oct-2016 reyk

Allow to add an interface to an interface group; with the group keyword.

Requested and tested by martijn@


# 1.34 12-Oct-2016 reyk

Fix functionality and semantics of vmctl load/reload/reset.

OK rzalamena@


# 1.33 06-Oct-2016 reyk

Terminate VMs on shutdown of vmd instead of leaving them running as
undead VM processes.

OK mlarkin@


# 1.32 05-Oct-2016 reyk

Add support for enhanced networking configuration and virtual switches.
See vm.conf(5) for more details.

OK mlarkin@


# 1.31 04-Oct-2016 reyk

Add a new "priv" process that is responsible for ioctls and restricted
operations that aren't allowed under pledge. This is a companion to
the "vmd" process that runs as root but with pledge.

With the "priv" process, each new tap(4) interface now gets a
description to indicate the vm, eg. "vm1-if0-myvm". For network
configuration will be done by vmd/priv later.

OK mlarkin@


# 1.30 29-Sep-2016 reyk

Implement fork+exec for vmd, using the same framework from httpd etc.

No objections from mlarkin@ sunil@


# 1.29 17-Aug-2016 deraadt

small bits of header cleanup; ok mlarkin


# 1.28 29-Jul-2016 stefan

Allow starting a VM again after it was terminated

If a VM exits, terminate it and remove it from the list of
available VMs. That allows a VM with name `foo' to be restarted
after it has exited.

This changes structures shared between vmd and vmctl. You need to
rebuild vmctl also.

ok mlarkin@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.27 05-Feb-2016 reyk

Fix a possible use-after-free in vmd, forward the result to the
control socket before free'ing the vm.

Found by and OK jsg@


# 1.26 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.25 11-Dec-2015 reyk

The vmctl "id" argument can now be a number of or a vm name, eg.
vmctl stop 3
vmctl stop "openbsd.vm"


# 1.24 08-Dec-2015 jsg

when checking the config file with -n don't open /dev/vmm or require root
ok reyk@


# 1.23 08-Dec-2015 jsg

make the -f option work as intended
ok reyk@


# 1.22 07-Dec-2015 reyk

tweak initial error logging


# 1.21 06-Dec-2015 reyk

Prevent running a VM with the same name multiple times - multiple
instances of the same configuration will be handled in a different way
later. It is also not a good idea to use the same writeable disk
with multiple VMs at the same time.

As discussed with mlarkin@


# 1.20 06-Dec-2015 reyk

Print the TTY in the vmctl status output.


# 1.19 06-Dec-2015 reyk

When a new vm is created with VMM_IOC_CREATE, the kernel assigns a
unique id to it. This happens in the vm child process and has to be
communicated to the parent processes to track the vm. Knowing the vm
id in the parent and vmm processes also allows to remove vm from the
daemons list on terminate requests later.


# 1.18 06-Dec-2015 reyk

Check errno from config_getvm() correctly


# 1.17 05-Dec-2015 reyk

Print shorter error message if opening /dev/vmm failed.

Pointed out by deraadt@


# 1.16 03-Dec-2015 reyk

Re-add the "load" and "reload" commands to vmctl: Instead of parsing
the configuration in vmctl directly, it now sends a (re)load request
to vmd. The reload also resets the existing configuration status -
this doesn't do much difference yet but a future change will compare
if a specified VM is already running. "load" will allow to add
configuration, while "reload" resets the state before loading.


# 1.15 03-Dec-2015 reyk

Add and document -D and -f flags to vmd.


# 1.14 03-Dec-2015 reyk

mlarkin's code has been moved to vmm.c, so it is ok to claim the copyright.


# 1.13 03-Dec-2015 reyk

Add support for an optional vm.conf(5) file in vmd. This will replace
vmm.conf(5) in vmmctl. For a short time, both vmd and vmmctl will
support a configuration file, but vmmctl will be changed to send
"load" requests to vmd instead of loading and parsing the file
directly.


# 1.12 03-Dec-2015 reyk

prepare config_getvm() for parse.y


# 1.11 02-Dec-2015 reyk

send the tty name to vmmctl and print it as a result.


# 1.10 02-Dec-2015 reyk

Split the fully privileged parent into two processes "parent" and
"vmm" with reduced privileges:
- the "parent" opens fds (disks, ifs, etc.) but runs as root but pledged as
"stdio rpath wpath proc tty sendfd".
- the "vmm" process handles the creation and supervision of vm processes,
and the primary communication with the vmm(4) subsystem. It runs as _vmd
in the chroot but does not use pledge, as the vmm ioctls are not allowed
by any pledge model yet.
With this change, vmd starts to track the configuration state of VMs
in vmd and will allow other things later (like terminating a vm by
name, moving the configuration parser to vmd, ...). More incremental
changes will follow.


# 1.9 02-Dec-2015 reyk

Start tweaking vmd's privsep and daemon model by splitting the main
process into multiple parts and adopting the "proc.c"-style from other
daemons. This allows to further reduce the privileges, to give better
pledge(2), and to add some upcoming changes.

"please do" mlarkin@, deraadt@


# 1.8 26-Nov-2015 reyk

Automatically start vmm(4) when the first VM is created and after the
last VM is terminated. This allows to remove the explicit "vmm
enable" / "vmm disable" (VMM_IOC_START / VMM_IOC_STOP) ioctls. You'll
have to update kernel and userland for this change, as the kernel ABI
changes.

OK mpi@ mlarkin@


# 1.7 25-Nov-2015 tedu

typo: should be looking pid == -1


# 1.6 23-Nov-2015 reyk

accept4() is restarted after signals which prevents vmd from exiting
in the current control socket loop. Add a poll before the accept that
is not restarted and allows to escape the loop. This code is kind of
temporary, as we're planning to replace the event handling, but it
allows to kill (or Ctrl+c) vmd for now.

OK tedu@, discussed with many


# 1.5 23-Nov-2015 reyk

I accidentally removed a newline in usage() when converting the log
messages to log_*.

From Cesare Gargano


# 1.4 23-Nov-2015 reyk

Add support for logging to stderr or syslog, and to run vmd in
foreground with -d.

OK mlarkin@ jung@


# 1.3 22-Nov-2015 deraadt

use PATH_MAX where needed


# 1.2 22-Nov-2015 reyk

Add $ Ids


# 1.1 22-Nov-2015 mlarkin

vmd(8) - virtual machine daemon.

There is still a lot to be done, and fixed, in these userland components
but I have received enough "it works, commit it" emails that it's time
to finish those things in tree.

discussed with many, tested by many.


# 1.130 01-Mar-2022 dv

vmd(8): gracefully handle hitting data limits when starting a vm

With recent changes to login.conf(5) to restrict daemon datasize
to a finite value, users can now hit resource limits when attempting
to start a vm.

This change fixes the error path when hitting the limit. vmd(8)
will no longer abort and memory error messages are relayed to the
user.

While here, address potential under-reads/writes using atomicio
when relaying data between the child vm process and vmd's vmm
process.

Original diff from tedu@. OK mlarkin@.


# 1.129 04-Jan-2022 claudio

Fix some simple -Wunused-but-set-variable warnings.
OK benno@ dv@


# 1.128 13-Dec-2021 deraadt

including sys/cdefs.h manually started as a result of netbsd trying to
macro-build a replacement for sccsid, and was done without any concern
for namespace damage. Unfortunately this practice started infecting
other code as others were unaware they didn't need the file.
ok millert guenther


# 1.127 29-Nov-2021 deraadt

mostly avoid sys/param.h with a local nitems()
ok mlarkin


Revision tags: OPENBSD_7_0_BASE
# 1.126 18-Jul-2021 dv

vmd(8): remove invalid errno values from config_setvm

Refactor config_setvm to directly return error code on failure
instead of returning -1 and setting errno. It was setting unsupported
values not defined in <errno.h>.

OK mlarkin@


# 1.125 05-May-2021 dv

Refactor vm_instance to return error value directly.

vmd(8)'s vm_instance function set unsupported errno values. Change the
api to directly return an error (either errno or custom vmd error).

"go for it" -mlarkin@


# 1.124 04-May-2021 dv

Init debug logging state before attempting to log.

Error messages related to bad configuration were not flushing to
stderr.

OK mlarkin@


# 1.123 26-Apr-2021 dv

vmd(8): fix vmctl client "wait" state corruption

Adds queue-based tracking of waiting client state to fix the cause of
state corruption when a vmctl(8) user cancels a wait and restarts it.
The socket fd value for the control process client was being used to
track the waiting party, but this also prevented multiple waiting
clients.

This moves all the state tracking of who to notify of a vm's stopping
to the control process and no longer requires the parent process to
track it in the global environment state.

Future work will be needed to smooth out the difference between the
IMSG_VMDOP_TERMINATE_VM_{EVENT,RESPONSE} events instead of needing to
translate before relaying to the vmctl(8) client.

Tested by Mischa Peters (thanks!)

ok mlarkin@


Revision tags: OPENBSD_6_9_BASE
# 1.122 05-Apr-2021 dv

Send correct response type on unpause errors.

ok pd@


# 1.121 29-Mar-2021 dv

Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp
and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior
ignored did not intercept these packets and instead transmitted them.

This should make vmd(8)'s dhcp behave more as a true dhcp server should and
allows it to work properly with the new dhcpleased(8) attempting a renewal.

OK mlarkin@


# 1.120 27-Jan-2021 deraadt

these programs (with common ancestry) had a -fno-common problem related
to privsep_procid.
ok mortimer


Revision tags: OPENBSD_6_8_BASE
# 1.119 23-Sep-2020 martijn

Revert agentx support for now, we're too close to release.

requested by deraadt@


# 1.118 23-Sep-2020 martijn

Add support for agentx to vmd.

This is based around VM-MIB from RFC7666,but does not export the full
spec. People more knowledgeable of vmd are encouraged to expand on this.


Revision tags: OPENBSD_6_7_BASE
# 1.117 12-Dec-2019 pd

vmd: start vms defined in vm.conf in a staggered fashion

This addresses 'thundering herd' problem when a lot of
vms are configured in vm.conf. A lot of vms booting in parallel can
overload the host and also mess up tsc calibration in openbsd guests as
it uses PIT which doesn't fire reliably if the host is overloaded.

We default to starting vms with parallelism of ncpuonline and a delay 30 seconds
between batches. This is configurable in vm.conf.

ok mlarkin@ (also addressed comments from cheloha@)


Revision tags: OPENBSD_6_6_BASE
# 1.116 04-Sep-2019 mlarkin

vmd(8): memory leak in an error path

Found by Hiltjo Posthuma, thanks!


# 1.115 14-Aug-2019 anton

Improve the error message when supplying an invalid template to vmctl
start. Favoring 'invalid template' over 'permission denied' should give
the user a better hint on what went wrong.

ok kn@ mlarkin@


# 1.114 28-Jun-2019 deraadt

When system calls indicate an error they return -1, not some arbitrary
value < 0. errno is only updated in this case. Change all (most?)
callers of syscalls to follow this better, and let's see if this strictness
helps us in the future.


# 1.113 20-May-2019 jasper

drop fatalx calls when claiming a new vm id; otherwise it's possible
to crash vmd and take all other vms with it. this required a little
shuffling to get the error value reported back to the caller to
handle the error properly.

ok mlarkin@


# 1.112 11-May-2019 jasper

report vm state through 'vmctl status'; whereas previously this would display the state of
the vcpu (which is why it got removed), it now actually reports the correct state
(running, stopped, disabled, paused, etc)

ok ccardenas@ mlarkin@


# 1.111 11-May-2019 jasper

vm_dump_header allocated space for a signature but it was never set;
set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image

ok mlarkin@ pd@


# 1.110 11-May-2019 jasper

track the state of the vm (running, paused, etc) using a single bitfield instead of
a handful of separate variables. this will makes it easier for vmd to report
and check on the individual vm states

no functional change intended

ok ccardenas@ mlarkin@


# 1.109 11-May-2019 jasper

sync the vm state in vmd too when (un)pausing a vm, otherwise the vm process
knows the vm is paused, but vmd does not.

ok mlarkin@ pd@


Revision tags: OPENBSD_6_5_BASE
# 1.108 09-Dec-2018 claudio

When -B is used to specify a specific boot device also change the reboot
behaviour of vmd to stop / exit at guest reboot.
OK ccardenas@


# 1.107 04-Dec-2018 claudio

Introduce IMSG_VMDOP_WAIT_VM_REQUEST a control message that registers a
vmctl peerid that should be informed when the VM is stopped (like when the
guest does a shutdown). Uses the same logic as using the VMOP_WAIT flag on
IMSG_VMDOP_TERMINATE_VM_REQUEST.
Ok ccardenas@, reyk@


# 1.106 26-Nov-2018 ori

Keep a list of known vms, and reuse the VM IDs.

This means that when using '-L', the IP addresses of the VMs are stable.

ok reyk@


# 1.105 21-Nov-2018 reyk

Add support for "local inet6" interfaces.

ok & test ccardenas@, additional review from kn@


# 1.104 15-Oct-2018 reyk

Prevent VM reboot loops by rate-limiting the interval a VM can reboot.

This looping has been experienced by people who run VMs with a broken
kernel or boot loader that trigger a very fast reboot loop (triple
fault) of a VM that ends up using a lot of CPU and resources on the
host. Some fixes in vmm(4) and vmd(8) helped to avoid such conditions
but it can still occur if something is wrong in the guest VM itself.

If the VM restarts after less than VM_START_RATE_SEC (6) seconds, we
increment the limit counter. After VM_START_RATE_LIMIT (3) of suchs
fast reboots the VM is stopped.

There are only very few people who intentionally want to reboot-loop a
VM very quickly (many times within a second); mostly for fuzzing.
They will have to recompile and adjust the stated #defines in the code
as we don't have a config option to disable it.

OK mlarkin@


Revision tags: OPENBSD_6_4_BASE
# 1.103 08-Oct-2018 reyk

Add support for qcow2 base images (external snapshots).

This works is from Ori Bernstein, committing on his behalf:

Add support to vmd for external snapshots. That is, snapshots that are
derived from a base image. Data lookups start in the derived image,
and if the derived image does not contain some data, the search
proceeds ot the base image. Multiple derived images may exist off of
a single base image.

A limitation of this format is that modifying the base image will
corrupt the derived image.

This change also adds support for creating disk derived disk images to
vmctl. To use it:

vmctl create derived.qcow2 -s 16G -b base.qcow2

From Ori Bernstein
OK mlarkin@ reyk@


# 1.102 29-Sep-2018 pd

vmd: don't remove vm if sending failed

Fix a bug where a vm was removed in vmd.c after vmctl send even if sending
failed.
spotted by solene@
ok mlarkin@


# 1.101 28-Sep-2018 reyk

Fix copy-pasto to use maxmem instead of maxcpu

Reported by Greg Steuck

OK mlarkin@


# 1.100 10-Sep-2018 bluhm

vmd(8) clould close file descriptor 0 as not all fd fields were
properly initialized with -1. Also avoid closing -1.
OK mlarkin@


# 1.99 10-Sep-2018 bluhm

During the fork+exec implementation, daemon(3) was moved after
proc_init(). As a consequence vmd(8) child processes did not detach
from the terminal anymore. Dup /dev/null to the stdio file descriptors
in the children.
OK mlarkin@ reyk@


# 1.98 15-Jul-2018 reyk

Track resources and enforce cpu/memory/interface limits for non-root users.

The limits are currently hard-coded and undocumented (4 CPUs/VMs, 2G
memory, 8 interfaces) but will be configurable in an upcoming diff.
These limits are tracked in total usage; for example, a user will be
able to run up to 4 VMs with 512M of memory or a single VM with 2G.

OK ccardenas@ mlarkin@


# 1.97 13-Jul-2018 reyk

Check the disk/kernel/cdrom file permissions after openening the fd.

This prevents time of TOCTOU attacks for instances.

OK mlarkin@


# 1.96 13-Jul-2018 reyk

Add "allow instance" option.

This allows users to create VM instances and change desired options,
for example a user can be allowed to run a VM with all the
pre-configured options but specify an own disk image.

(mlarkin@ was fine with iterating over it)

OK ccardenas@


# 1.95 12-Jul-2018 reyk

Allow to use configured/running VMs as templates for other VM instances.

This introduces new grammar and the -t optional in vmctl start.

(For now, only root can create VM instances; but it is planned to allow
users to create their own VMs based on permissions and quota.)

OK ccardenas@ mlarkin@ jmc@


# 1.94 11-Jul-2018 reyk

style - indent each case statement in a switch.


# 1.93 11-Jul-2018 reyk

Add -w option to vmctl stop to wait for completion of VM termination.

Use it in /etc/rc.d/vmd accordingly.

OK sthen@


# 1.92 11-Jul-2018 reyk

Rename function to vmd_check_vmh


# 1.91 11-Jul-2018 reyk

Add -f option to vmctl stop to forcefully kill a VM.

This also fixes a bug in vmm_sighdlr where it might have missed
forwarding the TERMINATE_EVENT to the vmd parent after a VM child
died, leading to an abandoned VM in the vmd parent process.

OK ccardenas@ mlarkin@ benno@ kn@


# 1.90 10-Jul-2018 reyk

style (single-line ifs don't need braces)


# 1.89 10-Jul-2018 reyk

vmd already had DEBUG/DPRINTF, there is no need for VMD_DEBUG/dprintf

Replace all occurences of dprintf with DPRINTF (defined in proc.h).


# 1.88 10-Jul-2018 reyk

Tweak debug log messages

- Turn tracing messages into DPRINTF (only compiled with DEBUG).

- Pass __func__ to vm_stop and vm_remove: this way we can track who
called the function in the async context. It replaces the manual
log_debug in front of each vm_stop/vm_remove. This debug logging
trick can be removed in the future once we are more confident about
it.

OK ccardenas@ mlarkin@


# 1.87 26-Jun-2018 reyk

Add "socket owner" to allow changing the owner of the vmd control socket.

This allows to open vmctl control or console access to other users
that are not in group wheel. Access for non-root users still defaults
to read-only actions unless you change the owner (user/group) of each
individual VM.

Requested by Mischa Peters

OK mlarkin@


# 1.86 19-Jun-2018 reyk

knf


# 1.85 13-May-2018 pd

vmd(8): enable pause / unpause for vm owners

Patch from Mohamed Aslan. Thanks!
ok kn@


# 1.84 25-Apr-2018 mlarkin

vmd(8)'s early error messages weren't visible when started via /etc/rc
(such as errors relating to not having VMX/etc). Change the log_init
to log to syslog so at least we have some chance of seeing these errors.

requested and ok beck@


# 1.83 21-Apr-2018 mlarkin

spelling error in log message


# 1.82 29-Mar-2018 martijn

Make sure that the global config is send out immediately when it is
loaded. This makes sure that the local prefix specied in the config is
always used.

OK ccardenas@


Revision tags: OPENBSD_6_3_BASE
# 1.81 14-Mar-2018 mlarkin

block two VMs from using the same disk image file at the same time.
Also changes an error message in vmctl to reflect same.


# 1.80 18-Feb-2018 pd

vmd: fix vmctl pause for non existing vm ids (never returns)

check if vm id is valid before sending to vmm for pausing. The 'lock' is caused
by vmm sending back ENOENT for a non existent vm but vmd drops the message
because it doesn't recogize the vmid vmm is talking about. This is an artifact
of the 'policy' don't trust any imsg from a sibling priv sep process and do
your own checking.

reported by Abel Abraham Camarillo Ojeda
ok mlarkin@ and ccardenas@


# 1.79 10-Jan-2018 sthen

Don't require "disk" or "kernel", also allow just "cdrom" instead, a VM can
still be useful with only cdrom storage. ok ccardenas@


# 1.78 08-Jan-2018 mpi

Enable TIOCUCNTL to be able to set ns8250's break detected condition.

It is now possible to send BREAK commands to vmd(8) independently of
the serial terminal emulator.

Happy virtual ddb(4) hacking!

No objection from mlarkin@, ok nicm@, ccardenas@, deraadt@


# 1.77 03-Jan-2018 ccardenas

Add initial CD-ROM support to VMD via vioscsi.

* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8)
* Support various sized ISOs (Limitation of 4G ISOs on Linux guests)
* Known working guests: OpenBSD (primary), Alpine Linux (primary),
CentOS 6 (secondary), Ubuntu 17.10 (secondary).
NOTE: Secondary indicates some issue(s) preventing full/reliable
functionality outside the scope of the vioscsi work.
* If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's
default BIOS) will boot from CD-ROM.

ok mlarkin@, jca@


# 1.76 06-Dec-2017 abieber

Make vmd respect owner when starting non-disabled vms.

OK pd@, benno@


# 1.75 30-Nov-2017 ccardenas

When performing vmctl reload and a previously configured vm is running,
exit with an EALREADY vs EPERM.

ok mlarkin@


# 1.74 11-Nov-2017 mlarkin

update switch handling in vmd(8). vmd now gets switch information (rdomain,
etc) from underlying switch interface instead of handling this on its
own.

Diff from carlos cardenas, Thanks!

ok reyk@


# 1.73 07-Nov-2017 mlarkin

typo in previous


# 1.72 07-Nov-2017 mlarkin

comment function vm_checkperm


# 1.71 24-Oct-2017 mlarkin

The VMD parent process didn't handle the case of a VM exiting
with a non 0 return properly (i.e. EIO).

From: Carlos Cardenas, thanks!


# 1.70 07-Oct-2017 mlarkin

vmd: retain ownership on vm reboot

from Jesper Wallin, thanks!


Revision tags: OPENBSD_6_2_BASE
# 1.69 08-Sep-2017 mlarkin

vmd: add more explanatory log_debug messages

From Carlos Cardenas, many thanks!


# 1.68 20-Aug-2017 pd

vmd: Allow only upward migration

This restricts receiving vms from hosts with more cpu features.

Tested on
broadwell -> skylake (works)
skylake -> broadwell (don't work)

ok mlarkin@


# 1.67 15-Aug-2017 pd

vmd: fix vm id displayed by vmctl when receiving a vm

Also fix two debug messages and an IMSG type.


# 1.66 14-Aug-2017 jasper

validate vm names before creating them; a valid name contains alphanumeric
characters, including '.', '_' and '-'. but does not start with the latter
three.

ok mlarkin@ pd@


# 1.65 13-Aug-2017 jasper

don't issue a termination command to an already stopped vm

ok mlarkin@


# 1.64 15-Jul-2017 pd

Add vmctl send and vmctl receive

ok reyk@ and mlarkin@


# 1.63 09-Jul-2017 pd

vmd/vmctl: Add ability to pause / unpause vms

With help from Ashwin Agrawal

ok reyk@ mlarkin@


# 1.62 29-May-2017 mlarkin

vmd(8): prevent crashing when presented with a vm name argument to
"vmctl stop" that doesn't exist.

Diff from Pratik Vyas, thanks!


# 1.61 04-May-2017 reyk

Report command failure back to vmctl reload, reset, load, log verbose.

OK mlarkin@


# 1.60 04-May-2017 reyk

Add support for rdomains.

This allows to configure VM interfaces and switches in individual rdomains.

OK mlarkin@


# 1.59 25-Apr-2017 reyk

Generate randomized MAC addresses earlier to keep them across reboots.

OK deraadt@


# 1.58 21-Apr-2017 reyk

Add global configuration option "local prefix" to change prefix for -L.

The default prefix is 100.64.0.0/10 from RFC6598.

Requested by sthen@ chris@
OK mlarkin@


# 1.57 19-Apr-2017 reyk

Add support for dynamic "NAT" interfaces (-L/local interface).

When a local interface is configured, vmd configures a /31 address on
the tap(4) interface of the host and provides another IP in the same
subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server
that replies with IP, gateway, and DNS addresses to the VM. The
built-in server only ever responds to the VM on the inside and cannot
leak its DHCP responses to the outside.

Thanks to Uwe Werler, Josh Grosse, and some others for testing!

OK deraadt@


# 1.56 06-Apr-2017 reyk

Do not expose vmm(4) VM IDs to the user, use vmd(8)'s IDs instead.

Each VM has two IDs: one from the kernel (vmm) and a different one
from userland (vmd). The vmm ID is not consistent and incremented on
every boot during runtimg of the host system. The vmd ID remains the
same during the lifetime of a configured VM, even after reboots.
Configured VMs will even get and keep their IDs when the configuration
is loaded. This is more what users expect.

Pointed out and tested by otto@

OK deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.55 15-Mar-2017 reyk

More fixes for starting and stopping VMs, fixing fallout from vm_running.

- Don't start a VM that is already running
- Keep the VM as running until it is powered off (and not stopping)
- Don't fatal in the parent if the vmm process referenced an unknown VM
- Don't stop a VM that is already stopping
- Indicate that a VM is stopping in "vmctl status"

The previous "vmctl stop; vmctl stop" to force-shutdown is not
supported anymore - the shutdown timeout should make sure that the VM
is really terminated. To force-shutdown, reference the VM by ID.
We might add a flag to vmctl stop to just turn the VM off.


# 1.54 15-Mar-2017 reyk

Close the tty if the VM was powered down.

The parent keeps a copy of each VM's tty fd to reuse it on reboot.
Close this tty if the VM was stopped, and not rebooted, by calling
vm_stop(vm, 0) instead of just setting vm_running to 0. Also make
sure that vm_ttyname is not used after free'ing it.


# 1.53 02-Mar-2017 reyk

Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.

This is especially useful when multiple VMs share a switch, the
implementation is independent from the underlying switch or bridge.

no objections mlarkin@


# 1.52 01-Mar-2017 reyk

Add "owner" option to set a user/group ownership for pre-configured VMs

This allows matching users to start or stop VMs that they "own" and to
access the console accordingly.

OK mlarkin@


# 1.51 27-Feb-2017 reyk

Replace openpty(3) with local function that uses pre-opened /dev/ptm fd

This allows more flexibility for upcoming changes and better pledge.
We also didn't use half of the features of libutil's openpty function.
Additionally, make sure that the ttys are closed correctly on shutdown.

OK gilles@


# 1.50 13-Jan-2017 edd

Make it possible to remove VMs from vmd(8)'s internal queue.

The semantics agreed with reyk@ are:

* ad-hoc created vms, created with `vmctl start`, are removed once stopped.
* Stopped VMs defined in a config file are flushed before a `vmctl reload`.

OK reyk@


# 1.49 11-Jan-2017 reyk

Add imsg communication channel between vmd and invividual VMs.
For now, this is only used to forward "log verbose|brief" requests,
but it will be used for better things later.

OK mlarkin@


# 1.48 09-Jan-2017 reyk

Stop accessing verbose and debug variables from log.c directly.

This replaces log_verbose() and "extern int verbose" with the two functions
log_setverbose() and log_getverbose().

Pointed out by benno@
OK krw@ eric@ gilles@ (OK gilles@ for the snmpd bits as well)


# 1.47 14-Dec-2016 reyk

Allow to start disabled and pre-configured VMs by name, "vmctl start foo".

With testing from Jon Bernard

OK mlarkin@


# 1.46 14-Dec-2016 reyk

If a VM terminates with the result EAGAIN, close all fds except the
pty and re-send it to the vmm monitor process. With additional
changes in vmm.c, this will allow perform a cold reboot of VM.

With testing and feedback from Jon Bernard
OK mlarkin@


# 1.45 26-Nov-2016 reyk

Implement basic support for boot.conf(8) on the disk image.

Like the real boot loader, load and parse hd0a:/etc/boot.conf from the
first disk and fall back to /bsd. Not all boot loader options are
supported, but it at least does set device, set image, and boot -acds
(eg. for booting single-user).

For example, it can now boot install60.fs that includes a boot.conf
with "set image /6.0/amd64/bsd.rd":
vmctl start install -c -d install60.fs -d OpenBSD.img

This pseudo-bootloader is only needed without BIOS and could
potentially be replaced in the future.

OK mlarkin@


# 1.44 26-Nov-2016 reyk

If -m/memory is not specified, use 512M by default.

Default value picked with mlarkin - not too small and not too large.

OK mlarkin@


# 1.43 24-Nov-2016 reyk

Add support for booting the kernel from the disk image.

This make the kernel/-k argument optional and, if not specified, tries
to find the /bsd kernel in the primary hd0a partition of the first
disk image itself. It doesn't support hd0a:/etc/boot.conf yet, and it
is no BIOS or full boot loader, but it makes booting and handling of
VMs a bit easier - booting an external kernel is still supported.

The UFS file system code ufs.c is directly from libsa which is also
used by the real boot loader. The code compiles with a few signedness
warning which will be fixed separately.

OK mlarkin@


# 1.42 22-Nov-2016 reyk

Fix error path of config_setvm() and its callers. This unbreaks
loading of invalid kernel files.

Reported by mlarkin@
OK mlarkin@


# 1.41 22-Nov-2016 reyk

There is no need for res when there is already ret.


# 1.40 22-Nov-2016 edd

Insert disabled VMs into vmd(8)'s queues and allow vmctl(8) to display them.

Tested by Jon Bernard and reyk@.

OK reyk@, no objections mlarkin@.

Thanks


# 1.39 04-Nov-2016 reyk

Pass the internal vmid or 0 to vm_register() instead of changing it
once again after setting the next available id.

Suggested by edd@


# 1.38 04-Nov-2016 reyk

Update the config/register/get VM methods to match the config_set/get
style that is used in other places. Also keep the vmid from the parent.

OK edd@


# 1.37 29-Oct-2016 edd

Separate parsing vms and switches from starting them in vmd(8).

Brings us one step closer to having disabled by default vms is vm.conf(5),
which can be started with vmctl(8).

Input, testing and OK reyk@. Thanks.


# 1.36 17-Oct-2016 reyk

Add the option to specify an interface group per virtual switch as well;
this group will be added to all VM tap(4) interfaces in the switch.

Tested by martijn@


# 1.35 15-Oct-2016 reyk

Allow to add an interface to an interface group; with the group keyword.

Requested and tested by martijn@


# 1.34 12-Oct-2016 reyk

Fix functionality and semantics of vmctl load/reload/reset.

OK rzalamena@


# 1.33 06-Oct-2016 reyk

Terminate VMs on shutdown of vmd instead of leaving them running as
undead VM processes.

OK mlarkin@


# 1.32 05-Oct-2016 reyk

Add support for enhanced networking configuration and virtual switches.
See vm.conf(5) for more details.

OK mlarkin@


# 1.31 04-Oct-2016 reyk

Add a new "priv" process that is responsible for ioctls and restricted
operations that aren't allowed under pledge. This is a companion to
the "vmd" process that runs as root but with pledge.

With the "priv" process, each new tap(4) interface now gets a
description to indicate the vm, eg. "vm1-if0-myvm". For network
configuration will be done by vmd/priv later.

OK mlarkin@


# 1.30 29-Sep-2016 reyk

Implement fork+exec for vmd, using the same framework from httpd etc.

No objections from mlarkin@ sunil@


# 1.29 17-Aug-2016 deraadt

small bits of header cleanup; ok mlarkin


# 1.28 29-Jul-2016 stefan

Allow starting a VM again after it was terminated

If a VM exits, terminate it and remove it from the list of
available VMs. That allows a VM with name `foo' to be restarted
after it has exited.

This changes structures shared between vmd and vmctl. You need to
rebuild vmctl also.

ok mlarkin@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.27 05-Feb-2016 reyk

Fix a possible use-after-free in vmd, forward the result to the
control socket before free'ing the vm.

Found by and OK jsg@


# 1.26 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.25 11-Dec-2015 reyk

The vmctl "id" argument can now be a number of or a vm name, eg.
vmctl stop 3
vmctl stop "openbsd.vm"


# 1.24 08-Dec-2015 jsg

when checking the config file with -n don't open /dev/vmm or require root
ok reyk@


# 1.23 08-Dec-2015 jsg

make the -f option work as intended
ok reyk@


# 1.22 07-Dec-2015 reyk

tweak initial error logging


# 1.21 06-Dec-2015 reyk

Prevent running a VM with the same name multiple times - multiple
instances of the same configuration will be handled in a different way
later. It is also not a good idea to use the same writeable disk
with multiple VMs at the same time.

As discussed with mlarkin@


# 1.20 06-Dec-2015 reyk

Print the TTY in the vmctl status output.


# 1.19 06-Dec-2015 reyk

When a new vm is created with VMM_IOC_CREATE, the kernel assigns a
unique id to it. This happens in the vm child process and has to be
communicated to the parent processes to track the vm. Knowing the vm
id in the parent and vmm processes also allows to remove vm from the
daemons list on terminate requests later.


# 1.18 06-Dec-2015 reyk

Check errno from config_getvm() correctly


# 1.17 05-Dec-2015 reyk

Print shorter error message if opening /dev/vmm failed.

Pointed out by deraadt@


# 1.16 03-Dec-2015 reyk

Re-add the "load" and "reload" commands to vmctl: Instead of parsing
the configuration in vmctl directly, it now sends a (re)load request
to vmd. The reload also resets the existing configuration status -
this doesn't do much difference yet but a future change will compare
if a specified VM is already running. "load" will allow to add
configuration, while "reload" resets the state before loading.


# 1.15 03-Dec-2015 reyk

Add and document -D and -f flags to vmd.


# 1.14 03-Dec-2015 reyk

mlarkin's code has been moved to vmm.c, so it is ok to claim the copyright.


# 1.13 03-Dec-2015 reyk

Add support for an optional vm.conf(5) file in vmd. This will replace
vmm.conf(5) in vmmctl. For a short time, both vmd and vmmctl will
support a configuration file, but vmmctl will be changed to send
"load" requests to vmd instead of loading and parsing the file
directly.


# 1.12 03-Dec-2015 reyk

prepare config_getvm() for parse.y


# 1.11 02-Dec-2015 reyk

send the tty name to vmmctl and print it as a result.


# 1.10 02-Dec-2015 reyk

Split the fully privileged parent into two processes "parent" and
"vmm" with reduced privileges:
- the "parent" opens fds (disks, ifs, etc.) but runs as root but pledged as
"stdio rpath wpath proc tty sendfd".
- the "vmm" process handles the creation and supervision of vm processes,
and the primary communication with the vmm(4) subsystem. It runs as _vmd
in the chroot but does not use pledge, as the vmm ioctls are not allowed
by any pledge model yet.
With this change, vmd starts to track the configuration state of VMs
in vmd and will allow other things later (like terminating a vm by
name, moving the configuration parser to vmd, ...). More incremental
changes will follow.


# 1.9 02-Dec-2015 reyk

Start tweaking vmd's privsep and daemon model by splitting the main
process into multiple parts and adopting the "proc.c"-style from other
daemons. This allows to further reduce the privileges, to give better
pledge(2), and to add some upcoming changes.

"please do" mlarkin@, deraadt@


# 1.8 26-Nov-2015 reyk

Automatically start vmm(4) when the first VM is created and after the
last VM is terminated. This allows to remove the explicit "vmm
enable" / "vmm disable" (VMM_IOC_START / VMM_IOC_STOP) ioctls. You'll
have to update kernel and userland for this change, as the kernel ABI
changes.

OK mpi@ mlarkin@


# 1.7 25-Nov-2015 tedu

typo: should be looking pid == -1


# 1.6 23-Nov-2015 reyk

accept4() is restarted after signals which prevents vmd from exiting
in the current control socket loop. Add a poll before the accept that
is not restarted and allows to escape the loop. This code is kind of
temporary, as we're planning to replace the event handling, but it
allows to kill (or Ctrl+c) vmd for now.

OK tedu@, discussed with many


# 1.5 23-Nov-2015 reyk

I accidentally removed a newline in usage() when converting the log
messages to log_*.

From Cesare Gargano


# 1.4 23-Nov-2015 reyk

Add support for logging to stderr or syslog, and to run vmd in
foreground with -d.

OK mlarkin@ jung@


# 1.3 22-Nov-2015 deraadt

use PATH_MAX where needed


# 1.2 22-Nov-2015 reyk

Add $ Ids


# 1.1 22-Nov-2015 mlarkin

vmd(8) - virtual machine daemon.

There is still a lot to be done, and fixed, in these userland components
but I have received enough "it works, commit it" emails that it's time
to finish those things in tree.

discussed with many, tested by many.


# 1.129 04-Jan-2022 claudio

Fix some simple -Wunused-but-set-variable warnings.
OK benno@ dv@


# 1.128 13-Dec-2021 deraadt

including sys/cdefs.h manually started as a result of netbsd trying to
macro-build a replacement for sccsid, and was done without any concern
for namespace damage. Unfortunately this practice started infecting
other code as others were unaware they didn't need the file.
ok millert guenther


# 1.127 29-Nov-2021 deraadt

mostly avoid sys/param.h with a local nitems()
ok mlarkin


Revision tags: OPENBSD_7_0_BASE
# 1.126 18-Jul-2021 dv

vmd(8): remove invalid errno values from config_setvm

Refactor config_setvm to directly return error code on failure
instead of returning -1 and setting errno. It was setting unsupported
values not defined in <errno.h>.

OK mlarkin@


# 1.125 05-May-2021 dv

Refactor vm_instance to return error value directly.

vmd(8)'s vm_instance function set unsupported errno values. Change the
api to directly return an error (either errno or custom vmd error).

"go for it" -mlarkin@


# 1.124 04-May-2021 dv

Init debug logging state before attempting to log.

Error messages related to bad configuration were not flushing to
stderr.

OK mlarkin@


# 1.123 26-Apr-2021 dv

vmd(8): fix vmctl client "wait" state corruption

Adds queue-based tracking of waiting client state to fix the cause of
state corruption when a vmctl(8) user cancels a wait and restarts it.
The socket fd value for the control process client was being used to
track the waiting party, but this also prevented multiple waiting
clients.

This moves all the state tracking of who to notify of a vm's stopping
to the control process and no longer requires the parent process to
track it in the global environment state.

Future work will be needed to smooth out the difference between the
IMSG_VMDOP_TERMINATE_VM_{EVENT,RESPONSE} events instead of needing to
translate before relaying to the vmctl(8) client.

Tested by Mischa Peters (thanks!)

ok mlarkin@


Revision tags: OPENBSD_6_9_BASE
# 1.122 05-Apr-2021 dv

Send correct response type on unpause errors.

ok pd@


# 1.121 29-Mar-2021 dv

Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp
and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior
ignored did not intercept these packets and instead transmitted them.

This should make vmd(8)'s dhcp behave more as a true dhcp server should and
allows it to work properly with the new dhcpleased(8) attempting a renewal.

OK mlarkin@


# 1.120 27-Jan-2021 deraadt

these programs (with common ancestry) had a -fno-common problem related
to privsep_procid.
ok mortimer


Revision tags: OPENBSD_6_8_BASE
# 1.119 23-Sep-2020 martijn

Revert agentx support for now, we're too close to release.

requested by deraadt@


# 1.118 23-Sep-2020 martijn

Add support for agentx to vmd.

This is based around VM-MIB from RFC7666,but does not export the full
spec. People more knowledgeable of vmd are encouraged to expand on this.


Revision tags: OPENBSD_6_7_BASE
# 1.117 12-Dec-2019 pd

vmd: start vms defined in vm.conf in a staggered fashion

This addresses 'thundering herd' problem when a lot of
vms are configured in vm.conf. A lot of vms booting in parallel can
overload the host and also mess up tsc calibration in openbsd guests as
it uses PIT which doesn't fire reliably if the host is overloaded.

We default to starting vms with parallelism of ncpuonline and a delay 30 seconds
between batches. This is configurable in vm.conf.

ok mlarkin@ (also addressed comments from cheloha@)


Revision tags: OPENBSD_6_6_BASE
# 1.116 04-Sep-2019 mlarkin

vmd(8): memory leak in an error path

Found by Hiltjo Posthuma, thanks!


# 1.115 14-Aug-2019 anton

Improve the error message when supplying an invalid template to vmctl
start. Favoring 'invalid template' over 'permission denied' should give
the user a better hint on what went wrong.

ok kn@ mlarkin@


# 1.114 28-Jun-2019 deraadt

When system calls indicate an error they return -1, not some arbitrary
value < 0. errno is only updated in this case. Change all (most?)
callers of syscalls to follow this better, and let's see if this strictness
helps us in the future.


# 1.113 20-May-2019 jasper

drop fatalx calls when claiming a new vm id; otherwise it's possible
to crash vmd and take all other vms with it. this required a little
shuffling to get the error value reported back to the caller to
handle the error properly.

ok mlarkin@


# 1.112 11-May-2019 jasper

report vm state through 'vmctl status'; whereas previously this would display the state of
the vcpu (which is why it got removed), it now actually reports the correct state
(running, stopped, disabled, paused, etc)

ok ccardenas@ mlarkin@


# 1.111 11-May-2019 jasper

vm_dump_header allocated space for a signature but it was never set;
set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image

ok mlarkin@ pd@


# 1.110 11-May-2019 jasper

track the state of the vm (running, paused, etc) using a single bitfield instead of
a handful of separate variables. this will makes it easier for vmd to report
and check on the individual vm states

no functional change intended

ok ccardenas@ mlarkin@


# 1.109 11-May-2019 jasper

sync the vm state in vmd too when (un)pausing a vm, otherwise the vm process
knows the vm is paused, but vmd does not.

ok mlarkin@ pd@


Revision tags: OPENBSD_6_5_BASE
# 1.108 09-Dec-2018 claudio

When -B is used to specify a specific boot device also change the reboot
behaviour of vmd to stop / exit at guest reboot.
OK ccardenas@


# 1.107 04-Dec-2018 claudio

Introduce IMSG_VMDOP_WAIT_VM_REQUEST a control message that registers a
vmctl peerid that should be informed when the VM is stopped (like when the
guest does a shutdown). Uses the same logic as using the VMOP_WAIT flag on
IMSG_VMDOP_TERMINATE_VM_REQUEST.
Ok ccardenas@, reyk@


# 1.106 26-Nov-2018 ori

Keep a list of known vms, and reuse the VM IDs.

This means that when using '-L', the IP addresses of the VMs are stable.

ok reyk@


# 1.105 21-Nov-2018 reyk

Add support for "local inet6" interfaces.

ok & test ccardenas@, additional review from kn@


# 1.104 15-Oct-2018 reyk

Prevent VM reboot loops by rate-limiting the interval a VM can reboot.

This looping has been experienced by people who run VMs with a broken
kernel or boot loader that trigger a very fast reboot loop (triple
fault) of a VM that ends up using a lot of CPU and resources on the
host. Some fixes in vmm(4) and vmd(8) helped to avoid such conditions
but it can still occur if something is wrong in the guest VM itself.

If the VM restarts after less than VM_START_RATE_SEC (6) seconds, we
increment the limit counter. After VM_START_RATE_LIMIT (3) of suchs
fast reboots the VM is stopped.

There are only very few people who intentionally want to reboot-loop a
VM very quickly (many times within a second); mostly for fuzzing.
They will have to recompile and adjust the stated #defines in the code
as we don't have a config option to disable it.

OK mlarkin@


Revision tags: OPENBSD_6_4_BASE
# 1.103 08-Oct-2018 reyk

Add support for qcow2 base images (external snapshots).

This works is from Ori Bernstein, committing on his behalf:

Add support to vmd for external snapshots. That is, snapshots that are
derived from a base image. Data lookups start in the derived image,
and if the derived image does not contain some data, the search
proceeds ot the base image. Multiple derived images may exist off of
a single base image.

A limitation of this format is that modifying the base image will
corrupt the derived image.

This change also adds support for creating disk derived disk images to
vmctl. To use it:

vmctl create derived.qcow2 -s 16G -b base.qcow2

From Ori Bernstein
OK mlarkin@ reyk@


# 1.102 29-Sep-2018 pd

vmd: don't remove vm if sending failed

Fix a bug where a vm was removed in vmd.c after vmctl send even if sending
failed.
spotted by solene@
ok mlarkin@


# 1.101 28-Sep-2018 reyk

Fix copy-pasto to use maxmem instead of maxcpu

Reported by Greg Steuck

OK mlarkin@


# 1.100 10-Sep-2018 bluhm

vmd(8) clould close file descriptor 0 as not all fd fields were
properly initialized with -1. Also avoid closing -1.
OK mlarkin@


# 1.99 10-Sep-2018 bluhm

During the fork+exec implementation, daemon(3) was moved after
proc_init(). As a consequence vmd(8) child processes did not detach
from the terminal anymore. Dup /dev/null to the stdio file descriptors
in the children.
OK mlarkin@ reyk@


# 1.98 15-Jul-2018 reyk

Track resources and enforce cpu/memory/interface limits for non-root users.

The limits are currently hard-coded and undocumented (4 CPUs/VMs, 2G
memory, 8 interfaces) but will be configurable in an upcoming diff.
These limits are tracked in total usage; for example, a user will be
able to run up to 4 VMs with 512M of memory or a single VM with 2G.

OK ccardenas@ mlarkin@


# 1.97 13-Jul-2018 reyk

Check the disk/kernel/cdrom file permissions after openening the fd.

This prevents time of TOCTOU attacks for instances.

OK mlarkin@


# 1.96 13-Jul-2018 reyk

Add "allow instance" option.

This allows users to create VM instances and change desired options,
for example a user can be allowed to run a VM with all the
pre-configured options but specify an own disk image.

(mlarkin@ was fine with iterating over it)

OK ccardenas@


# 1.95 12-Jul-2018 reyk

Allow to use configured/running VMs as templates for other VM instances.

This introduces new grammar and the -t optional in vmctl start.

(For now, only root can create VM instances; but it is planned to allow
users to create their own VMs based on permissions and quota.)

OK ccardenas@ mlarkin@ jmc@


# 1.94 11-Jul-2018 reyk

style - indent each case statement in a switch.


# 1.93 11-Jul-2018 reyk

Add -w option to vmctl stop to wait for completion of VM termination.

Use it in /etc/rc.d/vmd accordingly.

OK sthen@


# 1.92 11-Jul-2018 reyk

Rename function to vmd_check_vmh


# 1.91 11-Jul-2018 reyk

Add -f option to vmctl stop to forcefully kill a VM.

This also fixes a bug in vmm_sighdlr where it might have missed
forwarding the TERMINATE_EVENT to the vmd parent after a VM child
died, leading to an abandoned VM in the vmd parent process.

OK ccardenas@ mlarkin@ benno@ kn@


# 1.90 10-Jul-2018 reyk

style (single-line ifs don't need braces)


# 1.89 10-Jul-2018 reyk

vmd already had DEBUG/DPRINTF, there is no need for VMD_DEBUG/dprintf

Replace all occurences of dprintf with DPRINTF (defined in proc.h).


# 1.88 10-Jul-2018 reyk

Tweak debug log messages

- Turn tracing messages into DPRINTF (only compiled with DEBUG).

- Pass __func__ to vm_stop and vm_remove: this way we can track who
called the function in the async context. It replaces the manual
log_debug in front of each vm_stop/vm_remove. This debug logging
trick can be removed in the future once we are more confident about
it.

OK ccardenas@ mlarkin@


# 1.87 26-Jun-2018 reyk

Add "socket owner" to allow changing the owner of the vmd control socket.

This allows to open vmctl control or console access to other users
that are not in group wheel. Access for non-root users still defaults
to read-only actions unless you change the owner (user/group) of each
individual VM.

Requested by Mischa Peters

OK mlarkin@


# 1.86 19-Jun-2018 reyk

knf


# 1.85 13-May-2018 pd

vmd(8): enable pause / unpause for vm owners

Patch from Mohamed Aslan. Thanks!
ok kn@


# 1.84 25-Apr-2018 mlarkin

vmd(8)'s early error messages weren't visible when started via /etc/rc
(such as errors relating to not having VMX/etc). Change the log_init
to log to syslog so at least we have some chance of seeing these errors.

requested and ok beck@


# 1.83 21-Apr-2018 mlarkin

spelling error in log message


# 1.82 29-Mar-2018 martijn

Make sure that the global config is send out immediately when it is
loaded. This makes sure that the local prefix specied in the config is
always used.

OK ccardenas@


Revision tags: OPENBSD_6_3_BASE
# 1.81 14-Mar-2018 mlarkin

block two VMs from using the same disk image file at the same time.
Also changes an error message in vmctl to reflect same.


# 1.80 18-Feb-2018 pd

vmd: fix vmctl pause for non existing vm ids (never returns)

check if vm id is valid before sending to vmm for pausing. The 'lock' is caused
by vmm sending back ENOENT for a non existent vm but vmd drops the message
because it doesn't recogize the vmid vmm is talking about. This is an artifact
of the 'policy' don't trust any imsg from a sibling priv sep process and do
your own checking.

reported by Abel Abraham Camarillo Ojeda
ok mlarkin@ and ccardenas@


# 1.79 10-Jan-2018 sthen

Don't require "disk" or "kernel", also allow just "cdrom" instead, a VM can
still be useful with only cdrom storage. ok ccardenas@


# 1.78 08-Jan-2018 mpi

Enable TIOCUCNTL to be able to set ns8250's break detected condition.

It is now possible to send BREAK commands to vmd(8) independently of
the serial terminal emulator.

Happy virtual ddb(4) hacking!

No objection from mlarkin@, ok nicm@, ccardenas@, deraadt@


# 1.77 03-Jan-2018 ccardenas

Add initial CD-ROM support to VMD via vioscsi.

* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8)
* Support various sized ISOs (Limitation of 4G ISOs on Linux guests)
* Known working guests: OpenBSD (primary), Alpine Linux (primary),
CentOS 6 (secondary), Ubuntu 17.10 (secondary).
NOTE: Secondary indicates some issue(s) preventing full/reliable
functionality outside the scope of the vioscsi work.
* If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's
default BIOS) will boot from CD-ROM.

ok mlarkin@, jca@


# 1.76 06-Dec-2017 abieber

Make vmd respect owner when starting non-disabled vms.

OK pd@, benno@


# 1.75 30-Nov-2017 ccardenas

When performing vmctl reload and a previously configured vm is running,
exit with an EALREADY vs EPERM.

ok mlarkin@


# 1.74 11-Nov-2017 mlarkin

update switch handling in vmd(8). vmd now gets switch information (rdomain,
etc) from underlying switch interface instead of handling this on its
own.

Diff from carlos cardenas, Thanks!

ok reyk@


# 1.73 07-Nov-2017 mlarkin

typo in previous


# 1.72 07-Nov-2017 mlarkin

comment function vm_checkperm


# 1.71 24-Oct-2017 mlarkin

The VMD parent process didn't handle the case of a VM exiting
with a non 0 return properly (i.e. EIO).

From: Carlos Cardenas, thanks!


# 1.70 07-Oct-2017 mlarkin

vmd: retain ownership on vm reboot

from Jesper Wallin, thanks!


Revision tags: OPENBSD_6_2_BASE
# 1.69 08-Sep-2017 mlarkin

vmd: add more explanatory log_debug messages

From Carlos Cardenas, many thanks!


# 1.68 20-Aug-2017 pd

vmd: Allow only upward migration

This restricts receiving vms from hosts with more cpu features.

Tested on
broadwell -> skylake (works)
skylake -> broadwell (don't work)

ok mlarkin@


# 1.67 15-Aug-2017 pd

vmd: fix vm id displayed by vmctl when receiving a vm

Also fix two debug messages and an IMSG type.


# 1.66 14-Aug-2017 jasper

validate vm names before creating them; a valid name contains alphanumeric
characters, including '.', '_' and '-'. but does not start with the latter
three.

ok mlarkin@ pd@


# 1.65 13-Aug-2017 jasper

don't issue a termination command to an already stopped vm

ok mlarkin@


# 1.64 15-Jul-2017 pd

Add vmctl send and vmctl receive

ok reyk@ and mlarkin@


# 1.63 09-Jul-2017 pd

vmd/vmctl: Add ability to pause / unpause vms

With help from Ashwin Agrawal

ok reyk@ mlarkin@


# 1.62 29-May-2017 mlarkin

vmd(8): prevent crashing when presented with a vm name argument to
"vmctl stop" that doesn't exist.

Diff from Pratik Vyas, thanks!


# 1.61 04-May-2017 reyk

Report command failure back to vmctl reload, reset, load, log verbose.

OK mlarkin@


# 1.60 04-May-2017 reyk

Add support for rdomains.

This allows to configure VM interfaces and switches in individual rdomains.

OK mlarkin@


# 1.59 25-Apr-2017 reyk

Generate randomized MAC addresses earlier to keep them across reboots.

OK deraadt@


# 1.58 21-Apr-2017 reyk

Add global configuration option "local prefix" to change prefix for -L.

The default prefix is 100.64.0.0/10 from RFC6598.

Requested by sthen@ chris@
OK mlarkin@


# 1.57 19-Apr-2017 reyk

Add support for dynamic "NAT" interfaces (-L/local interface).

When a local interface is configured, vmd configures a /31 address on
the tap(4) interface of the host and provides another IP in the same
subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server
that replies with IP, gateway, and DNS addresses to the VM. The
built-in server only ever responds to the VM on the inside and cannot
leak its DHCP responses to the outside.

Thanks to Uwe Werler, Josh Grosse, and some others for testing!

OK deraadt@


# 1.56 06-Apr-2017 reyk

Do not expose vmm(4) VM IDs to the user, use vmd(8)'s IDs instead.

Each VM has two IDs: one from the kernel (vmm) and a different one
from userland (vmd). The vmm ID is not consistent and incremented on
every boot during runtimg of the host system. The vmd ID remains the
same during the lifetime of a configured VM, even after reboots.
Configured VMs will even get and keep their IDs when the configuration
is loaded. This is more what users expect.

Pointed out and tested by otto@

OK deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.55 15-Mar-2017 reyk

More fixes for starting and stopping VMs, fixing fallout from vm_running.

- Don't start a VM that is already running
- Keep the VM as running until it is powered off (and not stopping)
- Don't fatal in the parent if the vmm process referenced an unknown VM
- Don't stop a VM that is already stopping
- Indicate that a VM is stopping in "vmctl status"

The previous "vmctl stop; vmctl stop" to force-shutdown is not
supported anymore - the shutdown timeout should make sure that the VM
is really terminated. To force-shutdown, reference the VM by ID.
We might add a flag to vmctl stop to just turn the VM off.


# 1.54 15-Mar-2017 reyk

Close the tty if the VM was powered down.

The parent keeps a copy of each VM's tty fd to reuse it on reboot.
Close this tty if the VM was stopped, and not rebooted, by calling
vm_stop(vm, 0) instead of just setting vm_running to 0. Also make
sure that vm_ttyname is not used after free'ing it.


# 1.53 02-Mar-2017 reyk

Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.

This is especially useful when multiple VMs share a switch, the
implementation is independent from the underlying switch or bridge.

no objections mlarkin@


# 1.52 01-Mar-2017 reyk

Add "owner" option to set a user/group ownership for pre-configured VMs

This allows matching users to start or stop VMs that they "own" and to
access the console accordingly.

OK mlarkin@


# 1.51 27-Feb-2017 reyk

Replace openpty(3) with local function that uses pre-opened /dev/ptm fd

This allows more flexibility for upcoming changes and better pledge.
We also didn't use half of the features of libutil's openpty function.
Additionally, make sure that the ttys are closed correctly on shutdown.

OK gilles@


# 1.50 13-Jan-2017 edd

Make it possible to remove VMs from vmd(8)'s internal queue.

The semantics agreed with reyk@ are:

* ad-hoc created vms, created with `vmctl start`, are removed once stopped.
* Stopped VMs defined in a config file are flushed before a `vmctl reload`.

OK reyk@


# 1.49 11-Jan-2017 reyk

Add imsg communication channel between vmd and invividual VMs.
For now, this is only used to forward "log verbose|brief" requests,
but it will be used for better things later.

OK mlarkin@


# 1.48 09-Jan-2017 reyk

Stop accessing verbose and debug variables from log.c directly.

This replaces log_verbose() and "extern int verbose" with the two functions
log_setverbose() and log_getverbose().

Pointed out by benno@
OK krw@ eric@ gilles@ (OK gilles@ for the snmpd bits as well)


# 1.47 14-Dec-2016 reyk

Allow to start disabled and pre-configured VMs by name, "vmctl start foo".

With testing from Jon Bernard

OK mlarkin@


# 1.46 14-Dec-2016 reyk

If a VM terminates with the result EAGAIN, close all fds except the
pty and re-send it to the vmm monitor process. With additional
changes in vmm.c, this will allow perform a cold reboot of VM.

With testing and feedback from Jon Bernard
OK mlarkin@


# 1.45 26-Nov-2016 reyk

Implement basic support for boot.conf(8) on the disk image.

Like the real boot loader, load and parse hd0a:/etc/boot.conf from the
first disk and fall back to /bsd. Not all boot loader options are
supported, but it at least does set device, set image, and boot -acds
(eg. for booting single-user).

For example, it can now boot install60.fs that includes a boot.conf
with "set image /6.0/amd64/bsd.rd":
vmctl start install -c -d install60.fs -d OpenBSD.img

This pseudo-bootloader is only needed without BIOS and could
potentially be replaced in the future.

OK mlarkin@


# 1.44 26-Nov-2016 reyk

If -m/memory is not specified, use 512M by default.

Default value picked with mlarkin - not too small and not too large.

OK mlarkin@


# 1.43 24-Nov-2016 reyk

Add support for booting the kernel from the disk image.

This make the kernel/-k argument optional and, if not specified, tries
to find the /bsd kernel in the primary hd0a partition of the first
disk image itself. It doesn't support hd0a:/etc/boot.conf yet, and it
is no BIOS or full boot loader, but it makes booting and handling of
VMs a bit easier - booting an external kernel is still supported.

The UFS file system code ufs.c is directly from libsa which is also
used by the real boot loader. The code compiles with a few signedness
warning which will be fixed separately.

OK mlarkin@


# 1.42 22-Nov-2016 reyk

Fix error path of config_setvm() and its callers. This unbreaks
loading of invalid kernel files.

Reported by mlarkin@
OK mlarkin@


# 1.41 22-Nov-2016 reyk

There is no need for res when there is already ret.


# 1.40 22-Nov-2016 edd

Insert disabled VMs into vmd(8)'s queues and allow vmctl(8) to display them.

Tested by Jon Bernard and reyk@.

OK reyk@, no objections mlarkin@.

Thanks


# 1.39 04-Nov-2016 reyk

Pass the internal vmid or 0 to vm_register() instead of changing it
once again after setting the next available id.

Suggested by edd@


# 1.38 04-Nov-2016 reyk

Update the config/register/get VM methods to match the config_set/get
style that is used in other places. Also keep the vmid from the parent.

OK edd@


# 1.37 29-Oct-2016 edd

Separate parsing vms and switches from starting them in vmd(8).

Brings us one step closer to having disabled by default vms is vm.conf(5),
which can be started with vmctl(8).

Input, testing and OK reyk@. Thanks.


# 1.36 17-Oct-2016 reyk

Add the option to specify an interface group per virtual switch as well;
this group will be added to all VM tap(4) interfaces in the switch.

Tested by martijn@


# 1.35 15-Oct-2016 reyk

Allow to add an interface to an interface group; with the group keyword.

Requested and tested by martijn@


# 1.34 12-Oct-2016 reyk

Fix functionality and semantics of vmctl load/reload/reset.

OK rzalamena@


# 1.33 06-Oct-2016 reyk

Terminate VMs on shutdown of vmd instead of leaving them running as
undead VM processes.

OK mlarkin@


# 1.32 05-Oct-2016 reyk

Add support for enhanced networking configuration and virtual switches.
See vm.conf(5) for more details.

OK mlarkin@


# 1.31 04-Oct-2016 reyk

Add a new "priv" process that is responsible for ioctls and restricted
operations that aren't allowed under pledge. This is a companion to
the "vmd" process that runs as root but with pledge.

With the "priv" process, each new tap(4) interface now gets a
description to indicate the vm, eg. "vm1-if0-myvm". For network
configuration will be done by vmd/priv later.

OK mlarkin@


# 1.30 29-Sep-2016 reyk

Implement fork+exec for vmd, using the same framework from httpd etc.

No objections from mlarkin@ sunil@


# 1.29 17-Aug-2016 deraadt

small bits of header cleanup; ok mlarkin


# 1.28 29-Jul-2016 stefan

Allow starting a VM again after it was terminated

If a VM exits, terminate it and remove it from the list of
available VMs. That allows a VM with name `foo' to be restarted
after it has exited.

This changes structures shared between vmd and vmctl. You need to
rebuild vmctl also.

ok mlarkin@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.27 05-Feb-2016 reyk

Fix a possible use-after-free in vmd, forward the result to the
control socket before free'ing the vm.

Found by and OK jsg@


# 1.26 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.25 11-Dec-2015 reyk

The vmctl "id" argument can now be a number of or a vm name, eg.
vmctl stop 3
vmctl stop "openbsd.vm"


# 1.24 08-Dec-2015 jsg

when checking the config file with -n don't open /dev/vmm or require root
ok reyk@


# 1.23 08-Dec-2015 jsg

make the -f option work as intended
ok reyk@


# 1.22 07-Dec-2015 reyk

tweak initial error logging


# 1.21 06-Dec-2015 reyk

Prevent running a VM with the same name multiple times - multiple
instances of the same configuration will be handled in a different way
later. It is also not a good idea to use the same writeable disk
with multiple VMs at the same time.

As discussed with mlarkin@


# 1.20 06-Dec-2015 reyk

Print the TTY in the vmctl status output.


# 1.19 06-Dec-2015 reyk

When a new vm is created with VMM_IOC_CREATE, the kernel assigns a
unique id to it. This happens in the vm child process and has to be
communicated to the parent processes to track the vm. Knowing the vm
id in the parent and vmm processes also allows to remove vm from the
daemons list on terminate requests later.


# 1.18 06-Dec-2015 reyk

Check errno from config_getvm() correctly


# 1.17 05-Dec-2015 reyk

Print shorter error message if opening /dev/vmm failed.

Pointed out by deraadt@


# 1.16 03-Dec-2015 reyk

Re-add the "load" and "reload" commands to vmctl: Instead of parsing
the configuration in vmctl directly, it now sends a (re)load request
to vmd. The reload also resets the existing configuration status -
this doesn't do much difference yet but a future change will compare
if a specified VM is already running. "load" will allow to add
configuration, while "reload" resets the state before loading.


# 1.15 03-Dec-2015 reyk

Add and document -D and -f flags to vmd.


# 1.14 03-Dec-2015 reyk

mlarkin's code has been moved to vmm.c, so it is ok to claim the copyright.


# 1.13 03-Dec-2015 reyk

Add support for an optional vm.conf(5) file in vmd. This will replace
vmm.conf(5) in vmmctl. For a short time, both vmd and vmmctl will
support a configuration file, but vmmctl will be changed to send
"load" requests to vmd instead of loading and parsing the file
directly.


# 1.12 03-Dec-2015 reyk

prepare config_getvm() for parse.y


# 1.11 02-Dec-2015 reyk

send the tty name to vmmctl and print it as a result.


# 1.10 02-Dec-2015 reyk

Split the fully privileged parent into two processes "parent" and
"vmm" with reduced privileges:
- the "parent" opens fds (disks, ifs, etc.) but runs as root but pledged as
"stdio rpath wpath proc tty sendfd".
- the "vmm" process handles the creation and supervision of vm processes,
and the primary communication with the vmm(4) subsystem. It runs as _vmd
in the chroot but does not use pledge, as the vmm ioctls are not allowed
by any pledge model yet.
With this change, vmd starts to track the configuration state of VMs
in vmd and will allow other things later (like terminating a vm by
name, moving the configuration parser to vmd, ...). More incremental
changes will follow.


# 1.9 02-Dec-2015 reyk

Start tweaking vmd's privsep and daemon model by splitting the main
process into multiple parts and adopting the "proc.c"-style from other
daemons. This allows to further reduce the privileges, to give better
pledge(2), and to add some upcoming changes.

"please do" mlarkin@, deraadt@


# 1.8 26-Nov-2015 reyk

Automatically start vmm(4) when the first VM is created and after the
last VM is terminated. This allows to remove the explicit "vmm
enable" / "vmm disable" (VMM_IOC_START / VMM_IOC_STOP) ioctls. You'll
have to update kernel and userland for this change, as the kernel ABI
changes.

OK mpi@ mlarkin@


# 1.7 25-Nov-2015 tedu

typo: should be looking pid == -1


# 1.6 23-Nov-2015 reyk

accept4() is restarted after signals which prevents vmd from exiting
in the current control socket loop. Add a poll before the accept that
is not restarted and allows to escape the loop. This code is kind of
temporary, as we're planning to replace the event handling, but it
allows to kill (or Ctrl+c) vmd for now.

OK tedu@, discussed with many


# 1.5 23-Nov-2015 reyk

I accidentally removed a newline in usage() when converting the log
messages to log_*.

From Cesare Gargano


# 1.4 23-Nov-2015 reyk

Add support for logging to stderr or syslog, and to run vmd in
foreground with -d.

OK mlarkin@ jung@


# 1.3 22-Nov-2015 deraadt

use PATH_MAX where needed


# 1.2 22-Nov-2015 reyk

Add $ Ids


# 1.1 22-Nov-2015 mlarkin

vmd(8) - virtual machine daemon.

There is still a lot to be done, and fixed, in these userland components
but I have received enough "it works, commit it" emails that it's time
to finish those things in tree.

discussed with many, tested by many.


# 1.128 13-Dec-2021 deraadt

including sys/cdefs.h manually started as a result of netbsd trying to
macro-build a replacement for sccsid, and was done without any concern
for namespace damage. Unfortunately this practice started infecting
other code as others were unaware they didn't need the file.
ok millert guenther


# 1.127 29-Nov-2021 deraadt

mostly avoid sys/param.h with a local nitems()
ok mlarkin


Revision tags: OPENBSD_7_0_BASE
# 1.126 18-Jul-2021 dv

vmd(8): remove invalid errno values from config_setvm

Refactor config_setvm to directly return error code on failure
instead of returning -1 and setting errno. It was setting unsupported
values not defined in <errno.h>.

OK mlarkin@


# 1.125 05-May-2021 dv

Refactor vm_instance to return error value directly.

vmd(8)'s vm_instance function set unsupported errno values. Change the
api to directly return an error (either errno or custom vmd error).

"go for it" -mlarkin@


# 1.124 04-May-2021 dv

Init debug logging state before attempting to log.

Error messages related to bad configuration were not flushing to
stderr.

OK mlarkin@


# 1.123 26-Apr-2021 dv

vmd(8): fix vmctl client "wait" state corruption

Adds queue-based tracking of waiting client state to fix the cause of
state corruption when a vmctl(8) user cancels a wait and restarts it.
The socket fd value for the control process client was being used to
track the waiting party, but this also prevented multiple waiting
clients.

This moves all the state tracking of who to notify of a vm's stopping
to the control process and no longer requires the parent process to
track it in the global environment state.

Future work will be needed to smooth out the difference between the
IMSG_VMDOP_TERMINATE_VM_{EVENT,RESPONSE} events instead of needing to
translate before relaying to the vmctl(8) client.

Tested by Mischa Peters (thanks!)

ok mlarkin@


Revision tags: OPENBSD_6_9_BASE
# 1.122 05-Apr-2021 dv

Send correct response type on unpause errors.

ok pd@


# 1.121 29-Mar-2021 dv

Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp
and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior
ignored did not intercept these packets and instead transmitted them.

This should make vmd(8)'s dhcp behave more as a true dhcp server should and
allows it to work properly with the new dhcpleased(8) attempting a renewal.

OK mlarkin@


# 1.120 27-Jan-2021 deraadt

these programs (with common ancestry) had a -fno-common problem related
to privsep_procid.
ok mortimer


Revision tags: OPENBSD_6_8_BASE
# 1.119 23-Sep-2020 martijn

Revert agentx support for now, we're too close to release.

requested by deraadt@


# 1.118 23-Sep-2020 martijn

Add support for agentx to vmd.

This is based around VM-MIB from RFC7666,but does not export the full
spec. People more knowledgeable of vmd are encouraged to expand on this.


Revision tags: OPENBSD_6_7_BASE
# 1.117 12-Dec-2019 pd

vmd: start vms defined in vm.conf in a staggered fashion

This addresses 'thundering herd' problem when a lot of
vms are configured in vm.conf. A lot of vms booting in parallel can
overload the host and also mess up tsc calibration in openbsd guests as
it uses PIT which doesn't fire reliably if the host is overloaded.

We default to starting vms with parallelism of ncpuonline and a delay 30 seconds
between batches. This is configurable in vm.conf.

ok mlarkin@ (also addressed comments from cheloha@)


Revision tags: OPENBSD_6_6_BASE
# 1.116 04-Sep-2019 mlarkin

vmd(8): memory leak in an error path

Found by Hiltjo Posthuma, thanks!


# 1.115 14-Aug-2019 anton

Improve the error message when supplying an invalid template to vmctl
start. Favoring 'invalid template' over 'permission denied' should give
the user a better hint on what went wrong.

ok kn@ mlarkin@


# 1.114 28-Jun-2019 deraadt

When system calls indicate an error they return -1, not some arbitrary
value < 0. errno is only updated in this case. Change all (most?)
callers of syscalls to follow this better, and let's see if this strictness
helps us in the future.


# 1.113 20-May-2019 jasper

drop fatalx calls when claiming a new vm id; otherwise it's possible
to crash vmd and take all other vms with it. this required a little
shuffling to get the error value reported back to the caller to
handle the error properly.

ok mlarkin@


# 1.112 11-May-2019 jasper

report vm state through 'vmctl status'; whereas previously this would display the state of
the vcpu (which is why it got removed), it now actually reports the correct state
(running, stopped, disabled, paused, etc)

ok ccardenas@ mlarkin@


# 1.111 11-May-2019 jasper

vm_dump_header allocated space for a signature but it was never set;
set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image

ok mlarkin@ pd@


# 1.110 11-May-2019 jasper

track the state of the vm (running, paused, etc) using a single bitfield instead of
a handful of separate variables. this will makes it easier for vmd to report
and check on the individual vm states

no functional change intended

ok ccardenas@ mlarkin@


# 1.109 11-May-2019 jasper

sync the vm state in vmd too when (un)pausing a vm, otherwise the vm process
knows the vm is paused, but vmd does not.

ok mlarkin@ pd@


Revision tags: OPENBSD_6_5_BASE
# 1.108 09-Dec-2018 claudio

When -B is used to specify a specific boot device also change the reboot
behaviour of vmd to stop / exit at guest reboot.
OK ccardenas@


# 1.107 04-Dec-2018 claudio

Introduce IMSG_VMDOP_WAIT_VM_REQUEST a control message that registers a
vmctl peerid that should be informed when the VM is stopped (like when the
guest does a shutdown). Uses the same logic as using the VMOP_WAIT flag on
IMSG_VMDOP_TERMINATE_VM_REQUEST.
Ok ccardenas@, reyk@


# 1.106 26-Nov-2018 ori

Keep a list of known vms, and reuse the VM IDs.

This means that when using '-L', the IP addresses of the VMs are stable.

ok reyk@


# 1.105 21-Nov-2018 reyk

Add support for "local inet6" interfaces.

ok & test ccardenas@, additional review from kn@


# 1.104 15-Oct-2018 reyk

Prevent VM reboot loops by rate-limiting the interval a VM can reboot.

This looping has been experienced by people who run VMs with a broken
kernel or boot loader that trigger a very fast reboot loop (triple
fault) of a VM that ends up using a lot of CPU and resources on the
host. Some fixes in vmm(4) and vmd(8) helped to avoid such conditions
but it can still occur if something is wrong in the guest VM itself.

If the VM restarts after less than VM_START_RATE_SEC (6) seconds, we
increment the limit counter. After VM_START_RATE_LIMIT (3) of suchs
fast reboots the VM is stopped.

There are only very few people who intentionally want to reboot-loop a
VM very quickly (many times within a second); mostly for fuzzing.
They will have to recompile and adjust the stated #defines in the code
as we don't have a config option to disable it.

OK mlarkin@


Revision tags: OPENBSD_6_4_BASE
# 1.103 08-Oct-2018 reyk

Add support for qcow2 base images (external snapshots).

This works is from Ori Bernstein, committing on his behalf:

Add support to vmd for external snapshots. That is, snapshots that are
derived from a base image. Data lookups start in the derived image,
and if the derived image does not contain some data, the search
proceeds ot the base image. Multiple derived images may exist off of
a single base image.

A limitation of this format is that modifying the base image will
corrupt the derived image.

This change also adds support for creating disk derived disk images to
vmctl. To use it:

vmctl create derived.qcow2 -s 16G -b base.qcow2

From Ori Bernstein
OK mlarkin@ reyk@


# 1.102 29-Sep-2018 pd

vmd: don't remove vm if sending failed

Fix a bug where a vm was removed in vmd.c after vmctl send even if sending
failed.
spotted by solene@
ok mlarkin@


# 1.101 28-Sep-2018 reyk

Fix copy-pasto to use maxmem instead of maxcpu

Reported by Greg Steuck

OK mlarkin@


# 1.100 10-Sep-2018 bluhm

vmd(8) clould close file descriptor 0 as not all fd fields were
properly initialized with -1. Also avoid closing -1.
OK mlarkin@


# 1.99 10-Sep-2018 bluhm

During the fork+exec implementation, daemon(3) was moved after
proc_init(). As a consequence vmd(8) child processes did not detach
from the terminal anymore. Dup /dev/null to the stdio file descriptors
in the children.
OK mlarkin@ reyk@


# 1.98 15-Jul-2018 reyk

Track resources and enforce cpu/memory/interface limits for non-root users.

The limits are currently hard-coded and undocumented (4 CPUs/VMs, 2G
memory, 8 interfaces) but will be configurable in an upcoming diff.
These limits are tracked in total usage; for example, a user will be
able to run up to 4 VMs with 512M of memory or a single VM with 2G.

OK ccardenas@ mlarkin@


# 1.97 13-Jul-2018 reyk

Check the disk/kernel/cdrom file permissions after openening the fd.

This prevents time of TOCTOU attacks for instances.

OK mlarkin@


# 1.96 13-Jul-2018 reyk

Add "allow instance" option.

This allows users to create VM instances and change desired options,
for example a user can be allowed to run a VM with all the
pre-configured options but specify an own disk image.

(mlarkin@ was fine with iterating over it)

OK ccardenas@


# 1.95 12-Jul-2018 reyk

Allow to use configured/running VMs as templates for other VM instances.

This introduces new grammar and the -t optional in vmctl start.

(For now, only root can create VM instances; but it is planned to allow
users to create their own VMs based on permissions and quota.)

OK ccardenas@ mlarkin@ jmc@


# 1.94 11-Jul-2018 reyk

style - indent each case statement in a switch.


# 1.93 11-Jul-2018 reyk

Add -w option to vmctl stop to wait for completion of VM termination.

Use it in /etc/rc.d/vmd accordingly.

OK sthen@


# 1.92 11-Jul-2018 reyk

Rename function to vmd_check_vmh


# 1.91 11-Jul-2018 reyk

Add -f option to vmctl stop to forcefully kill a VM.

This also fixes a bug in vmm_sighdlr where it might have missed
forwarding the TERMINATE_EVENT to the vmd parent after a VM child
died, leading to an abandoned VM in the vmd parent process.

OK ccardenas@ mlarkin@ benno@ kn@


# 1.90 10-Jul-2018 reyk

style (single-line ifs don't need braces)


# 1.89 10-Jul-2018 reyk

vmd already had DEBUG/DPRINTF, there is no need for VMD_DEBUG/dprintf

Replace all occurences of dprintf with DPRINTF (defined in proc.h).


# 1.88 10-Jul-2018 reyk

Tweak debug log messages

- Turn tracing messages into DPRINTF (only compiled with DEBUG).

- Pass __func__ to vm_stop and vm_remove: this way we can track who
called the function in the async context. It replaces the manual
log_debug in front of each vm_stop/vm_remove. This debug logging
trick can be removed in the future once we are more confident about
it.

OK ccardenas@ mlarkin@


# 1.87 26-Jun-2018 reyk

Add "socket owner" to allow changing the owner of the vmd control socket.

This allows to open vmctl control or console access to other users
that are not in group wheel. Access for non-root users still defaults
to read-only actions unless you change the owner (user/group) of each
individual VM.

Requested by Mischa Peters

OK mlarkin@


# 1.86 19-Jun-2018 reyk

knf


# 1.85 13-May-2018 pd

vmd(8): enable pause / unpause for vm owners

Patch from Mohamed Aslan. Thanks!
ok kn@


# 1.84 25-Apr-2018 mlarkin

vmd(8)'s early error messages weren't visible when started via /etc/rc
(such as errors relating to not having VMX/etc). Change the log_init
to log to syslog so at least we have some chance of seeing these errors.

requested and ok beck@


# 1.83 21-Apr-2018 mlarkin

spelling error in log message


# 1.82 29-Mar-2018 martijn

Make sure that the global config is send out immediately when it is
loaded. This makes sure that the local prefix specied in the config is
always used.

OK ccardenas@


Revision tags: OPENBSD_6_3_BASE
# 1.81 14-Mar-2018 mlarkin

block two VMs from using the same disk image file at the same time.
Also changes an error message in vmctl to reflect same.


# 1.80 18-Feb-2018 pd

vmd: fix vmctl pause for non existing vm ids (never returns)

check if vm id is valid before sending to vmm for pausing. The 'lock' is caused
by vmm sending back ENOENT for a non existent vm but vmd drops the message
because it doesn't recogize the vmid vmm is talking about. This is an artifact
of the 'policy' don't trust any imsg from a sibling priv sep process and do
your own checking.

reported by Abel Abraham Camarillo Ojeda
ok mlarkin@ and ccardenas@


# 1.79 10-Jan-2018 sthen

Don't require "disk" or "kernel", also allow just "cdrom" instead, a VM can
still be useful with only cdrom storage. ok ccardenas@


# 1.78 08-Jan-2018 mpi

Enable TIOCUCNTL to be able to set ns8250's break detected condition.

It is now possible to send BREAK commands to vmd(8) independently of
the serial terminal emulator.

Happy virtual ddb(4) hacking!

No objection from mlarkin@, ok nicm@, ccardenas@, deraadt@


# 1.77 03-Jan-2018 ccardenas

Add initial CD-ROM support to VMD via vioscsi.

* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8)
* Support various sized ISOs (Limitation of 4G ISOs on Linux guests)
* Known working guests: OpenBSD (primary), Alpine Linux (primary),
CentOS 6 (secondary), Ubuntu 17.10 (secondary).
NOTE: Secondary indicates some issue(s) preventing full/reliable
functionality outside the scope of the vioscsi work.
* If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's
default BIOS) will boot from CD-ROM.

ok mlarkin@, jca@


# 1.76 06-Dec-2017 abieber

Make vmd respect owner when starting non-disabled vms.

OK pd@, benno@


# 1.75 30-Nov-2017 ccardenas

When performing vmctl reload and a previously configured vm is running,
exit with an EALREADY vs EPERM.

ok mlarkin@


# 1.74 11-Nov-2017 mlarkin

update switch handling in vmd(8). vmd now gets switch information (rdomain,
etc) from underlying switch interface instead of handling this on its
own.

Diff from carlos cardenas, Thanks!

ok reyk@


# 1.73 07-Nov-2017 mlarkin

typo in previous


# 1.72 07-Nov-2017 mlarkin

comment function vm_checkperm


# 1.71 24-Oct-2017 mlarkin

The VMD parent process didn't handle the case of a VM exiting
with a non 0 return properly (i.e. EIO).

From: Carlos Cardenas, thanks!


# 1.70 07-Oct-2017 mlarkin

vmd: retain ownership on vm reboot

from Jesper Wallin, thanks!


Revision tags: OPENBSD_6_2_BASE
# 1.69 08-Sep-2017 mlarkin

vmd: add more explanatory log_debug messages

From Carlos Cardenas, many thanks!


# 1.68 20-Aug-2017 pd

vmd: Allow only upward migration

This restricts receiving vms from hosts with more cpu features.

Tested on
broadwell -> skylake (works)
skylake -> broadwell (don't work)

ok mlarkin@


# 1.67 15-Aug-2017 pd

vmd: fix vm id displayed by vmctl when receiving a vm

Also fix two debug messages and an IMSG type.


# 1.66 14-Aug-2017 jasper

validate vm names before creating them; a valid name contains alphanumeric
characters, including '.', '_' and '-'. but does not start with the latter
three.

ok mlarkin@ pd@


# 1.65 13-Aug-2017 jasper

don't issue a termination command to an already stopped vm

ok mlarkin@


# 1.64 15-Jul-2017 pd

Add vmctl send and vmctl receive

ok reyk@ and mlarkin@


# 1.63 09-Jul-2017 pd

vmd/vmctl: Add ability to pause / unpause vms

With help from Ashwin Agrawal

ok reyk@ mlarkin@


# 1.62 29-May-2017 mlarkin

vmd(8): prevent crashing when presented with a vm name argument to
"vmctl stop" that doesn't exist.

Diff from Pratik Vyas, thanks!


# 1.61 04-May-2017 reyk

Report command failure back to vmctl reload, reset, load, log verbose.

OK mlarkin@


# 1.60 04-May-2017 reyk

Add support for rdomains.

This allows to configure VM interfaces and switches in individual rdomains.

OK mlarkin@


# 1.59 25-Apr-2017 reyk

Generate randomized MAC addresses earlier to keep them across reboots.

OK deraadt@


# 1.58 21-Apr-2017 reyk

Add global configuration option "local prefix" to change prefix for -L.

The default prefix is 100.64.0.0/10 from RFC6598.

Requested by sthen@ chris@
OK mlarkin@


# 1.57 19-Apr-2017 reyk

Add support for dynamic "NAT" interfaces (-L/local interface).

When a local interface is configured, vmd configures a /31 address on
the tap(4) interface of the host and provides another IP in the same
subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server
that replies with IP, gateway, and DNS addresses to the VM. The
built-in server only ever responds to the VM on the inside and cannot
leak its DHCP responses to the outside.

Thanks to Uwe Werler, Josh Grosse, and some others for testing!

OK deraadt@


# 1.56 06-Apr-2017 reyk

Do not expose vmm(4) VM IDs to the user, use vmd(8)'s IDs instead.

Each VM has two IDs: one from the kernel (vmm) and a different one
from userland (vmd). The vmm ID is not consistent and incremented on
every boot during runtimg of the host system. The vmd ID remains the
same during the lifetime of a configured VM, even after reboots.
Configured VMs will even get and keep their IDs when the configuration
is loaded. This is more what users expect.

Pointed out and tested by otto@

OK deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.55 15-Mar-2017 reyk

More fixes for starting and stopping VMs, fixing fallout from vm_running.

- Don't start a VM that is already running
- Keep the VM as running until it is powered off (and not stopping)
- Don't fatal in the parent if the vmm process referenced an unknown VM
- Don't stop a VM that is already stopping
- Indicate that a VM is stopping in "vmctl status"

The previous "vmctl stop; vmctl stop" to force-shutdown is not
supported anymore - the shutdown timeout should make sure that the VM
is really terminated. To force-shutdown, reference the VM by ID.
We might add a flag to vmctl stop to just turn the VM off.


# 1.54 15-Mar-2017 reyk

Close the tty if the VM was powered down.

The parent keeps a copy of each VM's tty fd to reuse it on reboot.
Close this tty if the VM was stopped, and not rebooted, by calling
vm_stop(vm, 0) instead of just setting vm_running to 0. Also make
sure that vm_ttyname is not used after free'ing it.


# 1.53 02-Mar-2017 reyk

Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.

This is especially useful when multiple VMs share a switch, the
implementation is independent from the underlying switch or bridge.

no objections mlarkin@


# 1.52 01-Mar-2017 reyk

Add "owner" option to set a user/group ownership for pre-configured VMs

This allows matching users to start or stop VMs that they "own" and to
access the console accordingly.

OK mlarkin@


# 1.51 27-Feb-2017 reyk

Replace openpty(3) with local function that uses pre-opened /dev/ptm fd

This allows more flexibility for upcoming changes and better pledge.
We also didn't use half of the features of libutil's openpty function.
Additionally, make sure that the ttys are closed correctly on shutdown.

OK gilles@


# 1.50 13-Jan-2017 edd

Make it possible to remove VMs from vmd(8)'s internal queue.

The semantics agreed with reyk@ are:

* ad-hoc created vms, created with `vmctl start`, are removed once stopped.
* Stopped VMs defined in a config file are flushed before a `vmctl reload`.

OK reyk@


# 1.49 11-Jan-2017 reyk

Add imsg communication channel between vmd and invividual VMs.
For now, this is only used to forward "log verbose|brief" requests,
but it will be used for better things later.

OK mlarkin@


# 1.48 09-Jan-2017 reyk

Stop accessing verbose and debug variables from log.c directly.

This replaces log_verbose() and "extern int verbose" with the two functions
log_setverbose() and log_getverbose().

Pointed out by benno@
OK krw@ eric@ gilles@ (OK gilles@ for the snmpd bits as well)


# 1.47 14-Dec-2016 reyk

Allow to start disabled and pre-configured VMs by name, "vmctl start foo".

With testing from Jon Bernard

OK mlarkin@


# 1.46 14-Dec-2016 reyk

If a VM terminates with the result EAGAIN, close all fds except the
pty and re-send it to the vmm monitor process. With additional
changes in vmm.c, this will allow perform a cold reboot of VM.

With testing and feedback from Jon Bernard
OK mlarkin@


# 1.45 26-Nov-2016 reyk

Implement basic support for boot.conf(8) on the disk image.

Like the real boot loader, load and parse hd0a:/etc/boot.conf from the
first disk and fall back to /bsd. Not all boot loader options are
supported, but it at least does set device, set image, and boot -acds
(eg. for booting single-user).

For example, it can now boot install60.fs that includes a boot.conf
with "set image /6.0/amd64/bsd.rd":
vmctl start install -c -d install60.fs -d OpenBSD.img

This pseudo-bootloader is only needed without BIOS and could
potentially be replaced in the future.

OK mlarkin@


# 1.44 26-Nov-2016 reyk

If -m/memory is not specified, use 512M by default.

Default value picked with mlarkin - not too small and not too large.

OK mlarkin@


# 1.43 24-Nov-2016 reyk

Add support for booting the kernel from the disk image.

This make the kernel/-k argument optional and, if not specified, tries
to find the /bsd kernel in the primary hd0a partition of the first
disk image itself. It doesn't support hd0a:/etc/boot.conf yet, and it
is no BIOS or full boot loader, but it makes booting and handling of
VMs a bit easier - booting an external kernel is still supported.

The UFS file system code ufs.c is directly from libsa which is also
used by the real boot loader. The code compiles with a few signedness
warning which will be fixed separately.

OK mlarkin@


# 1.42 22-Nov-2016 reyk

Fix error path of config_setvm() and its callers. This unbreaks
loading of invalid kernel files.

Reported by mlarkin@
OK mlarkin@


# 1.41 22-Nov-2016 reyk

There is no need for res when there is already ret.


# 1.40 22-Nov-2016 edd

Insert disabled VMs into vmd(8)'s queues and allow vmctl(8) to display them.

Tested by Jon Bernard and reyk@.

OK reyk@, no objections mlarkin@.

Thanks


# 1.39 04-Nov-2016 reyk

Pass the internal vmid or 0 to vm_register() instead of changing it
once again after setting the next available id.

Suggested by edd@


# 1.38 04-Nov-2016 reyk

Update the config/register/get VM methods to match the config_set/get
style that is used in other places. Also keep the vmid from the parent.

OK edd@


# 1.37 29-Oct-2016 edd

Separate parsing vms and switches from starting them in vmd(8).

Brings us one step closer to having disabled by default vms is vm.conf(5),
which can be started with vmctl(8).

Input, testing and OK reyk@. Thanks.


# 1.36 17-Oct-2016 reyk

Add the option to specify an interface group per virtual switch as well;
this group will be added to all VM tap(4) interfaces in the switch.

Tested by martijn@


# 1.35 15-Oct-2016 reyk

Allow to add an interface to an interface group; with the group keyword.

Requested and tested by martijn@


# 1.34 12-Oct-2016 reyk

Fix functionality and semantics of vmctl load/reload/reset.

OK rzalamena@


# 1.33 06-Oct-2016 reyk

Terminate VMs on shutdown of vmd instead of leaving them running as
undead VM processes.

OK mlarkin@


# 1.32 05-Oct-2016 reyk

Add support for enhanced networking configuration and virtual switches.
See vm.conf(5) for more details.

OK mlarkin@


# 1.31 04-Oct-2016 reyk

Add a new "priv" process that is responsible for ioctls and restricted
operations that aren't allowed under pledge. This is a companion to
the "vmd" process that runs as root but with pledge.

With the "priv" process, each new tap(4) interface now gets a
description to indicate the vm, eg. "vm1-if0-myvm". For network
configuration will be done by vmd/priv later.

OK mlarkin@


# 1.30 29-Sep-2016 reyk

Implement fork+exec for vmd, using the same framework from httpd etc.

No objections from mlarkin@ sunil@


# 1.29 17-Aug-2016 deraadt

small bits of header cleanup; ok mlarkin


# 1.28 29-Jul-2016 stefan

Allow starting a VM again after it was terminated

If a VM exits, terminate it and remove it from the list of
available VMs. That allows a VM with name `foo' to be restarted
after it has exited.

This changes structures shared between vmd and vmctl. You need to
rebuild vmctl also.

ok mlarkin@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.27 05-Feb-2016 reyk

Fix a possible use-after-free in vmd, forward the result to the
control socket before free'ing the vm.

Found by and OK jsg@


# 1.26 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.25 11-Dec-2015 reyk

The vmctl "id" argument can now be a number of or a vm name, eg.
vmctl stop 3
vmctl stop "openbsd.vm"


# 1.24 08-Dec-2015 jsg

when checking the config file with -n don't open /dev/vmm or require root
ok reyk@


# 1.23 08-Dec-2015 jsg

make the -f option work as intended
ok reyk@


# 1.22 07-Dec-2015 reyk

tweak initial error logging


# 1.21 06-Dec-2015 reyk

Prevent running a VM with the same name multiple times - multiple
instances of the same configuration will be handled in a different way
later. It is also not a good idea to use the same writeable disk
with multiple VMs at the same time.

As discussed with mlarkin@


# 1.20 06-Dec-2015 reyk

Print the TTY in the vmctl status output.


# 1.19 06-Dec-2015 reyk

When a new vm is created with VMM_IOC_CREATE, the kernel assigns a
unique id to it. This happens in the vm child process and has to be
communicated to the parent processes to track the vm. Knowing the vm
id in the parent and vmm processes also allows to remove vm from the
daemons list on terminate requests later.


# 1.18 06-Dec-2015 reyk

Check errno from config_getvm() correctly


# 1.17 05-Dec-2015 reyk

Print shorter error message if opening /dev/vmm failed.

Pointed out by deraadt@


# 1.16 03-Dec-2015 reyk

Re-add the "load" and "reload" commands to vmctl: Instead of parsing
the configuration in vmctl directly, it now sends a (re)load request
to vmd. The reload also resets the existing configuration status -
this doesn't do much difference yet but a future change will compare
if a specified VM is already running. "load" will allow to add
configuration, while "reload" resets the state before loading.


# 1.15 03-Dec-2015 reyk

Add and document -D and -f flags to vmd.


# 1.14 03-Dec-2015 reyk

mlarkin's code has been moved to vmm.c, so it is ok to claim the copyright.


# 1.13 03-Dec-2015 reyk

Add support for an optional vm.conf(5) file in vmd. This will replace
vmm.conf(5) in vmmctl. For a short time, both vmd and vmmctl will
support a configuration file, but vmmctl will be changed to send
"load" requests to vmd instead of loading and parsing the file
directly.


# 1.12 03-Dec-2015 reyk

prepare config_getvm() for parse.y


# 1.11 02-Dec-2015 reyk

send the tty name to vmmctl and print it as a result.


# 1.10 02-Dec-2015 reyk

Split the fully privileged parent into two processes "parent" and
"vmm" with reduced privileges:
- the "parent" opens fds (disks, ifs, etc.) but runs as root but pledged as
"stdio rpath wpath proc tty sendfd".
- the "vmm" process handles the creation and supervision of vm processes,
and the primary communication with the vmm(4) subsystem. It runs as _vmd
in the chroot but does not use pledge, as the vmm ioctls are not allowed
by any pledge model yet.
With this change, vmd starts to track the configuration state of VMs
in vmd and will allow other things later (like terminating a vm by
name, moving the configuration parser to vmd, ...). More incremental
changes will follow.


# 1.9 02-Dec-2015 reyk

Start tweaking vmd's privsep and daemon model by splitting the main
process into multiple parts and adopting the "proc.c"-style from other
daemons. This allows to further reduce the privileges, to give better
pledge(2), and to add some upcoming changes.

"please do" mlarkin@, deraadt@


# 1.8 26-Nov-2015 reyk

Automatically start vmm(4) when the first VM is created and after the
last VM is terminated. This allows to remove the explicit "vmm
enable" / "vmm disable" (VMM_IOC_START / VMM_IOC_STOP) ioctls. You'll
have to update kernel and userland for this change, as the kernel ABI
changes.

OK mpi@ mlarkin@


# 1.7 25-Nov-2015 tedu

typo: should be looking pid == -1


# 1.6 23-Nov-2015 reyk

accept4() is restarted after signals which prevents vmd from exiting
in the current control socket loop. Add a poll before the accept that
is not restarted and allows to escape the loop. This code is kind of
temporary, as we're planning to replace the event handling, but it
allows to kill (or Ctrl+c) vmd for now.

OK tedu@, discussed with many


# 1.5 23-Nov-2015 reyk

I accidentally removed a newline in usage() when converting the log
messages to log_*.

From Cesare Gargano


# 1.4 23-Nov-2015 reyk

Add support for logging to stderr or syslog, and to run vmd in
foreground with -d.

OK mlarkin@ jung@


# 1.3 22-Nov-2015 deraadt

use PATH_MAX where needed


# 1.2 22-Nov-2015 reyk

Add $ Ids


# 1.1 22-Nov-2015 mlarkin

vmd(8) - virtual machine daemon.

There is still a lot to be done, and fixed, in these userland components
but I have received enough "it works, commit it" emails that it's time
to finish those things in tree.

discussed with many, tested by many.


# 1.127 29-Nov-2021 deraadt

mostly avoid sys/param.h with a local nitems()
ok mlarkin


Revision tags: OPENBSD_7_0_BASE
# 1.126 18-Jul-2021 dv

vmd(8): remove invalid errno values from config_setvm

Refactor config_setvm to directly return error code on failure
instead of returning -1 and setting errno. It was setting unsupported
values not defined in <errno.h>.

OK mlarkin@


# 1.125 05-May-2021 dv

Refactor vm_instance to return error value directly.

vmd(8)'s vm_instance function set unsupported errno values. Change the
api to directly return an error (either errno or custom vmd error).

"go for it" -mlarkin@


# 1.124 04-May-2021 dv

Init debug logging state before attempting to log.

Error messages related to bad configuration were not flushing to
stderr.

OK mlarkin@


# 1.123 26-Apr-2021 dv

vmd(8): fix vmctl client "wait" state corruption

Adds queue-based tracking of waiting client state to fix the cause of
state corruption when a vmctl(8) user cancels a wait and restarts it.
The socket fd value for the control process client was being used to
track the waiting party, but this also prevented multiple waiting
clients.

This moves all the state tracking of who to notify of a vm's stopping
to the control process and no longer requires the parent process to
track it in the global environment state.

Future work will be needed to smooth out the difference between the
IMSG_VMDOP_TERMINATE_VM_{EVENT,RESPONSE} events instead of needing to
translate before relaying to the vmctl(8) client.

Tested by Mischa Peters (thanks!)

ok mlarkin@


Revision tags: OPENBSD_6_9_BASE
# 1.122 05-Apr-2021 dv

Send correct response type on unpause errors.

ok pd@


# 1.121 29-Mar-2021 dv

Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp
and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior
ignored did not intercept these packets and instead transmitted them.

This should make vmd(8)'s dhcp behave more as a true dhcp server should and
allows it to work properly with the new dhcpleased(8) attempting a renewal.

OK mlarkin@


# 1.120 27-Jan-2021 deraadt

these programs (with common ancestry) had a -fno-common problem related
to privsep_procid.
ok mortimer


Revision tags: OPENBSD_6_8_BASE
# 1.119 23-Sep-2020 martijn

Revert agentx support for now, we're too close to release.

requested by deraadt@


# 1.118 23-Sep-2020 martijn

Add support for agentx to vmd.

This is based around VM-MIB from RFC7666,but does not export the full
spec. People more knowledgeable of vmd are encouraged to expand on this.


Revision tags: OPENBSD_6_7_BASE
# 1.117 12-Dec-2019 pd

vmd: start vms defined in vm.conf in a staggered fashion

This addresses 'thundering herd' problem when a lot of
vms are configured in vm.conf. A lot of vms booting in parallel can
overload the host and also mess up tsc calibration in openbsd guests as
it uses PIT which doesn't fire reliably if the host is overloaded.

We default to starting vms with parallelism of ncpuonline and a delay 30 seconds
between batches. This is configurable in vm.conf.

ok mlarkin@ (also addressed comments from cheloha@)


Revision tags: OPENBSD_6_6_BASE
# 1.116 04-Sep-2019 mlarkin

vmd(8): memory leak in an error path

Found by Hiltjo Posthuma, thanks!


# 1.115 14-Aug-2019 anton

Improve the error message when supplying an invalid template to vmctl
start. Favoring 'invalid template' over 'permission denied' should give
the user a better hint on what went wrong.

ok kn@ mlarkin@


# 1.114 28-Jun-2019 deraadt

When system calls indicate an error they return -1, not some arbitrary
value < 0. errno is only updated in this case. Change all (most?)
callers of syscalls to follow this better, and let's see if this strictness
helps us in the future.


# 1.113 20-May-2019 jasper

drop fatalx calls when claiming a new vm id; otherwise it's possible
to crash vmd and take all other vms with it. this required a little
shuffling to get the error value reported back to the caller to
handle the error properly.

ok mlarkin@


# 1.112 11-May-2019 jasper

report vm state through 'vmctl status'; whereas previously this would display the state of
the vcpu (which is why it got removed), it now actually reports the correct state
(running, stopped, disabled, paused, etc)

ok ccardenas@ mlarkin@


# 1.111 11-May-2019 jasper

vm_dump_header allocated space for a signature but it was never set;
set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image

ok mlarkin@ pd@


# 1.110 11-May-2019 jasper

track the state of the vm (running, paused, etc) using a single bitfield instead of
a handful of separate variables. this will makes it easier for vmd to report
and check on the individual vm states

no functional change intended

ok ccardenas@ mlarkin@


# 1.109 11-May-2019 jasper

sync the vm state in vmd too when (un)pausing a vm, otherwise the vm process
knows the vm is paused, but vmd does not.

ok mlarkin@ pd@


Revision tags: OPENBSD_6_5_BASE
# 1.108 09-Dec-2018 claudio

When -B is used to specify a specific boot device also change the reboot
behaviour of vmd to stop / exit at guest reboot.
OK ccardenas@


# 1.107 04-Dec-2018 claudio

Introduce IMSG_VMDOP_WAIT_VM_REQUEST a control message that registers a
vmctl peerid that should be informed when the VM is stopped (like when the
guest does a shutdown). Uses the same logic as using the VMOP_WAIT flag on
IMSG_VMDOP_TERMINATE_VM_REQUEST.
Ok ccardenas@, reyk@


# 1.106 26-Nov-2018 ori

Keep a list of known vms, and reuse the VM IDs.

This means that when using '-L', the IP addresses of the VMs are stable.

ok reyk@


# 1.105 21-Nov-2018 reyk

Add support for "local inet6" interfaces.

ok & test ccardenas@, additional review from kn@


# 1.104 15-Oct-2018 reyk

Prevent VM reboot loops by rate-limiting the interval a VM can reboot.

This looping has been experienced by people who run VMs with a broken
kernel or boot loader that trigger a very fast reboot loop (triple
fault) of a VM that ends up using a lot of CPU and resources on the
host. Some fixes in vmm(4) and vmd(8) helped to avoid such conditions
but it can still occur if something is wrong in the guest VM itself.

If the VM restarts after less than VM_START_RATE_SEC (6) seconds, we
increment the limit counter. After VM_START_RATE_LIMIT (3) of suchs
fast reboots the VM is stopped.

There are only very few people who intentionally want to reboot-loop a
VM very quickly (many times within a second); mostly for fuzzing.
They will have to recompile and adjust the stated #defines in the code
as we don't have a config option to disable it.

OK mlarkin@


Revision tags: OPENBSD_6_4_BASE
# 1.103 08-Oct-2018 reyk

Add support for qcow2 base images (external snapshots).

This works is from Ori Bernstein, committing on his behalf:

Add support to vmd for external snapshots. That is, snapshots that are
derived from a base image. Data lookups start in the derived image,
and if the derived image does not contain some data, the search
proceeds ot the base image. Multiple derived images may exist off of
a single base image.

A limitation of this format is that modifying the base image will
corrupt the derived image.

This change also adds support for creating disk derived disk images to
vmctl. To use it:

vmctl create derived.qcow2 -s 16G -b base.qcow2

From Ori Bernstein
OK mlarkin@ reyk@


# 1.102 29-Sep-2018 pd

vmd: don't remove vm if sending failed

Fix a bug where a vm was removed in vmd.c after vmctl send even if sending
failed.
spotted by solene@
ok mlarkin@


# 1.101 28-Sep-2018 reyk

Fix copy-pasto to use maxmem instead of maxcpu

Reported by Greg Steuck

OK mlarkin@


# 1.100 10-Sep-2018 bluhm

vmd(8) clould close file descriptor 0 as not all fd fields were
properly initialized with -1. Also avoid closing -1.
OK mlarkin@


# 1.99 10-Sep-2018 bluhm

During the fork+exec implementation, daemon(3) was moved after
proc_init(). As a consequence vmd(8) child processes did not detach
from the terminal anymore. Dup /dev/null to the stdio file descriptors
in the children.
OK mlarkin@ reyk@


# 1.98 15-Jul-2018 reyk

Track resources and enforce cpu/memory/interface limits for non-root users.

The limits are currently hard-coded and undocumented (4 CPUs/VMs, 2G
memory, 8 interfaces) but will be configurable in an upcoming diff.
These limits are tracked in total usage; for example, a user will be
able to run up to 4 VMs with 512M of memory or a single VM with 2G.

OK ccardenas@ mlarkin@


# 1.97 13-Jul-2018 reyk

Check the disk/kernel/cdrom file permissions after openening the fd.

This prevents time of TOCTOU attacks for instances.

OK mlarkin@


# 1.96 13-Jul-2018 reyk

Add "allow instance" option.

This allows users to create VM instances and change desired options,
for example a user can be allowed to run a VM with all the
pre-configured options but specify an own disk image.

(mlarkin@ was fine with iterating over it)

OK ccardenas@


# 1.95 12-Jul-2018 reyk

Allow to use configured/running VMs as templates for other VM instances.

This introduces new grammar and the -t optional in vmctl start.

(For now, only root can create VM instances; but it is planned to allow
users to create their own VMs based on permissions and quota.)

OK ccardenas@ mlarkin@ jmc@


# 1.94 11-Jul-2018 reyk

style - indent each case statement in a switch.


# 1.93 11-Jul-2018 reyk

Add -w option to vmctl stop to wait for completion of VM termination.

Use it in /etc/rc.d/vmd accordingly.

OK sthen@


# 1.92 11-Jul-2018 reyk

Rename function to vmd_check_vmh


# 1.91 11-Jul-2018 reyk

Add -f option to vmctl stop to forcefully kill a VM.

This also fixes a bug in vmm_sighdlr where it might have missed
forwarding the TERMINATE_EVENT to the vmd parent after a VM child
died, leading to an abandoned VM in the vmd parent process.

OK ccardenas@ mlarkin@ benno@ kn@


# 1.90 10-Jul-2018 reyk

style (single-line ifs don't need braces)


# 1.89 10-Jul-2018 reyk

vmd already had DEBUG/DPRINTF, there is no need for VMD_DEBUG/dprintf

Replace all occurences of dprintf with DPRINTF (defined in proc.h).


# 1.88 10-Jul-2018 reyk

Tweak debug log messages

- Turn tracing messages into DPRINTF (only compiled with DEBUG).

- Pass __func__ to vm_stop and vm_remove: this way we can track who
called the function in the async context. It replaces the manual
log_debug in front of each vm_stop/vm_remove. This debug logging
trick can be removed in the future once we are more confident about
it.

OK ccardenas@ mlarkin@


# 1.87 26-Jun-2018 reyk

Add "socket owner" to allow changing the owner of the vmd control socket.

This allows to open vmctl control or console access to other users
that are not in group wheel. Access for non-root users still defaults
to read-only actions unless you change the owner (user/group) of each
individual VM.

Requested by Mischa Peters

OK mlarkin@


# 1.86 19-Jun-2018 reyk

knf


# 1.85 13-May-2018 pd

vmd(8): enable pause / unpause for vm owners

Patch from Mohamed Aslan. Thanks!
ok kn@


# 1.84 25-Apr-2018 mlarkin

vmd(8)'s early error messages weren't visible when started via /etc/rc
(such as errors relating to not having VMX/etc). Change the log_init
to log to syslog so at least we have some chance of seeing these errors.

requested and ok beck@


# 1.83 21-Apr-2018 mlarkin

spelling error in log message


# 1.82 29-Mar-2018 martijn

Make sure that the global config is send out immediately when it is
loaded. This makes sure that the local prefix specied in the config is
always used.

OK ccardenas@


Revision tags: OPENBSD_6_3_BASE
# 1.81 14-Mar-2018 mlarkin

block two VMs from using the same disk image file at the same time.
Also changes an error message in vmctl to reflect same.


# 1.80 18-Feb-2018 pd

vmd: fix vmctl pause for non existing vm ids (never returns)

check if vm id is valid before sending to vmm for pausing. The 'lock' is caused
by vmm sending back ENOENT for a non existent vm but vmd drops the message
because it doesn't recogize the vmid vmm is talking about. This is an artifact
of the 'policy' don't trust any imsg from a sibling priv sep process and do
your own checking.

reported by Abel Abraham Camarillo Ojeda
ok mlarkin@ and ccardenas@


# 1.79 10-Jan-2018 sthen

Don't require "disk" or "kernel", also allow just "cdrom" instead, a VM can
still be useful with only cdrom storage. ok ccardenas@


# 1.78 08-Jan-2018 mpi

Enable TIOCUCNTL to be able to set ns8250's break detected condition.

It is now possible to send BREAK commands to vmd(8) independently of
the serial terminal emulator.

Happy virtual ddb(4) hacking!

No objection from mlarkin@, ok nicm@, ccardenas@, deraadt@


# 1.77 03-Jan-2018 ccardenas

Add initial CD-ROM support to VMD via vioscsi.

* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8)
* Support various sized ISOs (Limitation of 4G ISOs on Linux guests)
* Known working guests: OpenBSD (primary), Alpine Linux (primary),
CentOS 6 (secondary), Ubuntu 17.10 (secondary).
NOTE: Secondary indicates some issue(s) preventing full/reliable
functionality outside the scope of the vioscsi work.
* If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's
default BIOS) will boot from CD-ROM.

ok mlarkin@, jca@


# 1.76 06-Dec-2017 abieber

Make vmd respect owner when starting non-disabled vms.

OK pd@, benno@


# 1.75 30-Nov-2017 ccardenas

When performing vmctl reload and a previously configured vm is running,
exit with an EALREADY vs EPERM.

ok mlarkin@


# 1.74 11-Nov-2017 mlarkin

update switch handling in vmd(8). vmd now gets switch information (rdomain,
etc) from underlying switch interface instead of handling this on its
own.

Diff from carlos cardenas, Thanks!

ok reyk@


# 1.73 07-Nov-2017 mlarkin

typo in previous


# 1.72 07-Nov-2017 mlarkin

comment function vm_checkperm


# 1.71 24-Oct-2017 mlarkin

The VMD parent process didn't handle the case of a VM exiting
with a non 0 return properly (i.e. EIO).

From: Carlos Cardenas, thanks!


# 1.70 07-Oct-2017 mlarkin

vmd: retain ownership on vm reboot

from Jesper Wallin, thanks!


Revision tags: OPENBSD_6_2_BASE
# 1.69 08-Sep-2017 mlarkin

vmd: add more explanatory log_debug messages

From Carlos Cardenas, many thanks!


# 1.68 20-Aug-2017 pd

vmd: Allow only upward migration

This restricts receiving vms from hosts with more cpu features.

Tested on
broadwell -> skylake (works)
skylake -> broadwell (don't work)

ok mlarkin@


# 1.67 15-Aug-2017 pd

vmd: fix vm id displayed by vmctl when receiving a vm

Also fix two debug messages and an IMSG type.


# 1.66 14-Aug-2017 jasper

validate vm names before creating them; a valid name contains alphanumeric
characters, including '.', '_' and '-'. but does not start with the latter
three.

ok mlarkin@ pd@


# 1.65 13-Aug-2017 jasper

don't issue a termination command to an already stopped vm

ok mlarkin@


# 1.64 15-Jul-2017 pd

Add vmctl send and vmctl receive

ok reyk@ and mlarkin@


# 1.63 09-Jul-2017 pd

vmd/vmctl: Add ability to pause / unpause vms

With help from Ashwin Agrawal

ok reyk@ mlarkin@


# 1.62 29-May-2017 mlarkin

vmd(8): prevent crashing when presented with a vm name argument to
"vmctl stop" that doesn't exist.

Diff from Pratik Vyas, thanks!


# 1.61 04-May-2017 reyk

Report command failure back to vmctl reload, reset, load, log verbose.

OK mlarkin@


# 1.60 04-May-2017 reyk

Add support for rdomains.

This allows to configure VM interfaces and switches in individual rdomains.

OK mlarkin@


# 1.59 25-Apr-2017 reyk

Generate randomized MAC addresses earlier to keep them across reboots.

OK deraadt@


# 1.58 21-Apr-2017 reyk

Add global configuration option "local prefix" to change prefix for -L.

The default prefix is 100.64.0.0/10 from RFC6598.

Requested by sthen@ chris@
OK mlarkin@


# 1.57 19-Apr-2017 reyk

Add support for dynamic "NAT" interfaces (-L/local interface).

When a local interface is configured, vmd configures a /31 address on
the tap(4) interface of the host and provides another IP in the same
subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server
that replies with IP, gateway, and DNS addresses to the VM. The
built-in server only ever responds to the VM on the inside and cannot
leak its DHCP responses to the outside.

Thanks to Uwe Werler, Josh Grosse, and some others for testing!

OK deraadt@


# 1.56 06-Apr-2017 reyk

Do not expose vmm(4) VM IDs to the user, use vmd(8)'s IDs instead.

Each VM has two IDs: one from the kernel (vmm) and a different one
from userland (vmd). The vmm ID is not consistent and incremented on
every boot during runtimg of the host system. The vmd ID remains the
same during the lifetime of a configured VM, even after reboots.
Configured VMs will even get and keep their IDs when the configuration
is loaded. This is more what users expect.

Pointed out and tested by otto@

OK deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.55 15-Mar-2017 reyk

More fixes for starting and stopping VMs, fixing fallout from vm_running.

- Don't start a VM that is already running
- Keep the VM as running until it is powered off (and not stopping)
- Don't fatal in the parent if the vmm process referenced an unknown VM
- Don't stop a VM that is already stopping
- Indicate that a VM is stopping in "vmctl status"

The previous "vmctl stop; vmctl stop" to force-shutdown is not
supported anymore - the shutdown timeout should make sure that the VM
is really terminated. To force-shutdown, reference the VM by ID.
We might add a flag to vmctl stop to just turn the VM off.


# 1.54 15-Mar-2017 reyk

Close the tty if the VM was powered down.

The parent keeps a copy of each VM's tty fd to reuse it on reboot.
Close this tty if the VM was stopped, and not rebooted, by calling
vm_stop(vm, 0) instead of just setting vm_running to 0. Also make
sure that vm_ttyname is not used after free'ing it.


# 1.53 02-Mar-2017 reyk

Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.

This is especially useful when multiple VMs share a switch, the
implementation is independent from the underlying switch or bridge.

no objections mlarkin@


# 1.52 01-Mar-2017 reyk

Add "owner" option to set a user/group ownership for pre-configured VMs

This allows matching users to start or stop VMs that they "own" and to
access the console accordingly.

OK mlarkin@


# 1.51 27-Feb-2017 reyk

Replace openpty(3) with local function that uses pre-opened /dev/ptm fd

This allows more flexibility for upcoming changes and better pledge.
We also didn't use half of the features of libutil's openpty function.
Additionally, make sure that the ttys are closed correctly on shutdown.

OK gilles@


# 1.50 13-Jan-2017 edd

Make it possible to remove VMs from vmd(8)'s internal queue.

The semantics agreed with reyk@ are:

* ad-hoc created vms, created with `vmctl start`, are removed once stopped.
* Stopped VMs defined in a config file are flushed before a `vmctl reload`.

OK reyk@


# 1.49 11-Jan-2017 reyk

Add imsg communication channel between vmd and invividual VMs.
For now, this is only used to forward "log verbose|brief" requests,
but it will be used for better things later.

OK mlarkin@


# 1.48 09-Jan-2017 reyk

Stop accessing verbose and debug variables from log.c directly.

This replaces log_verbose() and "extern int verbose" with the two functions
log_setverbose() and log_getverbose().

Pointed out by benno@
OK krw@ eric@ gilles@ (OK gilles@ for the snmpd bits as well)


# 1.47 14-Dec-2016 reyk

Allow to start disabled and pre-configured VMs by name, "vmctl start foo".

With testing from Jon Bernard

OK mlarkin@


# 1.46 14-Dec-2016 reyk

If a VM terminates with the result EAGAIN, close all fds except the
pty and re-send it to the vmm monitor process. With additional
changes in vmm.c, this will allow perform a cold reboot of VM.

With testing and feedback from Jon Bernard
OK mlarkin@


# 1.45 26-Nov-2016 reyk

Implement basic support for boot.conf(8) on the disk image.

Like the real boot loader, load and parse hd0a:/etc/boot.conf from the
first disk and fall back to /bsd. Not all boot loader options are
supported, but it at least does set device, set image, and boot -acds
(eg. for booting single-user).

For example, it can now boot install60.fs that includes a boot.conf
with "set image /6.0/amd64/bsd.rd":
vmctl start install -c -d install60.fs -d OpenBSD.img

This pseudo-bootloader is only needed without BIOS and could
potentially be replaced in the future.

OK mlarkin@


# 1.44 26-Nov-2016 reyk

If -m/memory is not specified, use 512M by default.

Default value picked with mlarkin - not too small and not too large.

OK mlarkin@


# 1.43 24-Nov-2016 reyk

Add support for booting the kernel from the disk image.

This make the kernel/-k argument optional and, if not specified, tries
to find the /bsd kernel in the primary hd0a partition of the first
disk image itself. It doesn't support hd0a:/etc/boot.conf yet, and it
is no BIOS or full boot loader, but it makes booting and handling of
VMs a bit easier - booting an external kernel is still supported.

The UFS file system code ufs.c is directly from libsa which is also
used by the real boot loader. The code compiles with a few signedness
warning which will be fixed separately.

OK mlarkin@


# 1.42 22-Nov-2016 reyk

Fix error path of config_setvm() and its callers. This unbreaks
loading of invalid kernel files.

Reported by mlarkin@
OK mlarkin@


# 1.41 22-Nov-2016 reyk

There is no need for res when there is already ret.


# 1.40 22-Nov-2016 edd

Insert disabled VMs into vmd(8)'s queues and allow vmctl(8) to display them.

Tested by Jon Bernard and reyk@.

OK reyk@, no objections mlarkin@.

Thanks


# 1.39 04-Nov-2016 reyk

Pass the internal vmid or 0 to vm_register() instead of changing it
once again after setting the next available id.

Suggested by edd@


# 1.38 04-Nov-2016 reyk

Update the config/register/get VM methods to match the config_set/get
style that is used in other places. Also keep the vmid from the parent.

OK edd@


# 1.37 29-Oct-2016 edd

Separate parsing vms and switches from starting them in vmd(8).

Brings us one step closer to having disabled by default vms is vm.conf(5),
which can be started with vmctl(8).

Input, testing and OK reyk@. Thanks.


# 1.36 17-Oct-2016 reyk

Add the option to specify an interface group per virtual switch as well;
this group will be added to all VM tap(4) interfaces in the switch.

Tested by martijn@


# 1.35 15-Oct-2016 reyk

Allow to add an interface to an interface group; with the group keyword.

Requested and tested by martijn@


# 1.34 12-Oct-2016 reyk

Fix functionality and semantics of vmctl load/reload/reset.

OK rzalamena@


# 1.33 06-Oct-2016 reyk

Terminate VMs on shutdown of vmd instead of leaving them running as
undead VM processes.

OK mlarkin@


# 1.32 05-Oct-2016 reyk

Add support for enhanced networking configuration and virtual switches.
See vm.conf(5) for more details.

OK mlarkin@


# 1.31 04-Oct-2016 reyk

Add a new "priv" process that is responsible for ioctls and restricted
operations that aren't allowed under pledge. This is a companion to
the "vmd" process that runs as root but with pledge.

With the "priv" process, each new tap(4) interface now gets a
description to indicate the vm, eg. "vm1-if0-myvm". For network
configuration will be done by vmd/priv later.

OK mlarkin@


# 1.30 29-Sep-2016 reyk

Implement fork+exec for vmd, using the same framework from httpd etc.

No objections from mlarkin@ sunil@


# 1.29 17-Aug-2016 deraadt

small bits of header cleanup; ok mlarkin


# 1.28 29-Jul-2016 stefan

Allow starting a VM again after it was terminated

If a VM exits, terminate it and remove it from the list of
available VMs. That allows a VM with name `foo' to be restarted
after it has exited.

This changes structures shared between vmd and vmctl. You need to
rebuild vmctl also.

ok mlarkin@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.27 05-Feb-2016 reyk

Fix a possible use-after-free in vmd, forward the result to the
control socket before free'ing the vm.

Found by and OK jsg@


# 1.26 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.25 11-Dec-2015 reyk

The vmctl "id" argument can now be a number of or a vm name, eg.
vmctl stop 3
vmctl stop "openbsd.vm"


# 1.24 08-Dec-2015 jsg

when checking the config file with -n don't open /dev/vmm or require root
ok reyk@


# 1.23 08-Dec-2015 jsg

make the -f option work as intended
ok reyk@


# 1.22 07-Dec-2015 reyk

tweak initial error logging


# 1.21 06-Dec-2015 reyk

Prevent running a VM with the same name multiple times - multiple
instances of the same configuration will be handled in a different way
later. It is also not a good idea to use the same writeable disk
with multiple VMs at the same time.

As discussed with mlarkin@


# 1.20 06-Dec-2015 reyk

Print the TTY in the vmctl status output.


# 1.19 06-Dec-2015 reyk

When a new vm is created with VMM_IOC_CREATE, the kernel assigns a
unique id to it. This happens in the vm child process and has to be
communicated to the parent processes to track the vm. Knowing the vm
id in the parent and vmm processes also allows to remove vm from the
daemons list on terminate requests later.


# 1.18 06-Dec-2015 reyk

Check errno from config_getvm() correctly


# 1.17 05-Dec-2015 reyk

Print shorter error message if opening /dev/vmm failed.

Pointed out by deraadt@


# 1.16 03-Dec-2015 reyk

Re-add the "load" and "reload" commands to vmctl: Instead of parsing
the configuration in vmctl directly, it now sends a (re)load request
to vmd. The reload also resets the existing configuration status -
this doesn't do much difference yet but a future change will compare
if a specified VM is already running. "load" will allow to add
configuration, while "reload" resets the state before loading.


# 1.15 03-Dec-2015 reyk

Add and document -D and -f flags to vmd.


# 1.14 03-Dec-2015 reyk

mlarkin's code has been moved to vmm.c, so it is ok to claim the copyright.


# 1.13 03-Dec-2015 reyk

Add support for an optional vm.conf(5) file in vmd. This will replace
vmm.conf(5) in vmmctl. For a short time, both vmd and vmmctl will
support a configuration file, but vmmctl will be changed to send
"load" requests to vmd instead of loading and parsing the file
directly.


# 1.12 03-Dec-2015 reyk

prepare config_getvm() for parse.y


# 1.11 02-Dec-2015 reyk

send the tty name to vmmctl and print it as a result.


# 1.10 02-Dec-2015 reyk

Split the fully privileged parent into two processes "parent" and
"vmm" with reduced privileges:
- the "parent" opens fds (disks, ifs, etc.) but runs as root but pledged as
"stdio rpath wpath proc tty sendfd".
- the "vmm" process handles the creation and supervision of vm processes,
and the primary communication with the vmm(4) subsystem. It runs as _vmd
in the chroot but does not use pledge, as the vmm ioctls are not allowed
by any pledge model yet.
With this change, vmd starts to track the configuration state of VMs
in vmd and will allow other things later (like terminating a vm by
name, moving the configuration parser to vmd, ...). More incremental
changes will follow.


# 1.9 02-Dec-2015 reyk

Start tweaking vmd's privsep and daemon model by splitting the main
process into multiple parts and adopting the "proc.c"-style from other
daemons. This allows to further reduce the privileges, to give better
pledge(2), and to add some upcoming changes.

"please do" mlarkin@, deraadt@


# 1.8 26-Nov-2015 reyk

Automatically start vmm(4) when the first VM is created and after the
last VM is terminated. This allows to remove the explicit "vmm
enable" / "vmm disable" (VMM_IOC_START / VMM_IOC_STOP) ioctls. You'll
have to update kernel and userland for this change, as the kernel ABI
changes.

OK mpi@ mlarkin@


# 1.7 25-Nov-2015 tedu

typo: should be looking pid == -1


# 1.6 23-Nov-2015 reyk

accept4() is restarted after signals which prevents vmd from exiting
in the current control socket loop. Add a poll before the accept that
is not restarted and allows to escape the loop. This code is kind of
temporary, as we're planning to replace the event handling, but it
allows to kill (or Ctrl+c) vmd for now.

OK tedu@, discussed with many


# 1.5 23-Nov-2015 reyk

I accidentally removed a newline in usage() when converting the log
messages to log_*.

From Cesare Gargano


# 1.4 23-Nov-2015 reyk

Add support for logging to stderr or syslog, and to run vmd in
foreground with -d.

OK mlarkin@ jung@


# 1.3 22-Nov-2015 deraadt

use PATH_MAX where needed


# 1.2 22-Nov-2015 reyk

Add $ Ids


# 1.1 22-Nov-2015 mlarkin

vmd(8) - virtual machine daemon.

There is still a lot to be done, and fixed, in these userland components
but I have received enough "it works, commit it" emails that it's time
to finish those things in tree.

discussed with many, tested by many.


# 1.126 18-Jul-2021 dv

vmd(8): remove invalid errno values from config_setvm

Refactor config_setvm to directly return error code on failure
instead of returning -1 and setting errno. It was setting unsupported
values not defined in <errno.h>.

OK mlarkin@


# 1.125 05-May-2021 dv

Refactor vm_instance to return error value directly.

vmd(8)'s vm_instance function set unsupported errno values. Change the
api to directly return an error (either errno or custom vmd error).

"go for it" -mlarkin@


# 1.124 04-May-2021 dv

Init debug logging state before attempting to log.

Error messages related to bad configuration were not flushing to
stderr.

OK mlarkin@


# 1.123 26-Apr-2021 dv

vmd(8): fix vmctl client "wait" state corruption

Adds queue-based tracking of waiting client state to fix the cause of
state corruption when a vmctl(8) user cancels a wait and restarts it.
The socket fd value for the control process client was being used to
track the waiting party, but this also prevented multiple waiting
clients.

This moves all the state tracking of who to notify of a vm's stopping
to the control process and no longer requires the parent process to
track it in the global environment state.

Future work will be needed to smooth out the difference between the
IMSG_VMDOP_TERMINATE_VM_{EVENT,RESPONSE} events instead of needing to
translate before relaying to the vmctl(8) client.

Tested by Mischa Peters (thanks!)

ok mlarkin@


Revision tags: OPENBSD_6_9_BASE
# 1.122 05-Apr-2021 dv

Send correct response type on unpause errors.

ok pd@


# 1.121 29-Mar-2021 dv

Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp
and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior
ignored did not intercept these packets and instead transmitted them.

This should make vmd(8)'s dhcp behave more as a true dhcp server should and
allows it to work properly with the new dhcpleased(8) attempting a renewal.

OK mlarkin@


# 1.120 27-Jan-2021 deraadt

these programs (with common ancestry) had a -fno-common problem related
to privsep_procid.
ok mortimer


Revision tags: OPENBSD_6_8_BASE
# 1.119 23-Sep-2020 martijn

Revert agentx support for now, we're too close to release.

requested by deraadt@


# 1.118 23-Sep-2020 martijn

Add support for agentx to vmd.

This is based around VM-MIB from RFC7666,but does not export the full
spec. People more knowledgeable of vmd are encouraged to expand on this.


Revision tags: OPENBSD_6_7_BASE
# 1.117 12-Dec-2019 pd

vmd: start vms defined in vm.conf in a staggered fashion

This addresses 'thundering herd' problem when a lot of
vms are configured in vm.conf. A lot of vms booting in parallel can
overload the host and also mess up tsc calibration in openbsd guests as
it uses PIT which doesn't fire reliably if the host is overloaded.

We default to starting vms with parallelism of ncpuonline and a delay 30 seconds
between batches. This is configurable in vm.conf.

ok mlarkin@ (also addressed comments from cheloha@)


Revision tags: OPENBSD_6_6_BASE
# 1.116 04-Sep-2019 mlarkin

vmd(8): memory leak in an error path

Found by Hiltjo Posthuma, thanks!


# 1.115 14-Aug-2019 anton

Improve the error message when supplying an invalid template to vmctl
start. Favoring 'invalid template' over 'permission denied' should give
the user a better hint on what went wrong.

ok kn@ mlarkin@


# 1.114 28-Jun-2019 deraadt

When system calls indicate an error they return -1, not some arbitrary
value < 0. errno is only updated in this case. Change all (most?)
callers of syscalls to follow this better, and let's see if this strictness
helps us in the future.


# 1.113 20-May-2019 jasper

drop fatalx calls when claiming a new vm id; otherwise it's possible
to crash vmd and take all other vms with it. this required a little
shuffling to get the error value reported back to the caller to
handle the error properly.

ok mlarkin@


# 1.112 11-May-2019 jasper

report vm state through 'vmctl status'; whereas previously this would display the state of
the vcpu (which is why it got removed), it now actually reports the correct state
(running, stopped, disabled, paused, etc)

ok ccardenas@ mlarkin@


# 1.111 11-May-2019 jasper

vm_dump_header allocated space for a signature but it was never set;
set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image

ok mlarkin@ pd@


# 1.110 11-May-2019 jasper

track the state of the vm (running, paused, etc) using a single bitfield instead of
a handful of separate variables. this will makes it easier for vmd to report
and check on the individual vm states

no functional change intended

ok ccardenas@ mlarkin@


# 1.109 11-May-2019 jasper

sync the vm state in vmd too when (un)pausing a vm, otherwise the vm process
knows the vm is paused, but vmd does not.

ok mlarkin@ pd@


Revision tags: OPENBSD_6_5_BASE
# 1.108 09-Dec-2018 claudio

When -B is used to specify a specific boot device also change the reboot
behaviour of vmd to stop / exit at guest reboot.
OK ccardenas@


# 1.107 04-Dec-2018 claudio

Introduce IMSG_VMDOP_WAIT_VM_REQUEST a control message that registers a
vmctl peerid that should be informed when the VM is stopped (like when the
guest does a shutdown). Uses the same logic as using the VMOP_WAIT flag on
IMSG_VMDOP_TERMINATE_VM_REQUEST.
Ok ccardenas@, reyk@


# 1.106 26-Nov-2018 ori

Keep a list of known vms, and reuse the VM IDs.

This means that when using '-L', the IP addresses of the VMs are stable.

ok reyk@


# 1.105 21-Nov-2018 reyk

Add support for "local inet6" interfaces.

ok & test ccardenas@, additional review from kn@


# 1.104 15-Oct-2018 reyk

Prevent VM reboot loops by rate-limiting the interval a VM can reboot.

This looping has been experienced by people who run VMs with a broken
kernel or boot loader that trigger a very fast reboot loop (triple
fault) of a VM that ends up using a lot of CPU and resources on the
host. Some fixes in vmm(4) and vmd(8) helped to avoid such conditions
but it can still occur if something is wrong in the guest VM itself.

If the VM restarts after less than VM_START_RATE_SEC (6) seconds, we
increment the limit counter. After VM_START_RATE_LIMIT (3) of suchs
fast reboots the VM is stopped.

There are only very few people who intentionally want to reboot-loop a
VM very quickly (many times within a second); mostly for fuzzing.
They will have to recompile and adjust the stated #defines in the code
as we don't have a config option to disable it.

OK mlarkin@


Revision tags: OPENBSD_6_4_BASE
# 1.103 08-Oct-2018 reyk

Add support for qcow2 base images (external snapshots).

This works is from Ori Bernstein, committing on his behalf:

Add support to vmd for external snapshots. That is, snapshots that are
derived from a base image. Data lookups start in the derived image,
and if the derived image does not contain some data, the search
proceeds ot the base image. Multiple derived images may exist off of
a single base image.

A limitation of this format is that modifying the base image will
corrupt the derived image.

This change also adds support for creating disk derived disk images to
vmctl. To use it:

vmctl create derived.qcow2 -s 16G -b base.qcow2

From Ori Bernstein
OK mlarkin@ reyk@


# 1.102 29-Sep-2018 pd

vmd: don't remove vm if sending failed

Fix a bug where a vm was removed in vmd.c after vmctl send even if sending
failed.
spotted by solene@
ok mlarkin@


# 1.101 28-Sep-2018 reyk

Fix copy-pasto to use maxmem instead of maxcpu

Reported by Greg Steuck

OK mlarkin@


# 1.100 10-Sep-2018 bluhm

vmd(8) clould close file descriptor 0 as not all fd fields were
properly initialized with -1. Also avoid closing -1.
OK mlarkin@


# 1.99 10-Sep-2018 bluhm

During the fork+exec implementation, daemon(3) was moved after
proc_init(). As a consequence vmd(8) child processes did not detach
from the terminal anymore. Dup /dev/null to the stdio file descriptors
in the children.
OK mlarkin@ reyk@


# 1.98 15-Jul-2018 reyk

Track resources and enforce cpu/memory/interface limits for non-root users.

The limits are currently hard-coded and undocumented (4 CPUs/VMs, 2G
memory, 8 interfaces) but will be configurable in an upcoming diff.
These limits are tracked in total usage; for example, a user will be
able to run up to 4 VMs with 512M of memory or a single VM with 2G.

OK ccardenas@ mlarkin@


# 1.97 13-Jul-2018 reyk

Check the disk/kernel/cdrom file permissions after openening the fd.

This prevents time of TOCTOU attacks for instances.

OK mlarkin@


# 1.96 13-Jul-2018 reyk

Add "allow instance" option.

This allows users to create VM instances and change desired options,
for example a user can be allowed to run a VM with all the
pre-configured options but specify an own disk image.

(mlarkin@ was fine with iterating over it)

OK ccardenas@


# 1.95 12-Jul-2018 reyk

Allow to use configured/running VMs as templates for other VM instances.

This introduces new grammar and the -t optional in vmctl start.

(For now, only root can create VM instances; but it is planned to allow
users to create their own VMs based on permissions and quota.)

OK ccardenas@ mlarkin@ jmc@


# 1.94 11-Jul-2018 reyk

style - indent each case statement in a switch.


# 1.93 11-Jul-2018 reyk

Add -w option to vmctl stop to wait for completion of VM termination.

Use it in /etc/rc.d/vmd accordingly.

OK sthen@


# 1.92 11-Jul-2018 reyk

Rename function to vmd_check_vmh


# 1.91 11-Jul-2018 reyk

Add -f option to vmctl stop to forcefully kill a VM.

This also fixes a bug in vmm_sighdlr where it might have missed
forwarding the TERMINATE_EVENT to the vmd parent after a VM child
died, leading to an abandoned VM in the vmd parent process.

OK ccardenas@ mlarkin@ benno@ kn@


# 1.90 10-Jul-2018 reyk

style (single-line ifs don't need braces)


# 1.89 10-Jul-2018 reyk

vmd already had DEBUG/DPRINTF, there is no need for VMD_DEBUG/dprintf

Replace all occurences of dprintf with DPRINTF (defined in proc.h).


# 1.88 10-Jul-2018 reyk

Tweak debug log messages

- Turn tracing messages into DPRINTF (only compiled with DEBUG).

- Pass __func__ to vm_stop and vm_remove: this way we can track who
called the function in the async context. It replaces the manual
log_debug in front of each vm_stop/vm_remove. This debug logging
trick can be removed in the future once we are more confident about
it.

OK ccardenas@ mlarkin@


# 1.87 26-Jun-2018 reyk

Add "socket owner" to allow changing the owner of the vmd control socket.

This allows to open vmctl control or console access to other users
that are not in group wheel. Access for non-root users still defaults
to read-only actions unless you change the owner (user/group) of each
individual VM.

Requested by Mischa Peters

OK mlarkin@


# 1.86 19-Jun-2018 reyk

knf


# 1.85 13-May-2018 pd

vmd(8): enable pause / unpause for vm owners

Patch from Mohamed Aslan. Thanks!
ok kn@


# 1.84 25-Apr-2018 mlarkin

vmd(8)'s early error messages weren't visible when started via /etc/rc
(such as errors relating to not having VMX/etc). Change the log_init
to log to syslog so at least we have some chance of seeing these errors.

requested and ok beck@


# 1.83 21-Apr-2018 mlarkin

spelling error in log message


# 1.82 29-Mar-2018 martijn

Make sure that the global config is send out immediately when it is
loaded. This makes sure that the local prefix specied in the config is
always used.

OK ccardenas@


Revision tags: OPENBSD_6_3_BASE
# 1.81 14-Mar-2018 mlarkin

block two VMs from using the same disk image file at the same time.
Also changes an error message in vmctl to reflect same.


# 1.80 18-Feb-2018 pd

vmd: fix vmctl pause for non existing vm ids (never returns)

check if vm id is valid before sending to vmm for pausing. The 'lock' is caused
by vmm sending back ENOENT for a non existent vm but vmd drops the message
because it doesn't recogize the vmid vmm is talking about. This is an artifact
of the 'policy' don't trust any imsg from a sibling priv sep process and do
your own checking.

reported by Abel Abraham Camarillo Ojeda
ok mlarkin@ and ccardenas@


# 1.79 10-Jan-2018 sthen

Don't require "disk" or "kernel", also allow just "cdrom" instead, a VM can
still be useful with only cdrom storage. ok ccardenas@


# 1.78 08-Jan-2018 mpi

Enable TIOCUCNTL to be able to set ns8250's break detected condition.

It is now possible to send BREAK commands to vmd(8) independently of
the serial terminal emulator.

Happy virtual ddb(4) hacking!

No objection from mlarkin@, ok nicm@, ccardenas@, deraadt@


# 1.77 03-Jan-2018 ccardenas

Add initial CD-ROM support to VMD via vioscsi.

* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8)
* Support various sized ISOs (Limitation of 4G ISOs on Linux guests)
* Known working guests: OpenBSD (primary), Alpine Linux (primary),
CentOS 6 (secondary), Ubuntu 17.10 (secondary).
NOTE: Secondary indicates some issue(s) preventing full/reliable
functionality outside the scope of the vioscsi work.
* If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's
default BIOS) will boot from CD-ROM.

ok mlarkin@, jca@


# 1.76 06-Dec-2017 abieber

Make vmd respect owner when starting non-disabled vms.

OK pd@, benno@


# 1.75 30-Nov-2017 ccardenas

When performing vmctl reload and a previously configured vm is running,
exit with an EALREADY vs EPERM.

ok mlarkin@


# 1.74 11-Nov-2017 mlarkin

update switch handling in vmd(8). vmd now gets switch information (rdomain,
etc) from underlying switch interface instead of handling this on its
own.

Diff from carlos cardenas, Thanks!

ok reyk@


# 1.73 07-Nov-2017 mlarkin

typo in previous


# 1.72 07-Nov-2017 mlarkin

comment function vm_checkperm


# 1.71 24-Oct-2017 mlarkin

The VMD parent process didn't handle the case of a VM exiting
with a non 0 return properly (i.e. EIO).

From: Carlos Cardenas, thanks!


# 1.70 07-Oct-2017 mlarkin

vmd: retain ownership on vm reboot

from Jesper Wallin, thanks!


Revision tags: OPENBSD_6_2_BASE
# 1.69 08-Sep-2017 mlarkin

vmd: add more explanatory log_debug messages

From Carlos Cardenas, many thanks!


# 1.68 20-Aug-2017 pd

vmd: Allow only upward migration

This restricts receiving vms from hosts with more cpu features.

Tested on
broadwell -> skylake (works)
skylake -> broadwell (don't work)

ok mlarkin@


# 1.67 15-Aug-2017 pd

vmd: fix vm id displayed by vmctl when receiving a vm

Also fix two debug messages and an IMSG type.


# 1.66 14-Aug-2017 jasper

validate vm names before creating them; a valid name contains alphanumeric
characters, including '.', '_' and '-'. but does not start with the latter
three.

ok mlarkin@ pd@


# 1.65 13-Aug-2017 jasper

don't issue a termination command to an already stopped vm

ok mlarkin@


# 1.64 15-Jul-2017 pd

Add vmctl send and vmctl receive

ok reyk@ and mlarkin@


# 1.63 09-Jul-2017 pd

vmd/vmctl: Add ability to pause / unpause vms

With help from Ashwin Agrawal

ok reyk@ mlarkin@


# 1.62 29-May-2017 mlarkin

vmd(8): prevent crashing when presented with a vm name argument to
"vmctl stop" that doesn't exist.

Diff from Pratik Vyas, thanks!


# 1.61 04-May-2017 reyk

Report command failure back to vmctl reload, reset, load, log verbose.

OK mlarkin@


# 1.60 04-May-2017 reyk

Add support for rdomains.

This allows to configure VM interfaces and switches in individual rdomains.

OK mlarkin@


# 1.59 25-Apr-2017 reyk

Generate randomized MAC addresses earlier to keep them across reboots.

OK deraadt@


# 1.58 21-Apr-2017 reyk

Add global configuration option "local prefix" to change prefix for -L.

The default prefix is 100.64.0.0/10 from RFC6598.

Requested by sthen@ chris@
OK mlarkin@


# 1.57 19-Apr-2017 reyk

Add support for dynamic "NAT" interfaces (-L/local interface).

When a local interface is configured, vmd configures a /31 address on
the tap(4) interface of the host and provides another IP in the same
subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server
that replies with IP, gateway, and DNS addresses to the VM. The
built-in server only ever responds to the VM on the inside and cannot
leak its DHCP responses to the outside.

Thanks to Uwe Werler, Josh Grosse, and some others for testing!

OK deraadt@


# 1.56 06-Apr-2017 reyk

Do not expose vmm(4) VM IDs to the user, use vmd(8)'s IDs instead.

Each VM has two IDs: one from the kernel (vmm) and a different one
from userland (vmd). The vmm ID is not consistent and incremented on
every boot during runtimg of the host system. The vmd ID remains the
same during the lifetime of a configured VM, even after reboots.
Configured VMs will even get and keep their IDs when the configuration
is loaded. This is more what users expect.

Pointed out and tested by otto@

OK deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.55 15-Mar-2017 reyk

More fixes for starting and stopping VMs, fixing fallout from vm_running.

- Don't start a VM that is already running
- Keep the VM as running until it is powered off (and not stopping)
- Don't fatal in the parent if the vmm process referenced an unknown VM
- Don't stop a VM that is already stopping
- Indicate that a VM is stopping in "vmctl status"

The previous "vmctl stop; vmctl stop" to force-shutdown is not
supported anymore - the shutdown timeout should make sure that the VM
is really terminated. To force-shutdown, reference the VM by ID.
We might add a flag to vmctl stop to just turn the VM off.


# 1.54 15-Mar-2017 reyk

Close the tty if the VM was powered down.

The parent keeps a copy of each VM's tty fd to reuse it on reboot.
Close this tty if the VM was stopped, and not rebooted, by calling
vm_stop(vm, 0) instead of just setting vm_running to 0. Also make
sure that vm_ttyname is not used after free'ing it.


# 1.53 02-Mar-2017 reyk

Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.

This is especially useful when multiple VMs share a switch, the
implementation is independent from the underlying switch or bridge.

no objections mlarkin@


# 1.52 01-Mar-2017 reyk

Add "owner" option to set a user/group ownership for pre-configured VMs

This allows matching users to start or stop VMs that they "own" and to
access the console accordingly.

OK mlarkin@


# 1.51 27-Feb-2017 reyk

Replace openpty(3) with local function that uses pre-opened /dev/ptm fd

This allows more flexibility for upcoming changes and better pledge.
We also didn't use half of the features of libutil's openpty function.
Additionally, make sure that the ttys are closed correctly on shutdown.

OK gilles@


# 1.50 13-Jan-2017 edd

Make it possible to remove VMs from vmd(8)'s internal queue.

The semantics agreed with reyk@ are:

* ad-hoc created vms, created with `vmctl start`, are removed once stopped.
* Stopped VMs defined in a config file are flushed before a `vmctl reload`.

OK reyk@


# 1.49 11-Jan-2017 reyk

Add imsg communication channel between vmd and invividual VMs.
For now, this is only used to forward "log verbose|brief" requests,
but it will be used for better things later.

OK mlarkin@


# 1.48 09-Jan-2017 reyk

Stop accessing verbose and debug variables from log.c directly.

This replaces log_verbose() and "extern int verbose" with the two functions
log_setverbose() and log_getverbose().

Pointed out by benno@
OK krw@ eric@ gilles@ (OK gilles@ for the snmpd bits as well)


# 1.47 14-Dec-2016 reyk

Allow to start disabled and pre-configured VMs by name, "vmctl start foo".

With testing from Jon Bernard

OK mlarkin@


# 1.46 14-Dec-2016 reyk

If a VM terminates with the result EAGAIN, close all fds except the
pty and re-send it to the vmm monitor process. With additional
changes in vmm.c, this will allow perform a cold reboot of VM.

With testing and feedback from Jon Bernard
OK mlarkin@


# 1.45 26-Nov-2016 reyk

Implement basic support for boot.conf(8) on the disk image.

Like the real boot loader, load and parse hd0a:/etc/boot.conf from the
first disk and fall back to /bsd. Not all boot loader options are
supported, but it at least does set device, set image, and boot -acds
(eg. for booting single-user).

For example, it can now boot install60.fs that includes a boot.conf
with "set image /6.0/amd64/bsd.rd":
vmctl start install -c -d install60.fs -d OpenBSD.img

This pseudo-bootloader is only needed without BIOS and could
potentially be replaced in the future.

OK mlarkin@


# 1.44 26-Nov-2016 reyk

If -m/memory is not specified, use 512M by default.

Default value picked with mlarkin - not too small and not too large.

OK mlarkin@


# 1.43 24-Nov-2016 reyk

Add support for booting the kernel from the disk image.

This make the kernel/-k argument optional and, if not specified, tries
to find the /bsd kernel in the primary hd0a partition of the first
disk image itself. It doesn't support hd0a:/etc/boot.conf yet, and it
is no BIOS or full boot loader, but it makes booting and handling of
VMs a bit easier - booting an external kernel is still supported.

The UFS file system code ufs.c is directly from libsa which is also
used by the real boot loader. The code compiles with a few signedness
warning which will be fixed separately.

OK mlarkin@


# 1.42 22-Nov-2016 reyk

Fix error path of config_setvm() and its callers. This unbreaks
loading of invalid kernel files.

Reported by mlarkin@
OK mlarkin@


# 1.41 22-Nov-2016 reyk

There is no need for res when there is already ret.


# 1.40 22-Nov-2016 edd

Insert disabled VMs into vmd(8)'s queues and allow vmctl(8) to display them.

Tested by Jon Bernard and reyk@.

OK reyk@, no objections mlarkin@.

Thanks


# 1.39 04-Nov-2016 reyk

Pass the internal vmid or 0 to vm_register() instead of changing it
once again after setting the next available id.

Suggested by edd@


# 1.38 04-Nov-2016 reyk

Update the config/register/get VM methods to match the config_set/get
style that is used in other places. Also keep the vmid from the parent.

OK edd@


# 1.37 29-Oct-2016 edd

Separate parsing vms and switches from starting them in vmd(8).

Brings us one step closer to having disabled by default vms is vm.conf(5),
which can be started with vmctl(8).

Input, testing and OK reyk@. Thanks.


# 1.36 17-Oct-2016 reyk

Add the option to specify an interface group per virtual switch as well;
this group will be added to all VM tap(4) interfaces in the switch.

Tested by martijn@


# 1.35 15-Oct-2016 reyk

Allow to add an interface to an interface group; with the group keyword.

Requested and tested by martijn@


# 1.34 12-Oct-2016 reyk

Fix functionality and semantics of vmctl load/reload/reset.

OK rzalamena@


# 1.33 06-Oct-2016 reyk

Terminate VMs on shutdown of vmd instead of leaving them running as
undead VM processes.

OK mlarkin@


# 1.32 05-Oct-2016 reyk

Add support for enhanced networking configuration and virtual switches.
See vm.conf(5) for more details.

OK mlarkin@


# 1.31 04-Oct-2016 reyk

Add a new "priv" process that is responsible for ioctls and restricted
operations that aren't allowed under pledge. This is a companion to
the "vmd" process that runs as root but with pledge.

With the "priv" process, each new tap(4) interface now gets a
description to indicate the vm, eg. "vm1-if0-myvm". For network
configuration will be done by vmd/priv later.

OK mlarkin@


# 1.30 29-Sep-2016 reyk

Implement fork+exec for vmd, using the same framework from httpd etc.

No objections from mlarkin@ sunil@


# 1.29 17-Aug-2016 deraadt

small bits of header cleanup; ok mlarkin


# 1.28 29-Jul-2016 stefan

Allow starting a VM again after it was terminated

If a VM exits, terminate it and remove it from the list of
available VMs. That allows a VM with name `foo' to be restarted
after it has exited.

This changes structures shared between vmd and vmctl. You need to
rebuild vmctl also.

ok mlarkin@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.27 05-Feb-2016 reyk

Fix a possible use-after-free in vmd, forward the result to the
control socket before free'ing the vm.

Found by and OK jsg@


# 1.26 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.25 11-Dec-2015 reyk

The vmctl "id" argument can now be a number of or a vm name, eg.
vmctl stop 3
vmctl stop "openbsd.vm"


# 1.24 08-Dec-2015 jsg

when checking the config file with -n don't open /dev/vmm or require root
ok reyk@


# 1.23 08-Dec-2015 jsg

make the -f option work as intended
ok reyk@


# 1.22 07-Dec-2015 reyk

tweak initial error logging


# 1.21 06-Dec-2015 reyk

Prevent running a VM with the same name multiple times - multiple
instances of the same configuration will be handled in a different way
later. It is also not a good idea to use the same writeable disk
with multiple VMs at the same time.

As discussed with mlarkin@


# 1.20 06-Dec-2015 reyk

Print the TTY in the vmctl status output.


# 1.19 06-Dec-2015 reyk

When a new vm is created with VMM_IOC_CREATE, the kernel assigns a
unique id to it. This happens in the vm child process and has to be
communicated to the parent processes to track the vm. Knowing the vm
id in the parent and vmm processes also allows to remove vm from the
daemons list on terminate requests later.


# 1.18 06-Dec-2015 reyk

Check errno from config_getvm() correctly


# 1.17 05-Dec-2015 reyk

Print shorter error message if opening /dev/vmm failed.

Pointed out by deraadt@


# 1.16 03-Dec-2015 reyk

Re-add the "load" and "reload" commands to vmctl: Instead of parsing
the configuration in vmctl directly, it now sends a (re)load request
to vmd. The reload also resets the existing configuration status -
this doesn't do much difference yet but a future change will compare
if a specified VM is already running. "load" will allow to add
configuration, while "reload" resets the state before loading.


# 1.15 03-Dec-2015 reyk

Add and document -D and -f flags to vmd.


# 1.14 03-Dec-2015 reyk

mlarkin's code has been moved to vmm.c, so it is ok to claim the copyright.


# 1.13 03-Dec-2015 reyk

Add support for an optional vm.conf(5) file in vmd. This will replace
vmm.conf(5) in vmmctl. For a short time, both vmd and vmmctl will
support a configuration file, but vmmctl will be changed to send
"load" requests to vmd instead of loading and parsing the file
directly.


# 1.12 03-Dec-2015 reyk

prepare config_getvm() for parse.y


# 1.11 02-Dec-2015 reyk

send the tty name to vmmctl and print it as a result.


# 1.10 02-Dec-2015 reyk

Split the fully privileged parent into two processes "parent" and
"vmm" with reduced privileges:
- the "parent" opens fds (disks, ifs, etc.) but runs as root but pledged as
"stdio rpath wpath proc tty sendfd".
- the "vmm" process handles the creation and supervision of vm processes,
and the primary communication with the vmm(4) subsystem. It runs as _vmd
in the chroot but does not use pledge, as the vmm ioctls are not allowed
by any pledge model yet.
With this change, vmd starts to track the configuration state of VMs
in vmd and will allow other things later (like terminating a vm by
name, moving the configuration parser to vmd, ...). More incremental
changes will follow.


# 1.9 02-Dec-2015 reyk

Start tweaking vmd's privsep and daemon model by splitting the main
process into multiple parts and adopting the "proc.c"-style from other
daemons. This allows to further reduce the privileges, to give better
pledge(2), and to add some upcoming changes.

"please do" mlarkin@, deraadt@


# 1.8 26-Nov-2015 reyk

Automatically start vmm(4) when the first VM is created and after the
last VM is terminated. This allows to remove the explicit "vmm
enable" / "vmm disable" (VMM_IOC_START / VMM_IOC_STOP) ioctls. You'll
have to update kernel and userland for this change, as the kernel ABI
changes.

OK mpi@ mlarkin@


# 1.7 25-Nov-2015 tedu

typo: should be looking pid == -1


# 1.6 23-Nov-2015 reyk

accept4() is restarted after signals which prevents vmd from exiting
in the current control socket loop. Add a poll before the accept that
is not restarted and allows to escape the loop. This code is kind of
temporary, as we're planning to replace the event handling, but it
allows to kill (or Ctrl+c) vmd for now.

OK tedu@, discussed with many


# 1.5 23-Nov-2015 reyk

I accidentally removed a newline in usage() when converting the log
messages to log_*.

From Cesare Gargano


# 1.4 23-Nov-2015 reyk

Add support for logging to stderr or syslog, and to run vmd in
foreground with -d.

OK mlarkin@ jung@


# 1.3 22-Nov-2015 deraadt

use PATH_MAX where needed


# 1.2 22-Nov-2015 reyk

Add $ Ids


# 1.1 22-Nov-2015 mlarkin

vmd(8) - virtual machine daemon.

There is still a lot to be done, and fixed, in these userland components
but I have received enough "it works, commit it" emails that it's time
to finish those things in tree.

discussed with many, tested by many.


# 1.125 05-May-2021 dv

Refactor vm_instance to return error value directly.

vmd(8)'s vm_instance function set unsupported errno values. Change the
api to directly return an error (either errno or custom vmd error).

"go for it" -mlarkin@


# 1.124 04-May-2021 dv

Init debug logging state before attempting to log.

Error messages related to bad configuration were not flushing to
stderr.

OK mlarkin@


# 1.123 26-Apr-2021 dv

vmd(8): fix vmctl client "wait" state corruption

Adds queue-based tracking of waiting client state to fix the cause of
state corruption when a vmctl(8) user cancels a wait and restarts it.
The socket fd value for the control process client was being used to
track the waiting party, but this also prevented multiple waiting
clients.

This moves all the state tracking of who to notify of a vm's stopping
to the control process and no longer requires the parent process to
track it in the global environment state.

Future work will be needed to smooth out the difference between the
IMSG_VMDOP_TERMINATE_VM_{EVENT,RESPONSE} events instead of needing to
translate before relaying to the vmctl(8) client.

Tested by Mischa Peters (thanks!)

ok mlarkin@


Revision tags: OPENBSD_6_9_BASE
# 1.122 05-Apr-2021 dv

Send correct response type on unpause errors.

ok pd@


# 1.121 29-Mar-2021 dv

Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp
and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior
ignored did not intercept these packets and instead transmitted them.

This should make vmd(8)'s dhcp behave more as a true dhcp server should and
allows it to work properly with the new dhcpleased(8) attempting a renewal.

OK mlarkin@


# 1.120 27-Jan-2021 deraadt

these programs (with common ancestry) had a -fno-common problem related
to privsep_procid.
ok mortimer


Revision tags: OPENBSD_6_8_BASE
# 1.119 23-Sep-2020 martijn

Revert agentx support for now, we're too close to release.

requested by deraadt@


# 1.118 23-Sep-2020 martijn

Add support for agentx to vmd.

This is based around VM-MIB from RFC7666,but does not export the full
spec. People more knowledgeable of vmd are encouraged to expand on this.


Revision tags: OPENBSD_6_7_BASE
# 1.117 12-Dec-2019 pd

vmd: start vms defined in vm.conf in a staggered fashion

This addresses 'thundering herd' problem when a lot of
vms are configured in vm.conf. A lot of vms booting in parallel can
overload the host and also mess up tsc calibration in openbsd guests as
it uses PIT which doesn't fire reliably if the host is overloaded.

We default to starting vms with parallelism of ncpuonline and a delay 30 seconds
between batches. This is configurable in vm.conf.

ok mlarkin@ (also addressed comments from cheloha@)


Revision tags: OPENBSD_6_6_BASE
# 1.116 04-Sep-2019 mlarkin

vmd(8): memory leak in an error path

Found by Hiltjo Posthuma, thanks!


# 1.115 14-Aug-2019 anton

Improve the error message when supplying an invalid template to vmctl
start. Favoring 'invalid template' over 'permission denied' should give
the user a better hint on what went wrong.

ok kn@ mlarkin@


# 1.114 28-Jun-2019 deraadt

When system calls indicate an error they return -1, not some arbitrary
value < 0. errno is only updated in this case. Change all (most?)
callers of syscalls to follow this better, and let's see if this strictness
helps us in the future.


# 1.113 20-May-2019 jasper

drop fatalx calls when claiming a new vm id; otherwise it's possible
to crash vmd and take all other vms with it. this required a little
shuffling to get the error value reported back to the caller to
handle the error properly.

ok mlarkin@


# 1.112 11-May-2019 jasper

report vm state through 'vmctl status'; whereas previously this would display the state of
the vcpu (which is why it got removed), it now actually reports the correct state
(running, stopped, disabled, paused, etc)

ok ccardenas@ mlarkin@


# 1.111 11-May-2019 jasper

vm_dump_header allocated space for a signature but it was never set;
set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image

ok mlarkin@ pd@


# 1.110 11-May-2019 jasper

track the state of the vm (running, paused, etc) using a single bitfield instead of
a handful of separate variables. this will makes it easier for vmd to report
and check on the individual vm states

no functional change intended

ok ccardenas@ mlarkin@


# 1.109 11-May-2019 jasper

sync the vm state in vmd too when (un)pausing a vm, otherwise the vm process
knows the vm is paused, but vmd does not.

ok mlarkin@ pd@


Revision tags: OPENBSD_6_5_BASE
# 1.108 09-Dec-2018 claudio

When -B is used to specify a specific boot device also change the reboot
behaviour of vmd to stop / exit at guest reboot.
OK ccardenas@


# 1.107 04-Dec-2018 claudio

Introduce IMSG_VMDOP_WAIT_VM_REQUEST a control message that registers a
vmctl peerid that should be informed when the VM is stopped (like when the
guest does a shutdown). Uses the same logic as using the VMOP_WAIT flag on
IMSG_VMDOP_TERMINATE_VM_REQUEST.
Ok ccardenas@, reyk@


# 1.106 26-Nov-2018 ori

Keep a list of known vms, and reuse the VM IDs.

This means that when using '-L', the IP addresses of the VMs are stable.

ok reyk@


# 1.105 21-Nov-2018 reyk

Add support for "local inet6" interfaces.

ok & test ccardenas@, additional review from kn@


# 1.104 15-Oct-2018 reyk

Prevent VM reboot loops by rate-limiting the interval a VM can reboot.

This looping has been experienced by people who run VMs with a broken
kernel or boot loader that trigger a very fast reboot loop (triple
fault) of a VM that ends up using a lot of CPU and resources on the
host. Some fixes in vmm(4) and vmd(8) helped to avoid such conditions
but it can still occur if something is wrong in the guest VM itself.

If the VM restarts after less than VM_START_RATE_SEC (6) seconds, we
increment the limit counter. After VM_START_RATE_LIMIT (3) of suchs
fast reboots the VM is stopped.

There are only very few people who intentionally want to reboot-loop a
VM very quickly (many times within a second); mostly for fuzzing.
They will have to recompile and adjust the stated #defines in the code
as we don't have a config option to disable it.

OK mlarkin@


Revision tags: OPENBSD_6_4_BASE
# 1.103 08-Oct-2018 reyk

Add support for qcow2 base images (external snapshots).

This works is from Ori Bernstein, committing on his behalf:

Add support to vmd for external snapshots. That is, snapshots that are
derived from a base image. Data lookups start in the derived image,
and if the derived image does not contain some data, the search
proceeds ot the base image. Multiple derived images may exist off of
a single base image.

A limitation of this format is that modifying the base image will
corrupt the derived image.

This change also adds support for creating disk derived disk images to
vmctl. To use it:

vmctl create derived.qcow2 -s 16G -b base.qcow2

From Ori Bernstein
OK mlarkin@ reyk@


# 1.102 29-Sep-2018 pd

vmd: don't remove vm if sending failed

Fix a bug where a vm was removed in vmd.c after vmctl send even if sending
failed.
spotted by solene@
ok mlarkin@


# 1.101 28-Sep-2018 reyk

Fix copy-pasto to use maxmem instead of maxcpu

Reported by Greg Steuck

OK mlarkin@


# 1.100 10-Sep-2018 bluhm

vmd(8) clould close file descriptor 0 as not all fd fields were
properly initialized with -1. Also avoid closing -1.
OK mlarkin@


# 1.99 10-Sep-2018 bluhm

During the fork+exec implementation, daemon(3) was moved after
proc_init(). As a consequence vmd(8) child processes did not detach
from the terminal anymore. Dup /dev/null to the stdio file descriptors
in the children.
OK mlarkin@ reyk@


# 1.98 15-Jul-2018 reyk

Track resources and enforce cpu/memory/interface limits for non-root users.

The limits are currently hard-coded and undocumented (4 CPUs/VMs, 2G
memory, 8 interfaces) but will be configurable in an upcoming diff.
These limits are tracked in total usage; for example, a user will be
able to run up to 4 VMs with 512M of memory or a single VM with 2G.

OK ccardenas@ mlarkin@


# 1.97 13-Jul-2018 reyk

Check the disk/kernel/cdrom file permissions after openening the fd.

This prevents time of TOCTOU attacks for instances.

OK mlarkin@


# 1.96 13-Jul-2018 reyk

Add "allow instance" option.

This allows users to create VM instances and change desired options,
for example a user can be allowed to run a VM with all the
pre-configured options but specify an own disk image.

(mlarkin@ was fine with iterating over it)

OK ccardenas@


# 1.95 12-Jul-2018 reyk

Allow to use configured/running VMs as templates for other VM instances.

This introduces new grammar and the -t optional in vmctl start.

(For now, only root can create VM instances; but it is planned to allow
users to create their own VMs based on permissions and quota.)

OK ccardenas@ mlarkin@ jmc@


# 1.94 11-Jul-2018 reyk

style - indent each case statement in a switch.


# 1.93 11-Jul-2018 reyk

Add -w option to vmctl stop to wait for completion of VM termination.

Use it in /etc/rc.d/vmd accordingly.

OK sthen@


# 1.92 11-Jul-2018 reyk

Rename function to vmd_check_vmh


# 1.91 11-Jul-2018 reyk

Add -f option to vmctl stop to forcefully kill a VM.

This also fixes a bug in vmm_sighdlr where it might have missed
forwarding the TERMINATE_EVENT to the vmd parent after a VM child
died, leading to an abandoned VM in the vmd parent process.

OK ccardenas@ mlarkin@ benno@ kn@


# 1.90 10-Jul-2018 reyk

style (single-line ifs don't need braces)


# 1.89 10-Jul-2018 reyk

vmd already had DEBUG/DPRINTF, there is no need for VMD_DEBUG/dprintf

Replace all occurences of dprintf with DPRINTF (defined in proc.h).


# 1.88 10-Jul-2018 reyk

Tweak debug log messages

- Turn tracing messages into DPRINTF (only compiled with DEBUG).

- Pass __func__ to vm_stop and vm_remove: this way we can track who
called the function in the async context. It replaces the manual
log_debug in front of each vm_stop/vm_remove. This debug logging
trick can be removed in the future once we are more confident about
it.

OK ccardenas@ mlarkin@


# 1.87 26-Jun-2018 reyk

Add "socket owner" to allow changing the owner of the vmd control socket.

This allows to open vmctl control or console access to other users
that are not in group wheel. Access for non-root users still defaults
to read-only actions unless you change the owner (user/group) of each
individual VM.

Requested by Mischa Peters

OK mlarkin@


# 1.86 19-Jun-2018 reyk

knf


# 1.85 13-May-2018 pd

vmd(8): enable pause / unpause for vm owners

Patch from Mohamed Aslan. Thanks!
ok kn@


# 1.84 25-Apr-2018 mlarkin

vmd(8)'s early error messages weren't visible when started via /etc/rc
(such as errors relating to not having VMX/etc). Change the log_init
to log to syslog so at least we have some chance of seeing these errors.

requested and ok beck@


# 1.83 21-Apr-2018 mlarkin

spelling error in log message


# 1.82 29-Mar-2018 martijn

Make sure that the global config is send out immediately when it is
loaded. This makes sure that the local prefix specied in the config is
always used.

OK ccardenas@


Revision tags: OPENBSD_6_3_BASE
# 1.81 14-Mar-2018 mlarkin

block two VMs from using the same disk image file at the same time.
Also changes an error message in vmctl to reflect same.


# 1.80 18-Feb-2018 pd

vmd: fix vmctl pause for non existing vm ids (never returns)

check if vm id is valid before sending to vmm for pausing. The 'lock' is caused
by vmm sending back ENOENT for a non existent vm but vmd drops the message
because it doesn't recogize the vmid vmm is talking about. This is an artifact
of the 'policy' don't trust any imsg from a sibling priv sep process and do
your own checking.

reported by Abel Abraham Camarillo Ojeda
ok mlarkin@ and ccardenas@


# 1.79 10-Jan-2018 sthen

Don't require "disk" or "kernel", also allow just "cdrom" instead, a VM can
still be useful with only cdrom storage. ok ccardenas@


# 1.78 08-Jan-2018 mpi

Enable TIOCUCNTL to be able to set ns8250's break detected condition.

It is now possible to send BREAK commands to vmd(8) independently of
the serial terminal emulator.

Happy virtual ddb(4) hacking!

No objection from mlarkin@, ok nicm@, ccardenas@, deraadt@


# 1.77 03-Jan-2018 ccardenas

Add initial CD-ROM support to VMD via vioscsi.

* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8)
* Support various sized ISOs (Limitation of 4G ISOs on Linux guests)
* Known working guests: OpenBSD (primary), Alpine Linux (primary),
CentOS 6 (secondary), Ubuntu 17.10 (secondary).
NOTE: Secondary indicates some issue(s) preventing full/reliable
functionality outside the scope of the vioscsi work.
* If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's
default BIOS) will boot from CD-ROM.

ok mlarkin@, jca@


# 1.76 06-Dec-2017 abieber

Make vmd respect owner when starting non-disabled vms.

OK pd@, benno@


# 1.75 30-Nov-2017 ccardenas

When performing vmctl reload and a previously configured vm is running,
exit with an EALREADY vs EPERM.

ok mlarkin@


# 1.74 11-Nov-2017 mlarkin

update switch handling in vmd(8). vmd now gets switch information (rdomain,
etc) from underlying switch interface instead of handling this on its
own.

Diff from carlos cardenas, Thanks!

ok reyk@


# 1.73 07-Nov-2017 mlarkin

typo in previous


# 1.72 07-Nov-2017 mlarkin

comment function vm_checkperm


# 1.71 24-Oct-2017 mlarkin

The VMD parent process didn't handle the case of a VM exiting
with a non 0 return properly (i.e. EIO).

From: Carlos Cardenas, thanks!


# 1.70 07-Oct-2017 mlarkin

vmd: retain ownership on vm reboot

from Jesper Wallin, thanks!


Revision tags: OPENBSD_6_2_BASE
# 1.69 08-Sep-2017 mlarkin

vmd: add more explanatory log_debug messages

From Carlos Cardenas, many thanks!


# 1.68 20-Aug-2017 pd

vmd: Allow only upward migration

This restricts receiving vms from hosts with more cpu features.

Tested on
broadwell -> skylake (works)
skylake -> broadwell (don't work)

ok mlarkin@


# 1.67 15-Aug-2017 pd

vmd: fix vm id displayed by vmctl when receiving a vm

Also fix two debug messages and an IMSG type.


# 1.66 14-Aug-2017 jasper

validate vm names before creating them; a valid name contains alphanumeric
characters, including '.', '_' and '-'. but does not start with the latter
three.

ok mlarkin@ pd@


# 1.65 13-Aug-2017 jasper

don't issue a termination command to an already stopped vm

ok mlarkin@


# 1.64 15-Jul-2017 pd

Add vmctl send and vmctl receive

ok reyk@ and mlarkin@


# 1.63 09-Jul-2017 pd

vmd/vmctl: Add ability to pause / unpause vms

With help from Ashwin Agrawal

ok reyk@ mlarkin@


# 1.62 29-May-2017 mlarkin

vmd(8): prevent crashing when presented with a vm name argument to
"vmctl stop" that doesn't exist.

Diff from Pratik Vyas, thanks!


# 1.61 04-May-2017 reyk

Report command failure back to vmctl reload, reset, load, log verbose.

OK mlarkin@


# 1.60 04-May-2017 reyk

Add support for rdomains.

This allows to configure VM interfaces and switches in individual rdomains.

OK mlarkin@


# 1.59 25-Apr-2017 reyk

Generate randomized MAC addresses earlier to keep them across reboots.

OK deraadt@


# 1.58 21-Apr-2017 reyk

Add global configuration option "local prefix" to change prefix for -L.

The default prefix is 100.64.0.0/10 from RFC6598.

Requested by sthen@ chris@
OK mlarkin@


# 1.57 19-Apr-2017 reyk

Add support for dynamic "NAT" interfaces (-L/local interface).

When a local interface is configured, vmd configures a /31 address on
the tap(4) interface of the host and provides another IP in the same
subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server
that replies with IP, gateway, and DNS addresses to the VM. The
built-in server only ever responds to the VM on the inside and cannot
leak its DHCP responses to the outside.

Thanks to Uwe Werler, Josh Grosse, and some others for testing!

OK deraadt@


# 1.56 06-Apr-2017 reyk

Do not expose vmm(4) VM IDs to the user, use vmd(8)'s IDs instead.

Each VM has two IDs: one from the kernel (vmm) and a different one
from userland (vmd). The vmm ID is not consistent and incremented on
every boot during runtimg of the host system. The vmd ID remains the
same during the lifetime of a configured VM, even after reboots.
Configured VMs will even get and keep their IDs when the configuration
is loaded. This is more what users expect.

Pointed out and tested by otto@

OK deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.55 15-Mar-2017 reyk

More fixes for starting and stopping VMs, fixing fallout from vm_running.

- Don't start a VM that is already running
- Keep the VM as running until it is powered off (and not stopping)
- Don't fatal in the parent if the vmm process referenced an unknown VM
- Don't stop a VM that is already stopping
- Indicate that a VM is stopping in "vmctl status"

The previous "vmctl stop; vmctl stop" to force-shutdown is not
supported anymore - the shutdown timeout should make sure that the VM
is really terminated. To force-shutdown, reference the VM by ID.
We might add a flag to vmctl stop to just turn the VM off.


# 1.54 15-Mar-2017 reyk

Close the tty if the VM was powered down.

The parent keeps a copy of each VM's tty fd to reuse it on reboot.
Close this tty if the VM was stopped, and not rebooted, by calling
vm_stop(vm, 0) instead of just setting vm_running to 0. Also make
sure that vm_ttyname is not used after free'ing it.


# 1.53 02-Mar-2017 reyk

Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.

This is especially useful when multiple VMs share a switch, the
implementation is independent from the underlying switch or bridge.

no objections mlarkin@


# 1.52 01-Mar-2017 reyk

Add "owner" option to set a user/group ownership for pre-configured VMs

This allows matching users to start or stop VMs that they "own" and to
access the console accordingly.

OK mlarkin@


# 1.51 27-Feb-2017 reyk

Replace openpty(3) with local function that uses pre-opened /dev/ptm fd

This allows more flexibility for upcoming changes and better pledge.
We also didn't use half of the features of libutil's openpty function.
Additionally, make sure that the ttys are closed correctly on shutdown.

OK gilles@


# 1.50 13-Jan-2017 edd

Make it possible to remove VMs from vmd(8)'s internal queue.

The semantics agreed with reyk@ are:

* ad-hoc created vms, created with `vmctl start`, are removed once stopped.
* Stopped VMs defined in a config file are flushed before a `vmctl reload`.

OK reyk@


# 1.49 11-Jan-2017 reyk

Add imsg communication channel between vmd and invividual VMs.
For now, this is only used to forward "log verbose|brief" requests,
but it will be used for better things later.

OK mlarkin@


# 1.48 09-Jan-2017 reyk

Stop accessing verbose and debug variables from log.c directly.

This replaces log_verbose() and "extern int verbose" with the two functions
log_setverbose() and log_getverbose().

Pointed out by benno@
OK krw@ eric@ gilles@ (OK gilles@ for the snmpd bits as well)


# 1.47 14-Dec-2016 reyk

Allow to start disabled and pre-configured VMs by name, "vmctl start foo".

With testing from Jon Bernard

OK mlarkin@


# 1.46 14-Dec-2016 reyk

If a VM terminates with the result EAGAIN, close all fds except the
pty and re-send it to the vmm monitor process. With additional
changes in vmm.c, this will allow perform a cold reboot of VM.

With testing and feedback from Jon Bernard
OK mlarkin@


# 1.45 26-Nov-2016 reyk

Implement basic support for boot.conf(8) on the disk image.

Like the real boot loader, load and parse hd0a:/etc/boot.conf from the
first disk and fall back to /bsd. Not all boot loader options are
supported, but it at least does set device, set image, and boot -acds
(eg. for booting single-user).

For example, it can now boot install60.fs that includes a boot.conf
with "set image /6.0/amd64/bsd.rd":
vmctl start install -c -d install60.fs -d OpenBSD.img

This pseudo-bootloader is only needed without BIOS and could
potentially be replaced in the future.

OK mlarkin@


# 1.44 26-Nov-2016 reyk

If -m/memory is not specified, use 512M by default.

Default value picked with mlarkin - not too small and not too large.

OK mlarkin@


# 1.43 24-Nov-2016 reyk

Add support for booting the kernel from the disk image.

This make the kernel/-k argument optional and, if not specified, tries
to find the /bsd kernel in the primary hd0a partition of the first
disk image itself. It doesn't support hd0a:/etc/boot.conf yet, and it
is no BIOS or full boot loader, but it makes booting and handling of
VMs a bit easier - booting an external kernel is still supported.

The UFS file system code ufs.c is directly from libsa which is also
used by the real boot loader. The code compiles with a few signedness
warning which will be fixed separately.

OK mlarkin@


# 1.42 22-Nov-2016 reyk

Fix error path of config_setvm() and its callers. This unbreaks
loading of invalid kernel files.

Reported by mlarkin@
OK mlarkin@


# 1.41 22-Nov-2016 reyk

There is no need for res when there is already ret.


# 1.40 22-Nov-2016 edd

Insert disabled VMs into vmd(8)'s queues and allow vmctl(8) to display them.

Tested by Jon Bernard and reyk@.

OK reyk@, no objections mlarkin@.

Thanks


# 1.39 04-Nov-2016 reyk

Pass the internal vmid or 0 to vm_register() instead of changing it
once again after setting the next available id.

Suggested by edd@


# 1.38 04-Nov-2016 reyk

Update the config/register/get VM methods to match the config_set/get
style that is used in other places. Also keep the vmid from the parent.

OK edd@


# 1.37 29-Oct-2016 edd

Separate parsing vms and switches from starting them in vmd(8).

Brings us one step closer to having disabled by default vms is vm.conf(5),
which can be started with vmctl(8).

Input, testing and OK reyk@. Thanks.


# 1.36 17-Oct-2016 reyk

Add the option to specify an interface group per virtual switch as well;
this group will be added to all VM tap(4) interfaces in the switch.

Tested by martijn@


# 1.35 15-Oct-2016 reyk

Allow to add an interface to an interface group; with the group keyword.

Requested and tested by martijn@


# 1.34 12-Oct-2016 reyk

Fix functionality and semantics of vmctl load/reload/reset.

OK rzalamena@


# 1.33 06-Oct-2016 reyk

Terminate VMs on shutdown of vmd instead of leaving them running as
undead VM processes.

OK mlarkin@


# 1.32 05-Oct-2016 reyk

Add support for enhanced networking configuration and virtual switches.
See vm.conf(5) for more details.

OK mlarkin@


# 1.31 04-Oct-2016 reyk

Add a new "priv" process that is responsible for ioctls and restricted
operations that aren't allowed under pledge. This is a companion to
the "vmd" process that runs as root but with pledge.

With the "priv" process, each new tap(4) interface now gets a
description to indicate the vm, eg. "vm1-if0-myvm". For network
configuration will be done by vmd/priv later.

OK mlarkin@


# 1.30 29-Sep-2016 reyk

Implement fork+exec for vmd, using the same framework from httpd etc.

No objections from mlarkin@ sunil@


# 1.29 17-Aug-2016 deraadt

small bits of header cleanup; ok mlarkin


# 1.28 29-Jul-2016 stefan

Allow starting a VM again after it was terminated

If a VM exits, terminate it and remove it from the list of
available VMs. That allows a VM with name `foo' to be restarted
after it has exited.

This changes structures shared between vmd and vmctl. You need to
rebuild vmctl also.

ok mlarkin@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.27 05-Feb-2016 reyk

Fix a possible use-after-free in vmd, forward the result to the
control socket before free'ing the vm.

Found by and OK jsg@


# 1.26 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.25 11-Dec-2015 reyk

The vmctl "id" argument can now be a number of or a vm name, eg.
vmctl stop 3
vmctl stop "openbsd.vm"


# 1.24 08-Dec-2015 jsg

when checking the config file with -n don't open /dev/vmm or require root
ok reyk@


# 1.23 08-Dec-2015 jsg

make the -f option work as intended
ok reyk@


# 1.22 07-Dec-2015 reyk

tweak initial error logging


# 1.21 06-Dec-2015 reyk

Prevent running a VM with the same name multiple times - multiple
instances of the same configuration will be handled in a different way
later. It is also not a good idea to use the same writeable disk
with multiple VMs at the same time.

As discussed with mlarkin@


# 1.20 06-Dec-2015 reyk

Print the TTY in the vmctl status output.


# 1.19 06-Dec-2015 reyk

When a new vm is created with VMM_IOC_CREATE, the kernel assigns a
unique id to it. This happens in the vm child process and has to be
communicated to the parent processes to track the vm. Knowing the vm
id in the parent and vmm processes also allows to remove vm from the
daemons list on terminate requests later.


# 1.18 06-Dec-2015 reyk

Check errno from config_getvm() correctly


# 1.17 05-Dec-2015 reyk

Print shorter error message if opening /dev/vmm failed.

Pointed out by deraadt@


# 1.16 03-Dec-2015 reyk

Re-add the "load" and "reload" commands to vmctl: Instead of parsing
the configuration in vmctl directly, it now sends a (re)load request
to vmd. The reload also resets the existing configuration status -
this doesn't do much difference yet but a future change will compare
if a specified VM is already running. "load" will allow to add
configuration, while "reload" resets the state before loading.


# 1.15 03-Dec-2015 reyk

Add and document -D and -f flags to vmd.


# 1.14 03-Dec-2015 reyk

mlarkin's code has been moved to vmm.c, so it is ok to claim the copyright.


# 1.13 03-Dec-2015 reyk

Add support for an optional vm.conf(5) file in vmd. This will replace
vmm.conf(5) in vmmctl. For a short time, both vmd and vmmctl will
support a configuration file, but vmmctl will be changed to send
"load" requests to vmd instead of loading and parsing the file
directly.


# 1.12 03-Dec-2015 reyk

prepare config_getvm() for parse.y


# 1.11 02-Dec-2015 reyk

send the tty name to vmmctl and print it as a result.


# 1.10 02-Dec-2015 reyk

Split the fully privileged parent into two processes "parent" and
"vmm" with reduced privileges:
- the "parent" opens fds (disks, ifs, etc.) but runs as root but pledged as
"stdio rpath wpath proc tty sendfd".
- the "vmm" process handles the creation and supervision of vm processes,
and the primary communication with the vmm(4) subsystem. It runs as _vmd
in the chroot but does not use pledge, as the vmm ioctls are not allowed
by any pledge model yet.
With this change, vmd starts to track the configuration state of VMs
in vmd and will allow other things later (like terminating a vm by
name, moving the configuration parser to vmd, ...). More incremental
changes will follow.


# 1.9 02-Dec-2015 reyk

Start tweaking vmd's privsep and daemon model by splitting the main
process into multiple parts and adopting the "proc.c"-style from other
daemons. This allows to further reduce the privileges, to give better
pledge(2), and to add some upcoming changes.

"please do" mlarkin@, deraadt@


# 1.8 26-Nov-2015 reyk

Automatically start vmm(4) when the first VM is created and after the
last VM is terminated. This allows to remove the explicit "vmm
enable" / "vmm disable" (VMM_IOC_START / VMM_IOC_STOP) ioctls. You'll
have to update kernel and userland for this change, as the kernel ABI
changes.

OK mpi@ mlarkin@


# 1.7 25-Nov-2015 tedu

typo: should be looking pid == -1


# 1.6 23-Nov-2015 reyk

accept4() is restarted after signals which prevents vmd from exiting
in the current control socket loop. Add a poll before the accept that
is not restarted and allows to escape the loop. This code is kind of
temporary, as we're planning to replace the event handling, but it
allows to kill (or Ctrl+c) vmd for now.

OK tedu@, discussed with many


# 1.5 23-Nov-2015 reyk

I accidentally removed a newline in usage() when converting the log
messages to log_*.

From Cesare Gargano


# 1.4 23-Nov-2015 reyk

Add support for logging to stderr or syslog, and to run vmd in
foreground with -d.

OK mlarkin@ jung@


# 1.3 22-Nov-2015 deraadt

use PATH_MAX where needed


# 1.2 22-Nov-2015 reyk

Add $ Ids


# 1.1 22-Nov-2015 mlarkin

vmd(8) - virtual machine daemon.

There is still a lot to be done, and fixed, in these userland components
but I have received enough "it works, commit it" emails that it's time
to finish those things in tree.

discussed with many, tested by many.


# 1.124 04-May-2021 dv

Init debug logging state before attempting to log.

Error messages related to bad configuration were not flushing to
stderr.

OK mlarkin@


# 1.123 26-Apr-2021 dv

vmd(8): fix vmctl client "wait" state corruption

Adds queue-based tracking of waiting client state to fix the cause of
state corruption when a vmctl(8) user cancels a wait and restarts it.
The socket fd value for the control process client was being used to
track the waiting party, but this also prevented multiple waiting
clients.

This moves all the state tracking of who to notify of a vm's stopping
to the control process and no longer requires the parent process to
track it in the global environment state.

Future work will be needed to smooth out the difference between the
IMSG_VMDOP_TERMINATE_VM_{EVENT,RESPONSE} events instead of needing to
translate before relaying to the vmctl(8) client.

Tested by Mischa Peters (thanks!)

ok mlarkin@


Revision tags: OPENBSD_6_9_BASE
# 1.122 05-Apr-2021 dv

Send correct response type on unpause errors.

ok pd@


# 1.121 29-Mar-2021 dv

Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp
and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior
ignored did not intercept these packets and instead transmitted them.

This should make vmd(8)'s dhcp behave more as a true dhcp server should and
allows it to work properly with the new dhcpleased(8) attempting a renewal.

OK mlarkin@


# 1.120 27-Jan-2021 deraadt

these programs (with common ancestry) had a -fno-common problem related
to privsep_procid.
ok mortimer


Revision tags: OPENBSD_6_8_BASE
# 1.119 23-Sep-2020 martijn

Revert agentx support for now, we're too close to release.

requested by deraadt@


# 1.118 23-Sep-2020 martijn

Add support for agentx to vmd.

This is based around VM-MIB from RFC7666,but does not export the full
spec. People more knowledgeable of vmd are encouraged to expand on this.


Revision tags: OPENBSD_6_7_BASE
# 1.117 12-Dec-2019 pd

vmd: start vms defined in vm.conf in a staggered fashion

This addresses 'thundering herd' problem when a lot of
vms are configured in vm.conf. A lot of vms booting in parallel can
overload the host and also mess up tsc calibration in openbsd guests as
it uses PIT which doesn't fire reliably if the host is overloaded.

We default to starting vms with parallelism of ncpuonline and a delay 30 seconds
between batches. This is configurable in vm.conf.

ok mlarkin@ (also addressed comments from cheloha@)


Revision tags: OPENBSD_6_6_BASE
# 1.116 04-Sep-2019 mlarkin

vmd(8): memory leak in an error path

Found by Hiltjo Posthuma, thanks!


# 1.115 14-Aug-2019 anton

Improve the error message when supplying an invalid template to vmctl
start. Favoring 'invalid template' over 'permission denied' should give
the user a better hint on what went wrong.

ok kn@ mlarkin@


# 1.114 28-Jun-2019 deraadt

When system calls indicate an error they return -1, not some arbitrary
value < 0. errno is only updated in this case. Change all (most?)
callers of syscalls to follow this better, and let's see if this strictness
helps us in the future.


# 1.113 20-May-2019 jasper

drop fatalx calls when claiming a new vm id; otherwise it's possible
to crash vmd and take all other vms with it. this required a little
shuffling to get the error value reported back to the caller to
handle the error properly.

ok mlarkin@


# 1.112 11-May-2019 jasper

report vm state through 'vmctl status'; whereas previously this would display the state of
the vcpu (which is why it got removed), it now actually reports the correct state
(running, stopped, disabled, paused, etc)

ok ccardenas@ mlarkin@


# 1.111 11-May-2019 jasper

vm_dump_header allocated space for a signature but it was never set;
set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image

ok mlarkin@ pd@


# 1.110 11-May-2019 jasper

track the state of the vm (running, paused, etc) using a single bitfield instead of
a handful of separate variables. this will makes it easier for vmd to report
and check on the individual vm states

no functional change intended

ok ccardenas@ mlarkin@


# 1.109 11-May-2019 jasper

sync the vm state in vmd too when (un)pausing a vm, otherwise the vm process
knows the vm is paused, but vmd does not.

ok mlarkin@ pd@


Revision tags: OPENBSD_6_5_BASE
# 1.108 09-Dec-2018 claudio

When -B is used to specify a specific boot device also change the reboot
behaviour of vmd to stop / exit at guest reboot.
OK ccardenas@


# 1.107 04-Dec-2018 claudio

Introduce IMSG_VMDOP_WAIT_VM_REQUEST a control message that registers a
vmctl peerid that should be informed when the VM is stopped (like when the
guest does a shutdown). Uses the same logic as using the VMOP_WAIT flag on
IMSG_VMDOP_TERMINATE_VM_REQUEST.
Ok ccardenas@, reyk@


# 1.106 26-Nov-2018 ori

Keep a list of known vms, and reuse the VM IDs.

This means that when using '-L', the IP addresses of the VMs are stable.

ok reyk@


# 1.105 21-Nov-2018 reyk

Add support for "local inet6" interfaces.

ok & test ccardenas@, additional review from kn@


# 1.104 15-Oct-2018 reyk

Prevent VM reboot loops by rate-limiting the interval a VM can reboot.

This looping has been experienced by people who run VMs with a broken
kernel or boot loader that trigger a very fast reboot loop (triple
fault) of a VM that ends up using a lot of CPU and resources on the
host. Some fixes in vmm(4) and vmd(8) helped to avoid such conditions
but it can still occur if something is wrong in the guest VM itself.

If the VM restarts after less than VM_START_RATE_SEC (6) seconds, we
increment the limit counter. After VM_START_RATE_LIMIT (3) of suchs
fast reboots the VM is stopped.

There are only very few people who intentionally want to reboot-loop a
VM very quickly (many times within a second); mostly for fuzzing.
They will have to recompile and adjust the stated #defines in the code
as we don't have a config option to disable it.

OK mlarkin@


Revision tags: OPENBSD_6_4_BASE
# 1.103 08-Oct-2018 reyk

Add support for qcow2 base images (external snapshots).

This works is from Ori Bernstein, committing on his behalf:

Add support to vmd for external snapshots. That is, snapshots that are
derived from a base image. Data lookups start in the derived image,
and if the derived image does not contain some data, the search
proceeds ot the base image. Multiple derived images may exist off of
a single base image.

A limitation of this format is that modifying the base image will
corrupt the derived image.

This change also adds support for creating disk derived disk images to
vmctl. To use it:

vmctl create derived.qcow2 -s 16G -b base.qcow2

From Ori Bernstein
OK mlarkin@ reyk@


# 1.102 29-Sep-2018 pd

vmd: don't remove vm if sending failed

Fix a bug where a vm was removed in vmd.c after vmctl send even if sending
failed.
spotted by solene@
ok mlarkin@


# 1.101 28-Sep-2018 reyk

Fix copy-pasto to use maxmem instead of maxcpu

Reported by Greg Steuck

OK mlarkin@


# 1.100 10-Sep-2018 bluhm

vmd(8) clould close file descriptor 0 as not all fd fields were
properly initialized with -1. Also avoid closing -1.
OK mlarkin@


# 1.99 10-Sep-2018 bluhm

During the fork+exec implementation, daemon(3) was moved after
proc_init(). As a consequence vmd(8) child processes did not detach
from the terminal anymore. Dup /dev/null to the stdio file descriptors
in the children.
OK mlarkin@ reyk@


# 1.98 15-Jul-2018 reyk

Track resources and enforce cpu/memory/interface limits for non-root users.

The limits are currently hard-coded and undocumented (4 CPUs/VMs, 2G
memory, 8 interfaces) but will be configurable in an upcoming diff.
These limits are tracked in total usage; for example, a user will be
able to run up to 4 VMs with 512M of memory or a single VM with 2G.

OK ccardenas@ mlarkin@


# 1.97 13-Jul-2018 reyk

Check the disk/kernel/cdrom file permissions after openening the fd.

This prevents time of TOCTOU attacks for instances.

OK mlarkin@


# 1.96 13-Jul-2018 reyk

Add "allow instance" option.

This allows users to create VM instances and change desired options,
for example a user can be allowed to run a VM with all the
pre-configured options but specify an own disk image.

(mlarkin@ was fine with iterating over it)

OK ccardenas@


# 1.95 12-Jul-2018 reyk

Allow to use configured/running VMs as templates for other VM instances.

This introduces new grammar and the -t optional in vmctl start.

(For now, only root can create VM instances; but it is planned to allow
users to create their own VMs based on permissions and quota.)

OK ccardenas@ mlarkin@ jmc@


# 1.94 11-Jul-2018 reyk

style - indent each case statement in a switch.


# 1.93 11-Jul-2018 reyk

Add -w option to vmctl stop to wait for completion of VM termination.

Use it in /etc/rc.d/vmd accordingly.

OK sthen@


# 1.92 11-Jul-2018 reyk

Rename function to vmd_check_vmh


# 1.91 11-Jul-2018 reyk

Add -f option to vmctl stop to forcefully kill a VM.

This also fixes a bug in vmm_sighdlr where it might have missed
forwarding the TERMINATE_EVENT to the vmd parent after a VM child
died, leading to an abandoned VM in the vmd parent process.

OK ccardenas@ mlarkin@ benno@ kn@


# 1.90 10-Jul-2018 reyk

style (single-line ifs don't need braces)


# 1.89 10-Jul-2018 reyk

vmd already had DEBUG/DPRINTF, there is no need for VMD_DEBUG/dprintf

Replace all occurences of dprintf with DPRINTF (defined in proc.h).


# 1.88 10-Jul-2018 reyk

Tweak debug log messages

- Turn tracing messages into DPRINTF (only compiled with DEBUG).

- Pass __func__ to vm_stop and vm_remove: this way we can track who
called the function in the async context. It replaces the manual
log_debug in front of each vm_stop/vm_remove. This debug logging
trick can be removed in the future once we are more confident about
it.

OK ccardenas@ mlarkin@


# 1.87 26-Jun-2018 reyk

Add "socket owner" to allow changing the owner of the vmd control socket.

This allows to open vmctl control or console access to other users
that are not in group wheel. Access for non-root users still defaults
to read-only actions unless you change the owner (user/group) of each
individual VM.

Requested by Mischa Peters

OK mlarkin@


# 1.86 19-Jun-2018 reyk

knf


# 1.85 13-May-2018 pd

vmd(8): enable pause / unpause for vm owners

Patch from Mohamed Aslan. Thanks!
ok kn@


# 1.84 25-Apr-2018 mlarkin

vmd(8)'s early error messages weren't visible when started via /etc/rc
(such as errors relating to not having VMX/etc). Change the log_init
to log to syslog so at least we have some chance of seeing these errors.

requested and ok beck@


# 1.83 21-Apr-2018 mlarkin

spelling error in log message


# 1.82 29-Mar-2018 martijn

Make sure that the global config is send out immediately when it is
loaded. This makes sure that the local prefix specied in the config is
always used.

OK ccardenas@


Revision tags: OPENBSD_6_3_BASE
# 1.81 14-Mar-2018 mlarkin

block two VMs from using the same disk image file at the same time.
Also changes an error message in vmctl to reflect same.


# 1.80 18-Feb-2018 pd

vmd: fix vmctl pause for non existing vm ids (never returns)

check if vm id is valid before sending to vmm for pausing. The 'lock' is caused
by vmm sending back ENOENT for a non existent vm but vmd drops the message
because it doesn't recogize the vmid vmm is talking about. This is an artifact
of the 'policy' don't trust any imsg from a sibling priv sep process and do
your own checking.

reported by Abel Abraham Camarillo Ojeda
ok mlarkin@ and ccardenas@


# 1.79 10-Jan-2018 sthen

Don't require "disk" or "kernel", also allow just "cdrom" instead, a VM can
still be useful with only cdrom storage. ok ccardenas@


# 1.78 08-Jan-2018 mpi

Enable TIOCUCNTL to be able to set ns8250's break detected condition.

It is now possible to send BREAK commands to vmd(8) independently of
the serial terminal emulator.

Happy virtual ddb(4) hacking!

No objection from mlarkin@, ok nicm@, ccardenas@, deraadt@


# 1.77 03-Jan-2018 ccardenas

Add initial CD-ROM support to VMD via vioscsi.

* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8)
* Support various sized ISOs (Limitation of 4G ISOs on Linux guests)
* Known working guests: OpenBSD (primary), Alpine Linux (primary),
CentOS 6 (secondary), Ubuntu 17.10 (secondary).
NOTE: Secondary indicates some issue(s) preventing full/reliable
functionality outside the scope of the vioscsi work.
* If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's
default BIOS) will boot from CD-ROM.

ok mlarkin@, jca@


# 1.76 06-Dec-2017 abieber

Make vmd respect owner when starting non-disabled vms.

OK pd@, benno@


# 1.75 30-Nov-2017 ccardenas

When performing vmctl reload and a previously configured vm is running,
exit with an EALREADY vs EPERM.

ok mlarkin@


# 1.74 11-Nov-2017 mlarkin

update switch handling in vmd(8). vmd now gets switch information (rdomain,
etc) from underlying switch interface instead of handling this on its
own.

Diff from carlos cardenas, Thanks!

ok reyk@


# 1.73 07-Nov-2017 mlarkin

typo in previous


# 1.72 07-Nov-2017 mlarkin

comment function vm_checkperm


# 1.71 24-Oct-2017 mlarkin

The VMD parent process didn't handle the case of a VM exiting
with a non 0 return properly (i.e. EIO).

From: Carlos Cardenas, thanks!


# 1.70 07-Oct-2017 mlarkin

vmd: retain ownership on vm reboot

from Jesper Wallin, thanks!


Revision tags: OPENBSD_6_2_BASE
# 1.69 08-Sep-2017 mlarkin

vmd: add more explanatory log_debug messages

From Carlos Cardenas, many thanks!


# 1.68 20-Aug-2017 pd

vmd: Allow only upward migration

This restricts receiving vms from hosts with more cpu features.

Tested on
broadwell -> skylake (works)
skylake -> broadwell (don't work)

ok mlarkin@


# 1.67 15-Aug-2017 pd

vmd: fix vm id displayed by vmctl when receiving a vm

Also fix two debug messages and an IMSG type.


# 1.66 14-Aug-2017 jasper

validate vm names before creating them; a valid name contains alphanumeric
characters, including '.', '_' and '-'. but does not start with the latter
three.

ok mlarkin@ pd@


# 1.65 13-Aug-2017 jasper

don't issue a termination command to an already stopped vm

ok mlarkin@


# 1.64 15-Jul-2017 pd

Add vmctl send and vmctl receive

ok reyk@ and mlarkin@


# 1.63 09-Jul-2017 pd

vmd/vmctl: Add ability to pause / unpause vms

With help from Ashwin Agrawal

ok reyk@ mlarkin@


# 1.62 29-May-2017 mlarkin

vmd(8): prevent crashing when presented with a vm name argument to
"vmctl stop" that doesn't exist.

Diff from Pratik Vyas, thanks!


# 1.61 04-May-2017 reyk

Report command failure back to vmctl reload, reset, load, log verbose.

OK mlarkin@


# 1.60 04-May-2017 reyk

Add support for rdomains.

This allows to configure VM interfaces and switches in individual rdomains.

OK mlarkin@


# 1.59 25-Apr-2017 reyk

Generate randomized MAC addresses earlier to keep them across reboots.

OK deraadt@


# 1.58 21-Apr-2017 reyk

Add global configuration option "local prefix" to change prefix for -L.

The default prefix is 100.64.0.0/10 from RFC6598.

Requested by sthen@ chris@
OK mlarkin@


# 1.57 19-Apr-2017 reyk

Add support for dynamic "NAT" interfaces (-L/local interface).

When a local interface is configured, vmd configures a /31 address on
the tap(4) interface of the host and provides another IP in the same
subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server
that replies with IP, gateway, and DNS addresses to the VM. The
built-in server only ever responds to the VM on the inside and cannot
leak its DHCP responses to the outside.

Thanks to Uwe Werler, Josh Grosse, and some others for testing!

OK deraadt@


# 1.56 06-Apr-2017 reyk

Do not expose vmm(4) VM IDs to the user, use vmd(8)'s IDs instead.

Each VM has two IDs: one from the kernel (vmm) and a different one
from userland (vmd). The vmm ID is not consistent and incremented on
every boot during runtimg of the host system. The vmd ID remains the
same during the lifetime of a configured VM, even after reboots.
Configured VMs will even get and keep their IDs when the configuration
is loaded. This is more what users expect.

Pointed out and tested by otto@

OK deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.55 15-Mar-2017 reyk

More fixes for starting and stopping VMs, fixing fallout from vm_running.

- Don't start a VM that is already running
- Keep the VM as running until it is powered off (and not stopping)
- Don't fatal in the parent if the vmm process referenced an unknown VM
- Don't stop a VM that is already stopping
- Indicate that a VM is stopping in "vmctl status"

The previous "vmctl stop; vmctl stop" to force-shutdown is not
supported anymore - the shutdown timeout should make sure that the VM
is really terminated. To force-shutdown, reference the VM by ID.
We might add a flag to vmctl stop to just turn the VM off.


# 1.54 15-Mar-2017 reyk

Close the tty if the VM was powered down.

The parent keeps a copy of each VM's tty fd to reuse it on reboot.
Close this tty if the VM was stopped, and not rebooted, by calling
vm_stop(vm, 0) instead of just setting vm_running to 0. Also make
sure that vm_ttyname is not used after free'ing it.


# 1.53 02-Mar-2017 reyk

Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.

This is especially useful when multiple VMs share a switch, the
implementation is independent from the underlying switch or bridge.

no objections mlarkin@


# 1.52 01-Mar-2017 reyk

Add "owner" option to set a user/group ownership for pre-configured VMs

This allows matching users to start or stop VMs that they "own" and to
access the console accordingly.

OK mlarkin@


# 1.51 27-Feb-2017 reyk

Replace openpty(3) with local function that uses pre-opened /dev/ptm fd

This allows more flexibility for upcoming changes and better pledge.
We also didn't use half of the features of libutil's openpty function.
Additionally, make sure that the ttys are closed correctly on shutdown.

OK gilles@


# 1.50 13-Jan-2017 edd

Make it possible to remove VMs from vmd(8)'s internal queue.

The semantics agreed with reyk@ are:

* ad-hoc created vms, created with `vmctl start`, are removed once stopped.
* Stopped VMs defined in a config file are flushed before a `vmctl reload`.

OK reyk@


# 1.49 11-Jan-2017 reyk

Add imsg communication channel between vmd and invividual VMs.
For now, this is only used to forward "log verbose|brief" requests,
but it will be used for better things later.

OK mlarkin@


# 1.48 09-Jan-2017 reyk

Stop accessing verbose and debug variables from log.c directly.

This replaces log_verbose() and "extern int verbose" with the two functions
log_setverbose() and log_getverbose().

Pointed out by benno@
OK krw@ eric@ gilles@ (OK gilles@ for the snmpd bits as well)


# 1.47 14-Dec-2016 reyk

Allow to start disabled and pre-configured VMs by name, "vmctl start foo".

With testing from Jon Bernard

OK mlarkin@


# 1.46 14-Dec-2016 reyk

If a VM terminates with the result EAGAIN, close all fds except the
pty and re-send it to the vmm monitor process. With additional
changes in vmm.c, this will allow perform a cold reboot of VM.

With testing and feedback from Jon Bernard
OK mlarkin@


# 1.45 26-Nov-2016 reyk

Implement basic support for boot.conf(8) on the disk image.

Like the real boot loader, load and parse hd0a:/etc/boot.conf from the
first disk and fall back to /bsd. Not all boot loader options are
supported, but it at least does set device, set image, and boot -acds
(eg. for booting single-user).

For example, it can now boot install60.fs that includes a boot.conf
with "set image /6.0/amd64/bsd.rd":
vmctl start install -c -d install60.fs -d OpenBSD.img

This pseudo-bootloader is only needed without BIOS and could
potentially be replaced in the future.

OK mlarkin@


# 1.44 26-Nov-2016 reyk

If -m/memory is not specified, use 512M by default.

Default value picked with mlarkin - not too small and not too large.

OK mlarkin@


# 1.43 24-Nov-2016 reyk

Add support for booting the kernel from the disk image.

This make the kernel/-k argument optional and, if not specified, tries
to find the /bsd kernel in the primary hd0a partition of the first
disk image itself. It doesn't support hd0a:/etc/boot.conf yet, and it
is no BIOS or full boot loader, but it makes booting and handling of
VMs a bit easier - booting an external kernel is still supported.

The UFS file system code ufs.c is directly from libsa which is also
used by the real boot loader. The code compiles with a few signedness
warning which will be fixed separately.

OK mlarkin@


# 1.42 22-Nov-2016 reyk

Fix error path of config_setvm() and its callers. This unbreaks
loading of invalid kernel files.

Reported by mlarkin@
OK mlarkin@


# 1.41 22-Nov-2016 reyk

There is no need for res when there is already ret.


# 1.40 22-Nov-2016 edd

Insert disabled VMs into vmd(8)'s queues and allow vmctl(8) to display them.

Tested by Jon Bernard and reyk@.

OK reyk@, no objections mlarkin@.

Thanks


# 1.39 04-Nov-2016 reyk

Pass the internal vmid or 0 to vm_register() instead of changing it
once again after setting the next available id.

Suggested by edd@


# 1.38 04-Nov-2016 reyk

Update the config/register/get VM methods to match the config_set/get
style that is used in other places. Also keep the vmid from the parent.

OK edd@


# 1.37 29-Oct-2016 edd

Separate parsing vms and switches from starting them in vmd(8).

Brings us one step closer to having disabled by default vms is vm.conf(5),
which can be started with vmctl(8).

Input, testing and OK reyk@. Thanks.


# 1.36 17-Oct-2016 reyk

Add the option to specify an interface group per virtual switch as well;
this group will be added to all VM tap(4) interfaces in the switch.

Tested by martijn@


# 1.35 15-Oct-2016 reyk

Allow to add an interface to an interface group; with the group keyword.

Requested and tested by martijn@


# 1.34 12-Oct-2016 reyk

Fix functionality and semantics of vmctl load/reload/reset.

OK rzalamena@


# 1.33 06-Oct-2016 reyk

Terminate VMs on shutdown of vmd instead of leaving them running as
undead VM processes.

OK mlarkin@


# 1.32 05-Oct-2016 reyk

Add support for enhanced networking configuration and virtual switches.
See vm.conf(5) for more details.

OK mlarkin@


# 1.31 04-Oct-2016 reyk

Add a new "priv" process that is responsible for ioctls and restricted
operations that aren't allowed under pledge. This is a companion to
the "vmd" process that runs as root but with pledge.

With the "priv" process, each new tap(4) interface now gets a
description to indicate the vm, eg. "vm1-if0-myvm". For network
configuration will be done by vmd/priv later.

OK mlarkin@


# 1.30 29-Sep-2016 reyk

Implement fork+exec for vmd, using the same framework from httpd etc.

No objections from mlarkin@ sunil@


# 1.29 17-Aug-2016 deraadt

small bits of header cleanup; ok mlarkin


# 1.28 29-Jul-2016 stefan

Allow starting a VM again after it was terminated

If a VM exits, terminate it and remove it from the list of
available VMs. That allows a VM with name `foo' to be restarted
after it has exited.

This changes structures shared between vmd and vmctl. You need to
rebuild vmctl also.

ok mlarkin@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.27 05-Feb-2016 reyk

Fix a possible use-after-free in vmd, forward the result to the
control socket before free'ing the vm.

Found by and OK jsg@


# 1.26 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.25 11-Dec-2015 reyk

The vmctl "id" argument can now be a number of or a vm name, eg.
vmctl stop 3
vmctl stop "openbsd.vm"


# 1.24 08-Dec-2015 jsg

when checking the config file with -n don't open /dev/vmm or require root
ok reyk@


# 1.23 08-Dec-2015 jsg

make the -f option work as intended
ok reyk@


# 1.22 07-Dec-2015 reyk

tweak initial error logging


# 1.21 06-Dec-2015 reyk

Prevent running a VM with the same name multiple times - multiple
instances of the same configuration will be handled in a different way
later. It is also not a good idea to use the same writeable disk
with multiple VMs at the same time.

As discussed with mlarkin@


# 1.20 06-Dec-2015 reyk

Print the TTY in the vmctl status output.


# 1.19 06-Dec-2015 reyk

When a new vm is created with VMM_IOC_CREATE, the kernel assigns a
unique id to it. This happens in the vm child process and has to be
communicated to the parent processes to track the vm. Knowing the vm
id in the parent and vmm processes also allows to remove vm from the
daemons list on terminate requests later.


# 1.18 06-Dec-2015 reyk

Check errno from config_getvm() correctly


# 1.17 05-Dec-2015 reyk

Print shorter error message if opening /dev/vmm failed.

Pointed out by deraadt@


# 1.16 03-Dec-2015 reyk

Re-add the "load" and "reload" commands to vmctl: Instead of parsing
the configuration in vmctl directly, it now sends a (re)load request
to vmd. The reload also resets the existing configuration status -
this doesn't do much difference yet but a future change will compare
if a specified VM is already running. "load" will allow to add
configuration, while "reload" resets the state before loading.


# 1.15 03-Dec-2015 reyk

Add and document -D and -f flags to vmd.


# 1.14 03-Dec-2015 reyk

mlarkin's code has been moved to vmm.c, so it is ok to claim the copyright.


# 1.13 03-Dec-2015 reyk

Add support for an optional vm.conf(5) file in vmd. This will replace
vmm.conf(5) in vmmctl. For a short time, both vmd and vmmctl will
support a configuration file, but vmmctl will be changed to send
"load" requests to vmd instead of loading and parsing the file
directly.


# 1.12 03-Dec-2015 reyk

prepare config_getvm() for parse.y


# 1.11 02-Dec-2015 reyk

send the tty name to vmmctl and print it as a result.


# 1.10 02-Dec-2015 reyk

Split the fully privileged parent into two processes "parent" and
"vmm" with reduced privileges:
- the "parent" opens fds (disks, ifs, etc.) but runs as root but pledged as
"stdio rpath wpath proc tty sendfd".
- the "vmm" process handles the creation and supervision of vm processes,
and the primary communication with the vmm(4) subsystem. It runs as _vmd
in the chroot but does not use pledge, as the vmm ioctls are not allowed
by any pledge model yet.
With this change, vmd starts to track the configuration state of VMs
in vmd and will allow other things later (like terminating a vm by
name, moving the configuration parser to vmd, ...). More incremental
changes will follow.


# 1.9 02-Dec-2015 reyk

Start tweaking vmd's privsep and daemon model by splitting the main
process into multiple parts and adopting the "proc.c"-style from other
daemons. This allows to further reduce the privileges, to give better
pledge(2), and to add some upcoming changes.

"please do" mlarkin@, deraadt@


# 1.8 26-Nov-2015 reyk

Automatically start vmm(4) when the first VM is created and after the
last VM is terminated. This allows to remove the explicit "vmm
enable" / "vmm disable" (VMM_IOC_START / VMM_IOC_STOP) ioctls. You'll
have to update kernel and userland for this change, as the kernel ABI
changes.

OK mpi@ mlarkin@


# 1.7 25-Nov-2015 tedu

typo: should be looking pid == -1


# 1.6 23-Nov-2015 reyk

accept4() is restarted after signals which prevents vmd from exiting
in the current control socket loop. Add a poll before the accept that
is not restarted and allows to escape the loop. This code is kind of
temporary, as we're planning to replace the event handling, but it
allows to kill (or Ctrl+c) vmd for now.

OK tedu@, discussed with many


# 1.5 23-Nov-2015 reyk

I accidentally removed a newline in usage() when converting the log
messages to log_*.

From Cesare Gargano


# 1.4 23-Nov-2015 reyk

Add support for logging to stderr or syslog, and to run vmd in
foreground with -d.

OK mlarkin@ jung@


# 1.3 22-Nov-2015 deraadt

use PATH_MAX where needed


# 1.2 22-Nov-2015 reyk

Add $ Ids


# 1.1 22-Nov-2015 mlarkin

vmd(8) - virtual machine daemon.

There is still a lot to be done, and fixed, in these userland components
but I have received enough "it works, commit it" emails that it's time
to finish those things in tree.

discussed with many, tested by many.


# 1.123 26-Apr-2021 dv

vmd(8): fix vmctl client "wait" state corruption

Adds queue-based tracking of waiting client state to fix the cause of
state corruption when a vmctl(8) user cancels a wait and restarts it.
The socket fd value for the control process client was being used to
track the waiting party, but this also prevented multiple waiting
clients.

This moves all the state tracking of who to notify of a vm's stopping
to the control process and no longer requires the parent process to
track it in the global environment state.

Future work will be needed to smooth out the difference between the
IMSG_VMDOP_TERMINATE_VM_{EVENT,RESPONSE} events instead of needing to
translate before relaying to the vmctl(8) client.

Tested by Mischa Peters (thanks!)

ok mlarkin@


Revision tags: OPENBSD_6_9_BASE
# 1.122 05-Apr-2021 dv

Send correct response type on unpause errors.

ok pd@


# 1.121 29-Mar-2021 dv

Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp
and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior
ignored did not intercept these packets and instead transmitted them.

This should make vmd(8)'s dhcp behave more as a true dhcp server should and
allows it to work properly with the new dhcpleased(8) attempting a renewal.

OK mlarkin@


# 1.120 27-Jan-2021 deraadt

these programs (with common ancestry) had a -fno-common problem related
to privsep_procid.
ok mortimer


Revision tags: OPENBSD_6_8_BASE
# 1.119 23-Sep-2020 martijn

Revert agentx support for now, we're too close to release.

requested by deraadt@


# 1.118 23-Sep-2020 martijn

Add support for agentx to vmd.

This is based around VM-MIB from RFC7666,but does not export the full
spec. People more knowledgeable of vmd are encouraged to expand on this.


Revision tags: OPENBSD_6_7_BASE
# 1.117 12-Dec-2019 pd

vmd: start vms defined in vm.conf in a staggered fashion

This addresses 'thundering herd' problem when a lot of
vms are configured in vm.conf. A lot of vms booting in parallel can
overload the host and also mess up tsc calibration in openbsd guests as
it uses PIT which doesn't fire reliably if the host is overloaded.

We default to starting vms with parallelism of ncpuonline and a delay 30 seconds
between batches. This is configurable in vm.conf.

ok mlarkin@ (also addressed comments from cheloha@)


Revision tags: OPENBSD_6_6_BASE
# 1.116 04-Sep-2019 mlarkin

vmd(8): memory leak in an error path

Found by Hiltjo Posthuma, thanks!


# 1.115 14-Aug-2019 anton

Improve the error message when supplying an invalid template to vmctl
start. Favoring 'invalid template' over 'permission denied' should give
the user a better hint on what went wrong.

ok kn@ mlarkin@


# 1.114 28-Jun-2019 deraadt

When system calls indicate an error they return -1, not some arbitrary
value < 0. errno is only updated in this case. Change all (most?)
callers of syscalls to follow this better, and let's see if this strictness
helps us in the future.


# 1.113 20-May-2019 jasper

drop fatalx calls when claiming a new vm id; otherwise it's possible
to crash vmd and take all other vms with it. this required a little
shuffling to get the error value reported back to the caller to
handle the error properly.

ok mlarkin@


# 1.112 11-May-2019 jasper

report vm state through 'vmctl status'; whereas previously this would display the state of
the vcpu (which is why it got removed), it now actually reports the correct state
(running, stopped, disabled, paused, etc)

ok ccardenas@ mlarkin@


# 1.111 11-May-2019 jasper

vm_dump_header allocated space for a signature but it was never set;
set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image

ok mlarkin@ pd@


# 1.110 11-May-2019 jasper

track the state of the vm (running, paused, etc) using a single bitfield instead of
a handful of separate variables. this will makes it easier for vmd to report
and check on the individual vm states

no functional change intended

ok ccardenas@ mlarkin@


# 1.109 11-May-2019 jasper

sync the vm state in vmd too when (un)pausing a vm, otherwise the vm process
knows the vm is paused, but vmd does not.

ok mlarkin@ pd@


Revision tags: OPENBSD_6_5_BASE
# 1.108 09-Dec-2018 claudio

When -B is used to specify a specific boot device also change the reboot
behaviour of vmd to stop / exit at guest reboot.
OK ccardenas@


# 1.107 04-Dec-2018 claudio

Introduce IMSG_VMDOP_WAIT_VM_REQUEST a control message that registers a
vmctl peerid that should be informed when the VM is stopped (like when the
guest does a shutdown). Uses the same logic as using the VMOP_WAIT flag on
IMSG_VMDOP_TERMINATE_VM_REQUEST.
Ok ccardenas@, reyk@


# 1.106 26-Nov-2018 ori

Keep a list of known vms, and reuse the VM IDs.

This means that when using '-L', the IP addresses of the VMs are stable.

ok reyk@


# 1.105 21-Nov-2018 reyk

Add support for "local inet6" interfaces.

ok & test ccardenas@, additional review from kn@


# 1.104 15-Oct-2018 reyk

Prevent VM reboot loops by rate-limiting the interval a VM can reboot.

This looping has been experienced by people who run VMs with a broken
kernel or boot loader that trigger a very fast reboot loop (triple
fault) of a VM that ends up using a lot of CPU and resources on the
host. Some fixes in vmm(4) and vmd(8) helped to avoid such conditions
but it can still occur if something is wrong in the guest VM itself.

If the VM restarts after less than VM_START_RATE_SEC (6) seconds, we
increment the limit counter. After VM_START_RATE_LIMIT (3) of suchs
fast reboots the VM is stopped.

There are only very few people who intentionally want to reboot-loop a
VM very quickly (many times within a second); mostly for fuzzing.
They will have to recompile and adjust the stated #defines in the code
as we don't have a config option to disable it.

OK mlarkin@


Revision tags: OPENBSD_6_4_BASE
# 1.103 08-Oct-2018 reyk

Add support for qcow2 base images (external snapshots).

This works is from Ori Bernstein, committing on his behalf:

Add support to vmd for external snapshots. That is, snapshots that are
derived from a base image. Data lookups start in the derived image,
and if the derived image does not contain some data, the search
proceeds ot the base image. Multiple derived images may exist off of
a single base image.

A limitation of this format is that modifying the base image will
corrupt the derived image.

This change also adds support for creating disk derived disk images to
vmctl. To use it:

vmctl create derived.qcow2 -s 16G -b base.qcow2

From Ori Bernstein
OK mlarkin@ reyk@


# 1.102 29-Sep-2018 pd

vmd: don't remove vm if sending failed

Fix a bug where a vm was removed in vmd.c after vmctl send even if sending
failed.
spotted by solene@
ok mlarkin@


# 1.101 28-Sep-2018 reyk

Fix copy-pasto to use maxmem instead of maxcpu

Reported by Greg Steuck

OK mlarkin@


# 1.100 10-Sep-2018 bluhm

vmd(8) clould close file descriptor 0 as not all fd fields were
properly initialized with -1. Also avoid closing -1.
OK mlarkin@


# 1.99 10-Sep-2018 bluhm

During the fork+exec implementation, daemon(3) was moved after
proc_init(). As a consequence vmd(8) child processes did not detach
from the terminal anymore. Dup /dev/null to the stdio file descriptors
in the children.
OK mlarkin@ reyk@


# 1.98 15-Jul-2018 reyk

Track resources and enforce cpu/memory/interface limits for non-root users.

The limits are currently hard-coded and undocumented (4 CPUs/VMs, 2G
memory, 8 interfaces) but will be configurable in an upcoming diff.
These limits are tracked in total usage; for example, a user will be
able to run up to 4 VMs with 512M of memory or a single VM with 2G.

OK ccardenas@ mlarkin@


# 1.97 13-Jul-2018 reyk

Check the disk/kernel/cdrom file permissions after openening the fd.

This prevents time of TOCTOU attacks for instances.

OK mlarkin@


# 1.96 13-Jul-2018 reyk

Add "allow instance" option.

This allows users to create VM instances and change desired options,
for example a user can be allowed to run a VM with all the
pre-configured options but specify an own disk image.

(mlarkin@ was fine with iterating over it)

OK ccardenas@


# 1.95 12-Jul-2018 reyk

Allow to use configured/running VMs as templates for other VM instances.

This introduces new grammar and the -t optional in vmctl start.

(For now, only root can create VM instances; but it is planned to allow
users to create their own VMs based on permissions and quota.)

OK ccardenas@ mlarkin@ jmc@


# 1.94 11-Jul-2018 reyk

style - indent each case statement in a switch.


# 1.93 11-Jul-2018 reyk

Add -w option to vmctl stop to wait for completion of VM termination.

Use it in /etc/rc.d/vmd accordingly.

OK sthen@


# 1.92 11-Jul-2018 reyk

Rename function to vmd_check_vmh


# 1.91 11-Jul-2018 reyk

Add -f option to vmctl stop to forcefully kill a VM.

This also fixes a bug in vmm_sighdlr where it might have missed
forwarding the TERMINATE_EVENT to the vmd parent after a VM child
died, leading to an abandoned VM in the vmd parent process.

OK ccardenas@ mlarkin@ benno@ kn@


# 1.90 10-Jul-2018 reyk

style (single-line ifs don't need braces)


# 1.89 10-Jul-2018 reyk

vmd already had DEBUG/DPRINTF, there is no need for VMD_DEBUG/dprintf

Replace all occurences of dprintf with DPRINTF (defined in proc.h).


# 1.88 10-Jul-2018 reyk

Tweak debug log messages

- Turn tracing messages into DPRINTF (only compiled with DEBUG).

- Pass __func__ to vm_stop and vm_remove: this way we can track who
called the function in the async context. It replaces the manual
log_debug in front of each vm_stop/vm_remove. This debug logging
trick can be removed in the future once we are more confident about
it.

OK ccardenas@ mlarkin@


# 1.87 26-Jun-2018 reyk

Add "socket owner" to allow changing the owner of the vmd control socket.

This allows to open vmctl control or console access to other users
that are not in group wheel. Access for non-root users still defaults
to read-only actions unless you change the owner (user/group) of each
individual VM.

Requested by Mischa Peters

OK mlarkin@


# 1.86 19-Jun-2018 reyk

knf


# 1.85 13-May-2018 pd

vmd(8): enable pause / unpause for vm owners

Patch from Mohamed Aslan. Thanks!
ok kn@


# 1.84 25-Apr-2018 mlarkin

vmd(8)'s early error messages weren't visible when started via /etc/rc
(such as errors relating to not having VMX/etc). Change the log_init
to log to syslog so at least we have some chance of seeing these errors.

requested and ok beck@


# 1.83 21-Apr-2018 mlarkin

spelling error in log message


# 1.82 29-Mar-2018 martijn

Make sure that the global config is send out immediately when it is
loaded. This makes sure that the local prefix specied in the config is
always used.

OK ccardenas@


Revision tags: OPENBSD_6_3_BASE
# 1.81 14-Mar-2018 mlarkin

block two VMs from using the same disk image file at the same time.
Also changes an error message in vmctl to reflect same.


# 1.80 18-Feb-2018 pd

vmd: fix vmctl pause for non existing vm ids (never returns)

check if vm id is valid before sending to vmm for pausing. The 'lock' is caused
by vmm sending back ENOENT for a non existent vm but vmd drops the message
because it doesn't recogize the vmid vmm is talking about. This is an artifact
of the 'policy' don't trust any imsg from a sibling priv sep process and do
your own checking.

reported by Abel Abraham Camarillo Ojeda
ok mlarkin@ and ccardenas@


# 1.79 10-Jan-2018 sthen

Don't require "disk" or "kernel", also allow just "cdrom" instead, a VM can
still be useful with only cdrom storage. ok ccardenas@


# 1.78 08-Jan-2018 mpi

Enable TIOCUCNTL to be able to set ns8250's break detected condition.

It is now possible to send BREAK commands to vmd(8) independently of
the serial terminal emulator.

Happy virtual ddb(4) hacking!

No objection from mlarkin@, ok nicm@, ccardenas@, deraadt@


# 1.77 03-Jan-2018 ccardenas

Add initial CD-ROM support to VMD via vioscsi.

* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8)
* Support various sized ISOs (Limitation of 4G ISOs on Linux guests)
* Known working guests: OpenBSD (primary), Alpine Linux (primary),
CentOS 6 (secondary), Ubuntu 17.10 (secondary).
NOTE: Secondary indicates some issue(s) preventing full/reliable
functionality outside the scope of the vioscsi work.
* If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's
default BIOS) will boot from CD-ROM.

ok mlarkin@, jca@


# 1.76 06-Dec-2017 abieber

Make vmd respect owner when starting non-disabled vms.

OK pd@, benno@


# 1.75 30-Nov-2017 ccardenas

When performing vmctl reload and a previously configured vm is running,
exit with an EALREADY vs EPERM.

ok mlarkin@


# 1.74 11-Nov-2017 mlarkin

update switch handling in vmd(8). vmd now gets switch information (rdomain,
etc) from underlying switch interface instead of handling this on its
own.

Diff from carlos cardenas, Thanks!

ok reyk@


# 1.73 07-Nov-2017 mlarkin

typo in previous


# 1.72 07-Nov-2017 mlarkin

comment function vm_checkperm


# 1.71 24-Oct-2017 mlarkin

The VMD parent process didn't handle the case of a VM exiting
with a non 0 return properly (i.e. EIO).

From: Carlos Cardenas, thanks!


# 1.70 07-Oct-2017 mlarkin

vmd: retain ownership on vm reboot

from Jesper Wallin, thanks!


Revision tags: OPENBSD_6_2_BASE
# 1.69 08-Sep-2017 mlarkin

vmd: add more explanatory log_debug messages

From Carlos Cardenas, many thanks!


# 1.68 20-Aug-2017 pd

vmd: Allow only upward migration

This restricts receiving vms from hosts with more cpu features.

Tested on
broadwell -> skylake (works)
skylake -> broadwell (don't work)

ok mlarkin@


# 1.67 15-Aug-2017 pd

vmd: fix vm id displayed by vmctl when receiving a vm

Also fix two debug messages and an IMSG type.


# 1.66 14-Aug-2017 jasper

validate vm names before creating them; a valid name contains alphanumeric
characters, including '.', '_' and '-'. but does not start with the latter
three.

ok mlarkin@ pd@


# 1.65 13-Aug-2017 jasper

don't issue a termination command to an already stopped vm

ok mlarkin@


# 1.64 15-Jul-2017 pd

Add vmctl send and vmctl receive

ok reyk@ and mlarkin@


# 1.63 09-Jul-2017 pd

vmd/vmctl: Add ability to pause / unpause vms

With help from Ashwin Agrawal

ok reyk@ mlarkin@


# 1.62 29-May-2017 mlarkin

vmd(8): prevent crashing when presented with a vm name argument to
"vmctl stop" that doesn't exist.

Diff from Pratik Vyas, thanks!


# 1.61 04-May-2017 reyk

Report command failure back to vmctl reload, reset, load, log verbose.

OK mlarkin@


# 1.60 04-May-2017 reyk

Add support for rdomains.

This allows to configure VM interfaces and switches in individual rdomains.

OK mlarkin@


# 1.59 25-Apr-2017 reyk

Generate randomized MAC addresses earlier to keep them across reboots.

OK deraadt@


# 1.58 21-Apr-2017 reyk

Add global configuration option "local prefix" to change prefix for -L.

The default prefix is 100.64.0.0/10 from RFC6598.

Requested by sthen@ chris@
OK mlarkin@


# 1.57 19-Apr-2017 reyk

Add support for dynamic "NAT" interfaces (-L/local interface).

When a local interface is configured, vmd configures a /31 address on
the tap(4) interface of the host and provides another IP in the same
subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server
that replies with IP, gateway, and DNS addresses to the VM. The
built-in server only ever responds to the VM on the inside and cannot
leak its DHCP responses to the outside.

Thanks to Uwe Werler, Josh Grosse, and some others for testing!

OK deraadt@


# 1.56 06-Apr-2017 reyk

Do not expose vmm(4) VM IDs to the user, use vmd(8)'s IDs instead.

Each VM has two IDs: one from the kernel (vmm) and a different one
from userland (vmd). The vmm ID is not consistent and incremented on
every boot during runtimg of the host system. The vmd ID remains the
same during the lifetime of a configured VM, even after reboots.
Configured VMs will even get and keep their IDs when the configuration
is loaded. This is more what users expect.

Pointed out and tested by otto@

OK deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.55 15-Mar-2017 reyk

More fixes for starting and stopping VMs, fixing fallout from vm_running.

- Don't start a VM that is already running
- Keep the VM as running until it is powered off (and not stopping)
- Don't fatal in the parent if the vmm process referenced an unknown VM
- Don't stop a VM that is already stopping
- Indicate that a VM is stopping in "vmctl status"

The previous "vmctl stop; vmctl stop" to force-shutdown is not
supported anymore - the shutdown timeout should make sure that the VM
is really terminated. To force-shutdown, reference the VM by ID.
We might add a flag to vmctl stop to just turn the VM off.


# 1.54 15-Mar-2017 reyk

Close the tty if the VM was powered down.

The parent keeps a copy of each VM's tty fd to reuse it on reboot.
Close this tty if the VM was stopped, and not rebooted, by calling
vm_stop(vm, 0) instead of just setting vm_running to 0. Also make
sure that vm_ttyname is not used after free'ing it.


# 1.53 02-Mar-2017 reyk

Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.

This is especially useful when multiple VMs share a switch, the
implementation is independent from the underlying switch or bridge.

no objections mlarkin@


# 1.52 01-Mar-2017 reyk

Add "owner" option to set a user/group ownership for pre-configured VMs

This allows matching users to start or stop VMs that they "own" and to
access the console accordingly.

OK mlarkin@


# 1.51 27-Feb-2017 reyk

Replace openpty(3) with local function that uses pre-opened /dev/ptm fd

This allows more flexibility for upcoming changes and better pledge.
We also didn't use half of the features of libutil's openpty function.
Additionally, make sure that the ttys are closed correctly on shutdown.

OK gilles@


# 1.50 13-Jan-2017 edd

Make it possible to remove VMs from vmd(8)'s internal queue.

The semantics agreed with reyk@ are:

* ad-hoc created vms, created with `vmctl start`, are removed once stopped.
* Stopped VMs defined in a config file are flushed before a `vmctl reload`.

OK reyk@


# 1.49 11-Jan-2017 reyk

Add imsg communication channel between vmd and invividual VMs.
For now, this is only used to forward "log verbose|brief" requests,
but it will be used for better things later.

OK mlarkin@


# 1.48 09-Jan-2017 reyk

Stop accessing verbose and debug variables from log.c directly.

This replaces log_verbose() and "extern int verbose" with the two functions
log_setverbose() and log_getverbose().

Pointed out by benno@
OK krw@ eric@ gilles@ (OK gilles@ for the snmpd bits as well)


# 1.47 14-Dec-2016 reyk

Allow to start disabled and pre-configured VMs by name, "vmctl start foo".

With testing from Jon Bernard

OK mlarkin@


# 1.46 14-Dec-2016 reyk

If a VM terminates with the result EAGAIN, close all fds except the
pty and re-send it to the vmm monitor process. With additional
changes in vmm.c, this will allow perform a cold reboot of VM.

With testing and feedback from Jon Bernard
OK mlarkin@


# 1.45 26-Nov-2016 reyk

Implement basic support for boot.conf(8) on the disk image.

Like the real boot loader, load and parse hd0a:/etc/boot.conf from the
first disk and fall back to /bsd. Not all boot loader options are
supported, but it at least does set device, set image, and boot -acds
(eg. for booting single-user).

For example, it can now boot install60.fs that includes a boot.conf
with "set image /6.0/amd64/bsd.rd":
vmctl start install -c -d install60.fs -d OpenBSD.img

This pseudo-bootloader is only needed without BIOS and could
potentially be replaced in the future.

OK mlarkin@


# 1.44 26-Nov-2016 reyk

If -m/memory is not specified, use 512M by default.

Default value picked with mlarkin - not too small and not too large.

OK mlarkin@


# 1.43 24-Nov-2016 reyk

Add support for booting the kernel from the disk image.

This make the kernel/-k argument optional and, if not specified, tries
to find the /bsd kernel in the primary hd0a partition of the first
disk image itself. It doesn't support hd0a:/etc/boot.conf yet, and it
is no BIOS or full boot loader, but it makes booting and handling of
VMs a bit easier - booting an external kernel is still supported.

The UFS file system code ufs.c is directly from libsa which is also
used by the real boot loader. The code compiles with a few signedness
warning which will be fixed separately.

OK mlarkin@


# 1.42 22-Nov-2016 reyk

Fix error path of config_setvm() and its callers. This unbreaks
loading of invalid kernel files.

Reported by mlarkin@
OK mlarkin@


# 1.41 22-Nov-2016 reyk

There is no need for res when there is already ret.


# 1.40 22-Nov-2016 edd

Insert disabled VMs into vmd(8)'s queues and allow vmctl(8) to display them.

Tested by Jon Bernard and reyk@.

OK reyk@, no objections mlarkin@.

Thanks


# 1.39 04-Nov-2016 reyk

Pass the internal vmid or 0 to vm_register() instead of changing it
once again after setting the next available id.

Suggested by edd@


# 1.38 04-Nov-2016 reyk

Update the config/register/get VM methods to match the config_set/get
style that is used in other places. Also keep the vmid from the parent.

OK edd@


# 1.37 29-Oct-2016 edd

Separate parsing vms and switches from starting them in vmd(8).

Brings us one step closer to having disabled by default vms is vm.conf(5),
which can be started with vmctl(8).

Input, testing and OK reyk@. Thanks.


# 1.36 17-Oct-2016 reyk

Add the option to specify an interface group per virtual switch as well;
this group will be added to all VM tap(4) interfaces in the switch.

Tested by martijn@


# 1.35 15-Oct-2016 reyk

Allow to add an interface to an interface group; with the group keyword.

Requested and tested by martijn@


# 1.34 12-Oct-2016 reyk

Fix functionality and semantics of vmctl load/reload/reset.

OK rzalamena@


# 1.33 06-Oct-2016 reyk

Terminate VMs on shutdown of vmd instead of leaving them running as
undead VM processes.

OK mlarkin@


# 1.32 05-Oct-2016 reyk

Add support for enhanced networking configuration and virtual switches.
See vm.conf(5) for more details.

OK mlarkin@


# 1.31 04-Oct-2016 reyk

Add a new "priv" process that is responsible for ioctls and restricted
operations that aren't allowed under pledge. This is a companion to
the "vmd" process that runs as root but with pledge.

With the "priv" process, each new tap(4) interface now gets a
description to indicate the vm, eg. "vm1-if0-myvm". For network
configuration will be done by vmd/priv later.

OK mlarkin@


# 1.30 29-Sep-2016 reyk

Implement fork+exec for vmd, using the same framework from httpd etc.

No objections from mlarkin@ sunil@


# 1.29 17-Aug-2016 deraadt

small bits of header cleanup; ok mlarkin


# 1.28 29-Jul-2016 stefan

Allow starting a VM again after it was terminated

If a VM exits, terminate it and remove it from the list of
available VMs. That allows a VM with name `foo' to be restarted
after it has exited.

This changes structures shared between vmd and vmctl. You need to
rebuild vmctl also.

ok mlarkin@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.27 05-Feb-2016 reyk

Fix a possible use-after-free in vmd, forward the result to the
control socket before free'ing the vm.

Found by and OK jsg@


# 1.26 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.25 11-Dec-2015 reyk

The vmctl "id" argument can now be a number of or a vm name, eg.
vmctl stop 3
vmctl stop "openbsd.vm"


# 1.24 08-Dec-2015 jsg

when checking the config file with -n don't open /dev/vmm or require root
ok reyk@


# 1.23 08-Dec-2015 jsg

make the -f option work as intended
ok reyk@


# 1.22 07-Dec-2015 reyk

tweak initial error logging


# 1.21 06-Dec-2015 reyk

Prevent running a VM with the same name multiple times - multiple
instances of the same configuration will be handled in a different way
later. It is also not a good idea to use the same writeable disk
with multiple VMs at the same time.

As discussed with mlarkin@


# 1.20 06-Dec-2015 reyk

Print the TTY in the vmctl status output.


# 1.19 06-Dec-2015 reyk

When a new vm is created with VMM_IOC_CREATE, the kernel assigns a
unique id to it. This happens in the vm child process and has to be
communicated to the parent processes to track the vm. Knowing the vm
id in the parent and vmm processes also allows to remove vm from the
daemons list on terminate requests later.


# 1.18 06-Dec-2015 reyk

Check errno from config_getvm() correctly


# 1.17 05-Dec-2015 reyk

Print shorter error message if opening /dev/vmm failed.

Pointed out by deraadt@


# 1.16 03-Dec-2015 reyk

Re-add the "load" and "reload" commands to vmctl: Instead of parsing
the configuration in vmctl directly, it now sends a (re)load request
to vmd. The reload also resets the existing configuration status -
this doesn't do much difference yet but a future change will compare
if a specified VM is already running. "load" will allow to add
configuration, while "reload" resets the state before loading.


# 1.15 03-Dec-2015 reyk

Add and document -D and -f flags to vmd.


# 1.14 03-Dec-2015 reyk

mlarkin's code has been moved to vmm.c, so it is ok to claim the copyright.


# 1.13 03-Dec-2015 reyk

Add support for an optional vm.conf(5) file in vmd. This will replace
vmm.conf(5) in vmmctl. For a short time, both vmd and vmmctl will
support a configuration file, but vmmctl will be changed to send
"load" requests to vmd instead of loading and parsing the file
directly.


# 1.12 03-Dec-2015 reyk

prepare config_getvm() for parse.y


# 1.11 02-Dec-2015 reyk

send the tty name to vmmctl and print it as a result.


# 1.10 02-Dec-2015 reyk

Split the fully privileged parent into two processes "parent" and
"vmm" with reduced privileges:
- the "parent" opens fds (disks, ifs, etc.) but runs as root but pledged as
"stdio rpath wpath proc tty sendfd".
- the "vmm" process handles the creation and supervision of vm processes,
and the primary communication with the vmm(4) subsystem. It runs as _vmd
in the chroot but does not use pledge, as the vmm ioctls are not allowed
by any pledge model yet.
With this change, vmd starts to track the configuration state of VMs
in vmd and will allow other things later (like terminating a vm by
name, moving the configuration parser to vmd, ...). More incremental
changes will follow.


# 1.9 02-Dec-2015 reyk

Start tweaking vmd's privsep and daemon model by splitting the main
process into multiple parts and adopting the "proc.c"-style from other
daemons. This allows to further reduce the privileges, to give better
pledge(2), and to add some upcoming changes.

"please do" mlarkin@, deraadt@


# 1.8 26-Nov-2015 reyk

Automatically start vmm(4) when the first VM is created and after the
last VM is terminated. This allows to remove the explicit "vmm
enable" / "vmm disable" (VMM_IOC_START / VMM_IOC_STOP) ioctls. You'll
have to update kernel and userland for this change, as the kernel ABI
changes.

OK mpi@ mlarkin@


# 1.7 25-Nov-2015 tedu

typo: should be looking pid == -1


# 1.6 23-Nov-2015 reyk

accept4() is restarted after signals which prevents vmd from exiting
in the current control socket loop. Add a poll before the accept that
is not restarted and allows to escape the loop. This code is kind of
temporary, as we're planning to replace the event handling, but it
allows to kill (or Ctrl+c) vmd for now.

OK tedu@, discussed with many


# 1.5 23-Nov-2015 reyk

I accidentally removed a newline in usage() when converting the log
messages to log_*.

From Cesare Gargano


# 1.4 23-Nov-2015 reyk

Add support for logging to stderr or syslog, and to run vmd in
foreground with -d.

OK mlarkin@ jung@


# 1.3 22-Nov-2015 deraadt

use PATH_MAX where needed


# 1.2 22-Nov-2015 reyk

Add $ Ids


# 1.1 22-Nov-2015 mlarkin

vmd(8) - virtual machine daemon.

There is still a lot to be done, and fixed, in these userland components
but I have received enough "it works, commit it" emails that it's time
to finish those things in tree.

discussed with many, tested by many.


# 1.122 05-Apr-2021 dv

Send correct response type on unpause errors.

ok pd@


# 1.121 29-Mar-2021 dv

Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp
and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior
ignored did not intercept these packets and instead transmitted them.

This should make vmd(8)'s dhcp behave more as a true dhcp server should and
allows it to work properly with the new dhcpleased(8) attempting a renewal.

OK mlarkin@


# 1.120 27-Jan-2021 deraadt

these programs (with common ancestry) had a -fno-common problem related
to privsep_procid.
ok mortimer


Revision tags: OPENBSD_6_8_BASE
# 1.119 23-Sep-2020 martijn

Revert agentx support for now, we're too close to release.

requested by deraadt@


# 1.118 23-Sep-2020 martijn

Add support for agentx to vmd.

This is based around VM-MIB from RFC7666,but does not export the full
spec. People more knowledgeable of vmd are encouraged to expand on this.


Revision tags: OPENBSD_6_7_BASE
# 1.117 12-Dec-2019 pd

vmd: start vms defined in vm.conf in a staggered fashion

This addresses 'thundering herd' problem when a lot of
vms are configured in vm.conf. A lot of vms booting in parallel can
overload the host and also mess up tsc calibration in openbsd guests as
it uses PIT which doesn't fire reliably if the host is overloaded.

We default to starting vms with parallelism of ncpuonline and a delay 30 seconds
between batches. This is configurable in vm.conf.

ok mlarkin@ (also addressed comments from cheloha@)


Revision tags: OPENBSD_6_6_BASE
# 1.116 04-Sep-2019 mlarkin

vmd(8): memory leak in an error path

Found by Hiltjo Posthuma, thanks!


# 1.115 14-Aug-2019 anton

Improve the error message when supplying an invalid template to vmctl
start. Favoring 'invalid template' over 'permission denied' should give
the user a better hint on what went wrong.

ok kn@ mlarkin@


# 1.114 28-Jun-2019 deraadt

When system calls indicate an error they return -1, not some arbitrary
value < 0. errno is only updated in this case. Change all (most?)
callers of syscalls to follow this better, and let's see if this strictness
helps us in the future.


# 1.113 20-May-2019 jasper

drop fatalx calls when claiming a new vm id; otherwise it's possible
to crash vmd and take all other vms with it. this required a little
shuffling to get the error value reported back to the caller to
handle the error properly.

ok mlarkin@


# 1.112 11-May-2019 jasper

report vm state through 'vmctl status'; whereas previously this would display the state of
the vcpu (which is why it got removed), it now actually reports the correct state
(running, stopped, disabled, paused, etc)

ok ccardenas@ mlarkin@


# 1.111 11-May-2019 jasper

vm_dump_header allocated space for a signature but it was never set;
set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image

ok mlarkin@ pd@


# 1.110 11-May-2019 jasper

track the state of the vm (running, paused, etc) using a single bitfield instead of
a handful of separate variables. this will makes it easier for vmd to report
and check on the individual vm states

no functional change intended

ok ccardenas@ mlarkin@


# 1.109 11-May-2019 jasper

sync the vm state in vmd too when (un)pausing a vm, otherwise the vm process
knows the vm is paused, but vmd does not.

ok mlarkin@ pd@


Revision tags: OPENBSD_6_5_BASE
# 1.108 09-Dec-2018 claudio

When -B is used to specify a specific boot device also change the reboot
behaviour of vmd to stop / exit at guest reboot.
OK ccardenas@


# 1.107 04-Dec-2018 claudio

Introduce IMSG_VMDOP_WAIT_VM_REQUEST a control message that registers a
vmctl peerid that should be informed when the VM is stopped (like when the
guest does a shutdown). Uses the same logic as using the VMOP_WAIT flag on
IMSG_VMDOP_TERMINATE_VM_REQUEST.
Ok ccardenas@, reyk@


# 1.106 26-Nov-2018 ori

Keep a list of known vms, and reuse the VM IDs.

This means that when using '-L', the IP addresses of the VMs are stable.

ok reyk@


# 1.105 21-Nov-2018 reyk

Add support for "local inet6" interfaces.

ok & test ccardenas@, additional review from kn@


# 1.104 15-Oct-2018 reyk

Prevent VM reboot loops by rate-limiting the interval a VM can reboot.

This looping has been experienced by people who run VMs with a broken
kernel or boot loader that trigger a very fast reboot loop (triple
fault) of a VM that ends up using a lot of CPU and resources on the
host. Some fixes in vmm(4) and vmd(8) helped to avoid such conditions
but it can still occur if something is wrong in the guest VM itself.

If the VM restarts after less than VM_START_RATE_SEC (6) seconds, we
increment the limit counter. After VM_START_RATE_LIMIT (3) of suchs
fast reboots the VM is stopped.

There are only very few people who intentionally want to reboot-loop a
VM very quickly (many times within a second); mostly for fuzzing.
They will have to recompile and adjust the stated #defines in the code
as we don't have a config option to disable it.

OK mlarkin@


Revision tags: OPENBSD_6_4_BASE
# 1.103 08-Oct-2018 reyk

Add support for qcow2 base images (external snapshots).

This works is from Ori Bernstein, committing on his behalf:

Add support to vmd for external snapshots. That is, snapshots that are
derived from a base image. Data lookups start in the derived image,
and if the derived image does not contain some data, the search
proceeds ot the base image. Multiple derived images may exist off of
a single base image.

A limitation of this format is that modifying the base image will
corrupt the derived image.

This change also adds support for creating disk derived disk images to
vmctl. To use it:

vmctl create derived.qcow2 -s 16G -b base.qcow2

From Ori Bernstein
OK mlarkin@ reyk@


# 1.102 29-Sep-2018 pd

vmd: don't remove vm if sending failed

Fix a bug where a vm was removed in vmd.c after vmctl send even if sending
failed.
spotted by solene@
ok mlarkin@


# 1.101 28-Sep-2018 reyk

Fix copy-pasto to use maxmem instead of maxcpu

Reported by Greg Steuck

OK mlarkin@


# 1.100 10-Sep-2018 bluhm

vmd(8) clould close file descriptor 0 as not all fd fields were
properly initialized with -1. Also avoid closing -1.
OK mlarkin@


# 1.99 10-Sep-2018 bluhm

During the fork+exec implementation, daemon(3) was moved after
proc_init(). As a consequence vmd(8) child processes did not detach
from the terminal anymore. Dup /dev/null to the stdio file descriptors
in the children.
OK mlarkin@ reyk@


# 1.98 15-Jul-2018 reyk

Track resources and enforce cpu/memory/interface limits for non-root users.

The limits are currently hard-coded and undocumented (4 CPUs/VMs, 2G
memory, 8 interfaces) but will be configurable in an upcoming diff.
These limits are tracked in total usage; for example, a user will be
able to run up to 4 VMs with 512M of memory or a single VM with 2G.

OK ccardenas@ mlarkin@


# 1.97 13-Jul-2018 reyk

Check the disk/kernel/cdrom file permissions after openening the fd.

This prevents time of TOCTOU attacks for instances.

OK mlarkin@


# 1.96 13-Jul-2018 reyk

Add "allow instance" option.

This allows users to create VM instances and change desired options,
for example a user can be allowed to run a VM with all the
pre-configured options but specify an own disk image.

(mlarkin@ was fine with iterating over it)

OK ccardenas@


# 1.95 12-Jul-2018 reyk

Allow to use configured/running VMs as templates for other VM instances.

This introduces new grammar and the -t optional in vmctl start.

(For now, only root can create VM instances; but it is planned to allow
users to create their own VMs based on permissions and quota.)

OK ccardenas@ mlarkin@ jmc@


# 1.94 11-Jul-2018 reyk

style - indent each case statement in a switch.


# 1.93 11-Jul-2018 reyk

Add -w option to vmctl stop to wait for completion of VM termination.

Use it in /etc/rc.d/vmd accordingly.

OK sthen@


# 1.92 11-Jul-2018 reyk

Rename function to vmd_check_vmh


# 1.91 11-Jul-2018 reyk

Add -f option to vmctl stop to forcefully kill a VM.

This also fixes a bug in vmm_sighdlr where it might have missed
forwarding the TERMINATE_EVENT to the vmd parent after a VM child
died, leading to an abandoned VM in the vmd parent process.

OK ccardenas@ mlarkin@ benno@ kn@


# 1.90 10-Jul-2018 reyk

style (single-line ifs don't need braces)


# 1.89 10-Jul-2018 reyk

vmd already had DEBUG/DPRINTF, there is no need for VMD_DEBUG/dprintf

Replace all occurences of dprintf with DPRINTF (defined in proc.h).


# 1.88 10-Jul-2018 reyk

Tweak debug log messages

- Turn tracing messages into DPRINTF (only compiled with DEBUG).

- Pass __func__ to vm_stop and vm_remove: this way we can track who
called the function in the async context. It replaces the manual
log_debug in front of each vm_stop/vm_remove. This debug logging
trick can be removed in the future once we are more confident about
it.

OK ccardenas@ mlarkin@


# 1.87 26-Jun-2018 reyk

Add "socket owner" to allow changing the owner of the vmd control socket.

This allows to open vmctl control or console access to other users
that are not in group wheel. Access for non-root users still defaults
to read-only actions unless you change the owner (user/group) of each
individual VM.

Requested by Mischa Peters

OK mlarkin@


# 1.86 19-Jun-2018 reyk

knf


# 1.85 13-May-2018 pd

vmd(8): enable pause / unpause for vm owners

Patch from Mohamed Aslan. Thanks!
ok kn@


# 1.84 25-Apr-2018 mlarkin

vmd(8)'s early error messages weren't visible when started via /etc/rc
(such as errors relating to not having VMX/etc). Change the log_init
to log to syslog so at least we have some chance of seeing these errors.

requested and ok beck@


# 1.83 21-Apr-2018 mlarkin

spelling error in log message


# 1.82 29-Mar-2018 martijn

Make sure that the global config is send out immediately when it is
loaded. This makes sure that the local prefix specied in the config is
always used.

OK ccardenas@


Revision tags: OPENBSD_6_3_BASE
# 1.81 14-Mar-2018 mlarkin

block two VMs from using the same disk image file at the same time.
Also changes an error message in vmctl to reflect same.


# 1.80 18-Feb-2018 pd

vmd: fix vmctl pause for non existing vm ids (never returns)

check if vm id is valid before sending to vmm for pausing. The 'lock' is caused
by vmm sending back ENOENT for a non existent vm but vmd drops the message
because it doesn't recogize the vmid vmm is talking about. This is an artifact
of the 'policy' don't trust any imsg from a sibling priv sep process and do
your own checking.

reported by Abel Abraham Camarillo Ojeda
ok mlarkin@ and ccardenas@


# 1.79 10-Jan-2018 sthen

Don't require "disk" or "kernel", also allow just "cdrom" instead, a VM can
still be useful with only cdrom storage. ok ccardenas@


# 1.78 08-Jan-2018 mpi

Enable TIOCUCNTL to be able to set ns8250's break detected condition.

It is now possible to send BREAK commands to vmd(8) independently of
the serial terminal emulator.

Happy virtual ddb(4) hacking!

No objection from mlarkin@, ok nicm@, ccardenas@, deraadt@


# 1.77 03-Jan-2018 ccardenas

Add initial CD-ROM support to VMD via vioscsi.

* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8)
* Support various sized ISOs (Limitation of 4G ISOs on Linux guests)
* Known working guests: OpenBSD (primary), Alpine Linux (primary),
CentOS 6 (secondary), Ubuntu 17.10 (secondary).
NOTE: Secondary indicates some issue(s) preventing full/reliable
functionality outside the scope of the vioscsi work.
* If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's
default BIOS) will boot from CD-ROM.

ok mlarkin@, jca@


# 1.76 06-Dec-2017 abieber

Make vmd respect owner when starting non-disabled vms.

OK pd@, benno@


# 1.75 30-Nov-2017 ccardenas

When performing vmctl reload and a previously configured vm is running,
exit with an EALREADY vs EPERM.

ok mlarkin@


# 1.74 11-Nov-2017 mlarkin

update switch handling in vmd(8). vmd now gets switch information (rdomain,
etc) from underlying switch interface instead of handling this on its
own.

Diff from carlos cardenas, Thanks!

ok reyk@


# 1.73 07-Nov-2017 mlarkin

typo in previous


# 1.72 07-Nov-2017 mlarkin

comment function vm_checkperm


# 1.71 24-Oct-2017 mlarkin

The VMD parent process didn't handle the case of a VM exiting
with a non 0 return properly (i.e. EIO).

From: Carlos Cardenas, thanks!


# 1.70 07-Oct-2017 mlarkin

vmd: retain ownership on vm reboot

from Jesper Wallin, thanks!


Revision tags: OPENBSD_6_2_BASE
# 1.69 08-Sep-2017 mlarkin

vmd: add more explanatory log_debug messages

From Carlos Cardenas, many thanks!


# 1.68 20-Aug-2017 pd

vmd: Allow only upward migration

This restricts receiving vms from hosts with more cpu features.

Tested on
broadwell -> skylake (works)
skylake -> broadwell (don't work)

ok mlarkin@


# 1.67 15-Aug-2017 pd

vmd: fix vm id displayed by vmctl when receiving a vm

Also fix two debug messages and an IMSG type.


# 1.66 14-Aug-2017 jasper

validate vm names before creating them; a valid name contains alphanumeric
characters, including '.', '_' and '-'. but does not start with the latter
three.

ok mlarkin@ pd@


# 1.65 13-Aug-2017 jasper

don't issue a termination command to an already stopped vm

ok mlarkin@


# 1.64 15-Jul-2017 pd

Add vmctl send and vmctl receive

ok reyk@ and mlarkin@


# 1.63 09-Jul-2017 pd

vmd/vmctl: Add ability to pause / unpause vms

With help from Ashwin Agrawal

ok reyk@ mlarkin@


# 1.62 29-May-2017 mlarkin

vmd(8): prevent crashing when presented with a vm name argument to
"vmctl stop" that doesn't exist.

Diff from Pratik Vyas, thanks!


# 1.61 04-May-2017 reyk

Report command failure back to vmctl reload, reset, load, log verbose.

OK mlarkin@


# 1.60 04-May-2017 reyk

Add support for rdomains.

This allows to configure VM interfaces and switches in individual rdomains.

OK mlarkin@


# 1.59 25-Apr-2017 reyk

Generate randomized MAC addresses earlier to keep them across reboots.

OK deraadt@


# 1.58 21-Apr-2017 reyk

Add global configuration option "local prefix" to change prefix for -L.

The default prefix is 100.64.0.0/10 from RFC6598.

Requested by sthen@ chris@
OK mlarkin@


# 1.57 19-Apr-2017 reyk

Add support for dynamic "NAT" interfaces (-L/local interface).

When a local interface is configured, vmd configures a /31 address on
the tap(4) interface of the host and provides another IP in the same
subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server
that replies with IP, gateway, and DNS addresses to the VM. The
built-in server only ever responds to the VM on the inside and cannot
leak its DHCP responses to the outside.

Thanks to Uwe Werler, Josh Grosse, and some others for testing!

OK deraadt@


# 1.56 06-Apr-2017 reyk

Do not expose vmm(4) VM IDs to the user, use vmd(8)'s IDs instead.

Each VM has two IDs: one from the kernel (vmm) and a different one
from userland (vmd). The vmm ID is not consistent and incremented on
every boot during runtimg of the host system. The vmd ID remains the
same during the lifetime of a configured VM, even after reboots.
Configured VMs will even get and keep their IDs when the configuration
is loaded. This is more what users expect.

Pointed out and tested by otto@

OK deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.55 15-Mar-2017 reyk

More fixes for starting and stopping VMs, fixing fallout from vm_running.

- Don't start a VM that is already running
- Keep the VM as running until it is powered off (and not stopping)
- Don't fatal in the parent if the vmm process referenced an unknown VM
- Don't stop a VM that is already stopping
- Indicate that a VM is stopping in "vmctl status"

The previous "vmctl stop; vmctl stop" to force-shutdown is not
supported anymore - the shutdown timeout should make sure that the VM
is really terminated. To force-shutdown, reference the VM by ID.
We might add a flag to vmctl stop to just turn the VM off.


# 1.54 15-Mar-2017 reyk

Close the tty if the VM was powered down.

The parent keeps a copy of each VM's tty fd to reuse it on reboot.
Close this tty if the VM was stopped, and not rebooted, by calling
vm_stop(vm, 0) instead of just setting vm_running to 0. Also make
sure that vm_ttyname is not used after free'ing it.


# 1.53 02-Mar-2017 reyk

Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.

This is especially useful when multiple VMs share a switch, the
implementation is independent from the underlying switch or bridge.

no objections mlarkin@


# 1.52 01-Mar-2017 reyk

Add "owner" option to set a user/group ownership for pre-configured VMs

This allows matching users to start or stop VMs that they "own" and to
access the console accordingly.

OK mlarkin@


# 1.51 27-Feb-2017 reyk

Replace openpty(3) with local function that uses pre-opened /dev/ptm fd

This allows more flexibility for upcoming changes and better pledge.
We also didn't use half of the features of libutil's openpty function.
Additionally, make sure that the ttys are closed correctly on shutdown.

OK gilles@


# 1.50 13-Jan-2017 edd

Make it possible to remove VMs from vmd(8)'s internal queue.

The semantics agreed with reyk@ are:

* ad-hoc created vms, created with `vmctl start`, are removed once stopped.
* Stopped VMs defined in a config file are flushed before a `vmctl reload`.

OK reyk@


# 1.49 11-Jan-2017 reyk

Add imsg communication channel between vmd and invividual VMs.
For now, this is only used to forward "log verbose|brief" requests,
but it will be used for better things later.

OK mlarkin@


# 1.48 09-Jan-2017 reyk

Stop accessing verbose and debug variables from log.c directly.

This replaces log_verbose() and "extern int verbose" with the two functions
log_setverbose() and log_getverbose().

Pointed out by benno@
OK krw@ eric@ gilles@ (OK gilles@ for the snmpd bits as well)


# 1.47 14-Dec-2016 reyk

Allow to start disabled and pre-configured VMs by name, "vmctl start foo".

With testing from Jon Bernard

OK mlarkin@


# 1.46 14-Dec-2016 reyk

If a VM terminates with the result EAGAIN, close all fds except the
pty and re-send it to the vmm monitor process. With additional
changes in vmm.c, this will allow perform a cold reboot of VM.

With testing and feedback from Jon Bernard
OK mlarkin@


# 1.45 26-Nov-2016 reyk

Implement basic support for boot.conf(8) on the disk image.

Like the real boot loader, load and parse hd0a:/etc/boot.conf from the
first disk and fall back to /bsd. Not all boot loader options are
supported, but it at least does set device, set image, and boot -acds
(eg. for booting single-user).

For example, it can now boot install60.fs that includes a boot.conf
with "set image /6.0/amd64/bsd.rd":
vmctl start install -c -d install60.fs -d OpenBSD.img

This pseudo-bootloader is only needed without BIOS and could
potentially be replaced in the future.

OK mlarkin@


# 1.44 26-Nov-2016 reyk

If -m/memory is not specified, use 512M by default.

Default value picked with mlarkin - not too small and not too large.

OK mlarkin@


# 1.43 24-Nov-2016 reyk

Add support for booting the kernel from the disk image.

This make the kernel/-k argument optional and, if not specified, tries
to find the /bsd kernel in the primary hd0a partition of the first
disk image itself. It doesn't support hd0a:/etc/boot.conf yet, and it
is no BIOS or full boot loader, but it makes booting and handling of
VMs a bit easier - booting an external kernel is still supported.

The UFS file system code ufs.c is directly from libsa which is also
used by the real boot loader. The code compiles with a few signedness
warning which will be fixed separately.

OK mlarkin@


# 1.42 22-Nov-2016 reyk

Fix error path of config_setvm() and its callers. This unbreaks
loading of invalid kernel files.

Reported by mlarkin@
OK mlarkin@


# 1.41 22-Nov-2016 reyk

There is no need for res when there is already ret.


# 1.40 22-Nov-2016 edd

Insert disabled VMs into vmd(8)'s queues and allow vmctl(8) to display them.

Tested by Jon Bernard and reyk@.

OK reyk@, no objections mlarkin@.

Thanks


# 1.39 04-Nov-2016 reyk

Pass the internal vmid or 0 to vm_register() instead of changing it
once again after setting the next available id.

Suggested by edd@


# 1.38 04-Nov-2016 reyk

Update the config/register/get VM methods to match the config_set/get
style that is used in other places. Also keep the vmid from the parent.

OK edd@


# 1.37 29-Oct-2016 edd

Separate parsing vms and switches from starting them in vmd(8).

Brings us one step closer to having disabled by default vms is vm.conf(5),
which can be started with vmctl(8).

Input, testing and OK reyk@. Thanks.


# 1.36 17-Oct-2016 reyk

Add the option to specify an interface group per virtual switch as well;
this group will be added to all VM tap(4) interfaces in the switch.

Tested by martijn@


# 1.35 15-Oct-2016 reyk

Allow to add an interface to an interface group; with the group keyword.

Requested and tested by martijn@


# 1.34 12-Oct-2016 reyk

Fix functionality and semantics of vmctl load/reload/reset.

OK rzalamena@


# 1.33 06-Oct-2016 reyk

Terminate VMs on shutdown of vmd instead of leaving them running as
undead VM processes.

OK mlarkin@


# 1.32 05-Oct-2016 reyk

Add support for enhanced networking configuration and virtual switches.
See vm.conf(5) for more details.

OK mlarkin@


# 1.31 04-Oct-2016 reyk

Add a new "priv" process that is responsible for ioctls and restricted
operations that aren't allowed under pledge. This is a companion to
the "vmd" process that runs as root but with pledge.

With the "priv" process, each new tap(4) interface now gets a
description to indicate the vm, eg. "vm1-if0-myvm". For network
configuration will be done by vmd/priv later.

OK mlarkin@


# 1.30 29-Sep-2016 reyk

Implement fork+exec for vmd, using the same framework from httpd etc.

No objections from mlarkin@ sunil@


# 1.29 17-Aug-2016 deraadt

small bits of header cleanup; ok mlarkin


# 1.28 29-Jul-2016 stefan

Allow starting a VM again after it was terminated

If a VM exits, terminate it and remove it from the list of
available VMs. That allows a VM with name `foo' to be restarted
after it has exited.

This changes structures shared between vmd and vmctl. You need to
rebuild vmctl also.

ok mlarkin@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.27 05-Feb-2016 reyk

Fix a possible use-after-free in vmd, forward the result to the
control socket before free'ing the vm.

Found by and OK jsg@


# 1.26 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.25 11-Dec-2015 reyk

The vmctl "id" argument can now be a number of or a vm name, eg.
vmctl stop 3
vmctl stop "openbsd.vm"


# 1.24 08-Dec-2015 jsg

when checking the config file with -n don't open /dev/vmm or require root
ok reyk@


# 1.23 08-Dec-2015 jsg

make the -f option work as intended
ok reyk@


# 1.22 07-Dec-2015 reyk

tweak initial error logging


# 1.21 06-Dec-2015 reyk

Prevent running a VM with the same name multiple times - multiple
instances of the same configuration will be handled in a different way
later. It is also not a good idea to use the same writeable disk
with multiple VMs at the same time.

As discussed with mlarkin@


# 1.20 06-Dec-2015 reyk

Print the TTY in the vmctl status output.


# 1.19 06-Dec-2015 reyk

When a new vm is created with VMM_IOC_CREATE, the kernel assigns a
unique id to it. This happens in the vm child process and has to be
communicated to the parent processes to track the vm. Knowing the vm
id in the parent and vmm processes also allows to remove vm from the
daemons list on terminate requests later.


# 1.18 06-Dec-2015 reyk

Check errno from config_getvm() correctly


# 1.17 05-Dec-2015 reyk

Print shorter error message if opening /dev/vmm failed.

Pointed out by deraadt@


# 1.16 03-Dec-2015 reyk

Re-add the "load" and "reload" commands to vmctl: Instead of parsing
the configuration in vmctl directly, it now sends a (re)load request
to vmd. The reload also resets the existing configuration status -
this doesn't do much difference yet but a future change will compare
if a specified VM is already running. "load" will allow to add
configuration, while "reload" resets the state before loading.


# 1.15 03-Dec-2015 reyk

Add and document -D and -f flags to vmd.


# 1.14 03-Dec-2015 reyk

mlarkin's code has been moved to vmm.c, so it is ok to claim the copyright.


# 1.13 03-Dec-2015 reyk

Add support for an optional vm.conf(5) file in vmd. This will replace
vmm.conf(5) in vmmctl. For a short time, both vmd and vmmctl will
support a configuration file, but vmmctl will be changed to send
"load" requests to vmd instead of loading and parsing the file
directly.


# 1.12 03-Dec-2015 reyk

prepare config_getvm() for parse.y


# 1.11 02-Dec-2015 reyk

send the tty name to vmmctl and print it as a result.


# 1.10 02-Dec-2015 reyk

Split the fully privileged parent into two processes "parent" and
"vmm" with reduced privileges:
- the "parent" opens fds (disks, ifs, etc.) but runs as root but pledged as
"stdio rpath wpath proc tty sendfd".
- the "vmm" process handles the creation and supervision of vm processes,
and the primary communication with the vmm(4) subsystem. It runs as _vmd
in the chroot but does not use pledge, as the vmm ioctls are not allowed
by any pledge model yet.
With this change, vmd starts to track the configuration state of VMs
in vmd and will allow other things later (like terminating a vm by
name, moving the configuration parser to vmd, ...). More incremental
changes will follow.


# 1.9 02-Dec-2015 reyk

Start tweaking vmd's privsep and daemon model by splitting the main
process into multiple parts and adopting the "proc.c"-style from other
daemons. This allows to further reduce the privileges, to give better
pledge(2), and to add some upcoming changes.

"please do" mlarkin@, deraadt@


# 1.8 26-Nov-2015 reyk

Automatically start vmm(4) when the first VM is created and after the
last VM is terminated. This allows to remove the explicit "vmm
enable" / "vmm disable" (VMM_IOC_START / VMM_IOC_STOP) ioctls. You'll
have to update kernel and userland for this change, as the kernel ABI
changes.

OK mpi@ mlarkin@


# 1.7 25-Nov-2015 tedu

typo: should be looking pid == -1


# 1.6 23-Nov-2015 reyk

accept4() is restarted after signals which prevents vmd from exiting
in the current control socket loop. Add a poll before the accept that
is not restarted and allows to escape the loop. This code is kind of
temporary, as we're planning to replace the event handling, but it
allows to kill (or Ctrl+c) vmd for now.

OK tedu@, discussed with many


# 1.5 23-Nov-2015 reyk

I accidentally removed a newline in usage() when converting the log
messages to log_*.

From Cesare Gargano


# 1.4 23-Nov-2015 reyk

Add support for logging to stderr or syslog, and to run vmd in
foreground with -d.

OK mlarkin@ jung@


# 1.3 22-Nov-2015 deraadt

use PATH_MAX where needed


# 1.2 22-Nov-2015 reyk

Add $ Ids


# 1.1 22-Nov-2015 mlarkin

vmd(8) - virtual machine daemon.

There is still a lot to be done, and fixed, in these userland components
but I have received enough "it works, commit it" emails that it's time
to finish those things in tree.

discussed with many, tested by many.


# 1.121 29-Mar-2021 dv

Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp
and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior
ignored did not intercept these packets and instead transmitted them.

This should make vmd(8)'s dhcp behave more as a true dhcp server should and
allows it to work properly with the new dhcpleased(8) attempting a renewal.

OK mlarkin@


# 1.120 27-Jan-2021 deraadt

these programs (with common ancestry) had a -fno-common problem related
to privsep_procid.
ok mortimer


Revision tags: OPENBSD_6_8_BASE
# 1.119 23-Sep-2020 martijn

Revert agentx support for now, we're too close to release.

requested by deraadt@


# 1.118 23-Sep-2020 martijn

Add support for agentx to vmd.

This is based around VM-MIB from RFC7666,but does not export the full
spec. People more knowledgeable of vmd are encouraged to expand on this.


Revision tags: OPENBSD_6_7_BASE
# 1.117 12-Dec-2019 pd

vmd: start vms defined in vm.conf in a staggered fashion

This addresses 'thundering herd' problem when a lot of
vms are configured in vm.conf. A lot of vms booting in parallel can
overload the host and also mess up tsc calibration in openbsd guests as
it uses PIT which doesn't fire reliably if the host is overloaded.

We default to starting vms with parallelism of ncpuonline and a delay 30 seconds
between batches. This is configurable in vm.conf.

ok mlarkin@ (also addressed comments from cheloha@)


Revision tags: OPENBSD_6_6_BASE
# 1.116 04-Sep-2019 mlarkin

vmd(8): memory leak in an error path

Found by Hiltjo Posthuma, thanks!


# 1.115 14-Aug-2019 anton

Improve the error message when supplying an invalid template to vmctl
start. Favoring 'invalid template' over 'permission denied' should give
the user a better hint on what went wrong.

ok kn@ mlarkin@


# 1.114 28-Jun-2019 deraadt

When system calls indicate an error they return -1, not some arbitrary
value < 0. errno is only updated in this case. Change all (most?)
callers of syscalls to follow this better, and let's see if this strictness
helps us in the future.


# 1.113 20-May-2019 jasper

drop fatalx calls when claiming a new vm id; otherwise it's possible
to crash vmd and take all other vms with it. this required a little
shuffling to get the error value reported back to the caller to
handle the error properly.

ok mlarkin@


# 1.112 11-May-2019 jasper

report vm state through 'vmctl status'; whereas previously this would display the state of
the vcpu (which is why it got removed), it now actually reports the correct state
(running, stopped, disabled, paused, etc)

ok ccardenas@ mlarkin@


# 1.111 11-May-2019 jasper

vm_dump_header allocated space for a signature but it was never set;
set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image

ok mlarkin@ pd@


# 1.110 11-May-2019 jasper

track the state of the vm (running, paused, etc) using a single bitfield instead of
a handful of separate variables. this will makes it easier for vmd to report
and check on the individual vm states

no functional change intended

ok ccardenas@ mlarkin@


# 1.109 11-May-2019 jasper

sync the vm state in vmd too when (un)pausing a vm, otherwise the vm process
knows the vm is paused, but vmd does not.

ok mlarkin@ pd@


Revision tags: OPENBSD_6_5_BASE
# 1.108 09-Dec-2018 claudio

When -B is used to specify a specific boot device also change the reboot
behaviour of vmd to stop / exit at guest reboot.
OK ccardenas@


# 1.107 04-Dec-2018 claudio

Introduce IMSG_VMDOP_WAIT_VM_REQUEST a control message that registers a
vmctl peerid that should be informed when the VM is stopped (like when the
guest does a shutdown). Uses the same logic as using the VMOP_WAIT flag on
IMSG_VMDOP_TERMINATE_VM_REQUEST.
Ok ccardenas@, reyk@


# 1.106 26-Nov-2018 ori

Keep a list of known vms, and reuse the VM IDs.

This means that when using '-L', the IP addresses of the VMs are stable.

ok reyk@


# 1.105 21-Nov-2018 reyk

Add support for "local inet6" interfaces.

ok & test ccardenas@, additional review from kn@


# 1.104 15-Oct-2018 reyk

Prevent VM reboot loops by rate-limiting the interval a VM can reboot.

This looping has been experienced by people who run VMs with a broken
kernel or boot loader that trigger a very fast reboot loop (triple
fault) of a VM that ends up using a lot of CPU and resources on the
host. Some fixes in vmm(4) and vmd(8) helped to avoid such conditions
but it can still occur if something is wrong in the guest VM itself.

If the VM restarts after less than VM_START_RATE_SEC (6) seconds, we
increment the limit counter. After VM_START_RATE_LIMIT (3) of suchs
fast reboots the VM is stopped.

There are only very few people who intentionally want to reboot-loop a
VM very quickly (many times within a second); mostly for fuzzing.
They will have to recompile and adjust the stated #defines in the code
as we don't have a config option to disable it.

OK mlarkin@


Revision tags: OPENBSD_6_4_BASE
# 1.103 08-Oct-2018 reyk

Add support for qcow2 base images (external snapshots).

This works is from Ori Bernstein, committing on his behalf:

Add support to vmd for external snapshots. That is, snapshots that are
derived from a base image. Data lookups start in the derived image,
and if the derived image does not contain some data, the search
proceeds ot the base image. Multiple derived images may exist off of
a single base image.

A limitation of this format is that modifying the base image will
corrupt the derived image.

This change also adds support for creating disk derived disk images to
vmctl. To use it:

vmctl create derived.qcow2 -s 16G -b base.qcow2

From Ori Bernstein
OK mlarkin@ reyk@


# 1.102 29-Sep-2018 pd

vmd: don't remove vm if sending failed

Fix a bug where a vm was removed in vmd.c after vmctl send even if sending
failed.
spotted by solene@
ok mlarkin@


# 1.101 28-Sep-2018 reyk

Fix copy-pasto to use maxmem instead of maxcpu

Reported by Greg Steuck

OK mlarkin@


# 1.100 10-Sep-2018 bluhm

vmd(8) clould close file descriptor 0 as not all fd fields were
properly initialized with -1. Also avoid closing -1.
OK mlarkin@


# 1.99 10-Sep-2018 bluhm

During the fork+exec implementation, daemon(3) was moved after
proc_init(). As a consequence vmd(8) child processes did not detach
from the terminal anymore. Dup /dev/null to the stdio file descriptors
in the children.
OK mlarkin@ reyk@


# 1.98 15-Jul-2018 reyk

Track resources and enforce cpu/memory/interface limits for non-root users.

The limits are currently hard-coded and undocumented (4 CPUs/VMs, 2G
memory, 8 interfaces) but will be configurable in an upcoming diff.
These limits are tracked in total usage; for example, a user will be
able to run up to 4 VMs with 512M of memory or a single VM with 2G.

OK ccardenas@ mlarkin@


# 1.97 13-Jul-2018 reyk

Check the disk/kernel/cdrom file permissions after openening the fd.

This prevents time of TOCTOU attacks for instances.

OK mlarkin@


# 1.96 13-Jul-2018 reyk

Add "allow instance" option.

This allows users to create VM instances and change desired options,
for example a user can be allowed to run a VM with all the
pre-configured options but specify an own disk image.

(mlarkin@ was fine with iterating over it)

OK ccardenas@


# 1.95 12-Jul-2018 reyk

Allow to use configured/running VMs as templates for other VM instances.

This introduces new grammar and the -t optional in vmctl start.

(For now, only root can create VM instances; but it is planned to allow
users to create their own VMs based on permissions and quota.)

OK ccardenas@ mlarkin@ jmc@


# 1.94 11-Jul-2018 reyk

style - indent each case statement in a switch.


# 1.93 11-Jul-2018 reyk

Add -w option to vmctl stop to wait for completion of VM termination.

Use it in /etc/rc.d/vmd accordingly.

OK sthen@


# 1.92 11-Jul-2018 reyk

Rename function to vmd_check_vmh


# 1.91 11-Jul-2018 reyk

Add -f option to vmctl stop to forcefully kill a VM.

This also fixes a bug in vmm_sighdlr where it might have missed
forwarding the TERMINATE_EVENT to the vmd parent after a VM child
died, leading to an abandoned VM in the vmd parent process.

OK ccardenas@ mlarkin@ benno@ kn@


# 1.90 10-Jul-2018 reyk

style (single-line ifs don't need braces)


# 1.89 10-Jul-2018 reyk

vmd already had DEBUG/DPRINTF, there is no need for VMD_DEBUG/dprintf

Replace all occurences of dprintf with DPRINTF (defined in proc.h).


# 1.88 10-Jul-2018 reyk

Tweak debug log messages

- Turn tracing messages into DPRINTF (only compiled with DEBUG).

- Pass __func__ to vm_stop and vm_remove: this way we can track who
called the function in the async context. It replaces the manual
log_debug in front of each vm_stop/vm_remove. This debug logging
trick can be removed in the future once we are more confident about
it.

OK ccardenas@ mlarkin@


# 1.87 26-Jun-2018 reyk

Add "socket owner" to allow changing the owner of the vmd control socket.

This allows to open vmctl control or console access to other users
that are not in group wheel. Access for non-root users still defaults
to read-only actions unless you change the owner (user/group) of each
individual VM.

Requested by Mischa Peters

OK mlarkin@


# 1.86 19-Jun-2018 reyk

knf


# 1.85 13-May-2018 pd

vmd(8): enable pause / unpause for vm owners

Patch from Mohamed Aslan. Thanks!
ok kn@


# 1.84 25-Apr-2018 mlarkin

vmd(8)'s early error messages weren't visible when started via /etc/rc
(such as errors relating to not having VMX/etc). Change the log_init
to log to syslog so at least we have some chance of seeing these errors.

requested and ok beck@


# 1.83 21-Apr-2018 mlarkin

spelling error in log message


# 1.82 29-Mar-2018 martijn

Make sure that the global config is send out immediately when it is
loaded. This makes sure that the local prefix specied in the config is
always used.

OK ccardenas@


Revision tags: OPENBSD_6_3_BASE
# 1.81 14-Mar-2018 mlarkin

block two VMs from using the same disk image file at the same time.
Also changes an error message in vmctl to reflect same.


# 1.80 18-Feb-2018 pd

vmd: fix vmctl pause for non existing vm ids (never returns)

check if vm id is valid before sending to vmm for pausing. The 'lock' is caused
by vmm sending back ENOENT for a non existent vm but vmd drops the message
because it doesn't recogize the vmid vmm is talking about. This is an artifact
of the 'policy' don't trust any imsg from a sibling priv sep process and do
your own checking.

reported by Abel Abraham Camarillo Ojeda
ok mlarkin@ and ccardenas@


# 1.79 10-Jan-2018 sthen

Don't require "disk" or "kernel", also allow just "cdrom" instead, a VM can
still be useful with only cdrom storage. ok ccardenas@


# 1.78 08-Jan-2018 mpi

Enable TIOCUCNTL to be able to set ns8250's break detected condition.

It is now possible to send BREAK commands to vmd(8) independently of
the serial terminal emulator.

Happy virtual ddb(4) hacking!

No objection from mlarkin@, ok nicm@, ccardenas@, deraadt@


# 1.77 03-Jan-2018 ccardenas

Add initial CD-ROM support to VMD via vioscsi.

* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8)
* Support various sized ISOs (Limitation of 4G ISOs on Linux guests)
* Known working guests: OpenBSD (primary), Alpine Linux (primary),
CentOS 6 (secondary), Ubuntu 17.10 (secondary).
NOTE: Secondary indicates some issue(s) preventing full/reliable
functionality outside the scope of the vioscsi work.
* If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's
default BIOS) will boot from CD-ROM.

ok mlarkin@, jca@


# 1.76 06-Dec-2017 abieber

Make vmd respect owner when starting non-disabled vms.

OK pd@, benno@


# 1.75 30-Nov-2017 ccardenas

When performing vmctl reload and a previously configured vm is running,
exit with an EALREADY vs EPERM.

ok mlarkin@


# 1.74 11-Nov-2017 mlarkin

update switch handling in vmd(8). vmd now gets switch information (rdomain,
etc) from underlying switch interface instead of handling this on its
own.

Diff from carlos cardenas, Thanks!

ok reyk@


# 1.73 07-Nov-2017 mlarkin

typo in previous


# 1.72 07-Nov-2017 mlarkin

comment function vm_checkperm


# 1.71 24-Oct-2017 mlarkin

The VMD parent process didn't handle the case of a VM exiting
with a non 0 return properly (i.e. EIO).

From: Carlos Cardenas, thanks!


# 1.70 07-Oct-2017 mlarkin

vmd: retain ownership on vm reboot

from Jesper Wallin, thanks!


Revision tags: OPENBSD_6_2_BASE
# 1.69 08-Sep-2017 mlarkin

vmd: add more explanatory log_debug messages

From Carlos Cardenas, many thanks!


# 1.68 20-Aug-2017 pd

vmd: Allow only upward migration

This restricts receiving vms from hosts with more cpu features.

Tested on
broadwell -> skylake (works)
skylake -> broadwell (don't work)

ok mlarkin@


# 1.67 15-Aug-2017 pd

vmd: fix vm id displayed by vmctl when receiving a vm

Also fix two debug messages and an IMSG type.


# 1.66 14-Aug-2017 jasper

validate vm names before creating them; a valid name contains alphanumeric
characters, including '.', '_' and '-'. but does not start with the latter
three.

ok mlarkin@ pd@


# 1.65 13-Aug-2017 jasper

don't issue a termination command to an already stopped vm

ok mlarkin@


# 1.64 15-Jul-2017 pd

Add vmctl send and vmctl receive

ok reyk@ and mlarkin@


# 1.63 09-Jul-2017 pd

vmd/vmctl: Add ability to pause / unpause vms

With help from Ashwin Agrawal

ok reyk@ mlarkin@


# 1.62 29-May-2017 mlarkin

vmd(8): prevent crashing when presented with a vm name argument to
"vmctl stop" that doesn't exist.

Diff from Pratik Vyas, thanks!


# 1.61 04-May-2017 reyk

Report command failure back to vmctl reload, reset, load, log verbose.

OK mlarkin@


# 1.60 04-May-2017 reyk

Add support for rdomains.

This allows to configure VM interfaces and switches in individual rdomains.

OK mlarkin@


# 1.59 25-Apr-2017 reyk

Generate randomized MAC addresses earlier to keep them across reboots.

OK deraadt@


# 1.58 21-Apr-2017 reyk

Add global configuration option "local prefix" to change prefix for -L.

The default prefix is 100.64.0.0/10 from RFC6598.

Requested by sthen@ chris@
OK mlarkin@


# 1.57 19-Apr-2017 reyk

Add support for dynamic "NAT" interfaces (-L/local interface).

When a local interface is configured, vmd configures a /31 address on
the tap(4) interface of the host and provides another IP in the same
subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server
that replies with IP, gateway, and DNS addresses to the VM. The
built-in server only ever responds to the VM on the inside and cannot
leak its DHCP responses to the outside.

Thanks to Uwe Werler, Josh Grosse, and some others for testing!

OK deraadt@


# 1.56 06-Apr-2017 reyk

Do not expose vmm(4) VM IDs to the user, use vmd(8)'s IDs instead.

Each VM has two IDs: one from the kernel (vmm) and a different one
from userland (vmd). The vmm ID is not consistent and incremented on
every boot during runtimg of the host system. The vmd ID remains the
same during the lifetime of a configured VM, even after reboots.
Configured VMs will even get and keep their IDs when the configuration
is loaded. This is more what users expect.

Pointed out and tested by otto@

OK deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.55 15-Mar-2017 reyk

More fixes for starting and stopping VMs, fixing fallout from vm_running.

- Don't start a VM that is already running
- Keep the VM as running until it is powered off (and not stopping)
- Don't fatal in the parent if the vmm process referenced an unknown VM
- Don't stop a VM that is already stopping
- Indicate that a VM is stopping in "vmctl status"

The previous "vmctl stop; vmctl stop" to force-shutdown is not
supported anymore - the shutdown timeout should make sure that the VM
is really terminated. To force-shutdown, reference the VM by ID.
We might add a flag to vmctl stop to just turn the VM off.


# 1.54 15-Mar-2017 reyk

Close the tty if the VM was powered down.

The parent keeps a copy of each VM's tty fd to reuse it on reboot.
Close this tty if the VM was stopped, and not rebooted, by calling
vm_stop(vm, 0) instead of just setting vm_running to 0. Also make
sure that vm_ttyname is not used after free'ing it.


# 1.53 02-Mar-2017 reyk

Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.

This is especially useful when multiple VMs share a switch, the
implementation is independent from the underlying switch or bridge.

no objections mlarkin@


# 1.52 01-Mar-2017 reyk

Add "owner" option to set a user/group ownership for pre-configured VMs

This allows matching users to start or stop VMs that they "own" and to
access the console accordingly.

OK mlarkin@


# 1.51 27-Feb-2017 reyk

Replace openpty(3) with local function that uses pre-opened /dev/ptm fd

This allows more flexibility for upcoming changes and better pledge.
We also didn't use half of the features of libutil's openpty function.
Additionally, make sure that the ttys are closed correctly on shutdown.

OK gilles@


# 1.50 13-Jan-2017 edd

Make it possible to remove VMs from vmd(8)'s internal queue.

The semantics agreed with reyk@ are:

* ad-hoc created vms, created with `vmctl start`, are removed once stopped.
* Stopped VMs defined in a config file are flushed before a `vmctl reload`.

OK reyk@


# 1.49 11-Jan-2017 reyk

Add imsg communication channel between vmd and invividual VMs.
For now, this is only used to forward "log verbose|brief" requests,
but it will be used for better things later.

OK mlarkin@


# 1.48 09-Jan-2017 reyk

Stop accessing verbose and debug variables from log.c directly.

This replaces log_verbose() and "extern int verbose" with the two functions
log_setverbose() and log_getverbose().

Pointed out by benno@
OK krw@ eric@ gilles@ (OK gilles@ for the snmpd bits as well)


# 1.47 14-Dec-2016 reyk

Allow to start disabled and pre-configured VMs by name, "vmctl start foo".

With testing from Jon Bernard

OK mlarkin@


# 1.46 14-Dec-2016 reyk

If a VM terminates with the result EAGAIN, close all fds except the
pty and re-send it to the vmm monitor process. With additional
changes in vmm.c, this will allow perform a cold reboot of VM.

With testing and feedback from Jon Bernard
OK mlarkin@


# 1.45 26-Nov-2016 reyk

Implement basic support for boot.conf(8) on the disk image.

Like the real boot loader, load and parse hd0a:/etc/boot.conf from the
first disk and fall back to /bsd. Not all boot loader options are
supported, but it at least does set device, set image, and boot -acds
(eg. for booting single-user).

For example, it can now boot install60.fs that includes a boot.conf
with "set image /6.0/amd64/bsd.rd":
vmctl start install -c -d install60.fs -d OpenBSD.img

This pseudo-bootloader is only needed without BIOS and could
potentially be replaced in the future.

OK mlarkin@


# 1.44 26-Nov-2016 reyk

If -m/memory is not specified, use 512M by default.

Default value picked with mlarkin - not too small and not too large.

OK mlarkin@


# 1.43 24-Nov-2016 reyk

Add support for booting the kernel from the disk image.

This make the kernel/-k argument optional and, if not specified, tries
to find the /bsd kernel in the primary hd0a partition of the first
disk image itself. It doesn't support hd0a:/etc/boot.conf yet, and it
is no BIOS or full boot loader, but it makes booting and handling of
VMs a bit easier - booting an external kernel is still supported.

The UFS file system code ufs.c is directly from libsa which is also
used by the real boot loader. The code compiles with a few signedness
warning which will be fixed separately.

OK mlarkin@


# 1.42 22-Nov-2016 reyk

Fix error path of config_setvm() and its callers. This unbreaks
loading of invalid kernel files.

Reported by mlarkin@
OK mlarkin@


# 1.41 22-Nov-2016 reyk

There is no need for res when there is already ret.


# 1.40 22-Nov-2016 edd

Insert disabled VMs into vmd(8)'s queues and allow vmctl(8) to display them.

Tested by Jon Bernard and reyk@.

OK reyk@, no objections mlarkin@.

Thanks


# 1.39 04-Nov-2016 reyk

Pass the internal vmid or 0 to vm_register() instead of changing it
once again after setting the next available id.

Suggested by edd@


# 1.38 04-Nov-2016 reyk

Update the config/register/get VM methods to match the config_set/get
style that is used in other places. Also keep the vmid from the parent.

OK edd@


# 1.37 29-Oct-2016 edd

Separate parsing vms and switches from starting them in vmd(8).

Brings us one step closer to having disabled by default vms is vm.conf(5),
which can be started with vmctl(8).

Input, testing and OK reyk@. Thanks.


# 1.36 17-Oct-2016 reyk

Add the option to specify an interface group per virtual switch as well;
this group will be added to all VM tap(4) interfaces in the switch.

Tested by martijn@


# 1.35 15-Oct-2016 reyk

Allow to add an interface to an interface group; with the group keyword.

Requested and tested by martijn@


# 1.34 12-Oct-2016 reyk

Fix functionality and semantics of vmctl load/reload/reset.

OK rzalamena@


# 1.33 06-Oct-2016 reyk

Terminate VMs on shutdown of vmd instead of leaving them running as
undead VM processes.

OK mlarkin@


# 1.32 05-Oct-2016 reyk

Add support for enhanced networking configuration and virtual switches.
See vm.conf(5) for more details.

OK mlarkin@


# 1.31 04-Oct-2016 reyk

Add a new "priv" process that is responsible for ioctls and restricted
operations that aren't allowed under pledge. This is a companion to
the "vmd" process that runs as root but with pledge.

With the "priv" process, each new tap(4) interface now gets a
description to indicate the vm, eg. "vm1-if0-myvm". For network
configuration will be done by vmd/priv later.

OK mlarkin@


# 1.30 29-Sep-2016 reyk

Implement fork+exec for vmd, using the same framework from httpd etc.

No objections from mlarkin@ sunil@


# 1.29 17-Aug-2016 deraadt

small bits of header cleanup; ok mlarkin


# 1.28 29-Jul-2016 stefan

Allow starting a VM again after it was terminated

If a VM exits, terminate it and remove it from the list of
available VMs. That allows a VM with name `foo' to be restarted
after it has exited.

This changes structures shared between vmd and vmctl. You need to
rebuild vmctl also.

ok mlarkin@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.27 05-Feb-2016 reyk

Fix a possible use-after-free in vmd, forward the result to the
control socket before free'ing the vm.

Found by and OK jsg@


# 1.26 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.25 11-Dec-2015 reyk

The vmctl "id" argument can now be a number of or a vm name, eg.
vmctl stop 3
vmctl stop "openbsd.vm"


# 1.24 08-Dec-2015 jsg

when checking the config file with -n don't open /dev/vmm or require root
ok reyk@


# 1.23 08-Dec-2015 jsg

make the -f option work as intended
ok reyk@


# 1.22 07-Dec-2015 reyk

tweak initial error logging


# 1.21 06-Dec-2015 reyk

Prevent running a VM with the same name multiple times - multiple
instances of the same configuration will be handled in a different way
later. It is also not a good idea to use the same writeable disk
with multiple VMs at the same time.

As discussed with mlarkin@


# 1.20 06-Dec-2015 reyk

Print the TTY in the vmctl status output.


# 1.19 06-Dec-2015 reyk

When a new vm is created with VMM_IOC_CREATE, the kernel assigns a
unique id to it. This happens in the vm child process and has to be
communicated to the parent processes to track the vm. Knowing the vm
id in the parent and vmm processes also allows to remove vm from the
daemons list on terminate requests later.


# 1.18 06-Dec-2015 reyk

Check errno from config_getvm() correctly


# 1.17 05-Dec-2015 reyk

Print shorter error message if opening /dev/vmm failed.

Pointed out by deraadt@


# 1.16 03-Dec-2015 reyk

Re-add the "load" and "reload" commands to vmctl: Instead of parsing
the configuration in vmctl directly, it now sends a (re)load request
to vmd. The reload also resets the existing configuration status -
this doesn't do much difference yet but a future change will compare
if a specified VM is already running. "load" will allow to add
configuration, while "reload" resets the state before loading.


# 1.15 03-Dec-2015 reyk

Add and document -D and -f flags to vmd.


# 1.14 03-Dec-2015 reyk

mlarkin's code has been moved to vmm.c, so it is ok to claim the copyright.


# 1.13 03-Dec-2015 reyk

Add support for an optional vm.conf(5) file in vmd. This will replace
vmm.conf(5) in vmmctl. For a short time, both vmd and vmmctl will
support a configuration file, but vmmctl will be changed to send
"load" requests to vmd instead of loading and parsing the file
directly.


# 1.12 03-Dec-2015 reyk

prepare config_getvm() for parse.y


# 1.11 02-Dec-2015 reyk

send the tty name to vmmctl and print it as a result.


# 1.10 02-Dec-2015 reyk

Split the fully privileged parent into two processes "parent" and
"vmm" with reduced privileges:
- the "parent" opens fds (disks, ifs, etc.) but runs as root but pledged as
"stdio rpath wpath proc tty sendfd".
- the "vmm" process handles the creation and supervision of vm processes,
and the primary communication with the vmm(4) subsystem. It runs as _vmd
in the chroot but does not use pledge, as the vmm ioctls are not allowed
by any pledge model yet.
With this change, vmd starts to track the configuration state of VMs
in vmd and will allow other things later (like terminating a vm by
name, moving the configuration parser to vmd, ...). More incremental
changes will follow.


# 1.9 02-Dec-2015 reyk

Start tweaking vmd's privsep and daemon model by splitting the main
process into multiple parts and adopting the "proc.c"-style from other
daemons. This allows to further reduce the privileges, to give better
pledge(2), and to add some upcoming changes.

"please do" mlarkin@, deraadt@


# 1.8 26-Nov-2015 reyk

Automatically start vmm(4) when the first VM is created and after the
last VM is terminated. This allows to remove the explicit "vmm
enable" / "vmm disable" (VMM_IOC_START / VMM_IOC_STOP) ioctls. You'll
have to update kernel and userland for this change, as the kernel ABI
changes.

OK mpi@ mlarkin@


# 1.7 25-Nov-2015 tedu

typo: should be looking pid == -1


# 1.6 23-Nov-2015 reyk

accept4() is restarted after signals which prevents vmd from exiting
in the current control socket loop. Add a poll before the accept that
is not restarted and allows to escape the loop. This code is kind of
temporary, as we're planning to replace the event handling, but it
allows to kill (or Ctrl+c) vmd for now.

OK tedu@, discussed with many


# 1.5 23-Nov-2015 reyk

I accidentally removed a newline in usage() when converting the log
messages to log_*.

From Cesare Gargano


# 1.4 23-Nov-2015 reyk

Add support for logging to stderr or syslog, and to run vmd in
foreground with -d.

OK mlarkin@ jung@


# 1.3 22-Nov-2015 deraadt

use PATH_MAX where needed


# 1.2 22-Nov-2015 reyk

Add $ Ids


# 1.1 22-Nov-2015 mlarkin

vmd(8) - virtual machine daemon.

There is still a lot to be done, and fixed, in these userland components
but I have received enough "it works, commit it" emails that it's time
to finish those things in tree.

discussed with many, tested by many.


# 1.120 27-Jan-2021 deraadt

these programs (with common ancestry) had a -fno-common problem related
to privsep_procid.
ok mortimer


Revision tags: OPENBSD_6_8_BASE
# 1.119 23-Sep-2020 martijn

Revert agentx support for now, we're too close to release.

requested by deraadt@


# 1.118 23-Sep-2020 martijn

Add support for agentx to vmd.

This is based around VM-MIB from RFC7666,but does not export the full
spec. People more knowledgeable of vmd are encouraged to expand on this.


Revision tags: OPENBSD_6_7_BASE
# 1.117 12-Dec-2019 pd

vmd: start vms defined in vm.conf in a staggered fashion

This addresses 'thundering herd' problem when a lot of
vms are configured in vm.conf. A lot of vms booting in parallel can
overload the host and also mess up tsc calibration in openbsd guests as
it uses PIT which doesn't fire reliably if the host is overloaded.

We default to starting vms with parallelism of ncpuonline and a delay 30 seconds
between batches. This is configurable in vm.conf.

ok mlarkin@ (also addressed comments from cheloha@)


Revision tags: OPENBSD_6_6_BASE
# 1.116 04-Sep-2019 mlarkin

vmd(8): memory leak in an error path

Found by Hiltjo Posthuma, thanks!


# 1.115 14-Aug-2019 anton

Improve the error message when supplying an invalid template to vmctl
start. Favoring 'invalid template' over 'permission denied' should give
the user a better hint on what went wrong.

ok kn@ mlarkin@


# 1.114 28-Jun-2019 deraadt

When system calls indicate an error they return -1, not some arbitrary
value < 0. errno is only updated in this case. Change all (most?)
callers of syscalls to follow this better, and let's see if this strictness
helps us in the future.


# 1.113 20-May-2019 jasper

drop fatalx calls when claiming a new vm id; otherwise it's possible
to crash vmd and take all other vms with it. this required a little
shuffling to get the error value reported back to the caller to
handle the error properly.

ok mlarkin@


# 1.112 11-May-2019 jasper

report vm state through 'vmctl status'; whereas previously this would display the state of
the vcpu (which is why it got removed), it now actually reports the correct state
(running, stopped, disabled, paused, etc)

ok ccardenas@ mlarkin@


# 1.111 11-May-2019 jasper

vm_dump_header allocated space for a signature but it was never set;
set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image

ok mlarkin@ pd@


# 1.110 11-May-2019 jasper

track the state of the vm (running, paused, etc) using a single bitfield instead of
a handful of separate variables. this will makes it easier for vmd to report
and check on the individual vm states

no functional change intended

ok ccardenas@ mlarkin@


# 1.109 11-May-2019 jasper

sync the vm state in vmd too when (un)pausing a vm, otherwise the vm process
knows the vm is paused, but vmd does not.

ok mlarkin@ pd@


Revision tags: OPENBSD_6_5_BASE
# 1.108 09-Dec-2018 claudio

When -B is used to specify a specific boot device also change the reboot
behaviour of vmd to stop / exit at guest reboot.
OK ccardenas@


# 1.107 04-Dec-2018 claudio

Introduce IMSG_VMDOP_WAIT_VM_REQUEST a control message that registers a
vmctl peerid that should be informed when the VM is stopped (like when the
guest does a shutdown). Uses the same logic as using the VMOP_WAIT flag on
IMSG_VMDOP_TERMINATE_VM_REQUEST.
Ok ccardenas@, reyk@


# 1.106 26-Nov-2018 ori

Keep a list of known vms, and reuse the VM IDs.

This means that when using '-L', the IP addresses of the VMs are stable.

ok reyk@


# 1.105 21-Nov-2018 reyk

Add support for "local inet6" interfaces.

ok & test ccardenas@, additional review from kn@


# 1.104 15-Oct-2018 reyk

Prevent VM reboot loops by rate-limiting the interval a VM can reboot.

This looping has been experienced by people who run VMs with a broken
kernel or boot loader that trigger a very fast reboot loop (triple
fault) of a VM that ends up using a lot of CPU and resources on the
host. Some fixes in vmm(4) and vmd(8) helped to avoid such conditions
but it can still occur if something is wrong in the guest VM itself.

If the VM restarts after less than VM_START_RATE_SEC (6) seconds, we
increment the limit counter. After VM_START_RATE_LIMIT (3) of suchs
fast reboots the VM is stopped.

There are only very few people who intentionally want to reboot-loop a
VM very quickly (many times within a second); mostly for fuzzing.
They will have to recompile and adjust the stated #defines in the code
as we don't have a config option to disable it.

OK mlarkin@


Revision tags: OPENBSD_6_4_BASE
# 1.103 08-Oct-2018 reyk

Add support for qcow2 base images (external snapshots).

This works is from Ori Bernstein, committing on his behalf:

Add support to vmd for external snapshots. That is, snapshots that are
derived from a base image. Data lookups start in the derived image,
and if the derived image does not contain some data, the search
proceeds ot the base image. Multiple derived images may exist off of
a single base image.

A limitation of this format is that modifying the base image will
corrupt the derived image.

This change also adds support for creating disk derived disk images to
vmctl. To use it:

vmctl create derived.qcow2 -s 16G -b base.qcow2

From Ori Bernstein
OK mlarkin@ reyk@


# 1.102 29-Sep-2018 pd

vmd: don't remove vm if sending failed

Fix a bug where a vm was removed in vmd.c after vmctl send even if sending
failed.
spotted by solene@
ok mlarkin@


# 1.101 28-Sep-2018 reyk

Fix copy-pasto to use maxmem instead of maxcpu

Reported by Greg Steuck

OK mlarkin@


# 1.100 10-Sep-2018 bluhm

vmd(8) clould close file descriptor 0 as not all fd fields were
properly initialized with -1. Also avoid closing -1.
OK mlarkin@


# 1.99 10-Sep-2018 bluhm

During the fork+exec implementation, daemon(3) was moved after
proc_init(). As a consequence vmd(8) child processes did not detach
from the terminal anymore. Dup /dev/null to the stdio file descriptors
in the children.
OK mlarkin@ reyk@


# 1.98 15-Jul-2018 reyk

Track resources and enforce cpu/memory/interface limits for non-root users.

The limits are currently hard-coded and undocumented (4 CPUs/VMs, 2G
memory, 8 interfaces) but will be configurable in an upcoming diff.
These limits are tracked in total usage; for example, a user will be
able to run up to 4 VMs with 512M of memory or a single VM with 2G.

OK ccardenas@ mlarkin@


# 1.97 13-Jul-2018 reyk

Check the disk/kernel/cdrom file permissions after openening the fd.

This prevents time of TOCTOU attacks for instances.

OK mlarkin@


# 1.96 13-Jul-2018 reyk

Add "allow instance" option.

This allows users to create VM instances and change desired options,
for example a user can be allowed to run a VM with all the
pre-configured options but specify an own disk image.

(mlarkin@ was fine with iterating over it)

OK ccardenas@


# 1.95 12-Jul-2018 reyk

Allow to use configured/running VMs as templates for other VM instances.

This introduces new grammar and the -t optional in vmctl start.

(For now, only root can create VM instances; but it is planned to allow
users to create their own VMs based on permissions and quota.)

OK ccardenas@ mlarkin@ jmc@


# 1.94 11-Jul-2018 reyk

style - indent each case statement in a switch.


# 1.93 11-Jul-2018 reyk

Add -w option to vmctl stop to wait for completion of VM termination.

Use it in /etc/rc.d/vmd accordingly.

OK sthen@


# 1.92 11-Jul-2018 reyk

Rename function to vmd_check_vmh


# 1.91 11-Jul-2018 reyk

Add -f option to vmctl stop to forcefully kill a VM.

This also fixes a bug in vmm_sighdlr where it might have missed
forwarding the TERMINATE_EVENT to the vmd parent after a VM child
died, leading to an abandoned VM in the vmd parent process.

OK ccardenas@ mlarkin@ benno@ kn@


# 1.90 10-Jul-2018 reyk

style (single-line ifs don't need braces)


# 1.89 10-Jul-2018 reyk

vmd already had DEBUG/DPRINTF, there is no need for VMD_DEBUG/dprintf

Replace all occurences of dprintf with DPRINTF (defined in proc.h).


# 1.88 10-Jul-2018 reyk

Tweak debug log messages

- Turn tracing messages into DPRINTF (only compiled with DEBUG).

- Pass __func__ to vm_stop and vm_remove: this way we can track who
called the function in the async context. It replaces the manual
log_debug in front of each vm_stop/vm_remove. This debug logging
trick can be removed in the future once we are more confident about
it.

OK ccardenas@ mlarkin@


# 1.87 26-Jun-2018 reyk

Add "socket owner" to allow changing the owner of the vmd control socket.

This allows to open vmctl control or console access to other users
that are not in group wheel. Access for non-root users still defaults
to read-only actions unless you change the owner (user/group) of each
individual VM.

Requested by Mischa Peters

OK mlarkin@


# 1.86 19-Jun-2018 reyk

knf


# 1.85 13-May-2018 pd

vmd(8): enable pause / unpause for vm owners

Patch from Mohamed Aslan. Thanks!
ok kn@


# 1.84 25-Apr-2018 mlarkin

vmd(8)'s early error messages weren't visible when started via /etc/rc
(such as errors relating to not having VMX/etc). Change the log_init
to log to syslog so at least we have some chance of seeing these errors.

requested and ok beck@


# 1.83 21-Apr-2018 mlarkin

spelling error in log message


# 1.82 29-Mar-2018 martijn

Make sure that the global config is send out immediately when it is
loaded. This makes sure that the local prefix specied in the config is
always used.

OK ccardenas@


Revision tags: OPENBSD_6_3_BASE
# 1.81 14-Mar-2018 mlarkin

block two VMs from using the same disk image file at the same time.
Also changes an error message in vmctl to reflect same.


# 1.80 18-Feb-2018 pd

vmd: fix vmctl pause for non existing vm ids (never returns)

check if vm id is valid before sending to vmm for pausing. The 'lock' is caused
by vmm sending back ENOENT for a non existent vm but vmd drops the message
because it doesn't recogize the vmid vmm is talking about. This is an artifact
of the 'policy' don't trust any imsg from a sibling priv sep process and do
your own checking.

reported by Abel Abraham Camarillo Ojeda
ok mlarkin@ and ccardenas@


# 1.79 10-Jan-2018 sthen

Don't require "disk" or "kernel", also allow just "cdrom" instead, a VM can
still be useful with only cdrom storage. ok ccardenas@


# 1.78 08-Jan-2018 mpi

Enable TIOCUCNTL to be able to set ns8250's break detected condition.

It is now possible to send BREAK commands to vmd(8) independently of
the serial terminal emulator.

Happy virtual ddb(4) hacking!

No objection from mlarkin@, ok nicm@, ccardenas@, deraadt@


# 1.77 03-Jan-2018 ccardenas

Add initial CD-ROM support to VMD via vioscsi.

* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8)
* Support various sized ISOs (Limitation of 4G ISOs on Linux guests)
* Known working guests: OpenBSD (primary), Alpine Linux (primary),
CentOS 6 (secondary), Ubuntu 17.10 (secondary).
NOTE: Secondary indicates some issue(s) preventing full/reliable
functionality outside the scope of the vioscsi work.
* If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's
default BIOS) will boot from CD-ROM.

ok mlarkin@, jca@


# 1.76 06-Dec-2017 abieber

Make vmd respect owner when starting non-disabled vms.

OK pd@, benno@


# 1.75 30-Nov-2017 ccardenas

When performing vmctl reload and a previously configured vm is running,
exit with an EALREADY vs EPERM.

ok mlarkin@


# 1.74 11-Nov-2017 mlarkin

update switch handling in vmd(8). vmd now gets switch information (rdomain,
etc) from underlying switch interface instead of handling this on its
own.

Diff from carlos cardenas, Thanks!

ok reyk@


# 1.73 07-Nov-2017 mlarkin

typo in previous


# 1.72 07-Nov-2017 mlarkin

comment function vm_checkperm


# 1.71 24-Oct-2017 mlarkin

The VMD parent process didn't handle the case of a VM exiting
with a non 0 return properly (i.e. EIO).

From: Carlos Cardenas, thanks!


# 1.70 07-Oct-2017 mlarkin

vmd: retain ownership on vm reboot

from Jesper Wallin, thanks!


Revision tags: OPENBSD_6_2_BASE
# 1.69 08-Sep-2017 mlarkin

vmd: add more explanatory log_debug messages

From Carlos Cardenas, many thanks!


# 1.68 20-Aug-2017 pd

vmd: Allow only upward migration

This restricts receiving vms from hosts with more cpu features.

Tested on
broadwell -> skylake (works)
skylake -> broadwell (don't work)

ok mlarkin@


# 1.67 15-Aug-2017 pd

vmd: fix vm id displayed by vmctl when receiving a vm

Also fix two debug messages and an IMSG type.


# 1.66 14-Aug-2017 jasper

validate vm names before creating them; a valid name contains alphanumeric
characters, including '.', '_' and '-'. but does not start with the latter
three.

ok mlarkin@ pd@


# 1.65 13-Aug-2017 jasper

don't issue a termination command to an already stopped vm

ok mlarkin@


# 1.64 15-Jul-2017 pd

Add vmctl send and vmctl receive

ok reyk@ and mlarkin@


# 1.63 09-Jul-2017 pd

vmd/vmctl: Add ability to pause / unpause vms

With help from Ashwin Agrawal

ok reyk@ mlarkin@


# 1.62 29-May-2017 mlarkin

vmd(8): prevent crashing when presented with a vm name argument to
"vmctl stop" that doesn't exist.

Diff from Pratik Vyas, thanks!


# 1.61 04-May-2017 reyk

Report command failure back to vmctl reload, reset, load, log verbose.

OK mlarkin@


# 1.60 04-May-2017 reyk

Add support for rdomains.

This allows to configure VM interfaces and switches in individual rdomains.

OK mlarkin@


# 1.59 25-Apr-2017 reyk

Generate randomized MAC addresses earlier to keep them across reboots.

OK deraadt@


# 1.58 21-Apr-2017 reyk

Add global configuration option "local prefix" to change prefix for -L.

The default prefix is 100.64.0.0/10 from RFC6598.

Requested by sthen@ chris@
OK mlarkin@


# 1.57 19-Apr-2017 reyk

Add support for dynamic "NAT" interfaces (-L/local interface).

When a local interface is configured, vmd configures a /31 address on
the tap(4) interface of the host and provides another IP in the same
subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server
that replies with IP, gateway, and DNS addresses to the VM. The
built-in server only ever responds to the VM on the inside and cannot
leak its DHCP responses to the outside.

Thanks to Uwe Werler, Josh Grosse, and some others for testing!

OK deraadt@


# 1.56 06-Apr-2017 reyk

Do not expose vmm(4) VM IDs to the user, use vmd(8)'s IDs instead.

Each VM has two IDs: one from the kernel (vmm) and a different one
from userland (vmd). The vmm ID is not consistent and incremented on
every boot during runtimg of the host system. The vmd ID remains the
same during the lifetime of a configured VM, even after reboots.
Configured VMs will even get and keep their IDs when the configuration
is loaded. This is more what users expect.

Pointed out and tested by otto@

OK deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.55 15-Mar-2017 reyk

More fixes for starting and stopping VMs, fixing fallout from vm_running.

- Don't start a VM that is already running
- Keep the VM as running until it is powered off (and not stopping)
- Don't fatal in the parent if the vmm process referenced an unknown VM
- Don't stop a VM that is already stopping
- Indicate that a VM is stopping in "vmctl status"

The previous "vmctl stop; vmctl stop" to force-shutdown is not
supported anymore - the shutdown timeout should make sure that the VM
is really terminated. To force-shutdown, reference the VM by ID.
We might add a flag to vmctl stop to just turn the VM off.


# 1.54 15-Mar-2017 reyk

Close the tty if the VM was powered down.

The parent keeps a copy of each VM's tty fd to reuse it on reboot.
Close this tty if the VM was stopped, and not rebooted, by calling
vm_stop(vm, 0) instead of just setting vm_running to 0. Also make
sure that vm_ttyname is not used after free'ing it.


# 1.53 02-Mar-2017 reyk

Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.

This is especially useful when multiple VMs share a switch, the
implementation is independent from the underlying switch or bridge.

no objections mlarkin@


# 1.52 01-Mar-2017 reyk

Add "owner" option to set a user/group ownership for pre-configured VMs

This allows matching users to start or stop VMs that they "own" and to
access the console accordingly.

OK mlarkin@


# 1.51 27-Feb-2017 reyk

Replace openpty(3) with local function that uses pre-opened /dev/ptm fd

This allows more flexibility for upcoming changes and better pledge.
We also didn't use half of the features of libutil's openpty function.
Additionally, make sure that the ttys are closed correctly on shutdown.

OK gilles@


# 1.50 13-Jan-2017 edd

Make it possible to remove VMs from vmd(8)'s internal queue.

The semantics agreed with reyk@ are:

* ad-hoc created vms, created with `vmctl start`, are removed once stopped.
* Stopped VMs defined in a config file are flushed before a `vmctl reload`.

OK reyk@


# 1.49 11-Jan-2017 reyk

Add imsg communication channel between vmd and invividual VMs.
For now, this is only used to forward "log verbose|brief" requests,
but it will be used for better things later.

OK mlarkin@


# 1.48 09-Jan-2017 reyk

Stop accessing verbose and debug variables from log.c directly.

This replaces log_verbose() and "extern int verbose" with the two functions
log_setverbose() and log_getverbose().

Pointed out by benno@
OK krw@ eric@ gilles@ (OK gilles@ for the snmpd bits as well)


# 1.47 14-Dec-2016 reyk

Allow to start disabled and pre-configured VMs by name, "vmctl start foo".

With testing from Jon Bernard

OK mlarkin@


# 1.46 14-Dec-2016 reyk

If a VM terminates with the result EAGAIN, close all fds except the
pty and re-send it to the vmm monitor process. With additional
changes in vmm.c, this will allow perform a cold reboot of VM.

With testing and feedback from Jon Bernard
OK mlarkin@


# 1.45 26-Nov-2016 reyk

Implement basic support for boot.conf(8) on the disk image.

Like the real boot loader, load and parse hd0a:/etc/boot.conf from the
first disk and fall back to /bsd. Not all boot loader options are
supported, but it at least does set device, set image, and boot -acds
(eg. for booting single-user).

For example, it can now boot install60.fs that includes a boot.conf
with "set image /6.0/amd64/bsd.rd":
vmctl start install -c -d install60.fs -d OpenBSD.img

This pseudo-bootloader is only needed without BIOS and could
potentially be replaced in the future.

OK mlarkin@


# 1.44 26-Nov-2016 reyk

If -m/memory is not specified, use 512M by default.

Default value picked with mlarkin - not too small and not too large.

OK mlarkin@


# 1.43 24-Nov-2016 reyk

Add support for booting the kernel from the disk image.

This make the kernel/-k argument optional and, if not specified, tries
to find the /bsd kernel in the primary hd0a partition of the first
disk image itself. It doesn't support hd0a:/etc/boot.conf yet, and it
is no BIOS or full boot loader, but it makes booting and handling of
VMs a bit easier - booting an external kernel is still supported.

The UFS file system code ufs.c is directly from libsa which is also
used by the real boot loader. The code compiles with a few signedness
warning which will be fixed separately.

OK mlarkin@


# 1.42 22-Nov-2016 reyk

Fix error path of config_setvm() and its callers. This unbreaks
loading of invalid kernel files.

Reported by mlarkin@
OK mlarkin@


# 1.41 22-Nov-2016 reyk

There is no need for res when there is already ret.


# 1.40 22-Nov-2016 edd

Insert disabled VMs into vmd(8)'s queues and allow vmctl(8) to display them.

Tested by Jon Bernard and reyk@.

OK reyk@, no objections mlarkin@.

Thanks


# 1.39 04-Nov-2016 reyk

Pass the internal vmid or 0 to vm_register() instead of changing it
once again after setting the next available id.

Suggested by edd@


# 1.38 04-Nov-2016 reyk

Update the config/register/get VM methods to match the config_set/get
style that is used in other places. Also keep the vmid from the parent.

OK edd@


# 1.37 29-Oct-2016 edd

Separate parsing vms and switches from starting them in vmd(8).

Brings us one step closer to having disabled by default vms is vm.conf(5),
which can be started with vmctl(8).

Input, testing and OK reyk@. Thanks.


# 1.36 17-Oct-2016 reyk

Add the option to specify an interface group per virtual switch as well;
this group will be added to all VM tap(4) interfaces in the switch.

Tested by martijn@


# 1.35 15-Oct-2016 reyk

Allow to add an interface to an interface group; with the group keyword.

Requested and tested by martijn@


# 1.34 12-Oct-2016 reyk

Fix functionality and semantics of vmctl load/reload/reset.

OK rzalamena@


# 1.33 06-Oct-2016 reyk

Terminate VMs on shutdown of vmd instead of leaving them running as
undead VM processes.

OK mlarkin@


# 1.32 05-Oct-2016 reyk

Add support for enhanced networking configuration and virtual switches.
See vm.conf(5) for more details.

OK mlarkin@


# 1.31 04-Oct-2016 reyk

Add a new "priv" process that is responsible for ioctls and restricted
operations that aren't allowed under pledge. This is a companion to
the "vmd" process that runs as root but with pledge.

With the "priv" process, each new tap(4) interface now gets a
description to indicate the vm, eg. "vm1-if0-myvm". For network
configuration will be done by vmd/priv later.

OK mlarkin@


# 1.30 29-Sep-2016 reyk

Implement fork+exec for vmd, using the same framework from httpd etc.

No objections from mlarkin@ sunil@


# 1.29 17-Aug-2016 deraadt

small bits of header cleanup; ok mlarkin


# 1.28 29-Jul-2016 stefan

Allow starting a VM again after it was terminated

If a VM exits, terminate it and remove it from the list of
available VMs. That allows a VM with name `foo' to be restarted
after it has exited.

This changes structures shared between vmd and vmctl. You need to
rebuild vmctl also.

ok mlarkin@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.27 05-Feb-2016 reyk

Fix a possible use-after-free in vmd, forward the result to the
control socket before free'ing the vm.

Found by and OK jsg@


# 1.26 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.25 11-Dec-2015 reyk

The vmctl "id" argument can now be a number of or a vm name, eg.
vmctl stop 3
vmctl stop "openbsd.vm"


# 1.24 08-Dec-2015 jsg

when checking the config file with -n don't open /dev/vmm or require root
ok reyk@


# 1.23 08-Dec-2015 jsg

make the -f option work as intended
ok reyk@


# 1.22 07-Dec-2015 reyk

tweak initial error logging


# 1.21 06-Dec-2015 reyk

Prevent running a VM with the same name multiple times - multiple
instances of the same configuration will be handled in a different way
later. It is also not a good idea to use the same writeable disk
with multiple VMs at the same time.

As discussed with mlarkin@


# 1.20 06-Dec-2015 reyk

Print the TTY in the vmctl status output.


# 1.19 06-Dec-2015 reyk

When a new vm is created with VMM_IOC_CREATE, the kernel assigns a
unique id to it. This happens in the vm child process and has to be
communicated to the parent processes to track the vm. Knowing the vm
id in the parent and vmm processes also allows to remove vm from the
daemons list on terminate requests later.


# 1.18 06-Dec-2015 reyk

Check errno from config_getvm() correctly


# 1.17 05-Dec-2015 reyk

Print shorter error message if opening /dev/vmm failed.

Pointed out by deraadt@


# 1.16 03-Dec-2015 reyk

Re-add the "load" and "reload" commands to vmctl: Instead of parsing
the configuration in vmctl directly, it now sends a (re)load request
to vmd. The reload also resets the existing configuration status -
this doesn't do much difference yet but a future change will compare
if a specified VM is already running. "load" will allow to add
configuration, while "reload" resets the state before loading.


# 1.15 03-Dec-2015 reyk

Add and document -D and -f flags to vmd.


# 1.14 03-Dec-2015 reyk

mlarkin's code has been moved to vmm.c, so it is ok to claim the copyright.


# 1.13 03-Dec-2015 reyk

Add support for an optional vm.conf(5) file in vmd. This will replace
vmm.conf(5) in vmmctl. For a short time, both vmd and vmmctl will
support a configuration file, but vmmctl will be changed to send
"load" requests to vmd instead of loading and parsing the file
directly.


# 1.12 03-Dec-2015 reyk

prepare config_getvm() for parse.y


# 1.11 02-Dec-2015 reyk

send the tty name to vmmctl and print it as a result.


# 1.10 02-Dec-2015 reyk

Split the fully privileged parent into two processes "parent" and
"vmm" with reduced privileges:
- the "parent" opens fds (disks, ifs, etc.) but runs as root but pledged as
"stdio rpath wpath proc tty sendfd".
- the "vmm" process handles the creation and supervision of vm processes,
and the primary communication with the vmm(4) subsystem. It runs as _vmd
in the chroot but does not use pledge, as the vmm ioctls are not allowed
by any pledge model yet.
With this change, vmd starts to track the configuration state of VMs
in vmd and will allow other things later (like terminating a vm by
name, moving the configuration parser to vmd, ...). More incremental
changes will follow.


# 1.9 02-Dec-2015 reyk

Start tweaking vmd's privsep and daemon model by splitting the main
process into multiple parts and adopting the "proc.c"-style from other
daemons. This allows to further reduce the privileges, to give better
pledge(2), and to add some upcoming changes.

"please do" mlarkin@, deraadt@


# 1.8 26-Nov-2015 reyk

Automatically start vmm(4) when the first VM is created and after the
last VM is terminated. This allows to remove the explicit "vmm
enable" / "vmm disable" (VMM_IOC_START / VMM_IOC_STOP) ioctls. You'll
have to update kernel and userland for this change, as the kernel ABI
changes.

OK mpi@ mlarkin@


# 1.7 25-Nov-2015 tedu

typo: should be looking pid == -1


# 1.6 23-Nov-2015 reyk

accept4() is restarted after signals which prevents vmd from exiting
in the current control socket loop. Add a poll before the accept that
is not restarted and allows to escape the loop. This code is kind of
temporary, as we're planning to replace the event handling, but it
allows to kill (or Ctrl+c) vmd for now.

OK tedu@, discussed with many


# 1.5 23-Nov-2015 reyk

I accidentally removed a newline in usage() when converting the log
messages to log_*.

From Cesare Gargano


# 1.4 23-Nov-2015 reyk

Add support for logging to stderr or syslog, and to run vmd in
foreground with -d.

OK mlarkin@ jung@


# 1.3 22-Nov-2015 deraadt

use PATH_MAX where needed


# 1.2 22-Nov-2015 reyk

Add $ Ids


# 1.1 22-Nov-2015 mlarkin

vmd(8) - virtual machine daemon.

There is still a lot to be done, and fixed, in these userland components
but I have received enough "it works, commit it" emails that it's time
to finish those things in tree.

discussed with many, tested by many.


# 1.119 23-Sep-2020 martijn

Revert agentx support for now, we're too close to release.

requested by deraadt@


# 1.118 23-Sep-2020 martijn

Add support for agentx to vmd.

This is based around VM-MIB from RFC7666,but does not export the full
spec. People more knowledgeable of vmd are encouraged to expand on this.


Revision tags: OPENBSD_6_7_BASE
# 1.117 12-Dec-2019 pd

vmd: start vms defined in vm.conf in a staggered fashion

This addresses 'thundering herd' problem when a lot of
vms are configured in vm.conf. A lot of vms booting in parallel can
overload the host and also mess up tsc calibration in openbsd guests as
it uses PIT which doesn't fire reliably if the host is overloaded.

We default to starting vms with parallelism of ncpuonline and a delay 30 seconds
between batches. This is configurable in vm.conf.

ok mlarkin@ (also addressed comments from cheloha@)


Revision tags: OPENBSD_6_6_BASE
# 1.116 04-Sep-2019 mlarkin

vmd(8): memory leak in an error path

Found by Hiltjo Posthuma, thanks!


# 1.115 14-Aug-2019 anton

Improve the error message when supplying an invalid template to vmctl
start. Favoring 'invalid template' over 'permission denied' should give
the user a better hint on what went wrong.

ok kn@ mlarkin@


# 1.114 28-Jun-2019 deraadt

When system calls indicate an error they return -1, not some arbitrary
value < 0. errno is only updated in this case. Change all (most?)
callers of syscalls to follow this better, and let's see if this strictness
helps us in the future.


# 1.113 20-May-2019 jasper

drop fatalx calls when claiming a new vm id; otherwise it's possible
to crash vmd and take all other vms with it. this required a little
shuffling to get the error value reported back to the caller to
handle the error properly.

ok mlarkin@


# 1.112 11-May-2019 jasper

report vm state through 'vmctl status'; whereas previously this would display the state of
the vcpu (which is why it got removed), it now actually reports the correct state
(running, stopped, disabled, paused, etc)

ok ccardenas@ mlarkin@


# 1.111 11-May-2019 jasper

vm_dump_header allocated space for a signature but it was never set;
set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image

ok mlarkin@ pd@


# 1.110 11-May-2019 jasper

track the state of the vm (running, paused, etc) using a single bitfield instead of
a handful of separate variables. this will makes it easier for vmd to report
and check on the individual vm states

no functional change intended

ok ccardenas@ mlarkin@


# 1.109 11-May-2019 jasper

sync the vm state in vmd too when (un)pausing a vm, otherwise the vm process
knows the vm is paused, but vmd does not.

ok mlarkin@ pd@


Revision tags: OPENBSD_6_5_BASE
# 1.108 09-Dec-2018 claudio

When -B is used to specify a specific boot device also change the reboot
behaviour of vmd to stop / exit at guest reboot.
OK ccardenas@


# 1.107 04-Dec-2018 claudio

Introduce IMSG_VMDOP_WAIT_VM_REQUEST a control message that registers a
vmctl peerid that should be informed when the VM is stopped (like when the
guest does a shutdown). Uses the same logic as using the VMOP_WAIT flag on
IMSG_VMDOP_TERMINATE_VM_REQUEST.
Ok ccardenas@, reyk@


# 1.106 26-Nov-2018 ori

Keep a list of known vms, and reuse the VM IDs.

This means that when using '-L', the IP addresses of the VMs are stable.

ok reyk@


# 1.105 21-Nov-2018 reyk

Add support for "local inet6" interfaces.

ok & test ccardenas@, additional review from kn@


# 1.104 15-Oct-2018 reyk

Prevent VM reboot loops by rate-limiting the interval a VM can reboot.

This looping has been experienced by people who run VMs with a broken
kernel or boot loader that trigger a very fast reboot loop (triple
fault) of a VM that ends up using a lot of CPU and resources on the
host. Some fixes in vmm(4) and vmd(8) helped to avoid such conditions
but it can still occur if something is wrong in the guest VM itself.

If the VM restarts after less than VM_START_RATE_SEC (6) seconds, we
increment the limit counter. After VM_START_RATE_LIMIT (3) of suchs
fast reboots the VM is stopped.

There are only very few people who intentionally want to reboot-loop a
VM very quickly (many times within a second); mostly for fuzzing.
They will have to recompile and adjust the stated #defines in the code
as we don't have a config option to disable it.

OK mlarkin@


Revision tags: OPENBSD_6_4_BASE
# 1.103 08-Oct-2018 reyk

Add support for qcow2 base images (external snapshots).

This works is from Ori Bernstein, committing on his behalf:

Add support to vmd for external snapshots. That is, snapshots that are
derived from a base image. Data lookups start in the derived image,
and if the derived image does not contain some data, the search
proceeds ot the base image. Multiple derived images may exist off of
a single base image.

A limitation of this format is that modifying the base image will
corrupt the derived image.

This change also adds support for creating disk derived disk images to
vmctl. To use it:

vmctl create derived.qcow2 -s 16G -b base.qcow2

From Ori Bernstein
OK mlarkin@ reyk@


# 1.102 29-Sep-2018 pd

vmd: don't remove vm if sending failed

Fix a bug where a vm was removed in vmd.c after vmctl send even if sending
failed.
spotted by solene@
ok mlarkin@


# 1.101 28-Sep-2018 reyk

Fix copy-pasto to use maxmem instead of maxcpu

Reported by Greg Steuck

OK mlarkin@


# 1.100 10-Sep-2018 bluhm

vmd(8) clould close file descriptor 0 as not all fd fields were
properly initialized with -1. Also avoid closing -1.
OK mlarkin@


# 1.99 10-Sep-2018 bluhm

During the fork+exec implementation, daemon(3) was moved after
proc_init(). As a consequence vmd(8) child processes did not detach
from the terminal anymore. Dup /dev/null to the stdio file descriptors
in the children.
OK mlarkin@ reyk@


# 1.98 15-Jul-2018 reyk

Track resources and enforce cpu/memory/interface limits for non-root users.

The limits are currently hard-coded and undocumented (4 CPUs/VMs, 2G
memory, 8 interfaces) but will be configurable in an upcoming diff.
These limits are tracked in total usage; for example, a user will be
able to run up to 4 VMs with 512M of memory or a single VM with 2G.

OK ccardenas@ mlarkin@


# 1.97 13-Jul-2018 reyk

Check the disk/kernel/cdrom file permissions after openening the fd.

This prevents time of TOCTOU attacks for instances.

OK mlarkin@


# 1.96 13-Jul-2018 reyk

Add "allow instance" option.

This allows users to create VM instances and change desired options,
for example a user can be allowed to run a VM with all the
pre-configured options but specify an own disk image.

(mlarkin@ was fine with iterating over it)

OK ccardenas@


# 1.95 12-Jul-2018 reyk

Allow to use configured/running VMs as templates for other VM instances.

This introduces new grammar and the -t optional in vmctl start.

(For now, only root can create VM instances; but it is planned to allow
users to create their own VMs based on permissions and quota.)

OK ccardenas@ mlarkin@ jmc@


# 1.94 11-Jul-2018 reyk

style - indent each case statement in a switch.


# 1.93 11-Jul-2018 reyk

Add -w option to vmctl stop to wait for completion of VM termination.

Use it in /etc/rc.d/vmd accordingly.

OK sthen@


# 1.92 11-Jul-2018 reyk

Rename function to vmd_check_vmh


# 1.91 11-Jul-2018 reyk

Add -f option to vmctl stop to forcefully kill a VM.

This also fixes a bug in vmm_sighdlr where it might have missed
forwarding the TERMINATE_EVENT to the vmd parent after a VM child
died, leading to an abandoned VM in the vmd parent process.

OK ccardenas@ mlarkin@ benno@ kn@


# 1.90 10-Jul-2018 reyk

style (single-line ifs don't need braces)


# 1.89 10-Jul-2018 reyk

vmd already had DEBUG/DPRINTF, there is no need for VMD_DEBUG/dprintf

Replace all occurences of dprintf with DPRINTF (defined in proc.h).


# 1.88 10-Jul-2018 reyk

Tweak debug log messages

- Turn tracing messages into DPRINTF (only compiled with DEBUG).

- Pass __func__ to vm_stop and vm_remove: this way we can track who
called the function in the async context. It replaces the manual
log_debug in front of each vm_stop/vm_remove. This debug logging
trick can be removed in the future once we are more confident about
it.

OK ccardenas@ mlarkin@


# 1.87 26-Jun-2018 reyk

Add "socket owner" to allow changing the owner of the vmd control socket.

This allows to open vmctl control or console access to other users
that are not in group wheel. Access for non-root users still defaults
to read-only actions unless you change the owner (user/group) of each
individual VM.

Requested by Mischa Peters

OK mlarkin@


# 1.86 19-Jun-2018 reyk

knf


# 1.85 13-May-2018 pd

vmd(8): enable pause / unpause for vm owners

Patch from Mohamed Aslan. Thanks!
ok kn@


# 1.84 25-Apr-2018 mlarkin

vmd(8)'s early error messages weren't visible when started via /etc/rc
(such as errors relating to not having VMX/etc). Change the log_init
to log to syslog so at least we have some chance of seeing these errors.

requested and ok beck@


# 1.83 21-Apr-2018 mlarkin

spelling error in log message


# 1.82 29-Mar-2018 martijn

Make sure that the global config is send out immediately when it is
loaded. This makes sure that the local prefix specied in the config is
always used.

OK ccardenas@


Revision tags: OPENBSD_6_3_BASE
# 1.81 14-Mar-2018 mlarkin

block two VMs from using the same disk image file at the same time.
Also changes an error message in vmctl to reflect same.


# 1.80 18-Feb-2018 pd

vmd: fix vmctl pause for non existing vm ids (never returns)

check if vm id is valid before sending to vmm for pausing. The 'lock' is caused
by vmm sending back ENOENT for a non existent vm but vmd drops the message
because it doesn't recogize the vmid vmm is talking about. This is an artifact
of the 'policy' don't trust any imsg from a sibling priv sep process and do
your own checking.

reported by Abel Abraham Camarillo Ojeda
ok mlarkin@ and ccardenas@


# 1.79 10-Jan-2018 sthen

Don't require "disk" or "kernel", also allow just "cdrom" instead, a VM can
still be useful with only cdrom storage. ok ccardenas@


# 1.78 08-Jan-2018 mpi

Enable TIOCUCNTL to be able to set ns8250's break detected condition.

It is now possible to send BREAK commands to vmd(8) independently of
the serial terminal emulator.

Happy virtual ddb(4) hacking!

No objection from mlarkin@, ok nicm@, ccardenas@, deraadt@


# 1.77 03-Jan-2018 ccardenas

Add initial CD-ROM support to VMD via vioscsi.

* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8)
* Support various sized ISOs (Limitation of 4G ISOs on Linux guests)
* Known working guests: OpenBSD (primary), Alpine Linux (primary),
CentOS 6 (secondary), Ubuntu 17.10 (secondary).
NOTE: Secondary indicates some issue(s) preventing full/reliable
functionality outside the scope of the vioscsi work.
* If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's
default BIOS) will boot from CD-ROM.

ok mlarkin@, jca@


# 1.76 06-Dec-2017 abieber

Make vmd respect owner when starting non-disabled vms.

OK pd@, benno@


# 1.75 30-Nov-2017 ccardenas

When performing vmctl reload and a previously configured vm is running,
exit with an EALREADY vs EPERM.

ok mlarkin@


# 1.74 11-Nov-2017 mlarkin

update switch handling in vmd(8). vmd now gets switch information (rdomain,
etc) from underlying switch interface instead of handling this on its
own.

Diff from carlos cardenas, Thanks!

ok reyk@


# 1.73 07-Nov-2017 mlarkin

typo in previous


# 1.72 07-Nov-2017 mlarkin

comment function vm_checkperm


# 1.71 24-Oct-2017 mlarkin

The VMD parent process didn't handle the case of a VM exiting
with a non 0 return properly (i.e. EIO).

From: Carlos Cardenas, thanks!


# 1.70 07-Oct-2017 mlarkin

vmd: retain ownership on vm reboot

from Jesper Wallin, thanks!


Revision tags: OPENBSD_6_2_BASE
# 1.69 08-Sep-2017 mlarkin

vmd: add more explanatory log_debug messages

From Carlos Cardenas, many thanks!


# 1.68 20-Aug-2017 pd

vmd: Allow only upward migration

This restricts receiving vms from hosts with more cpu features.

Tested on
broadwell -> skylake (works)
skylake -> broadwell (don't work)

ok mlarkin@


# 1.67 15-Aug-2017 pd

vmd: fix vm id displayed by vmctl when receiving a vm

Also fix two debug messages and an IMSG type.


# 1.66 14-Aug-2017 jasper

validate vm names before creating them; a valid name contains alphanumeric
characters, including '.', '_' and '-'. but does not start with the latter
three.

ok mlarkin@ pd@


# 1.65 13-Aug-2017 jasper

don't issue a termination command to an already stopped vm

ok mlarkin@


# 1.64 15-Jul-2017 pd

Add vmctl send and vmctl receive

ok reyk@ and mlarkin@


# 1.63 09-Jul-2017 pd

vmd/vmctl: Add ability to pause / unpause vms

With help from Ashwin Agrawal

ok reyk@ mlarkin@


# 1.62 29-May-2017 mlarkin

vmd(8): prevent crashing when presented with a vm name argument to
"vmctl stop" that doesn't exist.

Diff from Pratik Vyas, thanks!


# 1.61 04-May-2017 reyk

Report command failure back to vmctl reload, reset, load, log verbose.

OK mlarkin@


# 1.60 04-May-2017 reyk

Add support for rdomains.

This allows to configure VM interfaces and switches in individual rdomains.

OK mlarkin@


# 1.59 25-Apr-2017 reyk

Generate randomized MAC addresses earlier to keep them across reboots.

OK deraadt@


# 1.58 21-Apr-2017 reyk

Add global configuration option "local prefix" to change prefix for -L.

The default prefix is 100.64.0.0/10 from RFC6598.

Requested by sthen@ chris@
OK mlarkin@


# 1.57 19-Apr-2017 reyk

Add support for dynamic "NAT" interfaces (-L/local interface).

When a local interface is configured, vmd configures a /31 address on
the tap(4) interface of the host and provides another IP in the same
subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server
that replies with IP, gateway, and DNS addresses to the VM. The
built-in server only ever responds to the VM on the inside and cannot
leak its DHCP responses to the outside.

Thanks to Uwe Werler, Josh Grosse, and some others for testing!

OK deraadt@


# 1.56 06-Apr-2017 reyk

Do not expose vmm(4) VM IDs to the user, use vmd(8)'s IDs instead.

Each VM has two IDs: one from the kernel (vmm) and a different one
from userland (vmd). The vmm ID is not consistent and incremented on
every boot during runtimg of the host system. The vmd ID remains the
same during the lifetime of a configured VM, even after reboots.
Configured VMs will even get and keep their IDs when the configuration
is loaded. This is more what users expect.

Pointed out and tested by otto@

OK deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.55 15-Mar-2017 reyk

More fixes for starting and stopping VMs, fixing fallout from vm_running.

- Don't start a VM that is already running
- Keep the VM as running until it is powered off (and not stopping)
- Don't fatal in the parent if the vmm process referenced an unknown VM
- Don't stop a VM that is already stopping
- Indicate that a VM is stopping in "vmctl status"

The previous "vmctl stop; vmctl stop" to force-shutdown is not
supported anymore - the shutdown timeout should make sure that the VM
is really terminated. To force-shutdown, reference the VM by ID.
We might add a flag to vmctl stop to just turn the VM off.


# 1.54 15-Mar-2017 reyk

Close the tty if the VM was powered down.

The parent keeps a copy of each VM's tty fd to reuse it on reboot.
Close this tty if the VM was stopped, and not rebooted, by calling
vm_stop(vm, 0) instead of just setting vm_running to 0. Also make
sure that vm_ttyname is not used after free'ing it.


# 1.53 02-Mar-2017 reyk

Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.

This is especially useful when multiple VMs share a switch, the
implementation is independent from the underlying switch or bridge.

no objections mlarkin@


# 1.52 01-Mar-2017 reyk

Add "owner" option to set a user/group ownership for pre-configured VMs

This allows matching users to start or stop VMs that they "own" and to
access the console accordingly.

OK mlarkin@


# 1.51 27-Feb-2017 reyk

Replace openpty(3) with local function that uses pre-opened /dev/ptm fd

This allows more flexibility for upcoming changes and better pledge.
We also didn't use half of the features of libutil's openpty function.
Additionally, make sure that the ttys are closed correctly on shutdown.

OK gilles@


# 1.50 13-Jan-2017 edd

Make it possible to remove VMs from vmd(8)'s internal queue.

The semantics agreed with reyk@ are:

* ad-hoc created vms, created with `vmctl start`, are removed once stopped.
* Stopped VMs defined in a config file are flushed before a `vmctl reload`.

OK reyk@


# 1.49 11-Jan-2017 reyk

Add imsg communication channel between vmd and invividual VMs.
For now, this is only used to forward "log verbose|brief" requests,
but it will be used for better things later.

OK mlarkin@


# 1.48 09-Jan-2017 reyk

Stop accessing verbose and debug variables from log.c directly.

This replaces log_verbose() and "extern int verbose" with the two functions
log_setverbose() and log_getverbose().

Pointed out by benno@
OK krw@ eric@ gilles@ (OK gilles@ for the snmpd bits as well)


# 1.47 14-Dec-2016 reyk

Allow to start disabled and pre-configured VMs by name, "vmctl start foo".

With testing from Jon Bernard

OK mlarkin@


# 1.46 14-Dec-2016 reyk

If a VM terminates with the result EAGAIN, close all fds except the
pty and re-send it to the vmm monitor process. With additional
changes in vmm.c, this will allow perform a cold reboot of VM.

With testing and feedback from Jon Bernard
OK mlarkin@


# 1.45 26-Nov-2016 reyk

Implement basic support for boot.conf(8) on the disk image.

Like the real boot loader, load and parse hd0a:/etc/boot.conf from the
first disk and fall back to /bsd. Not all boot loader options are
supported, but it at least does set device, set image, and boot -acds
(eg. for booting single-user).

For example, it can now boot install60.fs that includes a boot.conf
with "set image /6.0/amd64/bsd.rd":
vmctl start install -c -d install60.fs -d OpenBSD.img

This pseudo-bootloader is only needed without BIOS and could
potentially be replaced in the future.

OK mlarkin@


# 1.44 26-Nov-2016 reyk

If -m/memory is not specified, use 512M by default.

Default value picked with mlarkin - not too small and not too large.

OK mlarkin@


# 1.43 24-Nov-2016 reyk

Add support for booting the kernel from the disk image.

This make the kernel/-k argument optional and, if not specified, tries
to find the /bsd kernel in the primary hd0a partition of the first
disk image itself. It doesn't support hd0a:/etc/boot.conf yet, and it
is no BIOS or full boot loader, but it makes booting and handling of
VMs a bit easier - booting an external kernel is still supported.

The UFS file system code ufs.c is directly from libsa which is also
used by the real boot loader. The code compiles with a few signedness
warning which will be fixed separately.

OK mlarkin@


# 1.42 22-Nov-2016 reyk

Fix error path of config_setvm() and its callers. This unbreaks
loading of invalid kernel files.

Reported by mlarkin@
OK mlarkin@


# 1.41 22-Nov-2016 reyk

There is no need for res when there is already ret.


# 1.40 22-Nov-2016 edd

Insert disabled VMs into vmd(8)'s queues and allow vmctl(8) to display them.

Tested by Jon Bernard and reyk@.

OK reyk@, no objections mlarkin@.

Thanks


# 1.39 04-Nov-2016 reyk

Pass the internal vmid or 0 to vm_register() instead of changing it
once again after setting the next available id.

Suggested by edd@


# 1.38 04-Nov-2016 reyk

Update the config/register/get VM methods to match the config_set/get
style that is used in other places. Also keep the vmid from the parent.

OK edd@


# 1.37 29-Oct-2016 edd

Separate parsing vms and switches from starting them in vmd(8).

Brings us one step closer to having disabled by default vms is vm.conf(5),
which can be started with vmctl(8).

Input, testing and OK reyk@. Thanks.


# 1.36 17-Oct-2016 reyk

Add the option to specify an interface group per virtual switch as well;
this group will be added to all VM tap(4) interfaces in the switch.

Tested by martijn@


# 1.35 15-Oct-2016 reyk

Allow to add an interface to an interface group; with the group keyword.

Requested and tested by martijn@


# 1.34 12-Oct-2016 reyk

Fix functionality and semantics of vmctl load/reload/reset.

OK rzalamena@


# 1.33 06-Oct-2016 reyk

Terminate VMs on shutdown of vmd instead of leaving them running as
undead VM processes.

OK mlarkin@


# 1.32 05-Oct-2016 reyk

Add support for enhanced networking configuration and virtual switches.
See vm.conf(5) for more details.

OK mlarkin@


# 1.31 04-Oct-2016 reyk

Add a new "priv" process that is responsible for ioctls and restricted
operations that aren't allowed under pledge. This is a companion to
the "vmd" process that runs as root but with pledge.

With the "priv" process, each new tap(4) interface now gets a
description to indicate the vm, eg. "vm1-if0-myvm". For network
configuration will be done by vmd/priv later.

OK mlarkin@


# 1.30 29-Sep-2016 reyk

Implement fork+exec for vmd, using the same framework from httpd etc.

No objections from mlarkin@ sunil@


# 1.29 17-Aug-2016 deraadt

small bits of header cleanup; ok mlarkin


# 1.28 29-Jul-2016 stefan

Allow starting a VM again after it was terminated

If a VM exits, terminate it and remove it from the list of
available VMs. That allows a VM with name `foo' to be restarted
after it has exited.

This changes structures shared between vmd and vmctl. You need to
rebuild vmctl also.

ok mlarkin@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.27 05-Feb-2016 reyk

Fix a possible use-after-free in vmd, forward the result to the
control socket before free'ing the vm.

Found by and OK jsg@


# 1.26 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.25 11-Dec-2015 reyk

The vmctl "id" argument can now be a number of or a vm name, eg.
vmctl stop 3
vmctl stop "openbsd.vm"


# 1.24 08-Dec-2015 jsg

when checking the config file with -n don't open /dev/vmm or require root
ok reyk@


# 1.23 08-Dec-2015 jsg

make the -f option work as intended
ok reyk@


# 1.22 07-Dec-2015 reyk

tweak initial error logging


# 1.21 06-Dec-2015 reyk

Prevent running a VM with the same name multiple times - multiple
instances of the same configuration will be handled in a different way
later. It is also not a good idea to use the same writeable disk
with multiple VMs at the same time.

As discussed with mlarkin@


# 1.20 06-Dec-2015 reyk

Print the TTY in the vmctl status output.


# 1.19 06-Dec-2015 reyk

When a new vm is created with VMM_IOC_CREATE, the kernel assigns a
unique id to it. This happens in the vm child process and has to be
communicated to the parent processes to track the vm. Knowing the vm
id in the parent and vmm processes also allows to remove vm from the
daemons list on terminate requests later.


# 1.18 06-Dec-2015 reyk

Check errno from config_getvm() correctly


# 1.17 05-Dec-2015 reyk

Print shorter error message if opening /dev/vmm failed.

Pointed out by deraadt@


# 1.16 03-Dec-2015 reyk

Re-add the "load" and "reload" commands to vmctl: Instead of parsing
the configuration in vmctl directly, it now sends a (re)load request
to vmd. The reload also resets the existing configuration status -
this doesn't do much difference yet but a future change will compare
if a specified VM is already running. "load" will allow to add
configuration, while "reload" resets the state before loading.


# 1.15 03-Dec-2015 reyk

Add and document -D and -f flags to vmd.


# 1.14 03-Dec-2015 reyk

mlarkin's code has been moved to vmm.c, so it is ok to claim the copyright.


# 1.13 03-Dec-2015 reyk

Add support for an optional vm.conf(5) file in vmd. This will replace
vmm.conf(5) in vmmctl. For a short time, both vmd and vmmctl will
support a configuration file, but vmmctl will be changed to send
"load" requests to vmd instead of loading and parsing the file
directly.


# 1.12 03-Dec-2015 reyk

prepare config_getvm() for parse.y


# 1.11 02-Dec-2015 reyk

send the tty name to vmmctl and print it as a result.


# 1.10 02-Dec-2015 reyk

Split the fully privileged parent into two processes "parent" and
"vmm" with reduced privileges:
- the "parent" opens fds (disks, ifs, etc.) but runs as root but pledged as
"stdio rpath wpath proc tty sendfd".
- the "vmm" process handles the creation and supervision of vm processes,
and the primary communication with the vmm(4) subsystem. It runs as _vmd
in the chroot but does not use pledge, as the vmm ioctls are not allowed
by any pledge model yet.
With this change, vmd starts to track the configuration state of VMs
in vmd and will allow other things later (like terminating a vm by
name, moving the configuration parser to vmd, ...). More incremental
changes will follow.


# 1.9 02-Dec-2015 reyk

Start tweaking vmd's privsep and daemon model by splitting the main
process into multiple parts and adopting the "proc.c"-style from other
daemons. This allows to further reduce the privileges, to give better
pledge(2), and to add some upcoming changes.

"please do" mlarkin@, deraadt@


# 1.8 26-Nov-2015 reyk

Automatically start vmm(4) when the first VM is created and after the
last VM is terminated. This allows to remove the explicit "vmm
enable" / "vmm disable" (VMM_IOC_START / VMM_IOC_STOP) ioctls. You'll
have to update kernel and userland for this change, as the kernel ABI
changes.

OK mpi@ mlarkin@


# 1.7 25-Nov-2015 tedu

typo: should be looking pid == -1


# 1.6 23-Nov-2015 reyk

accept4() is restarted after signals which prevents vmd from exiting
in the current control socket loop. Add a poll before the accept that
is not restarted and allows to escape the loop. This code is kind of
temporary, as we're planning to replace the event handling, but it
allows to kill (or Ctrl+c) vmd for now.

OK tedu@, discussed with many


# 1.5 23-Nov-2015 reyk

I accidentally removed a newline in usage() when converting the log
messages to log_*.

From Cesare Gargano


# 1.4 23-Nov-2015 reyk

Add support for logging to stderr or syslog, and to run vmd in
foreground with -d.

OK mlarkin@ jung@


# 1.3 22-Nov-2015 deraadt

use PATH_MAX where needed


# 1.2 22-Nov-2015 reyk

Add $ Ids


# 1.1 22-Nov-2015 mlarkin

vmd(8) - virtual machine daemon.

There is still a lot to be done, and fixed, in these userland components
but I have received enough "it works, commit it" emails that it's time
to finish those things in tree.

discussed with many, tested by many.


# 1.117 12-Dec-2019 pd

vmd: start vms defined in vm.conf in a staggered fashion

This addresses 'thundering herd' problem when a lot of
vms are configured in vm.conf. A lot of vms booting in parallel can
overload the host and also mess up tsc calibration in openbsd guests as
it uses PIT which doesn't fire reliably if the host is overloaded.

We default to starting vms with parallelism of ncpuonline and a delay 30 seconds
between batches. This is configurable in vm.conf.

ok mlarkin@ (also addressed comments from cheloha@)


Revision tags: OPENBSD_6_6_BASE
# 1.116 04-Sep-2019 mlarkin

vmd(8): memory leak in an error path

Found by Hiltjo Posthuma, thanks!


# 1.115 14-Aug-2019 anton

Improve the error message when supplying an invalid template to vmctl
start. Favoring 'invalid template' over 'permission denied' should give
the user a better hint on what went wrong.

ok kn@ mlarkin@


# 1.114 28-Jun-2019 deraadt

When system calls indicate an error they return -1, not some arbitrary
value < 0. errno is only updated in this case. Change all (most?)
callers of syscalls to follow this better, and let's see if this strictness
helps us in the future.


# 1.113 20-May-2019 jasper

drop fatalx calls when claiming a new vm id; otherwise it's possible
to crash vmd and take all other vms with it. this required a little
shuffling to get the error value reported back to the caller to
handle the error properly.

ok mlarkin@


# 1.112 11-May-2019 jasper

report vm state through 'vmctl status'; whereas previously this would display the state of
the vcpu (which is why it got removed), it now actually reports the correct state
(running, stopped, disabled, paused, etc)

ok ccardenas@ mlarkin@


# 1.111 11-May-2019 jasper

vm_dump_header allocated space for a signature but it was never set;
set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image

ok mlarkin@ pd@


# 1.110 11-May-2019 jasper

track the state of the vm (running, paused, etc) using a single bitfield instead of
a handful of separate variables. this will makes it easier for vmd to report
and check on the individual vm states

no functional change intended

ok ccardenas@ mlarkin@


# 1.109 11-May-2019 jasper

sync the vm state in vmd too when (un)pausing a vm, otherwise the vm process
knows the vm is paused, but vmd does not.

ok mlarkin@ pd@


Revision tags: OPENBSD_6_5_BASE
# 1.108 09-Dec-2018 claudio

When -B is used to specify a specific boot device also change the reboot
behaviour of vmd to stop / exit at guest reboot.
OK ccardenas@


# 1.107 04-Dec-2018 claudio

Introduce IMSG_VMDOP_WAIT_VM_REQUEST a control message that registers a
vmctl peerid that should be informed when the VM is stopped (like when the
guest does a shutdown). Uses the same logic as using the VMOP_WAIT flag on
IMSG_VMDOP_TERMINATE_VM_REQUEST.
Ok ccardenas@, reyk@


# 1.106 26-Nov-2018 ori

Keep a list of known vms, and reuse the VM IDs.

This means that when using '-L', the IP addresses of the VMs are stable.

ok reyk@


# 1.105 21-Nov-2018 reyk

Add support for "local inet6" interfaces.

ok & test ccardenas@, additional review from kn@


# 1.104 15-Oct-2018 reyk

Prevent VM reboot loops by rate-limiting the interval a VM can reboot.

This looping has been experienced by people who run VMs with a broken
kernel or boot loader that trigger a very fast reboot loop (triple
fault) of a VM that ends up using a lot of CPU and resources on the
host. Some fixes in vmm(4) and vmd(8) helped to avoid such conditions
but it can still occur if something is wrong in the guest VM itself.

If the VM restarts after less than VM_START_RATE_SEC (6) seconds, we
increment the limit counter. After VM_START_RATE_LIMIT (3) of suchs
fast reboots the VM is stopped.

There are only very few people who intentionally want to reboot-loop a
VM very quickly (many times within a second); mostly for fuzzing.
They will have to recompile and adjust the stated #defines in the code
as we don't have a config option to disable it.

OK mlarkin@


Revision tags: OPENBSD_6_4_BASE
# 1.103 08-Oct-2018 reyk

Add support for qcow2 base images (external snapshots).

This works is from Ori Bernstein, committing on his behalf:

Add support to vmd for external snapshots. That is, snapshots that are
derived from a base image. Data lookups start in the derived image,
and if the derived image does not contain some data, the search
proceeds ot the base image. Multiple derived images may exist off of
a single base image.

A limitation of this format is that modifying the base image will
corrupt the derived image.

This change also adds support for creating disk derived disk images to
vmctl. To use it:

vmctl create derived.qcow2 -s 16G -b base.qcow2

From Ori Bernstein
OK mlarkin@ reyk@


# 1.102 29-Sep-2018 pd

vmd: don't remove vm if sending failed

Fix a bug where a vm was removed in vmd.c after vmctl send even if sending
failed.
spotted by solene@
ok mlarkin@


# 1.101 28-Sep-2018 reyk

Fix copy-pasto to use maxmem instead of maxcpu

Reported by Greg Steuck

OK mlarkin@


# 1.100 10-Sep-2018 bluhm

vmd(8) clould close file descriptor 0 as not all fd fields were
properly initialized with -1. Also avoid closing -1.
OK mlarkin@


# 1.99 10-Sep-2018 bluhm

During the fork+exec implementation, daemon(3) was moved after
proc_init(). As a consequence vmd(8) child processes did not detach
from the terminal anymore. Dup /dev/null to the stdio file descriptors
in the children.
OK mlarkin@ reyk@


# 1.98 15-Jul-2018 reyk

Track resources and enforce cpu/memory/interface limits for non-root users.

The limits are currently hard-coded and undocumented (4 CPUs/VMs, 2G
memory, 8 interfaces) but will be configurable in an upcoming diff.
These limits are tracked in total usage; for example, a user will be
able to run up to 4 VMs with 512M of memory or a single VM with 2G.

OK ccardenas@ mlarkin@


# 1.97 13-Jul-2018 reyk

Check the disk/kernel/cdrom file permissions after openening the fd.

This prevents time of TOCTOU attacks for instances.

OK mlarkin@


# 1.96 13-Jul-2018 reyk

Add "allow instance" option.

This allows users to create VM instances and change desired options,
for example a user can be allowed to run a VM with all the
pre-configured options but specify an own disk image.

(mlarkin@ was fine with iterating over it)

OK ccardenas@


# 1.95 12-Jul-2018 reyk

Allow to use configured/running VMs as templates for other VM instances.

This introduces new grammar and the -t optional in vmctl start.

(For now, only root can create VM instances; but it is planned to allow
users to create their own VMs based on permissions and quota.)

OK ccardenas@ mlarkin@ jmc@


# 1.94 11-Jul-2018 reyk

style - indent each case statement in a switch.


# 1.93 11-Jul-2018 reyk

Add -w option to vmctl stop to wait for completion of VM termination.

Use it in /etc/rc.d/vmd accordingly.

OK sthen@


# 1.92 11-Jul-2018 reyk

Rename function to vmd_check_vmh


# 1.91 11-Jul-2018 reyk

Add -f option to vmctl stop to forcefully kill a VM.

This also fixes a bug in vmm_sighdlr where it might have missed
forwarding the TERMINATE_EVENT to the vmd parent after a VM child
died, leading to an abandoned VM in the vmd parent process.

OK ccardenas@ mlarkin@ benno@ kn@


# 1.90 10-Jul-2018 reyk

style (single-line ifs don't need braces)


# 1.89 10-Jul-2018 reyk

vmd already had DEBUG/DPRINTF, there is no need for VMD_DEBUG/dprintf

Replace all occurences of dprintf with DPRINTF (defined in proc.h).


# 1.88 10-Jul-2018 reyk

Tweak debug log messages

- Turn tracing messages into DPRINTF (only compiled with DEBUG).

- Pass __func__ to vm_stop and vm_remove: this way we can track who
called the function in the async context. It replaces the manual
log_debug in front of each vm_stop/vm_remove. This debug logging
trick can be removed in the future once we are more confident about
it.

OK ccardenas@ mlarkin@


# 1.87 26-Jun-2018 reyk

Add "socket owner" to allow changing the owner of the vmd control socket.

This allows to open vmctl control or console access to other users
that are not in group wheel. Access for non-root users still defaults
to read-only actions unless you change the owner (user/group) of each
individual VM.

Requested by Mischa Peters

OK mlarkin@


# 1.86 19-Jun-2018 reyk

knf


# 1.85 13-May-2018 pd

vmd(8): enable pause / unpause for vm owners

Patch from Mohamed Aslan. Thanks!
ok kn@


# 1.84 25-Apr-2018 mlarkin

vmd(8)'s early error messages weren't visible when started via /etc/rc
(such as errors relating to not having VMX/etc). Change the log_init
to log to syslog so at least we have some chance of seeing these errors.

requested and ok beck@


# 1.83 21-Apr-2018 mlarkin

spelling error in log message


# 1.82 29-Mar-2018 martijn

Make sure that the global config is send out immediately when it is
loaded. This makes sure that the local prefix specied in the config is
always used.

OK ccardenas@


Revision tags: OPENBSD_6_3_BASE
# 1.81 14-Mar-2018 mlarkin

block two VMs from using the same disk image file at the same time.
Also changes an error message in vmctl to reflect same.


# 1.80 18-Feb-2018 pd

vmd: fix vmctl pause for non existing vm ids (never returns)

check if vm id is valid before sending to vmm for pausing. The 'lock' is caused
by vmm sending back ENOENT for a non existent vm but vmd drops the message
because it doesn't recogize the vmid vmm is talking about. This is an artifact
of the 'policy' don't trust any imsg from a sibling priv sep process and do
your own checking.

reported by Abel Abraham Camarillo Ojeda
ok mlarkin@ and ccardenas@


# 1.79 10-Jan-2018 sthen

Don't require "disk" or "kernel", also allow just "cdrom" instead, a VM can
still be useful with only cdrom storage. ok ccardenas@


# 1.78 08-Jan-2018 mpi

Enable TIOCUCNTL to be able to set ns8250's break detected condition.

It is now possible to send BREAK commands to vmd(8) independently of
the serial terminal emulator.

Happy virtual ddb(4) hacking!

No objection from mlarkin@, ok nicm@, ccardenas@, deraadt@


# 1.77 03-Jan-2018 ccardenas

Add initial CD-ROM support to VMD via vioscsi.

* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8)
* Support various sized ISOs (Limitation of 4G ISOs on Linux guests)
* Known working guests: OpenBSD (primary), Alpine Linux (primary),
CentOS 6 (secondary), Ubuntu 17.10 (secondary).
NOTE: Secondary indicates some issue(s) preventing full/reliable
functionality outside the scope of the vioscsi work.
* If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's
default BIOS) will boot from CD-ROM.

ok mlarkin@, jca@


# 1.76 06-Dec-2017 abieber

Make vmd respect owner when starting non-disabled vms.

OK pd@, benno@


# 1.75 30-Nov-2017 ccardenas

When performing vmctl reload and a previously configured vm is running,
exit with an EALREADY vs EPERM.

ok mlarkin@


# 1.74 11-Nov-2017 mlarkin

update switch handling in vmd(8). vmd now gets switch information (rdomain,
etc) from underlying switch interface instead of handling this on its
own.

Diff from carlos cardenas, Thanks!

ok reyk@


# 1.73 07-Nov-2017 mlarkin

typo in previous


# 1.72 07-Nov-2017 mlarkin

comment function vm_checkperm


# 1.71 24-Oct-2017 mlarkin

The VMD parent process didn't handle the case of a VM exiting
with a non 0 return properly (i.e. EIO).

From: Carlos Cardenas, thanks!


# 1.70 07-Oct-2017 mlarkin

vmd: retain ownership on vm reboot

from Jesper Wallin, thanks!


Revision tags: OPENBSD_6_2_BASE
# 1.69 08-Sep-2017 mlarkin

vmd: add more explanatory log_debug messages

From Carlos Cardenas, many thanks!


# 1.68 20-Aug-2017 pd

vmd: Allow only upward migration

This restricts receiving vms from hosts with more cpu features.

Tested on
broadwell -> skylake (works)
skylake -> broadwell (don't work)

ok mlarkin@


# 1.67 15-Aug-2017 pd

vmd: fix vm id displayed by vmctl when receiving a vm

Also fix two debug messages and an IMSG type.


# 1.66 14-Aug-2017 jasper

validate vm names before creating them; a valid name contains alphanumeric
characters, including '.', '_' and '-'. but does not start with the latter
three.

ok mlarkin@ pd@


# 1.65 13-Aug-2017 jasper

don't issue a termination command to an already stopped vm

ok mlarkin@


# 1.64 15-Jul-2017 pd

Add vmctl send and vmctl receive

ok reyk@ and mlarkin@


# 1.63 09-Jul-2017 pd

vmd/vmctl: Add ability to pause / unpause vms

With help from Ashwin Agrawal

ok reyk@ mlarkin@


# 1.62 29-May-2017 mlarkin

vmd(8): prevent crashing when presented with a vm name argument to
"vmctl stop" that doesn't exist.

Diff from Pratik Vyas, thanks!


# 1.61 04-May-2017 reyk

Report command failure back to vmctl reload, reset, load, log verbose.

OK mlarkin@


# 1.60 04-May-2017 reyk

Add support for rdomains.

This allows to configure VM interfaces and switches in individual rdomains.

OK mlarkin@


# 1.59 25-Apr-2017 reyk

Generate randomized MAC addresses earlier to keep them across reboots.

OK deraadt@


# 1.58 21-Apr-2017 reyk

Add global configuration option "local prefix" to change prefix for -L.

The default prefix is 100.64.0.0/10 from RFC6598.

Requested by sthen@ chris@
OK mlarkin@


# 1.57 19-Apr-2017 reyk

Add support for dynamic "NAT" interfaces (-L/local interface).

When a local interface is configured, vmd configures a /31 address on
the tap(4) interface of the host and provides another IP in the same
subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server
that replies with IP, gateway, and DNS addresses to the VM. The
built-in server only ever responds to the VM on the inside and cannot
leak its DHCP responses to the outside.

Thanks to Uwe Werler, Josh Grosse, and some others for testing!

OK deraadt@


# 1.56 06-Apr-2017 reyk

Do not expose vmm(4) VM IDs to the user, use vmd(8)'s IDs instead.

Each VM has two IDs: one from the kernel (vmm) and a different one
from userland (vmd). The vmm ID is not consistent and incremented on
every boot during runtimg of the host system. The vmd ID remains the
same during the lifetime of a configured VM, even after reboots.
Configured VMs will even get and keep their IDs when the configuration
is loaded. This is more what users expect.

Pointed out and tested by otto@

OK deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.55 15-Mar-2017 reyk

More fixes for starting and stopping VMs, fixing fallout from vm_running.

- Don't start a VM that is already running
- Keep the VM as running until it is powered off (and not stopping)
- Don't fatal in the parent if the vmm process referenced an unknown VM
- Don't stop a VM that is already stopping
- Indicate that a VM is stopping in "vmctl status"

The previous "vmctl stop; vmctl stop" to force-shutdown is not
supported anymore - the shutdown timeout should make sure that the VM
is really terminated. To force-shutdown, reference the VM by ID.
We might add a flag to vmctl stop to just turn the VM off.


# 1.54 15-Mar-2017 reyk

Close the tty if the VM was powered down.

The parent keeps a copy of each VM's tty fd to reuse it on reboot.
Close this tty if the VM was stopped, and not rebooted, by calling
vm_stop(vm, 0) instead of just setting vm_running to 0. Also make
sure that vm_ttyname is not used after free'ing it.


# 1.53 02-Mar-2017 reyk

Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.

This is especially useful when multiple VMs share a switch, the
implementation is independent from the underlying switch or bridge.

no objections mlarkin@


# 1.52 01-Mar-2017 reyk

Add "owner" option to set a user/group ownership for pre-configured VMs

This allows matching users to start or stop VMs that they "own" and to
access the console accordingly.

OK mlarkin@


# 1.51 27-Feb-2017 reyk

Replace openpty(3) with local function that uses pre-opened /dev/ptm fd

This allows more flexibility for upcoming changes and better pledge.
We also didn't use half of the features of libutil's openpty function.
Additionally, make sure that the ttys are closed correctly on shutdown.

OK gilles@


# 1.50 13-Jan-2017 edd

Make it possible to remove VMs from vmd(8)'s internal queue.

The semantics agreed with reyk@ are:

* ad-hoc created vms, created with `vmctl start`, are removed once stopped.
* Stopped VMs defined in a config file are flushed before a `vmctl reload`.

OK reyk@


# 1.49 11-Jan-2017 reyk

Add imsg communication channel between vmd and invividual VMs.
For now, this is only used to forward "log verbose|brief" requests,
but it will be used for better things later.

OK mlarkin@


# 1.48 09-Jan-2017 reyk

Stop accessing verbose and debug variables from log.c directly.

This replaces log_verbose() and "extern int verbose" with the two functions
log_setverbose() and log_getverbose().

Pointed out by benno@
OK krw@ eric@ gilles@ (OK gilles@ for the snmpd bits as well)


# 1.47 14-Dec-2016 reyk

Allow to start disabled and pre-configured VMs by name, "vmctl start foo".

With testing from Jon Bernard

OK mlarkin@


# 1.46 14-Dec-2016 reyk

If a VM terminates with the result EAGAIN, close all fds except the
pty and re-send it to the vmm monitor process. With additional
changes in vmm.c, this will allow perform a cold reboot of VM.

With testing and feedback from Jon Bernard
OK mlarkin@


# 1.45 26-Nov-2016 reyk

Implement basic support for boot.conf(8) on the disk image.

Like the real boot loader, load and parse hd0a:/etc/boot.conf from the
first disk and fall back to /bsd. Not all boot loader options are
supported, but it at least does set device, set image, and boot -acds
(eg. for booting single-user).

For example, it can now boot install60.fs that includes a boot.conf
with "set image /6.0/amd64/bsd.rd":
vmctl start install -c -d install60.fs -d OpenBSD.img

This pseudo-bootloader is only needed without BIOS and could
potentially be replaced in the future.

OK mlarkin@


# 1.44 26-Nov-2016 reyk

If -m/memory is not specified, use 512M by default.

Default value picked with mlarkin - not too small and not too large.

OK mlarkin@


# 1.43 24-Nov-2016 reyk

Add support for booting the kernel from the disk image.

This make the kernel/-k argument optional and, if not specified, tries
to find the /bsd kernel in the primary hd0a partition of the first
disk image itself. It doesn't support hd0a:/etc/boot.conf yet, and it
is no BIOS or full boot loader, but it makes booting and handling of
VMs a bit easier - booting an external kernel is still supported.

The UFS file system code ufs.c is directly from libsa which is also
used by the real boot loader. The code compiles with a few signedness
warning which will be fixed separately.

OK mlarkin@


# 1.42 22-Nov-2016 reyk

Fix error path of config_setvm() and its callers. This unbreaks
loading of invalid kernel files.

Reported by mlarkin@
OK mlarkin@


# 1.41 22-Nov-2016 reyk

There is no need for res when there is already ret.


# 1.40 22-Nov-2016 edd

Insert disabled VMs into vmd(8)'s queues and allow vmctl(8) to display them.

Tested by Jon Bernard and reyk@.

OK reyk@, no objections mlarkin@.

Thanks


# 1.39 04-Nov-2016 reyk

Pass the internal vmid or 0 to vm_register() instead of changing it
once again after setting the next available id.

Suggested by edd@


# 1.38 04-Nov-2016 reyk

Update the config/register/get VM methods to match the config_set/get
style that is used in other places. Also keep the vmid from the parent.

OK edd@


# 1.37 29-Oct-2016 edd

Separate parsing vms and switches from starting them in vmd(8).

Brings us one step closer to having disabled by default vms is vm.conf(5),
which can be started with vmctl(8).

Input, testing and OK reyk@. Thanks.


# 1.36 17-Oct-2016 reyk

Add the option to specify an interface group per virtual switch as well;
this group will be added to all VM tap(4) interfaces in the switch.

Tested by martijn@


# 1.35 15-Oct-2016 reyk

Allow to add an interface to an interface group; with the group keyword.

Requested and tested by martijn@


# 1.34 12-Oct-2016 reyk

Fix functionality and semantics of vmctl load/reload/reset.

OK rzalamena@


# 1.33 06-Oct-2016 reyk

Terminate VMs on shutdown of vmd instead of leaving them running as
undead VM processes.

OK mlarkin@


# 1.32 05-Oct-2016 reyk

Add support for enhanced networking configuration and virtual switches.
See vm.conf(5) for more details.

OK mlarkin@


# 1.31 04-Oct-2016 reyk

Add a new "priv" process that is responsible for ioctls and restricted
operations that aren't allowed under pledge. This is a companion to
the "vmd" process that runs as root but with pledge.

With the "priv" process, each new tap(4) interface now gets a
description to indicate the vm, eg. "vm1-if0-myvm". For network
configuration will be done by vmd/priv later.

OK mlarkin@


# 1.30 29-Sep-2016 reyk

Implement fork+exec for vmd, using the same framework from httpd etc.

No objections from mlarkin@ sunil@


# 1.29 17-Aug-2016 deraadt

small bits of header cleanup; ok mlarkin


# 1.28 29-Jul-2016 stefan

Allow starting a VM again after it was terminated

If a VM exits, terminate it and remove it from the list of
available VMs. That allows a VM with name `foo' to be restarted
after it has exited.

This changes structures shared between vmd and vmctl. You need to
rebuild vmctl also.

ok mlarkin@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.27 05-Feb-2016 reyk

Fix a possible use-after-free in vmd, forward the result to the
control socket before free'ing the vm.

Found by and OK jsg@


# 1.26 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.25 11-Dec-2015 reyk

The vmctl "id" argument can now be a number of or a vm name, eg.
vmctl stop 3
vmctl stop "openbsd.vm"


# 1.24 08-Dec-2015 jsg

when checking the config file with -n don't open /dev/vmm or require root
ok reyk@


# 1.23 08-Dec-2015 jsg

make the -f option work as intended
ok reyk@


# 1.22 07-Dec-2015 reyk

tweak initial error logging


# 1.21 06-Dec-2015 reyk

Prevent running a VM with the same name multiple times - multiple
instances of the same configuration will be handled in a different way
later. It is also not a good idea to use the same writeable disk
with multiple VMs at the same time.

As discussed with mlarkin@


# 1.20 06-Dec-2015 reyk

Print the TTY in the vmctl status output.


# 1.19 06-Dec-2015 reyk

When a new vm is created with VMM_IOC_CREATE, the kernel assigns a
unique id to it. This happens in the vm child process and has to be
communicated to the parent processes to track the vm. Knowing the vm
id in the parent and vmm processes also allows to remove vm from the
daemons list on terminate requests later.


# 1.18 06-Dec-2015 reyk

Check errno from config_getvm() correctly


# 1.17 05-Dec-2015 reyk

Print shorter error message if opening /dev/vmm failed.

Pointed out by deraadt@


# 1.16 03-Dec-2015 reyk

Re-add the "load" and "reload" commands to vmctl: Instead of parsing
the configuration in vmctl directly, it now sends a (re)load request
to vmd. The reload also resets the existing configuration status -
this doesn't do much difference yet but a future change will compare
if a specified VM is already running. "load" will allow to add
configuration, while "reload" resets the state before loading.


# 1.15 03-Dec-2015 reyk

Add and document -D and -f flags to vmd.


# 1.14 03-Dec-2015 reyk

mlarkin's code has been moved to vmm.c, so it is ok to claim the copyright.


# 1.13 03-Dec-2015 reyk

Add support for an optional vm.conf(5) file in vmd. This will replace
vmm.conf(5) in vmmctl. For a short time, both vmd and vmmctl will
support a configuration file, but vmmctl will be changed to send
"load" requests to vmd instead of loading and parsing the file
directly.


# 1.12 03-Dec-2015 reyk

prepare config_getvm() for parse.y


# 1.11 02-Dec-2015 reyk

send the tty name to vmmctl and print it as a result.


# 1.10 02-Dec-2015 reyk

Split the fully privileged parent into two processes "parent" and
"vmm" with reduced privileges:
- the "parent" opens fds (disks, ifs, etc.) but runs as root but pledged as
"stdio rpath wpath proc tty sendfd".
- the "vmm" process handles the creation and supervision of vm processes,
and the primary communication with the vmm(4) subsystem. It runs as _vmd
in the chroot but does not use pledge, as the vmm ioctls are not allowed
by any pledge model yet.
With this change, vmd starts to track the configuration state of VMs
in vmd and will allow other things later (like terminating a vm by
name, moving the configuration parser to vmd, ...). More incremental
changes will follow.


# 1.9 02-Dec-2015 reyk

Start tweaking vmd's privsep and daemon model by splitting the main
process into multiple parts and adopting the "proc.c"-style from other
daemons. This allows to further reduce the privileges, to give better
pledge(2), and to add some upcoming changes.

"please do" mlarkin@, deraadt@


# 1.8 26-Nov-2015 reyk

Automatically start vmm(4) when the first VM is created and after the
last VM is terminated. This allows to remove the explicit "vmm
enable" / "vmm disable" (VMM_IOC_START / VMM_IOC_STOP) ioctls. You'll
have to update kernel and userland for this change, as the kernel ABI
changes.

OK mpi@ mlarkin@


# 1.7 25-Nov-2015 tedu

typo: should be looking pid == -1


# 1.6 23-Nov-2015 reyk

accept4() is restarted after signals which prevents vmd from exiting
in the current control socket loop. Add a poll before the accept that
is not restarted and allows to escape the loop. This code is kind of
temporary, as we're planning to replace the event handling, but it
allows to kill (or Ctrl+c) vmd for now.

OK tedu@, discussed with many


# 1.5 23-Nov-2015 reyk

I accidentally removed a newline in usage() when converting the log
messages to log_*.

From Cesare Gargano


# 1.4 23-Nov-2015 reyk

Add support for logging to stderr or syslog, and to run vmd in
foreground with -d.

OK mlarkin@ jung@


# 1.3 22-Nov-2015 deraadt

use PATH_MAX where needed


# 1.2 22-Nov-2015 reyk

Add $ Ids


# 1.1 22-Nov-2015 mlarkin

vmd(8) - virtual machine daemon.

There is still a lot to be done, and fixed, in these userland components
but I have received enough "it works, commit it" emails that it's time
to finish those things in tree.

discussed with many, tested by many.


# 1.116 04-Sep-2019 mlarkin

vmd(8): memory leak in an error path

Found by Hiltjo Posthuma, thanks!


# 1.115 14-Aug-2019 anton

Improve the error message when supplying an invalid template to vmctl
start. Favoring 'invalid template' over 'permission denied' should give
the user a better hint on what went wrong.

ok kn@ mlarkin@


# 1.114 28-Jun-2019 deraadt

When system calls indicate an error they return -1, not some arbitrary
value < 0. errno is only updated in this case. Change all (most?)
callers of syscalls to follow this better, and let's see if this strictness
helps us in the future.


# 1.113 20-May-2019 jasper

drop fatalx calls when claiming a new vm id; otherwise it's possible
to crash vmd and take all other vms with it. this required a little
shuffling to get the error value reported back to the caller to
handle the error properly.

ok mlarkin@


# 1.112 11-May-2019 jasper

report vm state through 'vmctl status'; whereas previously this would display the state of
the vcpu (which is why it got removed), it now actually reports the correct state
(running, stopped, disabled, paused, etc)

ok ccardenas@ mlarkin@


# 1.111 11-May-2019 jasper

vm_dump_header allocated space for a signature but it was never set;
set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image

ok mlarkin@ pd@


# 1.110 11-May-2019 jasper

track the state of the vm (running, paused, etc) using a single bitfield instead of
a handful of separate variables. this will makes it easier for vmd to report
and check on the individual vm states

no functional change intended

ok ccardenas@ mlarkin@


# 1.109 11-May-2019 jasper

sync the vm state in vmd too when (un)pausing a vm, otherwise the vm process
knows the vm is paused, but vmd does not.

ok mlarkin@ pd@


Revision tags: OPENBSD_6_5_BASE
# 1.108 09-Dec-2018 claudio

When -B is used to specify a specific boot device also change the reboot
behaviour of vmd to stop / exit at guest reboot.
OK ccardenas@


# 1.107 04-Dec-2018 claudio

Introduce IMSG_VMDOP_WAIT_VM_REQUEST a control message that registers a
vmctl peerid that should be informed when the VM is stopped (like when the
guest does a shutdown). Uses the same logic as using the VMOP_WAIT flag on
IMSG_VMDOP_TERMINATE_VM_REQUEST.
Ok ccardenas@, reyk@


# 1.106 26-Nov-2018 ori

Keep a list of known vms, and reuse the VM IDs.

This means that when using '-L', the IP addresses of the VMs are stable.

ok reyk@


# 1.105 21-Nov-2018 reyk

Add support for "local inet6" interfaces.

ok & test ccardenas@, additional review from kn@


# 1.104 15-Oct-2018 reyk

Prevent VM reboot loops by rate-limiting the interval a VM can reboot.

This looping has been experienced by people who run VMs with a broken
kernel or boot loader that trigger a very fast reboot loop (triple
fault) of a VM that ends up using a lot of CPU and resources on the
host. Some fixes in vmm(4) and vmd(8) helped to avoid such conditions
but it can still occur if something is wrong in the guest VM itself.

If the VM restarts after less than VM_START_RATE_SEC (6) seconds, we
increment the limit counter. After VM_START_RATE_LIMIT (3) of suchs
fast reboots the VM is stopped.

There are only very few people who intentionally want to reboot-loop a
VM very quickly (many times within a second); mostly for fuzzing.
They will have to recompile and adjust the stated #defines in the code
as we don't have a config option to disable it.

OK mlarkin@


Revision tags: OPENBSD_6_4_BASE
# 1.103 08-Oct-2018 reyk

Add support for qcow2 base images (external snapshots).

This works is from Ori Bernstein, committing on his behalf:

Add support to vmd for external snapshots. That is, snapshots that are
derived from a base image. Data lookups start in the derived image,
and if the derived image does not contain some data, the search
proceeds ot the base image. Multiple derived images may exist off of
a single base image.

A limitation of this format is that modifying the base image will
corrupt the derived image.

This change also adds support for creating disk derived disk images to
vmctl. To use it:

vmctl create derived.qcow2 -s 16G -b base.qcow2

From Ori Bernstein
OK mlarkin@ reyk@


# 1.102 29-Sep-2018 pd

vmd: don't remove vm if sending failed

Fix a bug where a vm was removed in vmd.c after vmctl send even if sending
failed.
spotted by solene@
ok mlarkin@


# 1.101 28-Sep-2018 reyk

Fix copy-pasto to use maxmem instead of maxcpu

Reported by Greg Steuck

OK mlarkin@


# 1.100 10-Sep-2018 bluhm

vmd(8) clould close file descriptor 0 as not all fd fields were
properly initialized with -1. Also avoid closing -1.
OK mlarkin@


# 1.99 10-Sep-2018 bluhm

During the fork+exec implementation, daemon(3) was moved after
proc_init(). As a consequence vmd(8) child processes did not detach
from the terminal anymore. Dup /dev/null to the stdio file descriptors
in the children.
OK mlarkin@ reyk@


# 1.98 15-Jul-2018 reyk

Track resources and enforce cpu/memory/interface limits for non-root users.

The limits are currently hard-coded and undocumented (4 CPUs/VMs, 2G
memory, 8 interfaces) but will be configurable in an upcoming diff.
These limits are tracked in total usage; for example, a user will be
able to run up to 4 VMs with 512M of memory or a single VM with 2G.

OK ccardenas@ mlarkin@


# 1.97 13-Jul-2018 reyk

Check the disk/kernel/cdrom file permissions after openening the fd.

This prevents time of TOCTOU attacks for instances.

OK mlarkin@


# 1.96 13-Jul-2018 reyk

Add "allow instance" option.

This allows users to create VM instances and change desired options,
for example a user can be allowed to run a VM with all the
pre-configured options but specify an own disk image.

(mlarkin@ was fine with iterating over it)

OK ccardenas@


# 1.95 12-Jul-2018 reyk

Allow to use configured/running VMs as templates for other VM instances.

This introduces new grammar and the -t optional in vmctl start.

(For now, only root can create VM instances; but it is planned to allow
users to create their own VMs based on permissions and quota.)

OK ccardenas@ mlarkin@ jmc@


# 1.94 11-Jul-2018 reyk

style - indent each case statement in a switch.


# 1.93 11-Jul-2018 reyk

Add -w option to vmctl stop to wait for completion of VM termination.

Use it in /etc/rc.d/vmd accordingly.

OK sthen@


# 1.92 11-Jul-2018 reyk

Rename function to vmd_check_vmh


# 1.91 11-Jul-2018 reyk

Add -f option to vmctl stop to forcefully kill a VM.

This also fixes a bug in vmm_sighdlr where it might have missed
forwarding the TERMINATE_EVENT to the vmd parent after a VM child
died, leading to an abandoned VM in the vmd parent process.

OK ccardenas@ mlarkin@ benno@ kn@


# 1.90 10-Jul-2018 reyk

style (single-line ifs don't need braces)


# 1.89 10-Jul-2018 reyk

vmd already had DEBUG/DPRINTF, there is no need for VMD_DEBUG/dprintf

Replace all occurences of dprintf with DPRINTF (defined in proc.h).


# 1.88 10-Jul-2018 reyk

Tweak debug log messages

- Turn tracing messages into DPRINTF (only compiled with DEBUG).

- Pass __func__ to vm_stop and vm_remove: this way we can track who
called the function in the async context. It replaces the manual
log_debug in front of each vm_stop/vm_remove. This debug logging
trick can be removed in the future once we are more confident about
it.

OK ccardenas@ mlarkin@


# 1.87 26-Jun-2018 reyk

Add "socket owner" to allow changing the owner of the vmd control socket.

This allows to open vmctl control or console access to other users
that are not in group wheel. Access for non-root users still defaults
to read-only actions unless you change the owner (user/group) of each
individual VM.

Requested by Mischa Peters

OK mlarkin@


# 1.86 19-Jun-2018 reyk

knf


# 1.85 13-May-2018 pd

vmd(8): enable pause / unpause for vm owners

Patch from Mohamed Aslan. Thanks!
ok kn@


# 1.84 25-Apr-2018 mlarkin

vmd(8)'s early error messages weren't visible when started via /etc/rc
(such as errors relating to not having VMX/etc). Change the log_init
to log to syslog so at least we have some chance of seeing these errors.

requested and ok beck@


# 1.83 21-Apr-2018 mlarkin

spelling error in log message


# 1.82 29-Mar-2018 martijn

Make sure that the global config is send out immediately when it is
loaded. This makes sure that the local prefix specied in the config is
always used.

OK ccardenas@


Revision tags: OPENBSD_6_3_BASE
# 1.81 14-Mar-2018 mlarkin

block two VMs from using the same disk image file at the same time.
Also changes an error message in vmctl to reflect same.


# 1.80 18-Feb-2018 pd

vmd: fix vmctl pause for non existing vm ids (never returns)

check if vm id is valid before sending to vmm for pausing. The 'lock' is caused
by vmm sending back ENOENT for a non existent vm but vmd drops the message
because it doesn't recogize the vmid vmm is talking about. This is an artifact
of the 'policy' don't trust any imsg from a sibling priv sep process and do
your own checking.

reported by Abel Abraham Camarillo Ojeda
ok mlarkin@ and ccardenas@


# 1.79 10-Jan-2018 sthen

Don't require "disk" or "kernel", also allow just "cdrom" instead, a VM can
still be useful with only cdrom storage. ok ccardenas@


# 1.78 08-Jan-2018 mpi

Enable TIOCUCNTL to be able to set ns8250's break detected condition.

It is now possible to send BREAK commands to vmd(8) independently of
the serial terminal emulator.

Happy virtual ddb(4) hacking!

No objection from mlarkin@, ok nicm@, ccardenas@, deraadt@


# 1.77 03-Jan-2018 ccardenas

Add initial CD-ROM support to VMD via vioscsi.

* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8)
* Support various sized ISOs (Limitation of 4G ISOs on Linux guests)
* Known working guests: OpenBSD (primary), Alpine Linux (primary),
CentOS 6 (secondary), Ubuntu 17.10 (secondary).
NOTE: Secondary indicates some issue(s) preventing full/reliable
functionality outside the scope of the vioscsi work.
* If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's
default BIOS) will boot from CD-ROM.

ok mlarkin@, jca@


# 1.76 06-Dec-2017 abieber

Make vmd respect owner when starting non-disabled vms.

OK pd@, benno@


# 1.75 30-Nov-2017 ccardenas

When performing vmctl reload and a previously configured vm is running,
exit with an EALREADY vs EPERM.

ok mlarkin@


# 1.74 11-Nov-2017 mlarkin

update switch handling in vmd(8). vmd now gets switch information (rdomain,
etc) from underlying switch interface instead of handling this on its
own.

Diff from carlos cardenas, Thanks!

ok reyk@


# 1.73 07-Nov-2017 mlarkin

typo in previous


# 1.72 07-Nov-2017 mlarkin

comment function vm_checkperm


# 1.71 24-Oct-2017 mlarkin

The VMD parent process didn't handle the case of a VM exiting
with a non 0 return properly (i.e. EIO).

From: Carlos Cardenas, thanks!


# 1.70 07-Oct-2017 mlarkin

vmd: retain ownership on vm reboot

from Jesper Wallin, thanks!


Revision tags: OPENBSD_6_2_BASE
# 1.69 08-Sep-2017 mlarkin

vmd: add more explanatory log_debug messages

From Carlos Cardenas, many thanks!


# 1.68 20-Aug-2017 pd

vmd: Allow only upward migration

This restricts receiving vms from hosts with more cpu features.

Tested on
broadwell -> skylake (works)
skylake -> broadwell (don't work)

ok mlarkin@


# 1.67 15-Aug-2017 pd

vmd: fix vm id displayed by vmctl when receiving a vm

Also fix two debug messages and an IMSG type.


# 1.66 14-Aug-2017 jasper

validate vm names before creating them; a valid name contains alphanumeric
characters, including '.', '_' and '-'. but does not start with the latter
three.

ok mlarkin@ pd@


# 1.65 13-Aug-2017 jasper

don't issue a termination command to an already stopped vm

ok mlarkin@


# 1.64 15-Jul-2017 pd

Add vmctl send and vmctl receive

ok reyk@ and mlarkin@


# 1.63 09-Jul-2017 pd

vmd/vmctl: Add ability to pause / unpause vms

With help from Ashwin Agrawal

ok reyk@ mlarkin@


# 1.62 29-May-2017 mlarkin

vmd(8): prevent crashing when presented with a vm name argument to
"vmctl stop" that doesn't exist.

Diff from Pratik Vyas, thanks!


# 1.61 04-May-2017 reyk

Report command failure back to vmctl reload, reset, load, log verbose.

OK mlarkin@


# 1.60 04-May-2017 reyk

Add support for rdomains.

This allows to configure VM interfaces and switches in individual rdomains.

OK mlarkin@


# 1.59 25-Apr-2017 reyk

Generate randomized MAC addresses earlier to keep them across reboots.

OK deraadt@


# 1.58 21-Apr-2017 reyk

Add global configuration option "local prefix" to change prefix for -L.

The default prefix is 100.64.0.0/10 from RFC6598.

Requested by sthen@ chris@
OK mlarkin@


# 1.57 19-Apr-2017 reyk

Add support for dynamic "NAT" interfaces (-L/local interface).

When a local interface is configured, vmd configures a /31 address on
the tap(4) interface of the host and provides another IP in the same
subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server
that replies with IP, gateway, and DNS addresses to the VM. The
built-in server only ever responds to the VM on the inside and cannot
leak its DHCP responses to the outside.

Thanks to Uwe Werler, Josh Grosse, and some others for testing!

OK deraadt@


# 1.56 06-Apr-2017 reyk

Do not expose vmm(4) VM IDs to the user, use vmd(8)'s IDs instead.

Each VM has two IDs: one from the kernel (vmm) and a different one
from userland (vmd). The vmm ID is not consistent and incremented on
every boot during runtimg of the host system. The vmd ID remains the
same during the lifetime of a configured VM, even after reboots.
Configured VMs will even get and keep their IDs when the configuration
is loaded. This is more what users expect.

Pointed out and tested by otto@

OK deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.55 15-Mar-2017 reyk

More fixes for starting and stopping VMs, fixing fallout from vm_running.

- Don't start a VM that is already running
- Keep the VM as running until it is powered off (and not stopping)
- Don't fatal in the parent if the vmm process referenced an unknown VM
- Don't stop a VM that is already stopping
- Indicate that a VM is stopping in "vmctl status"

The previous "vmctl stop; vmctl stop" to force-shutdown is not
supported anymore - the shutdown timeout should make sure that the VM
is really terminated. To force-shutdown, reference the VM by ID.
We might add a flag to vmctl stop to just turn the VM off.


# 1.54 15-Mar-2017 reyk

Close the tty if the VM was powered down.

The parent keeps a copy of each VM's tty fd to reuse it on reboot.
Close this tty if the VM was stopped, and not rebooted, by calling
vm_stop(vm, 0) instead of just setting vm_running to 0. Also make
sure that vm_ttyname is not used after free'ing it.


# 1.53 02-Mar-2017 reyk

Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.

This is especially useful when multiple VMs share a switch, the
implementation is independent from the underlying switch or bridge.

no objections mlarkin@


# 1.52 01-Mar-2017 reyk

Add "owner" option to set a user/group ownership for pre-configured VMs

This allows matching users to start or stop VMs that they "own" and to
access the console accordingly.

OK mlarkin@


# 1.51 27-Feb-2017 reyk

Replace openpty(3) with local function that uses pre-opened /dev/ptm fd

This allows more flexibility for upcoming changes and better pledge.
We also didn't use half of the features of libutil's openpty function.
Additionally, make sure that the ttys are closed correctly on shutdown.

OK gilles@


# 1.50 13-Jan-2017 edd

Make it possible to remove VMs from vmd(8)'s internal queue.

The semantics agreed with reyk@ are:

* ad-hoc created vms, created with `vmctl start`, are removed once stopped.
* Stopped VMs defined in a config file are flushed before a `vmctl reload`.

OK reyk@


# 1.49 11-Jan-2017 reyk

Add imsg communication channel between vmd and invividual VMs.
For now, this is only used to forward "log verbose|brief" requests,
but it will be used for better things later.

OK mlarkin@


# 1.48 09-Jan-2017 reyk

Stop accessing verbose and debug variables from log.c directly.

This replaces log_verbose() and "extern int verbose" with the two functions
log_setverbose() and log_getverbose().

Pointed out by benno@
OK krw@ eric@ gilles@ (OK gilles@ for the snmpd bits as well)


# 1.47 14-Dec-2016 reyk

Allow to start disabled and pre-configured VMs by name, "vmctl start foo".

With testing from Jon Bernard

OK mlarkin@


# 1.46 14-Dec-2016 reyk

If a VM terminates with the result EAGAIN, close all fds except the
pty and re-send it to the vmm monitor process. With additional
changes in vmm.c, this will allow perform a cold reboot of VM.

With testing and feedback from Jon Bernard
OK mlarkin@


# 1.45 26-Nov-2016 reyk

Implement basic support for boot.conf(8) on the disk image.

Like the real boot loader, load and parse hd0a:/etc/boot.conf from the
first disk and fall back to /bsd. Not all boot loader options are
supported, but it at least does set device, set image, and boot -acds
(eg. for booting single-user).

For example, it can now boot install60.fs that includes a boot.conf
with "set image /6.0/amd64/bsd.rd":
vmctl start install -c -d install60.fs -d OpenBSD.img

This pseudo-bootloader is only needed without BIOS and could
potentially be replaced in the future.

OK mlarkin@


# 1.44 26-Nov-2016 reyk

If -m/memory is not specified, use 512M by default.

Default value picked with mlarkin - not too small and not too large.

OK mlarkin@


# 1.43 24-Nov-2016 reyk

Add support for booting the kernel from the disk image.

This make the kernel/-k argument optional and, if not specified, tries
to find the /bsd kernel in the primary hd0a partition of the first
disk image itself. It doesn't support hd0a:/etc/boot.conf yet, and it
is no BIOS or full boot loader, but it makes booting and handling of
VMs a bit easier - booting an external kernel is still supported.

The UFS file system code ufs.c is directly from libsa which is also
used by the real boot loader. The code compiles with a few signedness
warning which will be fixed separately.

OK mlarkin@


# 1.42 22-Nov-2016 reyk

Fix error path of config_setvm() and its callers. This unbreaks
loading of invalid kernel files.

Reported by mlarkin@
OK mlarkin@


# 1.41 22-Nov-2016 reyk

There is no need for res when there is already ret.


# 1.40 22-Nov-2016 edd

Insert disabled VMs into vmd(8)'s queues and allow vmctl(8) to display them.

Tested by Jon Bernard and reyk@.

OK reyk@, no objections mlarkin@.

Thanks


# 1.39 04-Nov-2016 reyk

Pass the internal vmid or 0 to vm_register() instead of changing it
once again after setting the next available id.

Suggested by edd@


# 1.38 04-Nov-2016 reyk

Update the config/register/get VM methods to match the config_set/get
style that is used in other places. Also keep the vmid from the parent.

OK edd@


# 1.37 29-Oct-2016 edd

Separate parsing vms and switches from starting them in vmd(8).

Brings us one step closer to having disabled by default vms is vm.conf(5),
which can be started with vmctl(8).

Input, testing and OK reyk@. Thanks.


# 1.36 17-Oct-2016 reyk

Add the option to specify an interface group per virtual switch as well;
this group will be added to all VM tap(4) interfaces in the switch.

Tested by martijn@


# 1.35 15-Oct-2016 reyk

Allow to add an interface to an interface group; with the group keyword.

Requested and tested by martijn@


# 1.34 12-Oct-2016 reyk

Fix functionality and semantics of vmctl load/reload/reset.

OK rzalamena@


# 1.33 06-Oct-2016 reyk

Terminate VMs on shutdown of vmd instead of leaving them running as
undead VM processes.

OK mlarkin@


# 1.32 05-Oct-2016 reyk

Add support for enhanced networking configuration and virtual switches.
See vm.conf(5) for more details.

OK mlarkin@


# 1.31 04-Oct-2016 reyk

Add a new "priv" process that is responsible for ioctls and restricted
operations that aren't allowed under pledge. This is a companion to
the "vmd" process that runs as root but with pledge.

With the "priv" process, each new tap(4) interface now gets a
description to indicate the vm, eg. "vm1-if0-myvm". For network
configuration will be done by vmd/priv later.

OK mlarkin@


# 1.30 29-Sep-2016 reyk

Implement fork+exec for vmd, using the same framework from httpd etc.

No objections from mlarkin@ sunil@


# 1.29 17-Aug-2016 deraadt

small bits of header cleanup; ok mlarkin


# 1.28 29-Jul-2016 stefan

Allow starting a VM again after it was terminated

If a VM exits, terminate it and remove it from the list of
available VMs. That allows a VM with name `foo' to be restarted
after it has exited.

This changes structures shared between vmd and vmctl. You need to
rebuild vmctl also.

ok mlarkin@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.27 05-Feb-2016 reyk

Fix a possible use-after-free in vmd, forward the result to the
control socket before free'ing the vm.

Found by and OK jsg@


# 1.26 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.25 11-Dec-2015 reyk

The vmctl "id" argument can now be a number of or a vm name, eg.
vmctl stop 3
vmctl stop "openbsd.vm"


# 1.24 08-Dec-2015 jsg

when checking the config file with -n don't open /dev/vmm or require root
ok reyk@


# 1.23 08-Dec-2015 jsg

make the -f option work as intended
ok reyk@


# 1.22 07-Dec-2015 reyk

tweak initial error logging


# 1.21 06-Dec-2015 reyk

Prevent running a VM with the same name multiple times - multiple
instances of the same configuration will be handled in a different way
later. It is also not a good idea to use the same writeable disk
with multiple VMs at the same time.

As discussed with mlarkin@


# 1.20 06-Dec-2015 reyk

Print the TTY in the vmctl status output.


# 1.19 06-Dec-2015 reyk

When a new vm is created with VMM_IOC_CREATE, the kernel assigns a
unique id to it. This happens in the vm child process and has to be
communicated to the parent processes to track the vm. Knowing the vm
id in the parent and vmm processes also allows to remove vm from the
daemons list on terminate requests later.


# 1.18 06-Dec-2015 reyk

Check errno from config_getvm() correctly


# 1.17 05-Dec-2015 reyk

Print shorter error message if opening /dev/vmm failed.

Pointed out by deraadt@


# 1.16 03-Dec-2015 reyk

Re-add the "load" and "reload" commands to vmctl: Instead of parsing
the configuration in vmctl directly, it now sends a (re)load request
to vmd. The reload also resets the existing configuration status -
this doesn't do much difference yet but a future change will compare
if a specified VM is already running. "load" will allow to add
configuration, while "reload" resets the state before loading.


# 1.15 03-Dec-2015 reyk

Add and document -D and -f flags to vmd.


# 1.14 03-Dec-2015 reyk

mlarkin's code has been moved to vmm.c, so it is ok to claim the copyright.


# 1.13 03-Dec-2015 reyk

Add support for an optional vm.conf(5) file in vmd. This will replace
vmm.conf(5) in vmmctl. For a short time, both vmd and vmmctl will
support a configuration file, but vmmctl will be changed to send
"load" requests to vmd instead of loading and parsing the file
directly.


# 1.12 03-Dec-2015 reyk

prepare config_getvm() for parse.y


# 1.11 02-Dec-2015 reyk

send the tty name to vmmctl and print it as a result.


# 1.10 02-Dec-2015 reyk

Split the fully privileged parent into two processes "parent" and
"vmm" with reduced privileges:
- the "parent" opens fds (disks, ifs, etc.) but runs as root but pledged as
"stdio rpath wpath proc tty sendfd".
- the "vmm" process handles the creation and supervision of vm processes,
and the primary communication with the vmm(4) subsystem. It runs as _vmd
in the chroot but does not use pledge, as the vmm ioctls are not allowed
by any pledge model yet.
With this change, vmd starts to track the configuration state of VMs
in vmd and will allow other things later (like terminating a vm by
name, moving the configuration parser to vmd, ...). More incremental
changes will follow.


# 1.9 02-Dec-2015 reyk

Start tweaking vmd's privsep and daemon model by splitting the main
process into multiple parts and adopting the "proc.c"-style from other
daemons. This allows to further reduce the privileges, to give better
pledge(2), and to add some upcoming changes.

"please do" mlarkin@, deraadt@


# 1.8 26-Nov-2015 reyk

Automatically start vmm(4) when the first VM is created and after the
last VM is terminated. This allows to remove the explicit "vmm
enable" / "vmm disable" (VMM_IOC_START / VMM_IOC_STOP) ioctls. You'll
have to update kernel and userland for this change, as the kernel ABI
changes.

OK mpi@ mlarkin@


# 1.7 25-Nov-2015 tedu

typo: should be looking pid == -1


# 1.6 23-Nov-2015 reyk

accept4() is restarted after signals which prevents vmd from exiting
in the current control socket loop. Add a poll before the accept that
is not restarted and allows to escape the loop. This code is kind of
temporary, as we're planning to replace the event handling, but it
allows to kill (or Ctrl+c) vmd for now.

OK tedu@, discussed with many


# 1.5 23-Nov-2015 reyk

I accidentally removed a newline in usage() when converting the log
messages to log_*.

From Cesare Gargano


# 1.4 23-Nov-2015 reyk

Add support for logging to stderr or syslog, and to run vmd in
foreground with -d.

OK mlarkin@ jung@


# 1.3 22-Nov-2015 deraadt

use PATH_MAX where needed


# 1.2 22-Nov-2015 reyk

Add $ Ids


# 1.1 22-Nov-2015 mlarkin

vmd(8) - virtual machine daemon.

There is still a lot to be done, and fixed, in these userland components
but I have received enough "it works, commit it" emails that it's time
to finish those things in tree.

discussed with many, tested by many.


# 1.115 14-Aug-2019 anton

Improve the error message when supplying an invalid template to vmctl
start. Favoring 'invalid template' over 'permission denied' should give
the user a better hint on what went wrong.

ok kn@ mlarkin@


# 1.114 28-Jun-2019 deraadt

When system calls indicate an error they return -1, not some arbitrary
value < 0. errno is only updated in this case. Change all (most?)
callers of syscalls to follow this better, and let's see if this strictness
helps us in the future.


# 1.113 20-May-2019 jasper

drop fatalx calls when claiming a new vm id; otherwise it's possible
to crash vmd and take all other vms with it. this required a little
shuffling to get the error value reported back to the caller to
handle the error properly.

ok mlarkin@


# 1.112 11-May-2019 jasper

report vm state through 'vmctl status'; whereas previously this would display the state of
the vcpu (which is why it got removed), it now actually reports the correct state
(running, stopped, disabled, paused, etc)

ok ccardenas@ mlarkin@


# 1.111 11-May-2019 jasper

vm_dump_header allocated space for a signature but it was never set;
set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image

ok mlarkin@ pd@


# 1.110 11-May-2019 jasper

track the state of the vm (running, paused, etc) using a single bitfield instead of
a handful of separate variables. this will makes it easier for vmd to report
and check on the individual vm states

no functional change intended

ok ccardenas@ mlarkin@


# 1.109 11-May-2019 jasper

sync the vm state in vmd too when (un)pausing a vm, otherwise the vm process
knows the vm is paused, but vmd does not.

ok mlarkin@ pd@


Revision tags: OPENBSD_6_5_BASE
# 1.108 09-Dec-2018 claudio

When -B is used to specify a specific boot device also change the reboot
behaviour of vmd to stop / exit at guest reboot.
OK ccardenas@


# 1.107 04-Dec-2018 claudio

Introduce IMSG_VMDOP_WAIT_VM_REQUEST a control message that registers a
vmctl peerid that should be informed when the VM is stopped (like when the
guest does a shutdown). Uses the same logic as using the VMOP_WAIT flag on
IMSG_VMDOP_TERMINATE_VM_REQUEST.
Ok ccardenas@, reyk@


# 1.106 26-Nov-2018 ori

Keep a list of known vms, and reuse the VM IDs.

This means that when using '-L', the IP addresses of the VMs are stable.

ok reyk@


# 1.105 21-Nov-2018 reyk

Add support for "local inet6" interfaces.

ok & test ccardenas@, additional review from kn@


# 1.104 15-Oct-2018 reyk

Prevent VM reboot loops by rate-limiting the interval a VM can reboot.

This looping has been experienced by people who run VMs with a broken
kernel or boot loader that trigger a very fast reboot loop (triple
fault) of a VM that ends up using a lot of CPU and resources on the
host. Some fixes in vmm(4) and vmd(8) helped to avoid such conditions
but it can still occur if something is wrong in the guest VM itself.

If the VM restarts after less than VM_START_RATE_SEC (6) seconds, we
increment the limit counter. After VM_START_RATE_LIMIT (3) of suchs
fast reboots the VM is stopped.

There are only very few people who intentionally want to reboot-loop a
VM very quickly (many times within a second); mostly for fuzzing.
They will have to recompile and adjust the stated #defines in the code
as we don't have a config option to disable it.

OK mlarkin@


Revision tags: OPENBSD_6_4_BASE
# 1.103 08-Oct-2018 reyk

Add support for qcow2 base images (external snapshots).

This works is from Ori Bernstein, committing on his behalf:

Add support to vmd for external snapshots. That is, snapshots that are
derived from a base image. Data lookups start in the derived image,
and if the derived image does not contain some data, the search
proceeds ot the base image. Multiple derived images may exist off of
a single base image.

A limitation of this format is that modifying the base image will
corrupt the derived image.

This change also adds support for creating disk derived disk images to
vmctl. To use it:

vmctl create derived.qcow2 -s 16G -b base.qcow2

From Ori Bernstein
OK mlarkin@ reyk@


# 1.102 29-Sep-2018 pd

vmd: don't remove vm if sending failed

Fix a bug where a vm was removed in vmd.c after vmctl send even if sending
failed.
spotted by solene@
ok mlarkin@


# 1.101 28-Sep-2018 reyk

Fix copy-pasto to use maxmem instead of maxcpu

Reported by Greg Steuck

OK mlarkin@


# 1.100 10-Sep-2018 bluhm

vmd(8) clould close file descriptor 0 as not all fd fields were
properly initialized with -1. Also avoid closing -1.
OK mlarkin@


# 1.99 10-Sep-2018 bluhm

During the fork+exec implementation, daemon(3) was moved after
proc_init(). As a consequence vmd(8) child processes did not detach
from the terminal anymore. Dup /dev/null to the stdio file descriptors
in the children.
OK mlarkin@ reyk@


# 1.98 15-Jul-2018 reyk

Track resources and enforce cpu/memory/interface limits for non-root users.

The limits are currently hard-coded and undocumented (4 CPUs/VMs, 2G
memory, 8 interfaces) but will be configurable in an upcoming diff.
These limits are tracked in total usage; for example, a user will be
able to run up to 4 VMs with 512M of memory or a single VM with 2G.

OK ccardenas@ mlarkin@


# 1.97 13-Jul-2018 reyk

Check the disk/kernel/cdrom file permissions after openening the fd.

This prevents time of TOCTOU attacks for instances.

OK mlarkin@


# 1.96 13-Jul-2018 reyk

Add "allow instance" option.

This allows users to create VM instances and change desired options,
for example a user can be allowed to run a VM with all the
pre-configured options but specify an own disk image.

(mlarkin@ was fine with iterating over it)

OK ccardenas@


# 1.95 12-Jul-2018 reyk

Allow to use configured/running VMs as templates for other VM instances.

This introduces new grammar and the -t optional in vmctl start.

(For now, only root can create VM instances; but it is planned to allow
users to create their own VMs based on permissions and quota.)

OK ccardenas@ mlarkin@ jmc@


# 1.94 11-Jul-2018 reyk

style - indent each case statement in a switch.


# 1.93 11-Jul-2018 reyk

Add -w option to vmctl stop to wait for completion of VM termination.

Use it in /etc/rc.d/vmd accordingly.

OK sthen@


# 1.92 11-Jul-2018 reyk

Rename function to vmd_check_vmh


# 1.91 11-Jul-2018 reyk

Add -f option to vmctl stop to forcefully kill a VM.

This also fixes a bug in vmm_sighdlr where it might have missed
forwarding the TERMINATE_EVENT to the vmd parent after a VM child
died, leading to an abandoned VM in the vmd parent process.

OK ccardenas@ mlarkin@ benno@ kn@


# 1.90 10-Jul-2018 reyk

style (single-line ifs don't need braces)


# 1.89 10-Jul-2018 reyk

vmd already had DEBUG/DPRINTF, there is no need for VMD_DEBUG/dprintf

Replace all occurences of dprintf with DPRINTF (defined in proc.h).


# 1.88 10-Jul-2018 reyk

Tweak debug log messages

- Turn tracing messages into DPRINTF (only compiled with DEBUG).

- Pass __func__ to vm_stop and vm_remove: this way we can track who
called the function in the async context. It replaces the manual
log_debug in front of each vm_stop/vm_remove. This debug logging
trick can be removed in the future once we are more confident about
it.

OK ccardenas@ mlarkin@


# 1.87 26-Jun-2018 reyk

Add "socket owner" to allow changing the owner of the vmd control socket.

This allows to open vmctl control or console access to other users
that are not in group wheel. Access for non-root users still defaults
to read-only actions unless you change the owner (user/group) of each
individual VM.

Requested by Mischa Peters

OK mlarkin@


# 1.86 19-Jun-2018 reyk

knf


# 1.85 13-May-2018 pd

vmd(8): enable pause / unpause for vm owners

Patch from Mohamed Aslan. Thanks!
ok kn@


# 1.84 25-Apr-2018 mlarkin

vmd(8)'s early error messages weren't visible when started via /etc/rc
(such as errors relating to not having VMX/etc). Change the log_init
to log to syslog so at least we have some chance of seeing these errors.

requested and ok beck@


# 1.83 21-Apr-2018 mlarkin

spelling error in log message


# 1.82 29-Mar-2018 martijn

Make sure that the global config is send out immediately when it is
loaded. This makes sure that the local prefix specied in the config is
always used.

OK ccardenas@


Revision tags: OPENBSD_6_3_BASE
# 1.81 14-Mar-2018 mlarkin

block two VMs from using the same disk image file at the same time.
Also changes an error message in vmctl to reflect same.


# 1.80 18-Feb-2018 pd

vmd: fix vmctl pause for non existing vm ids (never returns)

check if vm id is valid before sending to vmm for pausing. The 'lock' is caused
by vmm sending back ENOENT for a non existent vm but vmd drops the message
because it doesn't recogize the vmid vmm is talking about. This is an artifact
of the 'policy' don't trust any imsg from a sibling priv sep process and do
your own checking.

reported by Abel Abraham Camarillo Ojeda
ok mlarkin@ and ccardenas@


# 1.79 10-Jan-2018 sthen

Don't require "disk" or "kernel", also allow just "cdrom" instead, a VM can
still be useful with only cdrom storage. ok ccardenas@


# 1.78 08-Jan-2018 mpi

Enable TIOCUCNTL to be able to set ns8250's break detected condition.

It is now possible to send BREAK commands to vmd(8) independently of
the serial terminal emulator.

Happy virtual ddb(4) hacking!

No objection from mlarkin@, ok nicm@, ccardenas@, deraadt@


# 1.77 03-Jan-2018 ccardenas

Add initial CD-ROM support to VMD via vioscsi.

* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8)
* Support various sized ISOs (Limitation of 4G ISOs on Linux guests)
* Known working guests: OpenBSD (primary), Alpine Linux (primary),
CentOS 6 (secondary), Ubuntu 17.10 (secondary).
NOTE: Secondary indicates some issue(s) preventing full/reliable
functionality outside the scope of the vioscsi work.
* If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's
default BIOS) will boot from CD-ROM.

ok mlarkin@, jca@


# 1.76 06-Dec-2017 abieber

Make vmd respect owner when starting non-disabled vms.

OK pd@, benno@


# 1.75 30-Nov-2017 ccardenas

When performing vmctl reload and a previously configured vm is running,
exit with an EALREADY vs EPERM.

ok mlarkin@


# 1.74 11-Nov-2017 mlarkin

update switch handling in vmd(8). vmd now gets switch information (rdomain,
etc) from underlying switch interface instead of handling this on its
own.

Diff from carlos cardenas, Thanks!

ok reyk@


# 1.73 07-Nov-2017 mlarkin

typo in previous


# 1.72 07-Nov-2017 mlarkin

comment function vm_checkperm


# 1.71 24-Oct-2017 mlarkin

The VMD parent process didn't handle the case of a VM exiting
with a non 0 return properly (i.e. EIO).

From: Carlos Cardenas, thanks!


# 1.70 07-Oct-2017 mlarkin

vmd: retain ownership on vm reboot

from Jesper Wallin, thanks!


Revision tags: OPENBSD_6_2_BASE
# 1.69 08-Sep-2017 mlarkin

vmd: add more explanatory log_debug messages

From Carlos Cardenas, many thanks!


# 1.68 20-Aug-2017 pd

vmd: Allow only upward migration

This restricts receiving vms from hosts with more cpu features.

Tested on
broadwell -> skylake (works)
skylake -> broadwell (don't work)

ok mlarkin@


# 1.67 15-Aug-2017 pd

vmd: fix vm id displayed by vmctl when receiving a vm

Also fix two debug messages and an IMSG type.


# 1.66 14-Aug-2017 jasper

validate vm names before creating them; a valid name contains alphanumeric
characters, including '.', '_' and '-'. but does not start with the latter
three.

ok mlarkin@ pd@


# 1.65 13-Aug-2017 jasper

don't issue a termination command to an already stopped vm

ok mlarkin@


# 1.64 15-Jul-2017 pd

Add vmctl send and vmctl receive

ok reyk@ and mlarkin@


# 1.63 09-Jul-2017 pd

vmd/vmctl: Add ability to pause / unpause vms

With help from Ashwin Agrawal

ok reyk@ mlarkin@


# 1.62 29-May-2017 mlarkin

vmd(8): prevent crashing when presented with a vm name argument to
"vmctl stop" that doesn't exist.

Diff from Pratik Vyas, thanks!


# 1.61 04-May-2017 reyk

Report command failure back to vmctl reload, reset, load, log verbose.

OK mlarkin@


# 1.60 04-May-2017 reyk

Add support for rdomains.

This allows to configure VM interfaces and switches in individual rdomains.

OK mlarkin@


# 1.59 25-Apr-2017 reyk

Generate randomized MAC addresses earlier to keep them across reboots.

OK deraadt@


# 1.58 21-Apr-2017 reyk

Add global configuration option "local prefix" to change prefix for -L.

The default prefix is 100.64.0.0/10 from RFC6598.

Requested by sthen@ chris@
OK mlarkin@


# 1.57 19-Apr-2017 reyk

Add support for dynamic "NAT" interfaces (-L/local interface).

When a local interface is configured, vmd configures a /31 address on
the tap(4) interface of the host and provides another IP in the same
subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server
that replies with IP, gateway, and DNS addresses to the VM. The
built-in server only ever responds to the VM on the inside and cannot
leak its DHCP responses to the outside.

Thanks to Uwe Werler, Josh Grosse, and some others for testing!

OK deraadt@


# 1.56 06-Apr-2017 reyk

Do not expose vmm(4) VM IDs to the user, use vmd(8)'s IDs instead.

Each VM has two IDs: one from the kernel (vmm) and a different one
from userland (vmd). The vmm ID is not consistent and incremented on
every boot during runtimg of the host system. The vmd ID remains the
same during the lifetime of a configured VM, even after reboots.
Configured VMs will even get and keep their IDs when the configuration
is loaded. This is more what users expect.

Pointed out and tested by otto@

OK deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.55 15-Mar-2017 reyk

More fixes for starting and stopping VMs, fixing fallout from vm_running.

- Don't start a VM that is already running
- Keep the VM as running until it is powered off (and not stopping)
- Don't fatal in the parent if the vmm process referenced an unknown VM
- Don't stop a VM that is already stopping
- Indicate that a VM is stopping in "vmctl status"

The previous "vmctl stop; vmctl stop" to force-shutdown is not
supported anymore - the shutdown timeout should make sure that the VM
is really terminated. To force-shutdown, reference the VM by ID.
We might add a flag to vmctl stop to just turn the VM off.


# 1.54 15-Mar-2017 reyk

Close the tty if the VM was powered down.

The parent keeps a copy of each VM's tty fd to reuse it on reboot.
Close this tty if the VM was stopped, and not rebooted, by calling
vm_stop(vm, 0) instead of just setting vm_running to 0. Also make
sure that vm_ttyname is not used after free'ing it.


# 1.53 02-Mar-2017 reyk

Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.

This is especially useful when multiple VMs share a switch, the
implementation is independent from the underlying switch or bridge.

no objections mlarkin@


# 1.52 01-Mar-2017 reyk

Add "owner" option to set a user/group ownership for pre-configured VMs

This allows matching users to start or stop VMs that they "own" and to
access the console accordingly.

OK mlarkin@


# 1.51 27-Feb-2017 reyk

Replace openpty(3) with local function that uses pre-opened /dev/ptm fd

This allows more flexibility for upcoming changes and better pledge.
We also didn't use half of the features of libutil's openpty function.
Additionally, make sure that the ttys are closed correctly on shutdown.

OK gilles@


# 1.50 13-Jan-2017 edd

Make it possible to remove VMs from vmd(8)'s internal queue.

The semantics agreed with reyk@ are:

* ad-hoc created vms, created with `vmctl start`, are removed once stopped.
* Stopped VMs defined in a config file are flushed before a `vmctl reload`.

OK reyk@


# 1.49 11-Jan-2017 reyk

Add imsg communication channel between vmd and invividual VMs.
For now, this is only used to forward "log verbose|brief" requests,
but it will be used for better things later.

OK mlarkin@


# 1.48 09-Jan-2017 reyk

Stop accessing verbose and debug variables from log.c directly.

This replaces log_verbose() and "extern int verbose" with the two functions
log_setverbose() and log_getverbose().

Pointed out by benno@
OK krw@ eric@ gilles@ (OK gilles@ for the snmpd bits as well)


# 1.47 14-Dec-2016 reyk

Allow to start disabled and pre-configured VMs by name, "vmctl start foo".

With testing from Jon Bernard

OK mlarkin@


# 1.46 14-Dec-2016 reyk

If a VM terminates with the result EAGAIN, close all fds except the
pty and re-send it to the vmm monitor process. With additional
changes in vmm.c, this will allow perform a cold reboot of VM.

With testing and feedback from Jon Bernard
OK mlarkin@


# 1.45 26-Nov-2016 reyk

Implement basic support for boot.conf(8) on the disk image.

Like the real boot loader, load and parse hd0a:/etc/boot.conf from the
first disk and fall back to /bsd. Not all boot loader options are
supported, but it at least does set device, set image, and boot -acds
(eg. for booting single-user).

For example, it can now boot install60.fs that includes a boot.conf
with "set image /6.0/amd64/bsd.rd":
vmctl start install -c -d install60.fs -d OpenBSD.img

This pseudo-bootloader is only needed without BIOS and could
potentially be replaced in the future.

OK mlarkin@


# 1.44 26-Nov-2016 reyk

If -m/memory is not specified, use 512M by default.

Default value picked with mlarkin - not too small and not too large.

OK mlarkin@


# 1.43 24-Nov-2016 reyk

Add support for booting the kernel from the disk image.

This make the kernel/-k argument optional and, if not specified, tries
to find the /bsd kernel in the primary hd0a partition of the first
disk image itself. It doesn't support hd0a:/etc/boot.conf yet, and it
is no BIOS or full boot loader, but it makes booting and handling of
VMs a bit easier - booting an external kernel is still supported.

The UFS file system code ufs.c is directly from libsa which is also
used by the real boot loader. The code compiles with a few signedness
warning which will be fixed separately.

OK mlarkin@


# 1.42 22-Nov-2016 reyk

Fix error path of config_setvm() and its callers. This unbreaks
loading of invalid kernel files.

Reported by mlarkin@
OK mlarkin@


# 1.41 22-Nov-2016 reyk

There is no need for res when there is already ret.


# 1.40 22-Nov-2016 edd

Insert disabled VMs into vmd(8)'s queues and allow vmctl(8) to display them.

Tested by Jon Bernard and reyk@.

OK reyk@, no objections mlarkin@.

Thanks


# 1.39 04-Nov-2016 reyk

Pass the internal vmid or 0 to vm_register() instead of changing it
once again after setting the next available id.

Suggested by edd@


# 1.38 04-Nov-2016 reyk

Update the config/register/get VM methods to match the config_set/get
style that is used in other places. Also keep the vmid from the parent.

OK edd@


# 1.37 29-Oct-2016 edd

Separate parsing vms and switches from starting them in vmd(8).

Brings us one step closer to having disabled by default vms is vm.conf(5),
which can be started with vmctl(8).

Input, testing and OK reyk@. Thanks.


# 1.36 17-Oct-2016 reyk

Add the option to specify an interface group per virtual switch as well;
this group will be added to all VM tap(4) interfaces in the switch.

Tested by martijn@


# 1.35 15-Oct-2016 reyk

Allow to add an interface to an interface group; with the group keyword.

Requested and tested by martijn@


# 1.34 12-Oct-2016 reyk

Fix functionality and semantics of vmctl load/reload/reset.

OK rzalamena@


# 1.33 06-Oct-2016 reyk

Terminate VMs on shutdown of vmd instead of leaving them running as
undead VM processes.

OK mlarkin@


# 1.32 05-Oct-2016 reyk

Add support for enhanced networking configuration and virtual switches.
See vm.conf(5) for more details.

OK mlarkin@


# 1.31 04-Oct-2016 reyk

Add a new "priv" process that is responsible for ioctls and restricted
operations that aren't allowed under pledge. This is a companion to
the "vmd" process that runs as root but with pledge.

With the "priv" process, each new tap(4) interface now gets a
description to indicate the vm, eg. "vm1-if0-myvm". For network
configuration will be done by vmd/priv later.

OK mlarkin@


# 1.30 29-Sep-2016 reyk

Implement fork+exec for vmd, using the same framework from httpd etc.

No objections from mlarkin@ sunil@


# 1.29 17-Aug-2016 deraadt

small bits of header cleanup; ok mlarkin


# 1.28 29-Jul-2016 stefan

Allow starting a VM again after it was terminated

If a VM exits, terminate it and remove it from the list of
available VMs. That allows a VM with name `foo' to be restarted
after it has exited.

This changes structures shared between vmd and vmctl. You need to
rebuild vmctl also.

ok mlarkin@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.27 05-Feb-2016 reyk

Fix a possible use-after-free in vmd, forward the result to the
control socket before free'ing the vm.

Found by and OK jsg@


# 1.26 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.25 11-Dec-2015 reyk

The vmctl "id" argument can now be a number of or a vm name, eg.
vmctl stop 3
vmctl stop "openbsd.vm"


# 1.24 08-Dec-2015 jsg

when checking the config file with -n don't open /dev/vmm or require root
ok reyk@


# 1.23 08-Dec-2015 jsg

make the -f option work as intended
ok reyk@


# 1.22 07-Dec-2015 reyk

tweak initial error logging


# 1.21 06-Dec-2015 reyk

Prevent running a VM with the same name multiple times - multiple
instances of the same configuration will be handled in a different way
later. It is also not a good idea to use the same writeable disk
with multiple VMs at the same time.

As discussed with mlarkin@


# 1.20 06-Dec-2015 reyk

Print the TTY in the vmctl status output.


# 1.19 06-Dec-2015 reyk

When a new vm is created with VMM_IOC_CREATE, the kernel assigns a
unique id to it. This happens in the vm child process and has to be
communicated to the parent processes to track the vm. Knowing the vm
id in the parent and vmm processes also allows to remove vm from the
daemons list on terminate requests later.


# 1.18 06-Dec-2015 reyk

Check errno from config_getvm() correctly


# 1.17 05-Dec-2015 reyk

Print shorter error message if opening /dev/vmm failed.

Pointed out by deraadt@


# 1.16 03-Dec-2015 reyk

Re-add the "load" and "reload" commands to vmctl: Instead of parsing
the configuration in vmctl directly, it now sends a (re)load request
to vmd. The reload also resets the existing configuration status -
this doesn't do much difference yet but a future change will compare
if a specified VM is already running. "load" will allow to add
configuration, while "reload" resets the state before loading.


# 1.15 03-Dec-2015 reyk

Add and document -D and -f flags to vmd.


# 1.14 03-Dec-2015 reyk

mlarkin's code has been moved to vmm.c, so it is ok to claim the copyright.


# 1.13 03-Dec-2015 reyk

Add support for an optional vm.conf(5) file in vmd. This will replace
vmm.conf(5) in vmmctl. For a short time, both vmd and vmmctl will
support a configuration file, but vmmctl will be changed to send
"load" requests to vmd instead of loading and parsing the file
directly.


# 1.12 03-Dec-2015 reyk

prepare config_getvm() for parse.y


# 1.11 02-Dec-2015 reyk

send the tty name to vmmctl and print it as a result.


# 1.10 02-Dec-2015 reyk

Split the fully privileged parent into two processes "parent" and
"vmm" with reduced privileges:
- the "parent" opens fds (disks, ifs, etc.) but runs as root but pledged as
"stdio rpath wpath proc tty sendfd".
- the "vmm" process handles the creation and supervision of vm processes,
and the primary communication with the vmm(4) subsystem. It runs as _vmd
in the chroot but does not use pledge, as the vmm ioctls are not allowed
by any pledge model yet.
With this change, vmd starts to track the configuration state of VMs
in vmd and will allow other things later (like terminating a vm by
name, moving the configuration parser to vmd, ...). More incremental
changes will follow.


# 1.9 02-Dec-2015 reyk

Start tweaking vmd's privsep and daemon model by splitting the main
process into multiple parts and adopting the "proc.c"-style from other
daemons. This allows to further reduce the privileges, to give better
pledge(2), and to add some upcoming changes.

"please do" mlarkin@, deraadt@


# 1.8 26-Nov-2015 reyk

Automatically start vmm(4) when the first VM is created and after the
last VM is terminated. This allows to remove the explicit "vmm
enable" / "vmm disable" (VMM_IOC_START / VMM_IOC_STOP) ioctls. You'll
have to update kernel and userland for this change, as the kernel ABI
changes.

OK mpi@ mlarkin@


# 1.7 25-Nov-2015 tedu

typo: should be looking pid == -1


# 1.6 23-Nov-2015 reyk

accept4() is restarted after signals which prevents vmd from exiting
in the current control socket loop. Add a poll before the accept that
is not restarted and allows to escape the loop. This code is kind of
temporary, as we're planning to replace the event handling, but it
allows to kill (or Ctrl+c) vmd for now.

OK tedu@, discussed with many


# 1.5 23-Nov-2015 reyk

I accidentally removed a newline in usage() when converting the log
messages to log_*.

From Cesare Gargano


# 1.4 23-Nov-2015 reyk

Add support for logging to stderr or syslog, and to run vmd in
foreground with -d.

OK mlarkin@ jung@


# 1.3 22-Nov-2015 deraadt

use PATH_MAX where needed


# 1.2 22-Nov-2015 reyk

Add $ Ids


# 1.1 22-Nov-2015 mlarkin

vmd(8) - virtual machine daemon.

There is still a lot to be done, and fixed, in these userland components
but I have received enough "it works, commit it" emails that it's time
to finish those things in tree.

discussed with many, tested by many.


# 1.114 28-Jun-2019 deraadt

When system calls indicate an error they return -1, not some arbitrary
value < 0. errno is only updated in this case. Change all (most?)
callers of syscalls to follow this better, and let's see if this strictness
helps us in the future.


# 1.113 20-May-2019 jasper

drop fatalx calls when claiming a new vm id; otherwise it's possible
to crash vmd and take all other vms with it. this required a little
shuffling to get the error value reported back to the caller to
handle the error properly.

ok mlarkin@


# 1.112 11-May-2019 jasper

report vm state through 'vmctl status'; whereas previously this would display the state of
the vcpu (which is why it got removed), it now actually reports the correct state
(running, stopped, disabled, paused, etc)

ok ccardenas@ mlarkin@


# 1.111 11-May-2019 jasper

vm_dump_header allocated space for a signature but it was never set;
set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image

ok mlarkin@ pd@


# 1.110 11-May-2019 jasper

track the state of the vm (running, paused, etc) using a single bitfield instead of
a handful of separate variables. this will makes it easier for vmd to report
and check on the individual vm states

no functional change intended

ok ccardenas@ mlarkin@


# 1.109 11-May-2019 jasper

sync the vm state in vmd too when (un)pausing a vm, otherwise the vm process
knows the vm is paused, but vmd does not.

ok mlarkin@ pd@


Revision tags: OPENBSD_6_5_BASE
# 1.108 09-Dec-2018 claudio

When -B is used to specify a specific boot device also change the reboot
behaviour of vmd to stop / exit at guest reboot.
OK ccardenas@


# 1.107 04-Dec-2018 claudio

Introduce IMSG_VMDOP_WAIT_VM_REQUEST a control message that registers a
vmctl peerid that should be informed when the VM is stopped (like when the
guest does a shutdown). Uses the same logic as using the VMOP_WAIT flag on
IMSG_VMDOP_TERMINATE_VM_REQUEST.
Ok ccardenas@, reyk@


# 1.106 26-Nov-2018 ori

Keep a list of known vms, and reuse the VM IDs.

This means that when using '-L', the IP addresses of the VMs are stable.

ok reyk@


# 1.105 21-Nov-2018 reyk

Add support for "local inet6" interfaces.

ok & test ccardenas@, additional review from kn@


# 1.104 15-Oct-2018 reyk

Prevent VM reboot loops by rate-limiting the interval a VM can reboot.

This looping has been experienced by people who run VMs with a broken
kernel or boot loader that trigger a very fast reboot loop (triple
fault) of a VM that ends up using a lot of CPU and resources on the
host. Some fixes in vmm(4) and vmd(8) helped to avoid such conditions
but it can still occur if something is wrong in the guest VM itself.

If the VM restarts after less than VM_START_RATE_SEC (6) seconds, we
increment the limit counter. After VM_START_RATE_LIMIT (3) of suchs
fast reboots the VM is stopped.

There are only very few people who intentionally want to reboot-loop a
VM very quickly (many times within a second); mostly for fuzzing.
They will have to recompile and adjust the stated #defines in the code
as we don't have a config option to disable it.

OK mlarkin@


Revision tags: OPENBSD_6_4_BASE
# 1.103 08-Oct-2018 reyk

Add support for qcow2 base images (external snapshots).

This works is from Ori Bernstein, committing on his behalf:

Add support to vmd for external snapshots. That is, snapshots that are
derived from a base image. Data lookups start in the derived image,
and if the derived image does not contain some data, the search
proceeds ot the base image. Multiple derived images may exist off of
a single base image.

A limitation of this format is that modifying the base image will
corrupt the derived image.

This change also adds support for creating disk derived disk images to
vmctl. To use it:

vmctl create derived.qcow2 -s 16G -b base.qcow2

From Ori Bernstein
OK mlarkin@ reyk@


# 1.102 29-Sep-2018 pd

vmd: don't remove vm if sending failed

Fix a bug where a vm was removed in vmd.c after vmctl send even if sending
failed.
spotted by solene@
ok mlarkin@


# 1.101 28-Sep-2018 reyk

Fix copy-pasto to use maxmem instead of maxcpu

Reported by Greg Steuck

OK mlarkin@


# 1.100 10-Sep-2018 bluhm

vmd(8) clould close file descriptor 0 as not all fd fields were
properly initialized with -1. Also avoid closing -1.
OK mlarkin@


# 1.99 10-Sep-2018 bluhm

During the fork+exec implementation, daemon(3) was moved after
proc_init(). As a consequence vmd(8) child processes did not detach
from the terminal anymore. Dup /dev/null to the stdio file descriptors
in the children.
OK mlarkin@ reyk@


# 1.98 15-Jul-2018 reyk

Track resources and enforce cpu/memory/interface limits for non-root users.

The limits are currently hard-coded and undocumented (4 CPUs/VMs, 2G
memory, 8 interfaces) but will be configurable in an upcoming diff.
These limits are tracked in total usage; for example, a user will be
able to run up to 4 VMs with 512M of memory or a single VM with 2G.

OK ccardenas@ mlarkin@


# 1.97 13-Jul-2018 reyk

Check the disk/kernel/cdrom file permissions after openening the fd.

This prevents time of TOCTOU attacks for instances.

OK mlarkin@


# 1.96 13-Jul-2018 reyk

Add "allow instance" option.

This allows users to create VM instances and change desired options,
for example a user can be allowed to run a VM with all the
pre-configured options but specify an own disk image.

(mlarkin@ was fine with iterating over it)

OK ccardenas@


# 1.95 12-Jul-2018 reyk

Allow to use configured/running VMs as templates for other VM instances.

This introduces new grammar and the -t optional in vmctl start.

(For now, only root can create VM instances; but it is planned to allow
users to create their own VMs based on permissions and quota.)

OK ccardenas@ mlarkin@ jmc@


# 1.94 11-Jul-2018 reyk

style - indent each case statement in a switch.


# 1.93 11-Jul-2018 reyk

Add -w option to vmctl stop to wait for completion of VM termination.

Use it in /etc/rc.d/vmd accordingly.

OK sthen@


# 1.92 11-Jul-2018 reyk

Rename function to vmd_check_vmh


# 1.91 11-Jul-2018 reyk

Add -f option to vmctl stop to forcefully kill a VM.

This also fixes a bug in vmm_sighdlr where it might have missed
forwarding the TERMINATE_EVENT to the vmd parent after a VM child
died, leading to an abandoned VM in the vmd parent process.

OK ccardenas@ mlarkin@ benno@ kn@


# 1.90 10-Jul-2018 reyk

style (single-line ifs don't need braces)


# 1.89 10-Jul-2018 reyk

vmd already had DEBUG/DPRINTF, there is no need for VMD_DEBUG/dprintf

Replace all occurences of dprintf with DPRINTF (defined in proc.h).


# 1.88 10-Jul-2018 reyk

Tweak debug log messages

- Turn tracing messages into DPRINTF (only compiled with DEBUG).

- Pass __func__ to vm_stop and vm_remove: this way we can track who
called the function in the async context. It replaces the manual
log_debug in front of each vm_stop/vm_remove. This debug logging
trick can be removed in the future once we are more confident about
it.

OK ccardenas@ mlarkin@


# 1.87 26-Jun-2018 reyk

Add "socket owner" to allow changing the owner of the vmd control socket.

This allows to open vmctl control or console access to other users
that are not in group wheel. Access for non-root users still defaults
to read-only actions unless you change the owner (user/group) of each
individual VM.

Requested by Mischa Peters

OK mlarkin@


# 1.86 19-Jun-2018 reyk

knf


# 1.85 13-May-2018 pd

vmd(8): enable pause / unpause for vm owners

Patch from Mohamed Aslan. Thanks!
ok kn@


# 1.84 25-Apr-2018 mlarkin

vmd(8)'s early error messages weren't visible when started via /etc/rc
(such as errors relating to not having VMX/etc). Change the log_init
to log to syslog so at least we have some chance of seeing these errors.

requested and ok beck@


# 1.83 21-Apr-2018 mlarkin

spelling error in log message


# 1.82 29-Mar-2018 martijn

Make sure that the global config is send out immediately when it is
loaded. This makes sure that the local prefix specied in the config is
always used.

OK ccardenas@


Revision tags: OPENBSD_6_3_BASE
# 1.81 14-Mar-2018 mlarkin

block two VMs from using the same disk image file at the same time.
Also changes an error message in vmctl to reflect same.


# 1.80 18-Feb-2018 pd

vmd: fix vmctl pause for non existing vm ids (never returns)

check if vm id is valid before sending to vmm for pausing. The 'lock' is caused
by vmm sending back ENOENT for a non existent vm but vmd drops the message
because it doesn't recogize the vmid vmm is talking about. This is an artifact
of the 'policy' don't trust any imsg from a sibling priv sep process and do
your own checking.

reported by Abel Abraham Camarillo Ojeda
ok mlarkin@ and ccardenas@


# 1.79 10-Jan-2018 sthen

Don't require "disk" or "kernel", also allow just "cdrom" instead, a VM can
still be useful with only cdrom storage. ok ccardenas@


# 1.78 08-Jan-2018 mpi

Enable TIOCUCNTL to be able to set ns8250's break detected condition.

It is now possible to send BREAK commands to vmd(8) independently of
the serial terminal emulator.

Happy virtual ddb(4) hacking!

No objection from mlarkin@, ok nicm@, ccardenas@, deraadt@


# 1.77 03-Jan-2018 ccardenas

Add initial CD-ROM support to VMD via vioscsi.

* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8)
* Support various sized ISOs (Limitation of 4G ISOs on Linux guests)
* Known working guests: OpenBSD (primary), Alpine Linux (primary),
CentOS 6 (secondary), Ubuntu 17.10 (secondary).
NOTE: Secondary indicates some issue(s) preventing full/reliable
functionality outside the scope of the vioscsi work.
* If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's
default BIOS) will boot from CD-ROM.

ok mlarkin@, jca@


# 1.76 06-Dec-2017 abieber

Make vmd respect owner when starting non-disabled vms.

OK pd@, benno@


# 1.75 30-Nov-2017 ccardenas

When performing vmctl reload and a previously configured vm is running,
exit with an EALREADY vs EPERM.

ok mlarkin@


# 1.74 11-Nov-2017 mlarkin

update switch handling in vmd(8). vmd now gets switch information (rdomain,
etc) from underlying switch interface instead of handling this on its
own.

Diff from carlos cardenas, Thanks!

ok reyk@


# 1.73 07-Nov-2017 mlarkin

typo in previous


# 1.72 07-Nov-2017 mlarkin

comment function vm_checkperm


# 1.71 24-Oct-2017 mlarkin

The VMD parent process didn't handle the case of a VM exiting
with a non 0 return properly (i.e. EIO).

From: Carlos Cardenas, thanks!


# 1.70 07-Oct-2017 mlarkin

vmd: retain ownership on vm reboot

from Jesper Wallin, thanks!


Revision tags: OPENBSD_6_2_BASE
# 1.69 08-Sep-2017 mlarkin

vmd: add more explanatory log_debug messages

From Carlos Cardenas, many thanks!


# 1.68 20-Aug-2017 pd

vmd: Allow only upward migration

This restricts receiving vms from hosts with more cpu features.

Tested on
broadwell -> skylake (works)
skylake -> broadwell (don't work)

ok mlarkin@


# 1.67 15-Aug-2017 pd

vmd: fix vm id displayed by vmctl when receiving a vm

Also fix two debug messages and an IMSG type.


# 1.66 14-Aug-2017 jasper

validate vm names before creating them; a valid name contains alphanumeric
characters, including '.', '_' and '-'. but does not start with the latter
three.

ok mlarkin@ pd@


# 1.65 13-Aug-2017 jasper

don't issue a termination command to an already stopped vm

ok mlarkin@


# 1.64 15-Jul-2017 pd

Add vmctl send and vmctl receive

ok reyk@ and mlarkin@


# 1.63 09-Jul-2017 pd

vmd/vmctl: Add ability to pause / unpause vms

With help from Ashwin Agrawal

ok reyk@ mlarkin@


# 1.62 29-May-2017 mlarkin

vmd(8): prevent crashing when presented with a vm name argument to
"vmctl stop" that doesn't exist.

Diff from Pratik Vyas, thanks!


# 1.61 04-May-2017 reyk

Report command failure back to vmctl reload, reset, load, log verbose.

OK mlarkin@


# 1.60 04-May-2017 reyk

Add support for rdomains.

This allows to configure VM interfaces and switches in individual rdomains.

OK mlarkin@


# 1.59 25-Apr-2017 reyk

Generate randomized MAC addresses earlier to keep them across reboots.

OK deraadt@


# 1.58 21-Apr-2017 reyk

Add global configuration option "local prefix" to change prefix for -L.

The default prefix is 100.64.0.0/10 from RFC6598.

Requested by sthen@ chris@
OK mlarkin@


# 1.57 19-Apr-2017 reyk

Add support for dynamic "NAT" interfaces (-L/local interface).

When a local interface is configured, vmd configures a /31 address on
the tap(4) interface of the host and provides another IP in the same
subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server
that replies with IP, gateway, and DNS addresses to the VM. The
built-in server only ever responds to the VM on the inside and cannot
leak its DHCP responses to the outside.

Thanks to Uwe Werler, Josh Grosse, and some others for testing!

OK deraadt@


# 1.56 06-Apr-2017 reyk

Do not expose vmm(4) VM IDs to the user, use vmd(8)'s IDs instead.

Each VM has two IDs: one from the kernel (vmm) and a different one
from userland (vmd). The vmm ID is not consistent and incremented on
every boot during runtimg of the host system. The vmd ID remains the
same during the lifetime of a configured VM, even after reboots.
Configured VMs will even get and keep their IDs when the configuration
is loaded. This is more what users expect.

Pointed out and tested by otto@

OK deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.55 15-Mar-2017 reyk

More fixes for starting and stopping VMs, fixing fallout from vm_running.

- Don't start a VM that is already running
- Keep the VM as running until it is powered off (and not stopping)
- Don't fatal in the parent if the vmm process referenced an unknown VM
- Don't stop a VM that is already stopping
- Indicate that a VM is stopping in "vmctl status"

The previous "vmctl stop; vmctl stop" to force-shutdown is not
supported anymore - the shutdown timeout should make sure that the VM
is really terminated. To force-shutdown, reference the VM by ID.
We might add a flag to vmctl stop to just turn the VM off.


# 1.54 15-Mar-2017 reyk

Close the tty if the VM was powered down.

The parent keeps a copy of each VM's tty fd to reuse it on reboot.
Close this tty if the VM was stopped, and not rebooted, by calling
vm_stop(vm, 0) instead of just setting vm_running to 0. Also make
sure that vm_ttyname is not used after free'ing it.


# 1.53 02-Mar-2017 reyk

Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.

This is especially useful when multiple VMs share a switch, the
implementation is independent from the underlying switch or bridge.

no objections mlarkin@


# 1.52 01-Mar-2017 reyk

Add "owner" option to set a user/group ownership for pre-configured VMs

This allows matching users to start or stop VMs that they "own" and to
access the console accordingly.

OK mlarkin@


# 1.51 27-Feb-2017 reyk

Replace openpty(3) with local function that uses pre-opened /dev/ptm fd

This allows more flexibility for upcoming changes and better pledge.
We also didn't use half of the features of libutil's openpty function.
Additionally, make sure that the ttys are closed correctly on shutdown.

OK gilles@


# 1.50 13-Jan-2017 edd

Make it possible to remove VMs from vmd(8)'s internal queue.

The semantics agreed with reyk@ are:

* ad-hoc created vms, created with `vmctl start`, are removed once stopped.
* Stopped VMs defined in a config file are flushed before a `vmctl reload`.

OK reyk@


# 1.49 11-Jan-2017 reyk

Add imsg communication channel between vmd and invividual VMs.
For now, this is only used to forward "log verbose|brief" requests,
but it will be used for better things later.

OK mlarkin@


# 1.48 09-Jan-2017 reyk

Stop accessing verbose and debug variables from log.c directly.

This replaces log_verbose() and "extern int verbose" with the two functions
log_setverbose() and log_getverbose().

Pointed out by benno@
OK krw@ eric@ gilles@ (OK gilles@ for the snmpd bits as well)


# 1.47 14-Dec-2016 reyk

Allow to start disabled and pre-configured VMs by name, "vmctl start foo".

With testing from Jon Bernard

OK mlarkin@


# 1.46 14-Dec-2016 reyk

If a VM terminates with the result EAGAIN, close all fds except the
pty and re-send it to the vmm monitor process. With additional
changes in vmm.c, this will allow perform a cold reboot of VM.

With testing and feedback from Jon Bernard
OK mlarkin@


# 1.45 26-Nov-2016 reyk

Implement basic support for boot.conf(8) on the disk image.

Like the real boot loader, load and parse hd0a:/etc/boot.conf from the
first disk and fall back to /bsd. Not all boot loader options are
supported, but it at least does set device, set image, and boot -acds
(eg. for booting single-user).

For example, it can now boot install60.fs that includes a boot.conf
with "set image /6.0/amd64/bsd.rd":
vmctl start install -c -d install60.fs -d OpenBSD.img

This pseudo-bootloader is only needed without BIOS and could
potentially be replaced in the future.

OK mlarkin@


# 1.44 26-Nov-2016 reyk

If -m/memory is not specified, use 512M by default.

Default value picked with mlarkin - not too small and not too large.

OK mlarkin@


# 1.43 24-Nov-2016 reyk

Add support for booting the kernel from the disk image.

This make the kernel/-k argument optional and, if not specified, tries
to find the /bsd kernel in the primary hd0a partition of the first
disk image itself. It doesn't support hd0a:/etc/boot.conf yet, and it
is no BIOS or full boot loader, but it makes booting and handling of
VMs a bit easier - booting an external kernel is still supported.

The UFS file system code ufs.c is directly from libsa which is also
used by the real boot loader. The code compiles with a few signedness
warning which will be fixed separately.

OK mlarkin@


# 1.42 22-Nov-2016 reyk

Fix error path of config_setvm() and its callers. This unbreaks
loading of invalid kernel files.

Reported by mlarkin@
OK mlarkin@


# 1.41 22-Nov-2016 reyk

There is no need for res when there is already ret.


# 1.40 22-Nov-2016 edd

Insert disabled VMs into vmd(8)'s queues and allow vmctl(8) to display them.

Tested by Jon Bernard and reyk@.

OK reyk@, no objections mlarkin@.

Thanks


# 1.39 04-Nov-2016 reyk

Pass the internal vmid or 0 to vm_register() instead of changing it
once again after setting the next available id.

Suggested by edd@


# 1.38 04-Nov-2016 reyk

Update the config/register/get VM methods to match the config_set/get
style that is used in other places. Also keep the vmid from the parent.

OK edd@


# 1.37 29-Oct-2016 edd

Separate parsing vms and switches from starting them in vmd(8).

Brings us one step closer to having disabled by default vms is vm.conf(5),
which can be started with vmctl(8).

Input, testing and OK reyk@. Thanks.


# 1.36 17-Oct-2016 reyk

Add the option to specify an interface group per virtual switch as well;
this group will be added to all VM tap(4) interfaces in the switch.

Tested by martijn@


# 1.35 15-Oct-2016 reyk

Allow to add an interface to an interface group; with the group keyword.

Requested and tested by martijn@


# 1.34 12-Oct-2016 reyk

Fix functionality and semantics of vmctl load/reload/reset.

OK rzalamena@


# 1.33 06-Oct-2016 reyk

Terminate VMs on shutdown of vmd instead of leaving them running as
undead VM processes.

OK mlarkin@


# 1.32 05-Oct-2016 reyk

Add support for enhanced networking configuration and virtual switches.
See vm.conf(5) for more details.

OK mlarkin@


# 1.31 04-Oct-2016 reyk

Add a new "priv" process that is responsible for ioctls and restricted
operations that aren't allowed under pledge. This is a companion to
the "vmd" process that runs as root but with pledge.

With the "priv" process, each new tap(4) interface now gets a
description to indicate the vm, eg. "vm1-if0-myvm". For network
configuration will be done by vmd/priv later.

OK mlarkin@


# 1.30 29-Sep-2016 reyk

Implement fork+exec for vmd, using the same framework from httpd etc.

No objections from mlarkin@ sunil@


# 1.29 17-Aug-2016 deraadt

small bits of header cleanup; ok mlarkin


# 1.28 29-Jul-2016 stefan

Allow starting a VM again after it was terminated

If a VM exits, terminate it and remove it from the list of
available VMs. That allows a VM with name `foo' to be restarted
after it has exited.

This changes structures shared between vmd and vmctl. You need to
rebuild vmctl also.

ok mlarkin@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.27 05-Feb-2016 reyk

Fix a possible use-after-free in vmd, forward the result to the
control socket before free'ing the vm.

Found by and OK jsg@


# 1.26 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.25 11-Dec-2015 reyk

The vmctl "id" argument can now be a number of or a vm name, eg.
vmctl stop 3
vmctl stop "openbsd.vm"


# 1.24 08-Dec-2015 jsg

when checking the config file with -n don't open /dev/vmm or require root
ok reyk@


# 1.23 08-Dec-2015 jsg

make the -f option work as intended
ok reyk@


# 1.22 07-Dec-2015 reyk

tweak initial error logging


# 1.21 06-Dec-2015 reyk

Prevent running a VM with the same name multiple times - multiple
instances of the same configuration will be handled in a different way
later. It is also not a good idea to use the same writeable disk
with multiple VMs at the same time.

As discussed with mlarkin@


# 1.20 06-Dec-2015 reyk

Print the TTY in the vmctl status output.


# 1.19 06-Dec-2015 reyk

When a new vm is created with VMM_IOC_CREATE, the kernel assigns a
unique id to it. This happens in the vm child process and has to be
communicated to the parent processes to track the vm. Knowing the vm
id in the parent and vmm processes also allows to remove vm from the
daemons list on terminate requests later.


# 1.18 06-Dec-2015 reyk

Check errno from config_getvm() correctly


# 1.17 05-Dec-2015 reyk

Print shorter error message if opening /dev/vmm failed.

Pointed out by deraadt@


# 1.16 03-Dec-2015 reyk

Re-add the "load" and "reload" commands to vmctl: Instead of parsing
the configuration in vmctl directly, it now sends a (re)load request
to vmd. The reload also resets the existing configuration status -
this doesn't do much difference yet but a future change will compare
if a specified VM is already running. "load" will allow to add
configuration, while "reload" resets the state before loading.


# 1.15 03-Dec-2015 reyk

Add and document -D and -f flags to vmd.


# 1.14 03-Dec-2015 reyk

mlarkin's code has been moved to vmm.c, so it is ok to claim the copyright.


# 1.13 03-Dec-2015 reyk

Add support for an optional vm.conf(5) file in vmd. This will replace
vmm.conf(5) in vmmctl. For a short time, both vmd and vmmctl will
support a configuration file, but vmmctl will be changed to send
"load" requests to vmd instead of loading and parsing the file
directly.


# 1.12 03-Dec-2015 reyk

prepare config_getvm() for parse.y


# 1.11 02-Dec-2015 reyk

send the tty name to vmmctl and print it as a result.


# 1.10 02-Dec-2015 reyk

Split the fully privileged parent into two processes "parent" and
"vmm" with reduced privileges:
- the "parent" opens fds (disks, ifs, etc.) but runs as root but pledged as
"stdio rpath wpath proc tty sendfd".
- the "vmm" process handles the creation and supervision of vm processes,
and the primary communication with the vmm(4) subsystem. It runs as _vmd
in the chroot but does not use pledge, as the vmm ioctls are not allowed
by any pledge model yet.
With this change, vmd starts to track the configuration state of VMs
in vmd and will allow other things later (like terminating a vm by
name, moving the configuration parser to vmd, ...). More incremental
changes will follow.


# 1.9 02-Dec-2015 reyk

Start tweaking vmd's privsep and daemon model by splitting the main
process into multiple parts and adopting the "proc.c"-style from other
daemons. This allows to further reduce the privileges, to give better
pledge(2), and to add some upcoming changes.

"please do" mlarkin@, deraadt@


# 1.8 26-Nov-2015 reyk

Automatically start vmm(4) when the first VM is created and after the
last VM is terminated. This allows to remove the explicit "vmm
enable" / "vmm disable" (VMM_IOC_START / VMM_IOC_STOP) ioctls. You'll
have to update kernel and userland for this change, as the kernel ABI
changes.

OK mpi@ mlarkin@


# 1.7 25-Nov-2015 tedu

typo: should be looking pid == -1


# 1.6 23-Nov-2015 reyk

accept4() is restarted after signals which prevents vmd from exiting
in the current control socket loop. Add a poll before the accept that
is not restarted and allows to escape the loop. This code is kind of
temporary, as we're planning to replace the event handling, but it
allows to kill (or Ctrl+c) vmd for now.

OK tedu@, discussed with many


# 1.5 23-Nov-2015 reyk

I accidentally removed a newline in usage() when converting the log
messages to log_*.

From Cesare Gargano


# 1.4 23-Nov-2015 reyk

Add support for logging to stderr or syslog, and to run vmd in
foreground with -d.

OK mlarkin@ jung@


# 1.3 22-Nov-2015 deraadt

use PATH_MAX where needed


# 1.2 22-Nov-2015 reyk

Add $ Ids


# 1.1 22-Nov-2015 mlarkin

vmd(8) - virtual machine daemon.

There is still a lot to be done, and fixed, in these userland components
but I have received enough "it works, commit it" emails that it's time
to finish those things in tree.

discussed with many, tested by many.


# 1.113 20-May-2019 jasper

drop fatalx calls when claiming a new vm id; otherwise it's possible
to crash vmd and take all other vms with it. this required a little
shuffling to get the error value reported back to the caller to
handle the error properly.

ok mlarkin@


# 1.112 11-May-2019 jasper

report vm state through 'vmctl status'; whereas previously this would display the state of
the vcpu (which is why it got removed), it now actually reports the correct state
(running, stopped, disabled, paused, etc)

ok ccardenas@ mlarkin@


# 1.111 11-May-2019 jasper

vm_dump_header allocated space for a signature but it was never set;
set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image

ok mlarkin@ pd@


# 1.110 11-May-2019 jasper

track the state of the vm (running, paused, etc) using a single bitfield instead of
a handful of separate variables. this will makes it easier for vmd to report
and check on the individual vm states

no functional change intended

ok ccardenas@ mlarkin@


# 1.109 11-May-2019 jasper

sync the vm state in vmd too when (un)pausing a vm, otherwise the vm process
knows the vm is paused, but vmd does not.

ok mlarkin@ pd@


Revision tags: OPENBSD_6_5_BASE
# 1.108 09-Dec-2018 claudio

When -B is used to specify a specific boot device also change the reboot
behaviour of vmd to stop / exit at guest reboot.
OK ccardenas@


# 1.107 04-Dec-2018 claudio

Introduce IMSG_VMDOP_WAIT_VM_REQUEST a control message that registers a
vmctl peerid that should be informed when the VM is stopped (like when the
guest does a shutdown). Uses the same logic as using the VMOP_WAIT flag on
IMSG_VMDOP_TERMINATE_VM_REQUEST.
Ok ccardenas@, reyk@


# 1.106 26-Nov-2018 ori

Keep a list of known vms, and reuse the VM IDs.

This means that when using '-L', the IP addresses of the VMs are stable.

ok reyk@


# 1.105 21-Nov-2018 reyk

Add support for "local inet6" interfaces.

ok & test ccardenas@, additional review from kn@


# 1.104 15-Oct-2018 reyk

Prevent VM reboot loops by rate-limiting the interval a VM can reboot.

This looping has been experienced by people who run VMs with a broken
kernel or boot loader that trigger a very fast reboot loop (triple
fault) of a VM that ends up using a lot of CPU and resources on the
host. Some fixes in vmm(4) and vmd(8) helped to avoid such conditions
but it can still occur if something is wrong in the guest VM itself.

If the VM restarts after less than VM_START_RATE_SEC (6) seconds, we
increment the limit counter. After VM_START_RATE_LIMIT (3) of suchs
fast reboots the VM is stopped.

There are only very few people who intentionally want to reboot-loop a
VM very quickly (many times within a second); mostly for fuzzing.
They will have to recompile and adjust the stated #defines in the code
as we don't have a config option to disable it.

OK mlarkin@


Revision tags: OPENBSD_6_4_BASE
# 1.103 08-Oct-2018 reyk

Add support for qcow2 base images (external snapshots).

This works is from Ori Bernstein, committing on his behalf:

Add support to vmd for external snapshots. That is, snapshots that are
derived from a base image. Data lookups start in the derived image,
and if the derived image does not contain some data, the search
proceeds ot the base image. Multiple derived images may exist off of
a single base image.

A limitation of this format is that modifying the base image will
corrupt the derived image.

This change also adds support for creating disk derived disk images to
vmctl. To use it:

vmctl create derived.qcow2 -s 16G -b base.qcow2

From Ori Bernstein
OK mlarkin@ reyk@


# 1.102 29-Sep-2018 pd

vmd: don't remove vm if sending failed

Fix a bug where a vm was removed in vmd.c after vmctl send even if sending
failed.
spotted by solene@
ok mlarkin@


# 1.101 28-Sep-2018 reyk

Fix copy-pasto to use maxmem instead of maxcpu

Reported by Greg Steuck

OK mlarkin@


# 1.100 10-Sep-2018 bluhm

vmd(8) clould close file descriptor 0 as not all fd fields were
properly initialized with -1. Also avoid closing -1.
OK mlarkin@


# 1.99 10-Sep-2018 bluhm

During the fork+exec implementation, daemon(3) was moved after
proc_init(). As a consequence vmd(8) child processes did not detach
from the terminal anymore. Dup /dev/null to the stdio file descriptors
in the children.
OK mlarkin@ reyk@


# 1.98 15-Jul-2018 reyk

Track resources and enforce cpu/memory/interface limits for non-root users.

The limits are currently hard-coded and undocumented (4 CPUs/VMs, 2G
memory, 8 interfaces) but will be configurable in an upcoming diff.
These limits are tracked in total usage; for example, a user will be
able to run up to 4 VMs with 512M of memory or a single VM with 2G.

OK ccardenas@ mlarkin@


# 1.97 13-Jul-2018 reyk

Check the disk/kernel/cdrom file permissions after openening the fd.

This prevents time of TOCTOU attacks for instances.

OK mlarkin@


# 1.96 13-Jul-2018 reyk

Add "allow instance" option.

This allows users to create VM instances and change desired options,
for example a user can be allowed to run a VM with all the
pre-configured options but specify an own disk image.

(mlarkin@ was fine with iterating over it)

OK ccardenas@


# 1.95 12-Jul-2018 reyk

Allow to use configured/running VMs as templates for other VM instances.

This introduces new grammar and the -t optional in vmctl start.

(For now, only root can create VM instances; but it is planned to allow
users to create their own VMs based on permissions and quota.)

OK ccardenas@ mlarkin@ jmc@


# 1.94 11-Jul-2018 reyk

style - indent each case statement in a switch.


# 1.93 11-Jul-2018 reyk

Add -w option to vmctl stop to wait for completion of VM termination.

Use it in /etc/rc.d/vmd accordingly.

OK sthen@


# 1.92 11-Jul-2018 reyk

Rename function to vmd_check_vmh


# 1.91 11-Jul-2018 reyk

Add -f option to vmctl stop to forcefully kill a VM.

This also fixes a bug in vmm_sighdlr where it might have missed
forwarding the TERMINATE_EVENT to the vmd parent after a VM child
died, leading to an abandoned VM in the vmd parent process.

OK ccardenas@ mlarkin@ benno@ kn@


# 1.90 10-Jul-2018 reyk

style (single-line ifs don't need braces)


# 1.89 10-Jul-2018 reyk

vmd already had DEBUG/DPRINTF, there is no need for VMD_DEBUG/dprintf

Replace all occurences of dprintf with DPRINTF (defined in proc.h).


# 1.88 10-Jul-2018 reyk

Tweak debug log messages

- Turn tracing messages into DPRINTF (only compiled with DEBUG).

- Pass __func__ to vm_stop and vm_remove: this way we can track who
called the function in the async context. It replaces the manual
log_debug in front of each vm_stop/vm_remove. This debug logging
trick can be removed in the future once we are more confident about
it.

OK ccardenas@ mlarkin@


# 1.87 26-Jun-2018 reyk

Add "socket owner" to allow changing the owner of the vmd control socket.

This allows to open vmctl control or console access to other users
that are not in group wheel. Access for non-root users still defaults
to read-only actions unless you change the owner (user/group) of each
individual VM.

Requested by Mischa Peters

OK mlarkin@


# 1.86 19-Jun-2018 reyk

knf


# 1.85 13-May-2018 pd

vmd(8): enable pause / unpause for vm owners

Patch from Mohamed Aslan. Thanks!
ok kn@


# 1.84 25-Apr-2018 mlarkin

vmd(8)'s early error messages weren't visible when started via /etc/rc
(such as errors relating to not having VMX/etc). Change the log_init
to log to syslog so at least we have some chance of seeing these errors.

requested and ok beck@


# 1.83 21-Apr-2018 mlarkin

spelling error in log message


# 1.82 29-Mar-2018 martijn

Make sure that the global config is send out immediately when it is
loaded. This makes sure that the local prefix specied in the config is
always used.

OK ccardenas@


Revision tags: OPENBSD_6_3_BASE
# 1.81 14-Mar-2018 mlarkin

block two VMs from using the same disk image file at the same time.
Also changes an error message in vmctl to reflect same.


# 1.80 18-Feb-2018 pd

vmd: fix vmctl pause for non existing vm ids (never returns)

check if vm id is valid before sending to vmm for pausing. The 'lock' is caused
by vmm sending back ENOENT for a non existent vm but vmd drops the message
because it doesn't recogize the vmid vmm is talking about. This is an artifact
of the 'policy' don't trust any imsg from a sibling priv sep process and do
your own checking.

reported by Abel Abraham Camarillo Ojeda
ok mlarkin@ and ccardenas@


# 1.79 10-Jan-2018 sthen

Don't require "disk" or "kernel", also allow just "cdrom" instead, a VM can
still be useful with only cdrom storage. ok ccardenas@


# 1.78 08-Jan-2018 mpi

Enable TIOCUCNTL to be able to set ns8250's break detected condition.

It is now possible to send BREAK commands to vmd(8) independently of
the serial terminal emulator.

Happy virtual ddb(4) hacking!

No objection from mlarkin@, ok nicm@, ccardenas@, deraadt@


# 1.77 03-Jan-2018 ccardenas

Add initial CD-ROM support to VMD via vioscsi.

* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8)
* Support various sized ISOs (Limitation of 4G ISOs on Linux guests)
* Known working guests: OpenBSD (primary), Alpine Linux (primary),
CentOS 6 (secondary), Ubuntu 17.10 (secondary).
NOTE: Secondary indicates some issue(s) preventing full/reliable
functionality outside the scope of the vioscsi work.
* If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's
default BIOS) will boot from CD-ROM.

ok mlarkin@, jca@


# 1.76 06-Dec-2017 abieber

Make vmd respect owner when starting non-disabled vms.

OK pd@, benno@


# 1.75 30-Nov-2017 ccardenas

When performing vmctl reload and a previously configured vm is running,
exit with an EALREADY vs EPERM.

ok mlarkin@


# 1.74 11-Nov-2017 mlarkin

update switch handling in vmd(8). vmd now gets switch information (rdomain,
etc) from underlying switch interface instead of handling this on its
own.

Diff from carlos cardenas, Thanks!

ok reyk@


# 1.73 07-Nov-2017 mlarkin

typo in previous


# 1.72 07-Nov-2017 mlarkin

comment function vm_checkperm


# 1.71 24-Oct-2017 mlarkin

The VMD parent process didn't handle the case of a VM exiting
with a non 0 return properly (i.e. EIO).

From: Carlos Cardenas, thanks!


# 1.70 07-Oct-2017 mlarkin

vmd: retain ownership on vm reboot

from Jesper Wallin, thanks!


Revision tags: OPENBSD_6_2_BASE
# 1.69 08-Sep-2017 mlarkin

vmd: add more explanatory log_debug messages

From Carlos Cardenas, many thanks!


# 1.68 20-Aug-2017 pd

vmd: Allow only upward migration

This restricts receiving vms from hosts with more cpu features.

Tested on
broadwell -> skylake (works)
skylake -> broadwell (don't work)

ok mlarkin@


# 1.67 15-Aug-2017 pd

vmd: fix vm id displayed by vmctl when receiving a vm

Also fix two debug messages and an IMSG type.


# 1.66 14-Aug-2017 jasper

validate vm names before creating them; a valid name contains alphanumeric
characters, including '.', '_' and '-'. but does not start with the latter
three.

ok mlarkin@ pd@


# 1.65 13-Aug-2017 jasper

don't issue a termination command to an already stopped vm

ok mlarkin@


# 1.64 15-Jul-2017 pd

Add vmctl send and vmctl receive

ok reyk@ and mlarkin@


# 1.63 09-Jul-2017 pd

vmd/vmctl: Add ability to pause / unpause vms

With help from Ashwin Agrawal

ok reyk@ mlarkin@


# 1.62 29-May-2017 mlarkin

vmd(8): prevent crashing when presented with a vm name argument to
"vmctl stop" that doesn't exist.

Diff from Pratik Vyas, thanks!


# 1.61 04-May-2017 reyk

Report command failure back to vmctl reload, reset, load, log verbose.

OK mlarkin@


# 1.60 04-May-2017 reyk

Add support for rdomains.

This allows to configure VM interfaces and switches in individual rdomains.

OK mlarkin@


# 1.59 25-Apr-2017 reyk

Generate randomized MAC addresses earlier to keep them across reboots.

OK deraadt@


# 1.58 21-Apr-2017 reyk

Add global configuration option "local prefix" to change prefix for -L.

The default prefix is 100.64.0.0/10 from RFC6598.

Requested by sthen@ chris@
OK mlarkin@


# 1.57 19-Apr-2017 reyk

Add support for dynamic "NAT" interfaces (-L/local interface).

When a local interface is configured, vmd configures a /31 address on
the tap(4) interface of the host and provides another IP in the same
subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server
that replies with IP, gateway, and DNS addresses to the VM. The
built-in server only ever responds to the VM on the inside and cannot
leak its DHCP responses to the outside.

Thanks to Uwe Werler, Josh Grosse, and some others for testing!

OK deraadt@


# 1.56 06-Apr-2017 reyk

Do not expose vmm(4) VM IDs to the user, use vmd(8)'s IDs instead.

Each VM has two IDs: one from the kernel (vmm) and a different one
from userland (vmd). The vmm ID is not consistent and incremented on
every boot during runtimg of the host system. The vmd ID remains the
same during the lifetime of a configured VM, even after reboots.
Configured VMs will even get and keep their IDs when the configuration
is loaded. This is more what users expect.

Pointed out and tested by otto@

OK deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.55 15-Mar-2017 reyk

More fixes for starting and stopping VMs, fixing fallout from vm_running.

- Don't start a VM that is already running
- Keep the VM as running until it is powered off (and not stopping)
- Don't fatal in the parent if the vmm process referenced an unknown VM
- Don't stop a VM that is already stopping
- Indicate that a VM is stopping in "vmctl status"

The previous "vmctl stop; vmctl stop" to force-shutdown is not
supported anymore - the shutdown timeout should make sure that the VM
is really terminated. To force-shutdown, reference the VM by ID.
We might add a flag to vmctl stop to just turn the VM off.


# 1.54 15-Mar-2017 reyk

Close the tty if the VM was powered down.

The parent keeps a copy of each VM's tty fd to reuse it on reboot.
Close this tty if the VM was stopped, and not rebooted, by calling
vm_stop(vm, 0) instead of just setting vm_running to 0. Also make
sure that vm_ttyname is not used after free'ing it.


# 1.53 02-Mar-2017 reyk

Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.

This is especially useful when multiple VMs share a switch, the
implementation is independent from the underlying switch or bridge.

no objections mlarkin@


# 1.52 01-Mar-2017 reyk

Add "owner" option to set a user/group ownership for pre-configured VMs

This allows matching users to start or stop VMs that they "own" and to
access the console accordingly.

OK mlarkin@


# 1.51 27-Feb-2017 reyk

Replace openpty(3) with local function that uses pre-opened /dev/ptm fd

This allows more flexibility for upcoming changes and better pledge.
We also didn't use half of the features of libutil's openpty function.
Additionally, make sure that the ttys are closed correctly on shutdown.

OK gilles@


# 1.50 13-Jan-2017 edd

Make it possible to remove VMs from vmd(8)'s internal queue.

The semantics agreed with reyk@ are:

* ad-hoc created vms, created with `vmctl start`, are removed once stopped.
* Stopped VMs defined in a config file are flushed before a `vmctl reload`.

OK reyk@


# 1.49 11-Jan-2017 reyk

Add imsg communication channel between vmd and invividual VMs.
For now, this is only used to forward "log verbose|brief" requests,
but it will be used for better things later.

OK mlarkin@


# 1.48 09-Jan-2017 reyk

Stop accessing verbose and debug variables from log.c directly.

This replaces log_verbose() and "extern int verbose" with the two functions
log_setverbose() and log_getverbose().

Pointed out by benno@
OK krw@ eric@ gilles@ (OK gilles@ for the snmpd bits as well)


# 1.47 14-Dec-2016 reyk

Allow to start disabled and pre-configured VMs by name, "vmctl start foo".

With testing from Jon Bernard

OK mlarkin@


# 1.46 14-Dec-2016 reyk

If a VM terminates with the result EAGAIN, close all fds except the
pty and re-send it to the vmm monitor process. With additional
changes in vmm.c, this will allow perform a cold reboot of VM.

With testing and feedback from Jon Bernard
OK mlarkin@


# 1.45 26-Nov-2016 reyk

Implement basic support for boot.conf(8) on the disk image.

Like the real boot loader, load and parse hd0a:/etc/boot.conf from the
first disk and fall back to /bsd. Not all boot loader options are
supported, but it at least does set device, set image, and boot -acds
(eg. for booting single-user).

For example, it can now boot install60.fs that includes a boot.conf
with "set image /6.0/amd64/bsd.rd":
vmctl start install -c -d install60.fs -d OpenBSD.img

This pseudo-bootloader is only needed without BIOS and could
potentially be replaced in the future.

OK mlarkin@


# 1.44 26-Nov-2016 reyk

If -m/memory is not specified, use 512M by default.

Default value picked with mlarkin - not too small and not too large.

OK mlarkin@


# 1.43 24-Nov-2016 reyk

Add support for booting the kernel from the disk image.

This make the kernel/-k argument optional and, if not specified, tries
to find the /bsd kernel in the primary hd0a partition of the first
disk image itself. It doesn't support hd0a:/etc/boot.conf yet, and it
is no BIOS or full boot loader, but it makes booting and handling of
VMs a bit easier - booting an external kernel is still supported.

The UFS file system code ufs.c is directly from libsa which is also
used by the real boot loader. The code compiles with a few signedness
warning which will be fixed separately.

OK mlarkin@


# 1.42 22-Nov-2016 reyk

Fix error path of config_setvm() and its callers. This unbreaks
loading of invalid kernel files.

Reported by mlarkin@
OK mlarkin@


# 1.41 22-Nov-2016 reyk

There is no need for res when there is already ret.


# 1.40 22-Nov-2016 edd

Insert disabled VMs into vmd(8)'s queues and allow vmctl(8) to display them.

Tested by Jon Bernard and reyk@.

OK reyk@, no objections mlarkin@.

Thanks


# 1.39 04-Nov-2016 reyk

Pass the internal vmid or 0 to vm_register() instead of changing it
once again after setting the next available id.

Suggested by edd@


# 1.38 04-Nov-2016 reyk

Update the config/register/get VM methods to match the config_set/get
style that is used in other places. Also keep the vmid from the parent.

OK edd@


# 1.37 29-Oct-2016 edd

Separate parsing vms and switches from starting them in vmd(8).

Brings us one step closer to having disabled by default vms is vm.conf(5),
which can be started with vmctl(8).

Input, testing and OK reyk@. Thanks.


# 1.36 17-Oct-2016 reyk

Add the option to specify an interface group per virtual switch as well;
this group will be added to all VM tap(4) interfaces in the switch.

Tested by martijn@


# 1.35 15-Oct-2016 reyk

Allow to add an interface to an interface group; with the group keyword.

Requested and tested by martijn@


# 1.34 12-Oct-2016 reyk

Fix functionality and semantics of vmctl load/reload/reset.

OK rzalamena@


# 1.33 06-Oct-2016 reyk

Terminate VMs on shutdown of vmd instead of leaving them running as
undead VM processes.

OK mlarkin@


# 1.32 05-Oct-2016 reyk

Add support for enhanced networking configuration and virtual switches.
See vm.conf(5) for more details.

OK mlarkin@


# 1.31 04-Oct-2016 reyk

Add a new "priv" process that is responsible for ioctls and restricted
operations that aren't allowed under pledge. This is a companion to
the "vmd" process that runs as root but with pledge.

With the "priv" process, each new tap(4) interface now gets a
description to indicate the vm, eg. "vm1-if0-myvm". For network
configuration will be done by vmd/priv later.

OK mlarkin@


# 1.30 29-Sep-2016 reyk

Implement fork+exec for vmd, using the same framework from httpd etc.

No objections from mlarkin@ sunil@


# 1.29 17-Aug-2016 deraadt

small bits of header cleanup; ok mlarkin


# 1.28 29-Jul-2016 stefan

Allow starting a VM again after it was terminated

If a VM exits, terminate it and remove it from the list of
available VMs. That allows a VM with name `foo' to be restarted
after it has exited.

This changes structures shared between vmd and vmctl. You need to
rebuild vmctl also.

ok mlarkin@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.27 05-Feb-2016 reyk

Fix a possible use-after-free in vmd, forward the result to the
control socket before free'ing the vm.

Found by and OK jsg@


# 1.26 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.25 11-Dec-2015 reyk

The vmctl "id" argument can now be a number of or a vm name, eg.
vmctl stop 3
vmctl stop "openbsd.vm"


# 1.24 08-Dec-2015 jsg

when checking the config file with -n don't open /dev/vmm or require root
ok reyk@


# 1.23 08-Dec-2015 jsg

make the -f option work as intended
ok reyk@


# 1.22 07-Dec-2015 reyk

tweak initial error logging


# 1.21 06-Dec-2015 reyk

Prevent running a VM with the same name multiple times - multiple
instances of the same configuration will be handled in a different way
later. It is also not a good idea to use the same writeable disk
with multiple VMs at the same time.

As discussed with mlarkin@


# 1.20 06-Dec-2015 reyk

Print the TTY in the vmctl status output.


# 1.19 06-Dec-2015 reyk

When a new vm is created with VMM_IOC_CREATE, the kernel assigns a
unique id to it. This happens in the vm child process and has to be
communicated to the parent processes to track the vm. Knowing the vm
id in the parent and vmm processes also allows to remove vm from the
daemons list on terminate requests later.


# 1.18 06-Dec-2015 reyk

Check errno from config_getvm() correctly


# 1.17 05-Dec-2015 reyk

Print shorter error message if opening /dev/vmm failed.

Pointed out by deraadt@


# 1.16 03-Dec-2015 reyk

Re-add the "load" and "reload" commands to vmctl: Instead of parsing
the configuration in vmctl directly, it now sends a (re)load request
to vmd. The reload also resets the existing configuration status -
this doesn't do much difference yet but a future change will compare
if a specified VM is already running. "load" will allow to add
configuration, while "reload" resets the state before loading.


# 1.15 03-Dec-2015 reyk

Add and document -D and -f flags to vmd.


# 1.14 03-Dec-2015 reyk

mlarkin's code has been moved to vmm.c, so it is ok to claim the copyright.


# 1.13 03-Dec-2015 reyk

Add support for an optional vm.conf(5) file in vmd. This will replace
vmm.conf(5) in vmmctl. For a short time, both vmd and vmmctl will
support a configuration file, but vmmctl will be changed to send
"load" requests to vmd instead of loading and parsing the file
directly.


# 1.12 03-Dec-2015 reyk

prepare config_getvm() for parse.y


# 1.11 02-Dec-2015 reyk

send the tty name to vmmctl and print it as a result.


# 1.10 02-Dec-2015 reyk

Split the fully privileged parent into two processes "parent" and
"vmm" with reduced privileges:
- the "parent" opens fds (disks, ifs, etc.) but runs as root but pledged as
"stdio rpath wpath proc tty sendfd".
- the "vmm" process handles the creation and supervision of vm processes,
and the primary communication with the vmm(4) subsystem. It runs as _vmd
in the chroot but does not use pledge, as the vmm ioctls are not allowed
by any pledge model yet.
With this change, vmd starts to track the configuration state of VMs
in vmd and will allow other things later (like terminating a vm by
name, moving the configuration parser to vmd, ...). More incremental
changes will follow.


# 1.9 02-Dec-2015 reyk

Start tweaking vmd's privsep and daemon model by splitting the main
process into multiple parts and adopting the "proc.c"-style from other
daemons. This allows to further reduce the privileges, to give better
pledge(2), and to add some upcoming changes.

"please do" mlarkin@, deraadt@


# 1.8 26-Nov-2015 reyk

Automatically start vmm(4) when the first VM is created and after the
last VM is terminated. This allows to remove the explicit "vmm
enable" / "vmm disable" (VMM_IOC_START / VMM_IOC_STOP) ioctls. You'll
have to update kernel and userland for this change, as the kernel ABI
changes.

OK mpi@ mlarkin@


# 1.7 25-Nov-2015 tedu

typo: should be looking pid == -1


# 1.6 23-Nov-2015 reyk

accept4() is restarted after signals which prevents vmd from exiting
in the current control socket loop. Add a poll before the accept that
is not restarted and allows to escape the loop. This code is kind of
temporary, as we're planning to replace the event handling, but it
allows to kill (or Ctrl+c) vmd for now.

OK tedu@, discussed with many


# 1.5 23-Nov-2015 reyk

I accidentally removed a newline in usage() when converting the log
messages to log_*.

From Cesare Gargano


# 1.4 23-Nov-2015 reyk

Add support for logging to stderr or syslog, and to run vmd in
foreground with -d.

OK mlarkin@ jung@


# 1.3 22-Nov-2015 deraadt

use PATH_MAX where needed


# 1.2 22-Nov-2015 reyk

Add $ Ids


# 1.1 22-Nov-2015 mlarkin

vmd(8) - virtual machine daemon.

There is still a lot to be done, and fixed, in these userland components
but I have received enough "it works, commit it" emails that it's time
to finish those things in tree.

discussed with many, tested by many.


# 1.112 11-May-2019 jasper

report vm state through 'vmctl status'; whereas previously this would display the state of
the vcpu (which is why it got removed), it now actually reports the correct state
(running, stopped, disabled, paused, etc)

ok ccardenas@ mlarkin@


# 1.111 11-May-2019 jasper

vm_dump_header allocated space for a signature but it was never set;
set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image

ok mlarkin@ pd@


# 1.110 11-May-2019 jasper

track the state of the vm (running, paused, etc) using a single bitfield instead of
a handful of separate variables. this will makes it easier for vmd to report
and check on the individual vm states

no functional change intended

ok ccardenas@ mlarkin@


# 1.109 11-May-2019 jasper

sync the vm state in vmd too when (un)pausing a vm, otherwise the vm process
knows the vm is paused, but vmd does not.

ok mlarkin@ pd@


Revision tags: OPENBSD_6_5_BASE
# 1.108 09-Dec-2018 claudio

When -B is used to specify a specific boot device also change the reboot
behaviour of vmd to stop / exit at guest reboot.
OK ccardenas@


# 1.107 04-Dec-2018 claudio

Introduce IMSG_VMDOP_WAIT_VM_REQUEST a control message that registers a
vmctl peerid that should be informed when the VM is stopped (like when the
guest does a shutdown). Uses the same logic as using the VMOP_WAIT flag on
IMSG_VMDOP_TERMINATE_VM_REQUEST.
Ok ccardenas@, reyk@


# 1.106 26-Nov-2018 ori

Keep a list of known vms, and reuse the VM IDs.

This means that when using '-L', the IP addresses of the VMs are stable.

ok reyk@


# 1.105 21-Nov-2018 reyk

Add support for "local inet6" interfaces.

ok & test ccardenas@, additional review from kn@


# 1.104 15-Oct-2018 reyk

Prevent VM reboot loops by rate-limiting the interval a VM can reboot.

This looping has been experienced by people who run VMs with a broken
kernel or boot loader that trigger a very fast reboot loop (triple
fault) of a VM that ends up using a lot of CPU and resources on the
host. Some fixes in vmm(4) and vmd(8) helped to avoid such conditions
but it can still occur if something is wrong in the guest VM itself.

If the VM restarts after less than VM_START_RATE_SEC (6) seconds, we
increment the limit counter. After VM_START_RATE_LIMIT (3) of suchs
fast reboots the VM is stopped.

There are only very few people who intentionally want to reboot-loop a
VM very quickly (many times within a second); mostly for fuzzing.
They will have to recompile and adjust the stated #defines in the code
as we don't have a config option to disable it.

OK mlarkin@


Revision tags: OPENBSD_6_4_BASE
# 1.103 08-Oct-2018 reyk

Add support for qcow2 base images (external snapshots).

This works is from Ori Bernstein, committing on his behalf:

Add support to vmd for external snapshots. That is, snapshots that are
derived from a base image. Data lookups start in the derived image,
and if the derived image does not contain some data, the search
proceeds ot the base image. Multiple derived images may exist off of
a single base image.

A limitation of this format is that modifying the base image will
corrupt the derived image.

This change also adds support for creating disk derived disk images to
vmctl. To use it:

vmctl create derived.qcow2 -s 16G -b base.qcow2

From Ori Bernstein
OK mlarkin@ reyk@


# 1.102 29-Sep-2018 pd

vmd: don't remove vm if sending failed

Fix a bug where a vm was removed in vmd.c after vmctl send even if sending
failed.
spotted by solene@
ok mlarkin@


# 1.101 28-Sep-2018 reyk

Fix copy-pasto to use maxmem instead of maxcpu

Reported by Greg Steuck

OK mlarkin@


# 1.100 10-Sep-2018 bluhm

vmd(8) clould close file descriptor 0 as not all fd fields were
properly initialized with -1. Also avoid closing -1.
OK mlarkin@


# 1.99 10-Sep-2018 bluhm

During the fork+exec implementation, daemon(3) was moved after
proc_init(). As a consequence vmd(8) child processes did not detach
from the terminal anymore. Dup /dev/null to the stdio file descriptors
in the children.
OK mlarkin@ reyk@


# 1.98 15-Jul-2018 reyk

Track resources and enforce cpu/memory/interface limits for non-root users.

The limits are currently hard-coded and undocumented (4 CPUs/VMs, 2G
memory, 8 interfaces) but will be configurable in an upcoming diff.
These limits are tracked in total usage; for example, a user will be
able to run up to 4 VMs with 512M of memory or a single VM with 2G.

OK ccardenas@ mlarkin@


# 1.97 13-Jul-2018 reyk

Check the disk/kernel/cdrom file permissions after openening the fd.

This prevents time of TOCTOU attacks for instances.

OK mlarkin@


# 1.96 13-Jul-2018 reyk

Add "allow instance" option.

This allows users to create VM instances and change desired options,
for example a user can be allowed to run a VM with all the
pre-configured options but specify an own disk image.

(mlarkin@ was fine with iterating over it)

OK ccardenas@


# 1.95 12-Jul-2018 reyk

Allow to use configured/running VMs as templates for other VM instances.

This introduces new grammar and the -t optional in vmctl start.

(For now, only root can create VM instances; but it is planned to allow
users to create their own VMs based on permissions and quota.)

OK ccardenas@ mlarkin@ jmc@


# 1.94 11-Jul-2018 reyk

style - indent each case statement in a switch.


# 1.93 11-Jul-2018 reyk

Add -w option to vmctl stop to wait for completion of VM termination.

Use it in /etc/rc.d/vmd accordingly.

OK sthen@


# 1.92 11-Jul-2018 reyk

Rename function to vmd_check_vmh


# 1.91 11-Jul-2018 reyk

Add -f option to vmctl stop to forcefully kill a VM.

This also fixes a bug in vmm_sighdlr where it might have missed
forwarding the TERMINATE_EVENT to the vmd parent after a VM child
died, leading to an abandoned VM in the vmd parent process.

OK ccardenas@ mlarkin@ benno@ kn@


# 1.90 10-Jul-2018 reyk

style (single-line ifs don't need braces)


# 1.89 10-Jul-2018 reyk

vmd already had DEBUG/DPRINTF, there is no need for VMD_DEBUG/dprintf

Replace all occurences of dprintf with DPRINTF (defined in proc.h).


# 1.88 10-Jul-2018 reyk

Tweak debug log messages

- Turn tracing messages into DPRINTF (only compiled with DEBUG).

- Pass __func__ to vm_stop and vm_remove: this way we can track who
called the function in the async context. It replaces the manual
log_debug in front of each vm_stop/vm_remove. This debug logging
trick can be removed in the future once we are more confident about
it.

OK ccardenas@ mlarkin@


# 1.87 26-Jun-2018 reyk

Add "socket owner" to allow changing the owner of the vmd control socket.

This allows to open vmctl control or console access to other users
that are not in group wheel. Access for non-root users still defaults
to read-only actions unless you change the owner (user/group) of each
individual VM.

Requested by Mischa Peters

OK mlarkin@


# 1.86 19-Jun-2018 reyk

knf


# 1.85 13-May-2018 pd

vmd(8): enable pause / unpause for vm owners

Patch from Mohamed Aslan. Thanks!
ok kn@


# 1.84 25-Apr-2018 mlarkin

vmd(8)'s early error messages weren't visible when started via /etc/rc
(such as errors relating to not having VMX/etc). Change the log_init
to log to syslog so at least we have some chance of seeing these errors.

requested and ok beck@


# 1.83 21-Apr-2018 mlarkin

spelling error in log message


# 1.82 29-Mar-2018 martijn

Make sure that the global config is send out immediately when it is
loaded. This makes sure that the local prefix specied in the config is
always used.

OK ccardenas@


Revision tags: OPENBSD_6_3_BASE
# 1.81 14-Mar-2018 mlarkin

block two VMs from using the same disk image file at the same time.
Also changes an error message in vmctl to reflect same.


# 1.80 18-Feb-2018 pd

vmd: fix vmctl pause for non existing vm ids (never returns)

check if vm id is valid before sending to vmm for pausing. The 'lock' is caused
by vmm sending back ENOENT for a non existent vm but vmd drops the message
because it doesn't recogize the vmid vmm is talking about. This is an artifact
of the 'policy' don't trust any imsg from a sibling priv sep process and do
your own checking.

reported by Abel Abraham Camarillo Ojeda
ok mlarkin@ and ccardenas@


# 1.79 10-Jan-2018 sthen

Don't require "disk" or "kernel", also allow just "cdrom" instead, a VM can
still be useful with only cdrom storage. ok ccardenas@


# 1.78 08-Jan-2018 mpi

Enable TIOCUCNTL to be able to set ns8250's break detected condition.

It is now possible to send BREAK commands to vmd(8) independently of
the serial terminal emulator.

Happy virtual ddb(4) hacking!

No objection from mlarkin@, ok nicm@, ccardenas@, deraadt@


# 1.77 03-Jan-2018 ccardenas

Add initial CD-ROM support to VMD via vioscsi.

* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8)
* Support various sized ISOs (Limitation of 4G ISOs on Linux guests)
* Known working guests: OpenBSD (primary), Alpine Linux (primary),
CentOS 6 (secondary), Ubuntu 17.10 (secondary).
NOTE: Secondary indicates some issue(s) preventing full/reliable
functionality outside the scope of the vioscsi work.
* If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's
default BIOS) will boot from CD-ROM.

ok mlarkin@, jca@


# 1.76 06-Dec-2017 abieber

Make vmd respect owner when starting non-disabled vms.

OK pd@, benno@


# 1.75 30-Nov-2017 ccardenas

When performing vmctl reload and a previously configured vm is running,
exit with an EALREADY vs EPERM.

ok mlarkin@


# 1.74 11-Nov-2017 mlarkin

update switch handling in vmd(8). vmd now gets switch information (rdomain,
etc) from underlying switch interface instead of handling this on its
own.

Diff from carlos cardenas, Thanks!

ok reyk@


# 1.73 07-Nov-2017 mlarkin

typo in previous


# 1.72 07-Nov-2017 mlarkin

comment function vm_checkperm


# 1.71 24-Oct-2017 mlarkin

The VMD parent process didn't handle the case of a VM exiting
with a non 0 return properly (i.e. EIO).

From: Carlos Cardenas, thanks!


# 1.70 07-Oct-2017 mlarkin

vmd: retain ownership on vm reboot

from Jesper Wallin, thanks!


Revision tags: OPENBSD_6_2_BASE
# 1.69 08-Sep-2017 mlarkin

vmd: add more explanatory log_debug messages

From Carlos Cardenas, many thanks!


# 1.68 20-Aug-2017 pd

vmd: Allow only upward migration

This restricts receiving vms from hosts with more cpu features.

Tested on
broadwell -> skylake (works)
skylake -> broadwell (don't work)

ok mlarkin@


# 1.67 15-Aug-2017 pd

vmd: fix vm id displayed by vmctl when receiving a vm

Also fix two debug messages and an IMSG type.


# 1.66 14-Aug-2017 jasper

validate vm names before creating them; a valid name contains alphanumeric
characters, including '.', '_' and '-'. but does not start with the latter
three.

ok mlarkin@ pd@


# 1.65 13-Aug-2017 jasper

don't issue a termination command to an already stopped vm

ok mlarkin@


# 1.64 15-Jul-2017 pd

Add vmctl send and vmctl receive

ok reyk@ and mlarkin@


# 1.63 09-Jul-2017 pd

vmd/vmctl: Add ability to pause / unpause vms

With help from Ashwin Agrawal

ok reyk@ mlarkin@


# 1.62 29-May-2017 mlarkin

vmd(8): prevent crashing when presented with a vm name argument to
"vmctl stop" that doesn't exist.

Diff from Pratik Vyas, thanks!


# 1.61 04-May-2017 reyk

Report command failure back to vmctl reload, reset, load, log verbose.

OK mlarkin@


# 1.60 04-May-2017 reyk

Add support for rdomains.

This allows to configure VM interfaces and switches in individual rdomains.

OK mlarkin@


# 1.59 25-Apr-2017 reyk

Generate randomized MAC addresses earlier to keep them across reboots.

OK deraadt@


# 1.58 21-Apr-2017 reyk

Add global configuration option "local prefix" to change prefix for -L.

The default prefix is 100.64.0.0/10 from RFC6598.

Requested by sthen@ chris@
OK mlarkin@


# 1.57 19-Apr-2017 reyk

Add support for dynamic "NAT" interfaces (-L/local interface).

When a local interface is configured, vmd configures a /31 address on
the tap(4) interface of the host and provides another IP in the same
subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server
that replies with IP, gateway, and DNS addresses to the VM. The
built-in server only ever responds to the VM on the inside and cannot
leak its DHCP responses to the outside.

Thanks to Uwe Werler, Josh Grosse, and some others for testing!

OK deraadt@


# 1.56 06-Apr-2017 reyk

Do not expose vmm(4) VM IDs to the user, use vmd(8)'s IDs instead.

Each VM has two IDs: one from the kernel (vmm) and a different one
from userland (vmd). The vmm ID is not consistent and incremented on
every boot during runtimg of the host system. The vmd ID remains the
same during the lifetime of a configured VM, even after reboots.
Configured VMs will even get and keep their IDs when the configuration
is loaded. This is more what users expect.

Pointed out and tested by otto@

OK deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.55 15-Mar-2017 reyk

More fixes for starting and stopping VMs, fixing fallout from vm_running.

- Don't start a VM that is already running
- Keep the VM as running until it is powered off (and not stopping)
- Don't fatal in the parent if the vmm process referenced an unknown VM
- Don't stop a VM that is already stopping
- Indicate that a VM is stopping in "vmctl status"

The previous "vmctl stop; vmctl stop" to force-shutdown is not
supported anymore - the shutdown timeout should make sure that the VM
is really terminated. To force-shutdown, reference the VM by ID.
We might add a flag to vmctl stop to just turn the VM off.


# 1.54 15-Mar-2017 reyk

Close the tty if the VM was powered down.

The parent keeps a copy of each VM's tty fd to reuse it on reboot.
Close this tty if the VM was stopped, and not rebooted, by calling
vm_stop(vm, 0) instead of just setting vm_running to 0. Also make
sure that vm_ttyname is not used after free'ing it.


# 1.53 02-Mar-2017 reyk

Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.

This is especially useful when multiple VMs share a switch, the
implementation is independent from the underlying switch or bridge.

no objections mlarkin@


# 1.52 01-Mar-2017 reyk

Add "owner" option to set a user/group ownership for pre-configured VMs

This allows matching users to start or stop VMs that they "own" and to
access the console accordingly.

OK mlarkin@


# 1.51 27-Feb-2017 reyk

Replace openpty(3) with local function that uses pre-opened /dev/ptm fd

This allows more flexibility for upcoming changes and better pledge.
We also didn't use half of the features of libutil's openpty function.
Additionally, make sure that the ttys are closed correctly on shutdown.

OK gilles@


# 1.50 13-Jan-2017 edd

Make it possible to remove VMs from vmd(8)'s internal queue.

The semantics agreed with reyk@ are:

* ad-hoc created vms, created with `vmctl start`, are removed once stopped.
* Stopped VMs defined in a config file are flushed before a `vmctl reload`.

OK reyk@


# 1.49 11-Jan-2017 reyk

Add imsg communication channel between vmd and invividual VMs.
For now, this is only used to forward "log verbose|brief" requests,
but it will be used for better things later.

OK mlarkin@


# 1.48 09-Jan-2017 reyk

Stop accessing verbose and debug variables from log.c directly.

This replaces log_verbose() and "extern int verbose" with the two functions
log_setverbose() and log_getverbose().

Pointed out by benno@
OK krw@ eric@ gilles@ (OK gilles@ for the snmpd bits as well)


# 1.47 14-Dec-2016 reyk

Allow to start disabled and pre-configured VMs by name, "vmctl start foo".

With testing from Jon Bernard

OK mlarkin@


# 1.46 14-Dec-2016 reyk

If a VM terminates with the result EAGAIN, close all fds except the
pty and re-send it to the vmm monitor process. With additional
changes in vmm.c, this will allow perform a cold reboot of VM.

With testing and feedback from Jon Bernard
OK mlarkin@


# 1.45 26-Nov-2016 reyk

Implement basic support for boot.conf(8) on the disk image.

Like the real boot loader, load and parse hd0a:/etc/boot.conf from the
first disk and fall back to /bsd. Not all boot loader options are
supported, but it at least does set device, set image, and boot -acds
(eg. for booting single-user).

For example, it can now boot install60.fs that includes a boot.conf
with "set image /6.0/amd64/bsd.rd":
vmctl start install -c -d install60.fs -d OpenBSD.img

This pseudo-bootloader is only needed without BIOS and could
potentially be replaced in the future.

OK mlarkin@


# 1.44 26-Nov-2016 reyk

If -m/memory is not specified, use 512M by default.

Default value picked with mlarkin - not too small and not too large.

OK mlarkin@


# 1.43 24-Nov-2016 reyk

Add support for booting the kernel from the disk image.

This make the kernel/-k argument optional and, if not specified, tries
to find the /bsd kernel in the primary hd0a partition of the first
disk image itself. It doesn't support hd0a:/etc/boot.conf yet, and it
is no BIOS or full boot loader, but it makes booting and handling of
VMs a bit easier - booting an external kernel is still supported.

The UFS file system code ufs.c is directly from libsa which is also
used by the real boot loader. The code compiles with a few signedness
warning which will be fixed separately.

OK mlarkin@


# 1.42 22-Nov-2016 reyk

Fix error path of config_setvm() and its callers. This unbreaks
loading of invalid kernel files.

Reported by mlarkin@
OK mlarkin@


# 1.41 22-Nov-2016 reyk

There is no need for res when there is already ret.


# 1.40 22-Nov-2016 edd

Insert disabled VMs into vmd(8)'s queues and allow vmctl(8) to display them.

Tested by Jon Bernard and reyk@.

OK reyk@, no objections mlarkin@.

Thanks


# 1.39 04-Nov-2016 reyk

Pass the internal vmid or 0 to vm_register() instead of changing it
once again after setting the next available id.

Suggested by edd@


# 1.38 04-Nov-2016 reyk

Update the config/register/get VM methods to match the config_set/get
style that is used in other places. Also keep the vmid from the parent.

OK edd@


# 1.37 29-Oct-2016 edd

Separate parsing vms and switches from starting them in vmd(8).

Brings us one step closer to having disabled by default vms is vm.conf(5),
which can be started with vmctl(8).

Input, testing and OK reyk@. Thanks.


# 1.36 17-Oct-2016 reyk

Add the option to specify an interface group per virtual switch as well;
this group will be added to all VM tap(4) interfaces in the switch.

Tested by martijn@


# 1.35 15-Oct-2016 reyk

Allow to add an interface to an interface group; with the group keyword.

Requested and tested by martijn@


# 1.34 12-Oct-2016 reyk

Fix functionality and semantics of vmctl load/reload/reset.

OK rzalamena@


# 1.33 06-Oct-2016 reyk

Terminate VMs on shutdown of vmd instead of leaving them running as
undead VM processes.

OK mlarkin@


# 1.32 05-Oct-2016 reyk

Add support for enhanced networking configuration and virtual switches.
See vm.conf(5) for more details.

OK mlarkin@


# 1.31 04-Oct-2016 reyk

Add a new "priv" process that is responsible for ioctls and restricted
operations that aren't allowed under pledge. This is a companion to
the "vmd" process that runs as root but with pledge.

With the "priv" process, each new tap(4) interface now gets a
description to indicate the vm, eg. "vm1-if0-myvm". For network
configuration will be done by vmd/priv later.

OK mlarkin@


# 1.30 29-Sep-2016 reyk

Implement fork+exec for vmd, using the same framework from httpd etc.

No objections from mlarkin@ sunil@


# 1.29 17-Aug-2016 deraadt

small bits of header cleanup; ok mlarkin


# 1.28 29-Jul-2016 stefan

Allow starting a VM again after it was terminated

If a VM exits, terminate it and remove it from the list of
available VMs. That allows a VM with name `foo' to be restarted
after it has exited.

This changes structures shared between vmd and vmctl. You need to
rebuild vmctl also.

ok mlarkin@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.27 05-Feb-2016 reyk

Fix a possible use-after-free in vmd, forward the result to the
control socket before free'ing the vm.

Found by and OK jsg@


# 1.26 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.25 11-Dec-2015 reyk

The vmctl "id" argument can now be a number of or a vm name, eg.
vmctl stop 3
vmctl stop "openbsd.vm"


# 1.24 08-Dec-2015 jsg

when checking the config file with -n don't open /dev/vmm or require root
ok reyk@


# 1.23 08-Dec-2015 jsg

make the -f option work as intended
ok reyk@


# 1.22 07-Dec-2015 reyk

tweak initial error logging


# 1.21 06-Dec-2015 reyk

Prevent running a VM with the same name multiple times - multiple
instances of the same configuration will be handled in a different way
later. It is also not a good idea to use the same writeable disk
with multiple VMs at the same time.

As discussed with mlarkin@


# 1.20 06-Dec-2015 reyk

Print the TTY in the vmctl status output.


# 1.19 06-Dec-2015 reyk

When a new vm is created with VMM_IOC_CREATE, the kernel assigns a
unique id to it. This happens in the vm child process and has to be
communicated to the parent processes to track the vm. Knowing the vm
id in the parent and vmm processes also allows to remove vm from the
daemons list on terminate requests later.


# 1.18 06-Dec-2015 reyk

Check errno from config_getvm() correctly


# 1.17 05-Dec-2015 reyk

Print shorter error message if opening /dev/vmm failed.

Pointed out by deraadt@


# 1.16 03-Dec-2015 reyk

Re-add the "load" and "reload" commands to vmctl: Instead of parsing
the configuration in vmctl directly, it now sends a (re)load request
to vmd. The reload also resets the existing configuration status -
this doesn't do much difference yet but a future change will compare
if a specified VM is already running. "load" will allow to add
configuration, while "reload" resets the state before loading.


# 1.15 03-Dec-2015 reyk

Add and document -D and -f flags to vmd.


# 1.14 03-Dec-2015 reyk

mlarkin's code has been moved to vmm.c, so it is ok to claim the copyright.


# 1.13 03-Dec-2015 reyk

Add support for an optional vm.conf(5) file in vmd. This will replace
vmm.conf(5) in vmmctl. For a short time, both vmd and vmmctl will
support a configuration file, but vmmctl will be changed to send
"load" requests to vmd instead of loading and parsing the file
directly.


# 1.12 03-Dec-2015 reyk

prepare config_getvm() for parse.y


# 1.11 02-Dec-2015 reyk

send the tty name to vmmctl and print it as a result.


# 1.10 02-Dec-2015 reyk

Split the fully privileged parent into two processes "parent" and
"vmm" with reduced privileges:
- the "parent" opens fds (disks, ifs, etc.) but runs as root but pledged as
"stdio rpath wpath proc tty sendfd".
- the "vmm" process handles the creation and supervision of vm processes,
and the primary communication with the vmm(4) subsystem. It runs as _vmd
in the chroot but does not use pledge, as the vmm ioctls are not allowed
by any pledge model yet.
With this change, vmd starts to track the configuration state of VMs
in vmd and will allow other things later (like terminating a vm by
name, moving the configuration parser to vmd, ...). More incremental
changes will follow.


# 1.9 02-Dec-2015 reyk

Start tweaking vmd's privsep and daemon model by splitting the main
process into multiple parts and adopting the "proc.c"-style from other
daemons. This allows to further reduce the privileges, to give better
pledge(2), and to add some upcoming changes.

"please do" mlarkin@, deraadt@


# 1.8 26-Nov-2015 reyk

Automatically start vmm(4) when the first VM is created and after the
last VM is terminated. This allows to remove the explicit "vmm
enable" / "vmm disable" (VMM_IOC_START / VMM_IOC_STOP) ioctls. You'll
have to update kernel and userland for this change, as the kernel ABI
changes.

OK mpi@ mlarkin@


# 1.7 25-Nov-2015 tedu

typo: should be looking pid == -1


# 1.6 23-Nov-2015 reyk

accept4() is restarted after signals which prevents vmd from exiting
in the current control socket loop. Add a poll before the accept that
is not restarted and allows to escape the loop. This code is kind of
temporary, as we're planning to replace the event handling, but it
allows to kill (or Ctrl+c) vmd for now.

OK tedu@, discussed with many


# 1.5 23-Nov-2015 reyk

I accidentally removed a newline in usage() when converting the log
messages to log_*.

From Cesare Gargano


# 1.4 23-Nov-2015 reyk

Add support for logging to stderr or syslog, and to run vmd in
foreground with -d.

OK mlarkin@ jung@


# 1.3 22-Nov-2015 deraadt

use PATH_MAX where needed


# 1.2 22-Nov-2015 reyk

Add $ Ids


# 1.1 22-Nov-2015 mlarkin

vmd(8) - virtual machine daemon.

There is still a lot to be done, and fixed, in these userland components
but I have received enough "it works, commit it" emails that it's time
to finish those things in tree.

discussed with many, tested by many.


# 1.109 11-May-2019 jasper

sync the vm state in vmd too when (un)pausing a vm, otherwise the vm process
knows the vm is paused, but vmd does not.

ok mlarkin@ pd@


Revision tags: OPENBSD_6_5_BASE
# 1.108 09-Dec-2018 claudio

When -B is used to specify a specific boot device also change the reboot
behaviour of vmd to stop / exit at guest reboot.
OK ccardenas@


# 1.107 04-Dec-2018 claudio

Introduce IMSG_VMDOP_WAIT_VM_REQUEST a control message that registers a
vmctl peerid that should be informed when the VM is stopped (like when the
guest does a shutdown). Uses the same logic as using the VMOP_WAIT flag on
IMSG_VMDOP_TERMINATE_VM_REQUEST.
Ok ccardenas@, reyk@


# 1.106 26-Nov-2018 ori

Keep a list of known vms, and reuse the VM IDs.

This means that when using '-L', the IP addresses of the VMs are stable.

ok reyk@


# 1.105 21-Nov-2018 reyk

Add support for "local inet6" interfaces.

ok & test ccardenas@, additional review from kn@


# 1.104 15-Oct-2018 reyk

Prevent VM reboot loops by rate-limiting the interval a VM can reboot.

This looping has been experienced by people who run VMs with a broken
kernel or boot loader that trigger a very fast reboot loop (triple
fault) of a VM that ends up using a lot of CPU and resources on the
host. Some fixes in vmm(4) and vmd(8) helped to avoid such conditions
but it can still occur if something is wrong in the guest VM itself.

If the VM restarts after less than VM_START_RATE_SEC (6) seconds, we
increment the limit counter. After VM_START_RATE_LIMIT (3) of suchs
fast reboots the VM is stopped.

There are only very few people who intentionally want to reboot-loop a
VM very quickly (many times within a second); mostly for fuzzing.
They will have to recompile and adjust the stated #defines in the code
as we don't have a config option to disable it.

OK mlarkin@


Revision tags: OPENBSD_6_4_BASE
# 1.103 08-Oct-2018 reyk

Add support for qcow2 base images (external snapshots).

This works is from Ori Bernstein, committing on his behalf:

Add support to vmd for external snapshots. That is, snapshots that are
derived from a base image. Data lookups start in the derived image,
and if the derived image does not contain some data, the search
proceeds ot the base image. Multiple derived images may exist off of
a single base image.

A limitation of this format is that modifying the base image will
corrupt the derived image.

This change also adds support for creating disk derived disk images to
vmctl. To use it:

vmctl create derived.qcow2 -s 16G -b base.qcow2

From Ori Bernstein
OK mlarkin@ reyk@


# 1.102 29-Sep-2018 pd

vmd: don't remove vm if sending failed

Fix a bug where a vm was removed in vmd.c after vmctl send even if sending
failed.
spotted by solene@
ok mlarkin@


# 1.101 28-Sep-2018 reyk

Fix copy-pasto to use maxmem instead of maxcpu

Reported by Greg Steuck

OK mlarkin@


# 1.100 10-Sep-2018 bluhm

vmd(8) clould close file descriptor 0 as not all fd fields were
properly initialized with -1. Also avoid closing -1.
OK mlarkin@


# 1.99 10-Sep-2018 bluhm

During the fork+exec implementation, daemon(3) was moved after
proc_init(). As a consequence vmd(8) child processes did not detach
from the terminal anymore. Dup /dev/null to the stdio file descriptors
in the children.
OK mlarkin@ reyk@


# 1.98 15-Jul-2018 reyk

Track resources and enforce cpu/memory/interface limits for non-root users.

The limits are currently hard-coded and undocumented (4 CPUs/VMs, 2G
memory, 8 interfaces) but will be configurable in an upcoming diff.
These limits are tracked in total usage; for example, a user will be
able to run up to 4 VMs with 512M of memory or a single VM with 2G.

OK ccardenas@ mlarkin@


# 1.97 13-Jul-2018 reyk

Check the disk/kernel/cdrom file permissions after openening the fd.

This prevents time of TOCTOU attacks for instances.

OK mlarkin@


# 1.96 13-Jul-2018 reyk

Add "allow instance" option.

This allows users to create VM instances and change desired options,
for example a user can be allowed to run a VM with all the
pre-configured options but specify an own disk image.

(mlarkin@ was fine with iterating over it)

OK ccardenas@


# 1.95 12-Jul-2018 reyk

Allow to use configured/running VMs as templates for other VM instances.

This introduces new grammar and the -t optional in vmctl start.

(For now, only root can create VM instances; but it is planned to allow
users to create their own VMs based on permissions and quota.)

OK ccardenas@ mlarkin@ jmc@


# 1.94 11-Jul-2018 reyk

style - indent each case statement in a switch.


# 1.93 11-Jul-2018 reyk

Add -w option to vmctl stop to wait for completion of VM termination.

Use it in /etc/rc.d/vmd accordingly.

OK sthen@


# 1.92 11-Jul-2018 reyk

Rename function to vmd_check_vmh


# 1.91 11-Jul-2018 reyk

Add -f option to vmctl stop to forcefully kill a VM.

This also fixes a bug in vmm_sighdlr where it might have missed
forwarding the TERMINATE_EVENT to the vmd parent after a VM child
died, leading to an abandoned VM in the vmd parent process.

OK ccardenas@ mlarkin@ benno@ kn@


# 1.90 10-Jul-2018 reyk

style (single-line ifs don't need braces)


# 1.89 10-Jul-2018 reyk

vmd already had DEBUG/DPRINTF, there is no need for VMD_DEBUG/dprintf

Replace all occurences of dprintf with DPRINTF (defined in proc.h).


# 1.88 10-Jul-2018 reyk

Tweak debug log messages

- Turn tracing messages into DPRINTF (only compiled with DEBUG).

- Pass __func__ to vm_stop and vm_remove: this way we can track who
called the function in the async context. It replaces the manual
log_debug in front of each vm_stop/vm_remove. This debug logging
trick can be removed in the future once we are more confident about
it.

OK ccardenas@ mlarkin@


# 1.87 26-Jun-2018 reyk

Add "socket owner" to allow changing the owner of the vmd control socket.

This allows to open vmctl control or console access to other users
that are not in group wheel. Access for non-root users still defaults
to read-only actions unless you change the owner (user/group) of each
individual VM.

Requested by Mischa Peters

OK mlarkin@


# 1.86 19-Jun-2018 reyk

knf


# 1.85 13-May-2018 pd

vmd(8): enable pause / unpause for vm owners

Patch from Mohamed Aslan. Thanks!
ok kn@


# 1.84 25-Apr-2018 mlarkin

vmd(8)'s early error messages weren't visible when started via /etc/rc
(such as errors relating to not having VMX/etc). Change the log_init
to log to syslog so at least we have some chance of seeing these errors.

requested and ok beck@


# 1.83 21-Apr-2018 mlarkin

spelling error in log message


# 1.82 29-Mar-2018 martijn

Make sure that the global config is send out immediately when it is
loaded. This makes sure that the local prefix specied in the config is
always used.

OK ccardenas@


Revision tags: OPENBSD_6_3_BASE
# 1.81 14-Mar-2018 mlarkin

block two VMs from using the same disk image file at the same time.
Also changes an error message in vmctl to reflect same.


# 1.80 18-Feb-2018 pd

vmd: fix vmctl pause for non existing vm ids (never returns)

check if vm id is valid before sending to vmm for pausing. The 'lock' is caused
by vmm sending back ENOENT for a non existent vm but vmd drops the message
because it doesn't recogize the vmid vmm is talking about. This is an artifact
of the 'policy' don't trust any imsg from a sibling priv sep process and do
your own checking.

reported by Abel Abraham Camarillo Ojeda
ok mlarkin@ and ccardenas@


# 1.79 10-Jan-2018 sthen

Don't require "disk" or "kernel", also allow just "cdrom" instead, a VM can
still be useful with only cdrom storage. ok ccardenas@


# 1.78 08-Jan-2018 mpi

Enable TIOCUCNTL to be able to set ns8250's break detected condition.

It is now possible to send BREAK commands to vmd(8) independently of
the serial terminal emulator.

Happy virtual ddb(4) hacking!

No objection from mlarkin@, ok nicm@, ccardenas@, deraadt@


# 1.77 03-Jan-2018 ccardenas

Add initial CD-ROM support to VMD via vioscsi.

* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8)
* Support various sized ISOs (Limitation of 4G ISOs on Linux guests)
* Known working guests: OpenBSD (primary), Alpine Linux (primary),
CentOS 6 (secondary), Ubuntu 17.10 (secondary).
NOTE: Secondary indicates some issue(s) preventing full/reliable
functionality outside the scope of the vioscsi work.
* If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's
default BIOS) will boot from CD-ROM.

ok mlarkin@, jca@


# 1.76 06-Dec-2017 abieber

Make vmd respect owner when starting non-disabled vms.

OK pd@, benno@


# 1.75 30-Nov-2017 ccardenas

When performing vmctl reload and a previously configured vm is running,
exit with an EALREADY vs EPERM.

ok mlarkin@


# 1.74 11-Nov-2017 mlarkin

update switch handling in vmd(8). vmd now gets switch information (rdomain,
etc) from underlying switch interface instead of handling this on its
own.

Diff from carlos cardenas, Thanks!

ok reyk@


# 1.73 07-Nov-2017 mlarkin

typo in previous


# 1.72 07-Nov-2017 mlarkin

comment function vm_checkperm


# 1.71 24-Oct-2017 mlarkin

The VMD parent process didn't handle the case of a VM exiting
with a non 0 return properly (i.e. EIO).

From: Carlos Cardenas, thanks!


# 1.70 07-Oct-2017 mlarkin

vmd: retain ownership on vm reboot

from Jesper Wallin, thanks!


Revision tags: OPENBSD_6_2_BASE
# 1.69 08-Sep-2017 mlarkin

vmd: add more explanatory log_debug messages

From Carlos Cardenas, many thanks!


# 1.68 20-Aug-2017 pd

vmd: Allow only upward migration

This restricts receiving vms from hosts with more cpu features.

Tested on
broadwell -> skylake (works)
skylake -> broadwell (don't work)

ok mlarkin@


# 1.67 15-Aug-2017 pd

vmd: fix vm id displayed by vmctl when receiving a vm

Also fix two debug messages and an IMSG type.


# 1.66 14-Aug-2017 jasper

validate vm names before creating them; a valid name contains alphanumeric
characters, including '.', '_' and '-'. but does not start with the latter
three.

ok mlarkin@ pd@


# 1.65 13-Aug-2017 jasper

don't issue a termination command to an already stopped vm

ok mlarkin@


# 1.64 15-Jul-2017 pd

Add vmctl send and vmctl receive

ok reyk@ and mlarkin@


# 1.63 09-Jul-2017 pd

vmd/vmctl: Add ability to pause / unpause vms

With help from Ashwin Agrawal

ok reyk@ mlarkin@


# 1.62 29-May-2017 mlarkin

vmd(8): prevent crashing when presented with a vm name argument to
"vmctl stop" that doesn't exist.

Diff from Pratik Vyas, thanks!


# 1.61 04-May-2017 reyk

Report command failure back to vmctl reload, reset, load, log verbose.

OK mlarkin@


# 1.60 04-May-2017 reyk

Add support for rdomains.

This allows to configure VM interfaces and switches in individual rdomains.

OK mlarkin@


# 1.59 25-Apr-2017 reyk

Generate randomized MAC addresses earlier to keep them across reboots.

OK deraadt@


# 1.58 21-Apr-2017 reyk

Add global configuration option "local prefix" to change prefix for -L.

The default prefix is 100.64.0.0/10 from RFC6598.

Requested by sthen@ chris@
OK mlarkin@


# 1.57 19-Apr-2017 reyk

Add support for dynamic "NAT" interfaces (-L/local interface).

When a local interface is configured, vmd configures a /31 address on
the tap(4) interface of the host and provides another IP in the same
subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server
that replies with IP, gateway, and DNS addresses to the VM. The
built-in server only ever responds to the VM on the inside and cannot
leak its DHCP responses to the outside.

Thanks to Uwe Werler, Josh Grosse, and some others for testing!

OK deraadt@


# 1.56 06-Apr-2017 reyk

Do not expose vmm(4) VM IDs to the user, use vmd(8)'s IDs instead.

Each VM has two IDs: one from the kernel (vmm) and a different one
from userland (vmd). The vmm ID is not consistent and incremented on
every boot during runtimg of the host system. The vmd ID remains the
same during the lifetime of a configured VM, even after reboots.
Configured VMs will even get and keep their IDs when the configuration
is loaded. This is more what users expect.

Pointed out and tested by otto@

OK deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.55 15-Mar-2017 reyk

More fixes for starting and stopping VMs, fixing fallout from vm_running.

- Don't start a VM that is already running
- Keep the VM as running until it is powered off (and not stopping)
- Don't fatal in the parent if the vmm process referenced an unknown VM
- Don't stop a VM that is already stopping
- Indicate that a VM is stopping in "vmctl status"

The previous "vmctl stop; vmctl stop" to force-shutdown is not
supported anymore - the shutdown timeout should make sure that the VM
is really terminated. To force-shutdown, reference the VM by ID.
We might add a flag to vmctl stop to just turn the VM off.


# 1.54 15-Mar-2017 reyk

Close the tty if the VM was powered down.

The parent keeps a copy of each VM's tty fd to reuse it on reboot.
Close this tty if the VM was stopped, and not rebooted, by calling
vm_stop(vm, 0) instead of just setting vm_running to 0. Also make
sure that vm_ttyname is not used after free'ing it.


# 1.53 02-Mar-2017 reyk

Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.

This is especially useful when multiple VMs share a switch, the
implementation is independent from the underlying switch or bridge.

no objections mlarkin@


# 1.52 01-Mar-2017 reyk

Add "owner" option to set a user/group ownership for pre-configured VMs

This allows matching users to start or stop VMs that they "own" and to
access the console accordingly.

OK mlarkin@


# 1.51 27-Feb-2017 reyk

Replace openpty(3) with local function that uses pre-opened /dev/ptm fd

This allows more flexibility for upcoming changes and better pledge.
We also didn't use half of the features of libutil's openpty function.
Additionally, make sure that the ttys are closed correctly on shutdown.

OK gilles@


# 1.50 13-Jan-2017 edd

Make it possible to remove VMs from vmd(8)'s internal queue.

The semantics agreed with reyk@ are:

* ad-hoc created vms, created with `vmctl start`, are removed once stopped.
* Stopped VMs defined in a config file are flushed before a `vmctl reload`.

OK reyk@


# 1.49 11-Jan-2017 reyk

Add imsg communication channel between vmd and invividual VMs.
For now, this is only used to forward "log verbose|brief" requests,
but it will be used for better things later.

OK mlarkin@


# 1.48 09-Jan-2017 reyk

Stop accessing verbose and debug variables from log.c directly.

This replaces log_verbose() and "extern int verbose" with the two functions
log_setverbose() and log_getverbose().

Pointed out by benno@
OK krw@ eric@ gilles@ (OK gilles@ for the snmpd bits as well)


# 1.47 14-Dec-2016 reyk

Allow to start disabled and pre-configured VMs by name, "vmctl start foo".

With testing from Jon Bernard

OK mlarkin@


# 1.46 14-Dec-2016 reyk

If a VM terminates with the result EAGAIN, close all fds except the
pty and re-send it to the vmm monitor process. With additional
changes in vmm.c, this will allow perform a cold reboot of VM.

With testing and feedback from Jon Bernard
OK mlarkin@


# 1.45 26-Nov-2016 reyk

Implement basic support for boot.conf(8) on the disk image.

Like the real boot loader, load and parse hd0a:/etc/boot.conf from the
first disk and fall back to /bsd. Not all boot loader options are
supported, but it at least does set device, set image, and boot -acds
(eg. for booting single-user).

For example, it can now boot install60.fs that includes a boot.conf
with "set image /6.0/amd64/bsd.rd":
vmctl start install -c -d install60.fs -d OpenBSD.img

This pseudo-bootloader is only needed without BIOS and could
potentially be replaced in the future.

OK mlarkin@


# 1.44 26-Nov-2016 reyk

If -m/memory is not specified, use 512M by default.

Default value picked with mlarkin - not too small and not too large.

OK mlarkin@


# 1.43 24-Nov-2016 reyk

Add support for booting the kernel from the disk image.

This make the kernel/-k argument optional and, if not specified, tries
to find the /bsd kernel in the primary hd0a partition of the first
disk image itself. It doesn't support hd0a:/etc/boot.conf yet, and it
is no BIOS or full boot loader, but it makes booting and handling of
VMs a bit easier - booting an external kernel is still supported.

The UFS file system code ufs.c is directly from libsa which is also
used by the real boot loader. The code compiles with a few signedness
warning which will be fixed separately.

OK mlarkin@


# 1.42 22-Nov-2016 reyk

Fix error path of config_setvm() and its callers. This unbreaks
loading of invalid kernel files.

Reported by mlarkin@
OK mlarkin@


# 1.41 22-Nov-2016 reyk

There is no need for res when there is already ret.


# 1.40 22-Nov-2016 edd

Insert disabled VMs into vmd(8)'s queues and allow vmctl(8) to display them.

Tested by Jon Bernard and reyk@.

OK reyk@, no objections mlarkin@.

Thanks


# 1.39 04-Nov-2016 reyk

Pass the internal vmid or 0 to vm_register() instead of changing it
once again after setting the next available id.

Suggested by edd@


# 1.38 04-Nov-2016 reyk

Update the config/register/get VM methods to match the config_set/get
style that is used in other places. Also keep the vmid from the parent.

OK edd@


# 1.37 29-Oct-2016 edd

Separate parsing vms and switches from starting them in vmd(8).

Brings us one step closer to having disabled by default vms is vm.conf(5),
which can be started with vmctl(8).

Input, testing and OK reyk@. Thanks.


# 1.36 17-Oct-2016 reyk

Add the option to specify an interface group per virtual switch as well;
this group will be added to all VM tap(4) interfaces in the switch.

Tested by martijn@


# 1.35 15-Oct-2016 reyk

Allow to add an interface to an interface group; with the group keyword.

Requested and tested by martijn@


# 1.34 12-Oct-2016 reyk

Fix functionality and semantics of vmctl load/reload/reset.

OK rzalamena@


# 1.33 06-Oct-2016 reyk

Terminate VMs on shutdown of vmd instead of leaving them running as
undead VM processes.

OK mlarkin@


# 1.32 05-Oct-2016 reyk

Add support for enhanced networking configuration and virtual switches.
See vm.conf(5) for more details.

OK mlarkin@


# 1.31 04-Oct-2016 reyk

Add a new "priv" process that is responsible for ioctls and restricted
operations that aren't allowed under pledge. This is a companion to
the "vmd" process that runs as root but with pledge.

With the "priv" process, each new tap(4) interface now gets a
description to indicate the vm, eg. "vm1-if0-myvm". For network
configuration will be done by vmd/priv later.

OK mlarkin@


# 1.30 29-Sep-2016 reyk

Implement fork+exec for vmd, using the same framework from httpd etc.

No objections from mlarkin@ sunil@


# 1.29 17-Aug-2016 deraadt

small bits of header cleanup; ok mlarkin


# 1.28 29-Jul-2016 stefan

Allow starting a VM again after it was terminated

If a VM exits, terminate it and remove it from the list of
available VMs. That allows a VM with name `foo' to be restarted
after it has exited.

This changes structures shared between vmd and vmctl. You need to
rebuild vmctl also.

ok mlarkin@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.27 05-Feb-2016 reyk

Fix a possible use-after-free in vmd, forward the result to the
control socket before free'ing the vm.

Found by and OK jsg@


# 1.26 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.25 11-Dec-2015 reyk

The vmctl "id" argument can now be a number of or a vm name, eg.
vmctl stop 3
vmctl stop "openbsd.vm"


# 1.24 08-Dec-2015 jsg

when checking the config file with -n don't open /dev/vmm or require root
ok reyk@


# 1.23 08-Dec-2015 jsg

make the -f option work as intended
ok reyk@


# 1.22 07-Dec-2015 reyk

tweak initial error logging


# 1.21 06-Dec-2015 reyk

Prevent running a VM with the same name multiple times - multiple
instances of the same configuration will be handled in a different way
later. It is also not a good idea to use the same writeable disk
with multiple VMs at the same time.

As discussed with mlarkin@


# 1.20 06-Dec-2015 reyk

Print the TTY in the vmctl status output.


# 1.19 06-Dec-2015 reyk

When a new vm is created with VMM_IOC_CREATE, the kernel assigns a
unique id to it. This happens in the vm child process and has to be
communicated to the parent processes to track the vm. Knowing the vm
id in the parent and vmm processes also allows to remove vm from the
daemons list on terminate requests later.


# 1.18 06-Dec-2015 reyk

Check errno from config_getvm() correctly


# 1.17 05-Dec-2015 reyk

Print shorter error message if opening /dev/vmm failed.

Pointed out by deraadt@


# 1.16 03-Dec-2015 reyk

Re-add the "load" and "reload" commands to vmctl: Instead of parsing
the configuration in vmctl directly, it now sends a (re)load request
to vmd. The reload also resets the existing configuration status -
this doesn't do much difference yet but a future change will compare
if a specified VM is already running. "load" will allow to add
configuration, while "reload" resets the state before loading.


# 1.15 03-Dec-2015 reyk

Add and document -D and -f flags to vmd.


# 1.14 03-Dec-2015 reyk

mlarkin's code has been moved to vmm.c, so it is ok to claim the copyright.


# 1.13 03-Dec-2015 reyk

Add support for an optional vm.conf(5) file in vmd. This will replace
vmm.conf(5) in vmmctl. For a short time, both vmd and vmmctl will
support a configuration file, but vmmctl will be changed to send
"load" requests to vmd instead of loading and parsing the file
directly.


# 1.12 03-Dec-2015 reyk

prepare config_getvm() for parse.y


# 1.11 02-Dec-2015 reyk

send the tty name to vmmctl and print it as a result.


# 1.10 02-Dec-2015 reyk

Split the fully privileged parent into two processes "parent" and
"vmm" with reduced privileges:
- the "parent" opens fds (disks, ifs, etc.) but runs as root but pledged as
"stdio rpath wpath proc tty sendfd".
- the "vmm" process handles the creation and supervision of vm processes,
and the primary communication with the vmm(4) subsystem. It runs as _vmd
in the chroot but does not use pledge, as the vmm ioctls are not allowed
by any pledge model yet.
With this change, vmd starts to track the configuration state of VMs
in vmd and will allow other things later (like terminating a vm by
name, moving the configuration parser to vmd, ...). More incremental
changes will follow.


# 1.9 02-Dec-2015 reyk

Start tweaking vmd's privsep and daemon model by splitting the main
process into multiple parts and adopting the "proc.c"-style from other
daemons. This allows to further reduce the privileges, to give better
pledge(2), and to add some upcoming changes.

"please do" mlarkin@, deraadt@


# 1.8 26-Nov-2015 reyk

Automatically start vmm(4) when the first VM is created and after the
last VM is terminated. This allows to remove the explicit "vmm
enable" / "vmm disable" (VMM_IOC_START / VMM_IOC_STOP) ioctls. You'll
have to update kernel and userland for this change, as the kernel ABI
changes.

OK mpi@ mlarkin@


# 1.7 25-Nov-2015 tedu

typo: should be looking pid == -1


# 1.6 23-Nov-2015 reyk

accept4() is restarted after signals which prevents vmd from exiting
in the current control socket loop. Add a poll before the accept that
is not restarted and allows to escape the loop. This code is kind of
temporary, as we're planning to replace the event handling, but it
allows to kill (or Ctrl+c) vmd for now.

OK tedu@, discussed with many


# 1.5 23-Nov-2015 reyk

I accidentally removed a newline in usage() when converting the log
messages to log_*.

From Cesare Gargano


# 1.4 23-Nov-2015 reyk

Add support for logging to stderr or syslog, and to run vmd in
foreground with -d.

OK mlarkin@ jung@


# 1.3 22-Nov-2015 deraadt

use PATH_MAX where needed


# 1.2 22-Nov-2015 reyk

Add $ Ids


# 1.1 22-Nov-2015 mlarkin

vmd(8) - virtual machine daemon.

There is still a lot to be done, and fixed, in these userland components
but I have received enough "it works, commit it" emails that it's time
to finish those things in tree.

discussed with many, tested by many.


# 1.108 09-Dec-2018 claudio

When -B is used to specify a specific boot device also change the reboot
behaviour of vmd to stop / exit at guest reboot.
OK ccardenas@


# 1.107 04-Dec-2018 claudio

Introduce IMSG_VMDOP_WAIT_VM_REQUEST a control message that registers a
vmctl peerid that should be informed when the VM is stopped (like when the
guest does a shutdown). Uses the same logic as using the VMOP_WAIT flag on
IMSG_VMDOP_TERMINATE_VM_REQUEST.
Ok ccardenas@, reyk@


# 1.106 26-Nov-2018 ori

Keep a list of known vms, and reuse the VM IDs.

This means that when using '-L', the IP addresses of the VMs are stable.

ok reyk@


# 1.105 21-Nov-2018 reyk

Add support for "local inet6" interfaces.

ok & test ccardenas@, additional review from kn@


# 1.104 15-Oct-2018 reyk

Prevent VM reboot loops by rate-limiting the interval a VM can reboot.

This looping has been experienced by people who run VMs with a broken
kernel or boot loader that trigger a very fast reboot loop (triple
fault) of a VM that ends up using a lot of CPU and resources on the
host. Some fixes in vmm(4) and vmd(8) helped to avoid such conditions
but it can still occur if something is wrong in the guest VM itself.

If the VM restarts after less than VM_START_RATE_SEC (6) seconds, we
increment the limit counter. After VM_START_RATE_LIMIT (3) of suchs
fast reboots the VM is stopped.

There are only very few people who intentionally want to reboot-loop a
VM very quickly (many times within a second); mostly for fuzzing.
They will have to recompile and adjust the stated #defines in the code
as we don't have a config option to disable it.

OK mlarkin@


Revision tags: OPENBSD_6_4_BASE
# 1.103 08-Oct-2018 reyk

Add support for qcow2 base images (external snapshots).

This works is from Ori Bernstein, committing on his behalf:

Add support to vmd for external snapshots. That is, snapshots that are
derived from a base image. Data lookups start in the derived image,
and if the derived image does not contain some data, the search
proceeds ot the base image. Multiple derived images may exist off of
a single base image.

A limitation of this format is that modifying the base image will
corrupt the derived image.

This change also adds support for creating disk derived disk images to
vmctl. To use it:

vmctl create derived.qcow2 -s 16G -b base.qcow2

From Ori Bernstein
OK mlarkin@ reyk@


# 1.102 29-Sep-2018 pd

vmd: don't remove vm if sending failed

Fix a bug where a vm was removed in vmd.c after vmctl send even if sending
failed.
spotted by solene@
ok mlarkin@


# 1.101 28-Sep-2018 reyk

Fix copy-pasto to use maxmem instead of maxcpu

Reported by Greg Steuck

OK mlarkin@


# 1.100 10-Sep-2018 bluhm

vmd(8) clould close file descriptor 0 as not all fd fields were
properly initialized with -1. Also avoid closing -1.
OK mlarkin@


# 1.99 10-Sep-2018 bluhm

During the fork+exec implementation, daemon(3) was moved after
proc_init(). As a consequence vmd(8) child processes did not detach
from the terminal anymore. Dup /dev/null to the stdio file descriptors
in the children.
OK mlarkin@ reyk@


# 1.98 15-Jul-2018 reyk

Track resources and enforce cpu/memory/interface limits for non-root users.

The limits are currently hard-coded and undocumented (4 CPUs/VMs, 2G
memory, 8 interfaces) but will be configurable in an upcoming diff.
These limits are tracked in total usage; for example, a user will be
able to run up to 4 VMs with 512M of memory or a single VM with 2G.

OK ccardenas@ mlarkin@


# 1.97 13-Jul-2018 reyk

Check the disk/kernel/cdrom file permissions after openening the fd.

This prevents time of TOCTOU attacks for instances.

OK mlarkin@


# 1.96 13-Jul-2018 reyk

Add "allow instance" option.

This allows users to create VM instances and change desired options,
for example a user can be allowed to run a VM with all the
pre-configured options but specify an own disk image.

(mlarkin@ was fine with iterating over it)

OK ccardenas@


# 1.95 12-Jul-2018 reyk

Allow to use configured/running VMs as templates for other VM instances.

This introduces new grammar and the -t optional in vmctl start.

(For now, only root can create VM instances; but it is planned to allow
users to create their own VMs based on permissions and quota.)

OK ccardenas@ mlarkin@ jmc@


# 1.94 11-Jul-2018 reyk

style - indent each case statement in a switch.


# 1.93 11-Jul-2018 reyk

Add -w option to vmctl stop to wait for completion of VM termination.

Use it in /etc/rc.d/vmd accordingly.

OK sthen@


# 1.92 11-Jul-2018 reyk

Rename function to vmd_check_vmh


# 1.91 11-Jul-2018 reyk

Add -f option to vmctl stop to forcefully kill a VM.

This also fixes a bug in vmm_sighdlr where it might have missed
forwarding the TERMINATE_EVENT to the vmd parent after a VM child
died, leading to an abandoned VM in the vmd parent process.

OK ccardenas@ mlarkin@ benno@ kn@


# 1.90 10-Jul-2018 reyk

style (single-line ifs don't need braces)


# 1.89 10-Jul-2018 reyk

vmd already had DEBUG/DPRINTF, there is no need for VMD_DEBUG/dprintf

Replace all occurences of dprintf with DPRINTF (defined in proc.h).


# 1.88 10-Jul-2018 reyk

Tweak debug log messages

- Turn tracing messages into DPRINTF (only compiled with DEBUG).

- Pass __func__ to vm_stop and vm_remove: this way we can track who
called the function in the async context. It replaces the manual
log_debug in front of each vm_stop/vm_remove. This debug logging
trick can be removed in the future once we are more confident about
it.

OK ccardenas@ mlarkin@


# 1.87 26-Jun-2018 reyk

Add "socket owner" to allow changing the owner of the vmd control socket.

This allows to open vmctl control or console access to other users
that are not in group wheel. Access for non-root users still defaults
to read-only actions unless you change the owner (user/group) of each
individual VM.

Requested by Mischa Peters

OK mlarkin@


# 1.86 19-Jun-2018 reyk

knf


# 1.85 13-May-2018 pd

vmd(8): enable pause / unpause for vm owners

Patch from Mohamed Aslan. Thanks!
ok kn@


# 1.84 25-Apr-2018 mlarkin

vmd(8)'s early error messages weren't visible when started via /etc/rc
(such as errors relating to not having VMX/etc). Change the log_init
to log to syslog so at least we have some chance of seeing these errors.

requested and ok beck@


# 1.83 21-Apr-2018 mlarkin

spelling error in log message


# 1.82 29-Mar-2018 martijn

Make sure that the global config is send out immediately when it is
loaded. This makes sure that the local prefix specied in the config is
always used.

OK ccardenas@


Revision tags: OPENBSD_6_3_BASE
# 1.81 14-Mar-2018 mlarkin

block two VMs from using the same disk image file at the same time.
Also changes an error message in vmctl to reflect same.


# 1.80 18-Feb-2018 pd

vmd: fix vmctl pause for non existing vm ids (never returns)

check if vm id is valid before sending to vmm for pausing. The 'lock' is caused
by vmm sending back ENOENT for a non existent vm but vmd drops the message
because it doesn't recogize the vmid vmm is talking about. This is an artifact
of the 'policy' don't trust any imsg from a sibling priv sep process and do
your own checking.

reported by Abel Abraham Camarillo Ojeda
ok mlarkin@ and ccardenas@


# 1.79 10-Jan-2018 sthen

Don't require "disk" or "kernel", also allow just "cdrom" instead, a VM can
still be useful with only cdrom storage. ok ccardenas@


# 1.78 08-Jan-2018 mpi

Enable TIOCUCNTL to be able to set ns8250's break detected condition.

It is now possible to send BREAK commands to vmd(8) independently of
the serial terminal emulator.

Happy virtual ddb(4) hacking!

No objection from mlarkin@, ok nicm@, ccardenas@, deraadt@


# 1.77 03-Jan-2018 ccardenas

Add initial CD-ROM support to VMD via vioscsi.

* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8)
* Support various sized ISOs (Limitation of 4G ISOs on Linux guests)
* Known working guests: OpenBSD (primary), Alpine Linux (primary),
CentOS 6 (secondary), Ubuntu 17.10 (secondary).
NOTE: Secondary indicates some issue(s) preventing full/reliable
functionality outside the scope of the vioscsi work.
* If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's
default BIOS) will boot from CD-ROM.

ok mlarkin@, jca@


# 1.76 06-Dec-2017 abieber

Make vmd respect owner when starting non-disabled vms.

OK pd@, benno@


# 1.75 30-Nov-2017 ccardenas

When performing vmctl reload and a previously configured vm is running,
exit with an EALREADY vs EPERM.

ok mlarkin@


# 1.74 11-Nov-2017 mlarkin

update switch handling in vmd(8). vmd now gets switch information (rdomain,
etc) from underlying switch interface instead of handling this on its
own.

Diff from carlos cardenas, Thanks!

ok reyk@


# 1.73 07-Nov-2017 mlarkin

typo in previous


# 1.72 07-Nov-2017 mlarkin

comment function vm_checkperm


# 1.71 24-Oct-2017 mlarkin

The VMD parent process didn't handle the case of a VM exiting
with a non 0 return properly (i.e. EIO).

From: Carlos Cardenas, thanks!


# 1.70 07-Oct-2017 mlarkin

vmd: retain ownership on vm reboot

from Jesper Wallin, thanks!


Revision tags: OPENBSD_6_2_BASE
# 1.69 08-Sep-2017 mlarkin

vmd: add more explanatory log_debug messages

From Carlos Cardenas, many thanks!


# 1.68 20-Aug-2017 pd

vmd: Allow only upward migration

This restricts receiving vms from hosts with more cpu features.

Tested on
broadwell -> skylake (works)
skylake -> broadwell (don't work)

ok mlarkin@


# 1.67 15-Aug-2017 pd

vmd: fix vm id displayed by vmctl when receiving a vm

Also fix two debug messages and an IMSG type.


# 1.66 14-Aug-2017 jasper

validate vm names before creating them; a valid name contains alphanumeric
characters, including '.', '_' and '-'. but does not start with the latter
three.

ok mlarkin@ pd@


# 1.65 13-Aug-2017 jasper

don't issue a termination command to an already stopped vm

ok mlarkin@


# 1.64 15-Jul-2017 pd

Add vmctl send and vmctl receive

ok reyk@ and mlarkin@


# 1.63 09-Jul-2017 pd

vmd/vmctl: Add ability to pause / unpause vms

With help from Ashwin Agrawal

ok reyk@ mlarkin@


# 1.62 29-May-2017 mlarkin

vmd(8): prevent crashing when presented with a vm name argument to
"vmctl stop" that doesn't exist.

Diff from Pratik Vyas, thanks!


# 1.61 04-May-2017 reyk

Report command failure back to vmctl reload, reset, load, log verbose.

OK mlarkin@


# 1.60 04-May-2017 reyk

Add support for rdomains.

This allows to configure VM interfaces and switches in individual rdomains.

OK mlarkin@


# 1.59 25-Apr-2017 reyk

Generate randomized MAC addresses earlier to keep them across reboots.

OK deraadt@


# 1.58 21-Apr-2017 reyk

Add global configuration option "local prefix" to change prefix for -L.

The default prefix is 100.64.0.0/10 from RFC6598.

Requested by sthen@ chris@
OK mlarkin@


# 1.57 19-Apr-2017 reyk

Add support for dynamic "NAT" interfaces (-L/local interface).

When a local interface is configured, vmd configures a /31 address on
the tap(4) interface of the host and provides another IP in the same
subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server
that replies with IP, gateway, and DNS addresses to the VM. The
built-in server only ever responds to the VM on the inside and cannot
leak its DHCP responses to the outside.

Thanks to Uwe Werler, Josh Grosse, and some others for testing!

OK deraadt@


# 1.56 06-Apr-2017 reyk

Do not expose vmm(4) VM IDs to the user, use vmd(8)'s IDs instead.

Each VM has two IDs: one from the kernel (vmm) and a different one
from userland (vmd). The vmm ID is not consistent and incremented on
every boot during runtimg of the host system. The vmd ID remains the
same during the lifetime of a configured VM, even after reboots.
Configured VMs will even get and keep their IDs when the configuration
is loaded. This is more what users expect.

Pointed out and tested by otto@

OK deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.55 15-Mar-2017 reyk

More fixes for starting and stopping VMs, fixing fallout from vm_running.

- Don't start a VM that is already running
- Keep the VM as running until it is powered off (and not stopping)
- Don't fatal in the parent if the vmm process referenced an unknown VM
- Don't stop a VM that is already stopping
- Indicate that a VM is stopping in "vmctl status"

The previous "vmctl stop; vmctl stop" to force-shutdown is not
supported anymore - the shutdown timeout should make sure that the VM
is really terminated. To force-shutdown, reference the VM by ID.
We might add a flag to vmctl stop to just turn the VM off.


# 1.54 15-Mar-2017 reyk

Close the tty if the VM was powered down.

The parent keeps a copy of each VM's tty fd to reuse it on reboot.
Close this tty if the VM was stopped, and not rebooted, by calling
vm_stop(vm, 0) instead of just setting vm_running to 0. Also make
sure that vm_ttyname is not used after free'ing it.


# 1.53 02-Mar-2017 reyk

Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.

This is especially useful when multiple VMs share a switch, the
implementation is independent from the underlying switch or bridge.

no objections mlarkin@


# 1.52 01-Mar-2017 reyk

Add "owner" option to set a user/group ownership for pre-configured VMs

This allows matching users to start or stop VMs that they "own" and to
access the console accordingly.

OK mlarkin@


# 1.51 27-Feb-2017 reyk

Replace openpty(3) with local function that uses pre-opened /dev/ptm fd

This allows more flexibility for upcoming changes and better pledge.
We also didn't use half of the features of libutil's openpty function.
Additionally, make sure that the ttys are closed correctly on shutdown.

OK gilles@


# 1.50 13-Jan-2017 edd

Make it possible to remove VMs from vmd(8)'s internal queue.

The semantics agreed with reyk@ are:

* ad-hoc created vms, created with `vmctl start`, are removed once stopped.
* Stopped VMs defined in a config file are flushed before a `vmctl reload`.

OK reyk@


# 1.49 11-Jan-2017 reyk

Add imsg communication channel between vmd and invividual VMs.
For now, this is only used to forward "log verbose|brief" requests,
but it will be used for better things later.

OK mlarkin@


# 1.48 09-Jan-2017 reyk

Stop accessing verbose and debug variables from log.c directly.

This replaces log_verbose() and "extern int verbose" with the two functions
log_setverbose() and log_getverbose().

Pointed out by benno@
OK krw@ eric@ gilles@ (OK gilles@ for the snmpd bits as well)


# 1.47 14-Dec-2016 reyk

Allow to start disabled and pre-configured VMs by name, "vmctl start foo".

With testing from Jon Bernard

OK mlarkin@


# 1.46 14-Dec-2016 reyk

If a VM terminates with the result EAGAIN, close all fds except the
pty and re-send it to the vmm monitor process. With additional
changes in vmm.c, this will allow perform a cold reboot of VM.

With testing and feedback from Jon Bernard
OK mlarkin@


# 1.45 26-Nov-2016 reyk

Implement basic support for boot.conf(8) on the disk image.

Like the real boot loader, load and parse hd0a:/etc/boot.conf from the
first disk and fall back to /bsd. Not all boot loader options are
supported, but it at least does set device, set image, and boot -acds
(eg. for booting single-user).

For example, it can now boot install60.fs that includes a boot.conf
with "set image /6.0/amd64/bsd.rd":
vmctl start install -c -d install60.fs -d OpenBSD.img

This pseudo-bootloader is only needed without BIOS and could
potentially be replaced in the future.

OK mlarkin@


# 1.44 26-Nov-2016 reyk

If -m/memory is not specified, use 512M by default.

Default value picked with mlarkin - not too small and not too large.

OK mlarkin@


# 1.43 24-Nov-2016 reyk

Add support for booting the kernel from the disk image.

This make the kernel/-k argument optional and, if not specified, tries
to find the /bsd kernel in the primary hd0a partition of the first
disk image itself. It doesn't support hd0a:/etc/boot.conf yet, and it
is no BIOS or full boot loader, but it makes booting and handling of
VMs a bit easier - booting an external kernel is still supported.

The UFS file system code ufs.c is directly from libsa which is also
used by the real boot loader. The code compiles with a few signedness
warning which will be fixed separately.

OK mlarkin@


# 1.42 22-Nov-2016 reyk

Fix error path of config_setvm() and its callers. This unbreaks
loading of invalid kernel files.

Reported by mlarkin@
OK mlarkin@


# 1.41 22-Nov-2016 reyk

There is no need for res when there is already ret.


# 1.40 22-Nov-2016 edd

Insert disabled VMs into vmd(8)'s queues and allow vmctl(8) to display them.

Tested by Jon Bernard and reyk@.

OK reyk@, no objections mlarkin@.

Thanks


# 1.39 04-Nov-2016 reyk

Pass the internal vmid or 0 to vm_register() instead of changing it
once again after setting the next available id.

Suggested by edd@


# 1.38 04-Nov-2016 reyk

Update the config/register/get VM methods to match the config_set/get
style that is used in other places. Also keep the vmid from the parent.

OK edd@


# 1.37 29-Oct-2016 edd

Separate parsing vms and switches from starting them in vmd(8).

Brings us one step closer to having disabled by default vms is vm.conf(5),
which can be started with vmctl(8).

Input, testing and OK reyk@. Thanks.


# 1.36 17-Oct-2016 reyk

Add the option to specify an interface group per virtual switch as well;
this group will be added to all VM tap(4) interfaces in the switch.

Tested by martijn@


# 1.35 15-Oct-2016 reyk

Allow to add an interface to an interface group; with the group keyword.

Requested and tested by martijn@


# 1.34 12-Oct-2016 reyk

Fix functionality and semantics of vmctl load/reload/reset.

OK rzalamena@


# 1.33 06-Oct-2016 reyk

Terminate VMs on shutdown of vmd instead of leaving them running as
undead VM processes.

OK mlarkin@


# 1.32 05-Oct-2016 reyk

Add support for enhanced networking configuration and virtual switches.
See vm.conf(5) for more details.

OK mlarkin@


# 1.31 04-Oct-2016 reyk

Add a new "priv" process that is responsible for ioctls and restricted
operations that aren't allowed under pledge. This is a companion to
the "vmd" process that runs as root but with pledge.

With the "priv" process, each new tap(4) interface now gets a
description to indicate the vm, eg. "vm1-if0-myvm". For network
configuration will be done by vmd/priv later.

OK mlarkin@


# 1.30 29-Sep-2016 reyk

Implement fork+exec for vmd, using the same framework from httpd etc.

No objections from mlarkin@ sunil@


# 1.29 17-Aug-2016 deraadt

small bits of header cleanup; ok mlarkin


# 1.28 29-Jul-2016 stefan

Allow starting a VM again after it was terminated

If a VM exits, terminate it and remove it from the list of
available VMs. That allows a VM with name `foo' to be restarted
after it has exited.

This changes structures shared between vmd and vmctl. You need to
rebuild vmctl also.

ok mlarkin@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.27 05-Feb-2016 reyk

Fix a possible use-after-free in vmd, forward the result to the
control socket before free'ing the vm.

Found by and OK jsg@


# 1.26 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.25 11-Dec-2015 reyk

The vmctl "id" argument can now be a number of or a vm name, eg.
vmctl stop 3
vmctl stop "openbsd.vm"


# 1.24 08-Dec-2015 jsg

when checking the config file with -n don't open /dev/vmm or require root
ok reyk@


# 1.23 08-Dec-2015 jsg

make the -f option work as intended
ok reyk@


# 1.22 07-Dec-2015 reyk

tweak initial error logging


# 1.21 06-Dec-2015 reyk

Prevent running a VM with the same name multiple times - multiple
instances of the same configuration will be handled in a different way
later. It is also not a good idea to use the same writeable disk
with multiple VMs at the same time.

As discussed with mlarkin@


# 1.20 06-Dec-2015 reyk

Print the TTY in the vmctl status output.


# 1.19 06-Dec-2015 reyk

When a new vm is created with VMM_IOC_CREATE, the kernel assigns a
unique id to it. This happens in the vm child process and has to be
communicated to the parent processes to track the vm. Knowing the vm
id in the parent and vmm processes also allows to remove vm from the
daemons list on terminate requests later.


# 1.18 06-Dec-2015 reyk

Check errno from config_getvm() correctly


# 1.17 05-Dec-2015 reyk

Print shorter error message if opening /dev/vmm failed.

Pointed out by deraadt@


# 1.16 03-Dec-2015 reyk

Re-add the "load" and "reload" commands to vmctl: Instead of parsing
the configuration in vmctl directly, it now sends a (re)load request
to vmd. The reload also resets the existing configuration status -
this doesn't do much difference yet but a future change will compare
if a specified VM is already running. "load" will allow to add
configuration, while "reload" resets the state before loading.


# 1.15 03-Dec-2015 reyk

Add and document -D and -f flags to vmd.


# 1.14 03-Dec-2015 reyk

mlarkin's code has been moved to vmm.c, so it is ok to claim the copyright.


# 1.13 03-Dec-2015 reyk

Add support for an optional vm.conf(5) file in vmd. This will replace
vmm.conf(5) in vmmctl. For a short time, both vmd and vmmctl will
support a configuration file, but vmmctl will be changed to send
"load" requests to vmd instead of loading and parsing the file
directly.


# 1.12 03-Dec-2015 reyk

prepare config_getvm() for parse.y


# 1.11 02-Dec-2015 reyk

send the tty name to vmmctl and print it as a result.


# 1.10 02-Dec-2015 reyk

Split the fully privileged parent into two processes "parent" and
"vmm" with reduced privileges:
- the "parent" opens fds (disks, ifs, etc.) but runs as root but pledged as
"stdio rpath wpath proc tty sendfd".
- the "vmm" process handles the creation and supervision of vm processes,
and the primary communication with the vmm(4) subsystem. It runs as _vmd
in the chroot but does not use pledge, as the vmm ioctls are not allowed
by any pledge model yet.
With this change, vmd starts to track the configuration state of VMs
in vmd and will allow other things later (like terminating a vm by
name, moving the configuration parser to vmd, ...). More incremental
changes will follow.


# 1.9 02-Dec-2015 reyk

Start tweaking vmd's privsep and daemon model by splitting the main
process into multiple parts and adopting the "proc.c"-style from other
daemons. This allows to further reduce the privileges, to give better
pledge(2), and to add some upcoming changes.

"please do" mlarkin@, deraadt@


# 1.8 26-Nov-2015 reyk

Automatically start vmm(4) when the first VM is created and after the
last VM is terminated. This allows to remove the explicit "vmm
enable" / "vmm disable" (VMM_IOC_START / VMM_IOC_STOP) ioctls. You'll
have to update kernel and userland for this change, as the kernel ABI
changes.

OK mpi@ mlarkin@


# 1.7 25-Nov-2015 tedu

typo: should be looking pid == -1


# 1.6 23-Nov-2015 reyk

accept4() is restarted after signals which prevents vmd from exiting
in the current control socket loop. Add a poll before the accept that
is not restarted and allows to escape the loop. This code is kind of
temporary, as we're planning to replace the event handling, but it
allows to kill (or Ctrl+c) vmd for now.

OK tedu@, discussed with many


# 1.5 23-Nov-2015 reyk

I accidentally removed a newline in usage() when converting the log
messages to log_*.

From Cesare Gargano


# 1.4 23-Nov-2015 reyk

Add support for logging to stderr or syslog, and to run vmd in
foreground with -d.

OK mlarkin@ jung@


# 1.3 22-Nov-2015 deraadt

use PATH_MAX where needed


# 1.2 22-Nov-2015 reyk

Add $ Ids


# 1.1 22-Nov-2015 mlarkin

vmd(8) - virtual machine daemon.

There is still a lot to be done, and fixed, in these userland components
but I have received enough "it works, commit it" emails that it's time
to finish those things in tree.

discussed with many, tested by many.


# 1.105 21-Nov-2018 reyk

Add support for "local inet6" interfaces.

ok & test ccardenas@, additional review from kn@


# 1.104 15-Oct-2018 reyk

Prevent VM reboot loops by rate-limiting the interval a VM can reboot.

This looping has been experienced by people who run VMs with a broken
kernel or boot loader that trigger a very fast reboot loop (triple
fault) of a VM that ends up using a lot of CPU and resources on the
host. Some fixes in vmm(4) and vmd(8) helped to avoid such conditions
but it can still occur if something is wrong in the guest VM itself.

If the VM restarts after less than VM_START_RATE_SEC (6) seconds, we
increment the limit counter. After VM_START_RATE_LIMIT (3) of suchs
fast reboots the VM is stopped.

There are only very few people who intentionally want to reboot-loop a
VM very quickly (many times within a second); mostly for fuzzing.
They will have to recompile and adjust the stated #defines in the code
as we don't have a config option to disable it.

OK mlarkin@


Revision tags: OPENBSD_6_4_BASE
# 1.103 08-Oct-2018 reyk

Add support for qcow2 base images (external snapshots).

This works is from Ori Bernstein, committing on his behalf:

Add support to vmd for external snapshots. That is, snapshots that are
derived from a base image. Data lookups start in the derived image,
and if the derived image does not contain some data, the search
proceeds ot the base image. Multiple derived images may exist off of
a single base image.

A limitation of this format is that modifying the base image will
corrupt the derived image.

This change also adds support for creating disk derived disk images to
vmctl. To use it:

vmctl create derived.qcow2 -s 16G -b base.qcow2

From Ori Bernstein
OK mlarkin@ reyk@


# 1.102 29-Sep-2018 pd

vmd: don't remove vm if sending failed

Fix a bug where a vm was removed in vmd.c after vmctl send even if sending
failed.
spotted by solene@
ok mlarkin@


# 1.101 28-Sep-2018 reyk

Fix copy-pasto to use maxmem instead of maxcpu

Reported by Greg Steuck

OK mlarkin@


# 1.100 10-Sep-2018 bluhm

vmd(8) clould close file descriptor 0 as not all fd fields were
properly initialized with -1. Also avoid closing -1.
OK mlarkin@


# 1.99 10-Sep-2018 bluhm

During the fork+exec implementation, daemon(3) was moved after
proc_init(). As a consequence vmd(8) child processes did not detach
from the terminal anymore. Dup /dev/null to the stdio file descriptors
in the children.
OK mlarkin@ reyk@


# 1.98 15-Jul-2018 reyk

Track resources and enforce cpu/memory/interface limits for non-root users.

The limits are currently hard-coded and undocumented (4 CPUs/VMs, 2G
memory, 8 interfaces) but will be configurable in an upcoming diff.
These limits are tracked in total usage; for example, a user will be
able to run up to 4 VMs with 512M of memory or a single VM with 2G.

OK ccardenas@ mlarkin@


# 1.97 13-Jul-2018 reyk

Check the disk/kernel/cdrom file permissions after openening the fd.

This prevents time of TOCTOU attacks for instances.

OK mlarkin@


# 1.96 13-Jul-2018 reyk

Add "allow instance" option.

This allows users to create VM instances and change desired options,
for example a user can be allowed to run a VM with all the
pre-configured options but specify an own disk image.

(mlarkin@ was fine with iterating over it)

OK ccardenas@


# 1.95 12-Jul-2018 reyk

Allow to use configured/running VMs as templates for other VM instances.

This introduces new grammar and the -t optional in vmctl start.

(For now, only root can create VM instances; but it is planned to allow
users to create their own VMs based on permissions and quota.)

OK ccardenas@ mlarkin@ jmc@


# 1.94 11-Jul-2018 reyk

style - indent each case statement in a switch.


# 1.93 11-Jul-2018 reyk

Add -w option to vmctl stop to wait for completion of VM termination.

Use it in /etc/rc.d/vmd accordingly.

OK sthen@


# 1.92 11-Jul-2018 reyk

Rename function to vmd_check_vmh


# 1.91 11-Jul-2018 reyk

Add -f option to vmctl stop to forcefully kill a VM.

This also fixes a bug in vmm_sighdlr where it might have missed
forwarding the TERMINATE_EVENT to the vmd parent after a VM child
died, leading to an abandoned VM in the vmd parent process.

OK ccardenas@ mlarkin@ benno@ kn@


# 1.90 10-Jul-2018 reyk

style (single-line ifs don't need braces)


# 1.89 10-Jul-2018 reyk

vmd already had DEBUG/DPRINTF, there is no need for VMD_DEBUG/dprintf

Replace all occurences of dprintf with DPRINTF (defined in proc.h).


# 1.88 10-Jul-2018 reyk

Tweak debug log messages

- Turn tracing messages into DPRINTF (only compiled with DEBUG).

- Pass __func__ to vm_stop and vm_remove: this way we can track who
called the function in the async context. It replaces the manual
log_debug in front of each vm_stop/vm_remove. This debug logging
trick can be removed in the future once we are more confident about
it.

OK ccardenas@ mlarkin@


# 1.87 26-Jun-2018 reyk

Add "socket owner" to allow changing the owner of the vmd control socket.

This allows to open vmctl control or console access to other users
that are not in group wheel. Access for non-root users still defaults
to read-only actions unless you change the owner (user/group) of each
individual VM.

Requested by Mischa Peters

OK mlarkin@


# 1.86 19-Jun-2018 reyk

knf


# 1.85 13-May-2018 pd

vmd(8): enable pause / unpause for vm owners

Patch from Mohamed Aslan. Thanks!
ok kn@


# 1.84 25-Apr-2018 mlarkin

vmd(8)'s early error messages weren't visible when started via /etc/rc
(such as errors relating to not having VMX/etc). Change the log_init
to log to syslog so at least we have some chance of seeing these errors.

requested and ok beck@


# 1.83 21-Apr-2018 mlarkin

spelling error in log message


# 1.82 29-Mar-2018 martijn

Make sure that the global config is send out immediately when it is
loaded. This makes sure that the local prefix specied in the config is
always used.

OK ccardenas@


Revision tags: OPENBSD_6_3_BASE
# 1.81 14-Mar-2018 mlarkin

block two VMs from using the same disk image file at the same time.
Also changes an error message in vmctl to reflect same.


# 1.80 18-Feb-2018 pd

vmd: fix vmctl pause for non existing vm ids (never returns)

check if vm id is valid before sending to vmm for pausing. The 'lock' is caused
by vmm sending back ENOENT for a non existent vm but vmd drops the message
because it doesn't recogize the vmid vmm is talking about. This is an artifact
of the 'policy' don't trust any imsg from a sibling priv sep process and do
your own checking.

reported by Abel Abraham Camarillo Ojeda
ok mlarkin@ and ccardenas@


# 1.79 10-Jan-2018 sthen

Don't require "disk" or "kernel", also allow just "cdrom" instead, a VM can
still be useful with only cdrom storage. ok ccardenas@


# 1.78 08-Jan-2018 mpi

Enable TIOCUCNTL to be able to set ns8250's break detected condition.

It is now possible to send BREAK commands to vmd(8) independently of
the serial terminal emulator.

Happy virtual ddb(4) hacking!

No objection from mlarkin@, ok nicm@, ccardenas@, deraadt@


# 1.77 03-Jan-2018 ccardenas

Add initial CD-ROM support to VMD via vioscsi.

* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8)
* Support various sized ISOs (Limitation of 4G ISOs on Linux guests)
* Known working guests: OpenBSD (primary), Alpine Linux (primary),
CentOS 6 (secondary), Ubuntu 17.10 (secondary).
NOTE: Secondary indicates some issue(s) preventing full/reliable
functionality outside the scope of the vioscsi work.
* If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's
default BIOS) will boot from CD-ROM.

ok mlarkin@, jca@


# 1.76 06-Dec-2017 abieber

Make vmd respect owner when starting non-disabled vms.

OK pd@, benno@


# 1.75 30-Nov-2017 ccardenas

When performing vmctl reload and a previously configured vm is running,
exit with an EALREADY vs EPERM.

ok mlarkin@


# 1.74 11-Nov-2017 mlarkin

update switch handling in vmd(8). vmd now gets switch information (rdomain,
etc) from underlying switch interface instead of handling this on its
own.

Diff from carlos cardenas, Thanks!

ok reyk@


# 1.73 07-Nov-2017 mlarkin

typo in previous


# 1.72 07-Nov-2017 mlarkin

comment function vm_checkperm


# 1.71 24-Oct-2017 mlarkin

The VMD parent process didn't handle the case of a VM exiting
with a non 0 return properly (i.e. EIO).

From: Carlos Cardenas, thanks!


# 1.70 07-Oct-2017 mlarkin

vmd: retain ownership on vm reboot

from Jesper Wallin, thanks!


Revision tags: OPENBSD_6_2_BASE
# 1.69 08-Sep-2017 mlarkin

vmd: add more explanatory log_debug messages

From Carlos Cardenas, many thanks!


# 1.68 20-Aug-2017 pd

vmd: Allow only upward migration

This restricts receiving vms from hosts with more cpu features.

Tested on
broadwell -> skylake (works)
skylake -> broadwell (don't work)

ok mlarkin@


# 1.67 15-Aug-2017 pd

vmd: fix vm id displayed by vmctl when receiving a vm

Also fix two debug messages and an IMSG type.


# 1.66 14-Aug-2017 jasper

validate vm names before creating them; a valid name contains alphanumeric
characters, including '.', '_' and '-'. but does not start with the latter
three.

ok mlarkin@ pd@


# 1.65 13-Aug-2017 jasper

don't issue a termination command to an already stopped vm

ok mlarkin@


# 1.64 15-Jul-2017 pd

Add vmctl send and vmctl receive

ok reyk@ and mlarkin@


# 1.63 09-Jul-2017 pd

vmd/vmctl: Add ability to pause / unpause vms

With help from Ashwin Agrawal

ok reyk@ mlarkin@


# 1.62 29-May-2017 mlarkin

vmd(8): prevent crashing when presented with a vm name argument to
"vmctl stop" that doesn't exist.

Diff from Pratik Vyas, thanks!


# 1.61 04-May-2017 reyk

Report command failure back to vmctl reload, reset, load, log verbose.

OK mlarkin@


# 1.60 04-May-2017 reyk

Add support for rdomains.

This allows to configure VM interfaces and switches in individual rdomains.

OK mlarkin@


# 1.59 25-Apr-2017 reyk

Generate randomized MAC addresses earlier to keep them across reboots.

OK deraadt@


# 1.58 21-Apr-2017 reyk

Add global configuration option "local prefix" to change prefix for -L.

The default prefix is 100.64.0.0/10 from RFC6598.

Requested by sthen@ chris@
OK mlarkin@


# 1.57 19-Apr-2017 reyk

Add support for dynamic "NAT" interfaces (-L/local interface).

When a local interface is configured, vmd configures a /31 address on
the tap(4) interface of the host and provides another IP in the same
subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server
that replies with IP, gateway, and DNS addresses to the VM. The
built-in server only ever responds to the VM on the inside and cannot
leak its DHCP responses to the outside.

Thanks to Uwe Werler, Josh Grosse, and some others for testing!

OK deraadt@


# 1.56 06-Apr-2017 reyk

Do not expose vmm(4) VM IDs to the user, use vmd(8)'s IDs instead.

Each VM has two IDs: one from the kernel (vmm) and a different one
from userland (vmd). The vmm ID is not consistent and incremented on
every boot during runtimg of the host system. The vmd ID remains the
same during the lifetime of a configured VM, even after reboots.
Configured VMs will even get and keep their IDs when the configuration
is loaded. This is more what users expect.

Pointed out and tested by otto@

OK deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.55 15-Mar-2017 reyk

More fixes for starting and stopping VMs, fixing fallout from vm_running.

- Don't start a VM that is already running
- Keep the VM as running until it is powered off (and not stopping)
- Don't fatal in the parent if the vmm process referenced an unknown VM
- Don't stop a VM that is already stopping
- Indicate that a VM is stopping in "vmctl status"

The previous "vmctl stop; vmctl stop" to force-shutdown is not
supported anymore - the shutdown timeout should make sure that the VM
is really terminated. To force-shutdown, reference the VM by ID.
We might add a flag to vmctl stop to just turn the VM off.


# 1.54 15-Mar-2017 reyk

Close the tty if the VM was powered down.

The parent keeps a copy of each VM's tty fd to reuse it on reboot.
Close this tty if the VM was stopped, and not rebooted, by calling
vm_stop(vm, 0) instead of just setting vm_running to 0. Also make
sure that vm_ttyname is not used after free'ing it.


# 1.53 02-Mar-2017 reyk

Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.

This is especially useful when multiple VMs share a switch, the
implementation is independent from the underlying switch or bridge.

no objections mlarkin@


# 1.52 01-Mar-2017 reyk

Add "owner" option to set a user/group ownership for pre-configured VMs

This allows matching users to start or stop VMs that they "own" and to
access the console accordingly.

OK mlarkin@


# 1.51 27-Feb-2017 reyk

Replace openpty(3) with local function that uses pre-opened /dev/ptm fd

This allows more flexibility for upcoming changes and better pledge.
We also didn't use half of the features of libutil's openpty function.
Additionally, make sure that the ttys are closed correctly on shutdown.

OK gilles@


# 1.50 13-Jan-2017 edd

Make it possible to remove VMs from vmd(8)'s internal queue.

The semantics agreed with reyk@ are:

* ad-hoc created vms, created with `vmctl start`, are removed once stopped.
* Stopped VMs defined in a config file are flushed before a `vmctl reload`.

OK reyk@


# 1.49 11-Jan-2017 reyk

Add imsg communication channel between vmd and invividual VMs.
For now, this is only used to forward "log verbose|brief" requests,
but it will be used for better things later.

OK mlarkin@


# 1.48 09-Jan-2017 reyk

Stop accessing verbose and debug variables from log.c directly.

This replaces log_verbose() and "extern int verbose" with the two functions
log_setverbose() and log_getverbose().

Pointed out by benno@
OK krw@ eric@ gilles@ (OK gilles@ for the snmpd bits as well)


# 1.47 14-Dec-2016 reyk

Allow to start disabled and pre-configured VMs by name, "vmctl start foo".

With testing from Jon Bernard

OK mlarkin@


# 1.46 14-Dec-2016 reyk

If a VM terminates with the result EAGAIN, close all fds except the
pty and re-send it to the vmm monitor process. With additional
changes in vmm.c, this will allow perform a cold reboot of VM.

With testing and feedback from Jon Bernard
OK mlarkin@


# 1.45 26-Nov-2016 reyk

Implement basic support for boot.conf(8) on the disk image.

Like the real boot loader, load and parse hd0a:/etc/boot.conf from the
first disk and fall back to /bsd. Not all boot loader options are
supported, but it at least does set device, set image, and boot -acds
(eg. for booting single-user).

For example, it can now boot install60.fs that includes a boot.conf
with "set image /6.0/amd64/bsd.rd":
vmctl start install -c -d install60.fs -d OpenBSD.img

This pseudo-bootloader is only needed without BIOS and could
potentially be replaced in the future.

OK mlarkin@


# 1.44 26-Nov-2016 reyk

If -m/memory is not specified, use 512M by default.

Default value picked with mlarkin - not too small and not too large.

OK mlarkin@


# 1.43 24-Nov-2016 reyk

Add support for booting the kernel from the disk image.

This make the kernel/-k argument optional and, if not specified, tries
to find the /bsd kernel in the primary hd0a partition of the first
disk image itself. It doesn't support hd0a:/etc/boot.conf yet, and it
is no BIOS or full boot loader, but it makes booting and handling of
VMs a bit easier - booting an external kernel is still supported.

The UFS file system code ufs.c is directly from libsa which is also
used by the real boot loader. The code compiles with a few signedness
warning which will be fixed separately.

OK mlarkin@


# 1.42 22-Nov-2016 reyk

Fix error path of config_setvm() and its callers. This unbreaks
loading of invalid kernel files.

Reported by mlarkin@
OK mlarkin@


# 1.41 22-Nov-2016 reyk

There is no need for res when there is already ret.


# 1.40 22-Nov-2016 edd

Insert disabled VMs into vmd(8)'s queues and allow vmctl(8) to display them.

Tested by Jon Bernard and reyk@.

OK reyk@, no objections mlarkin@.

Thanks


# 1.39 04-Nov-2016 reyk

Pass the internal vmid or 0 to vm_register() instead of changing it
once again after setting the next available id.

Suggested by edd@


# 1.38 04-Nov-2016 reyk

Update the config/register/get VM methods to match the config_set/get
style that is used in other places. Also keep the vmid from the parent.

OK edd@


# 1.37 29-Oct-2016 edd

Separate parsing vms and switches from starting them in vmd(8).

Brings us one step closer to having disabled by default vms is vm.conf(5),
which can be started with vmctl(8).

Input, testing and OK reyk@. Thanks.


# 1.36 17-Oct-2016 reyk

Add the option to specify an interface group per virtual switch as well;
this group will be added to all VM tap(4) interfaces in the switch.

Tested by martijn@


# 1.35 15-Oct-2016 reyk

Allow to add an interface to an interface group; with the group keyword.

Requested and tested by martijn@


# 1.34 12-Oct-2016 reyk

Fix functionality and semantics of vmctl load/reload/reset.

OK rzalamena@


# 1.33 06-Oct-2016 reyk

Terminate VMs on shutdown of vmd instead of leaving them running as
undead VM processes.

OK mlarkin@


# 1.32 05-Oct-2016 reyk

Add support for enhanced networking configuration and virtual switches.
See vm.conf(5) for more details.

OK mlarkin@


# 1.31 04-Oct-2016 reyk

Add a new "priv" process that is responsible for ioctls and restricted
operations that aren't allowed under pledge. This is a companion to
the "vmd" process that runs as root but with pledge.

With the "priv" process, each new tap(4) interface now gets a
description to indicate the vm, eg. "vm1-if0-myvm". For network
configuration will be done by vmd/priv later.

OK mlarkin@


# 1.30 29-Sep-2016 reyk

Implement fork+exec for vmd, using the same framework from httpd etc.

No objections from mlarkin@ sunil@


# 1.29 17-Aug-2016 deraadt

small bits of header cleanup; ok mlarkin


# 1.28 29-Jul-2016 stefan

Allow starting a VM again after it was terminated

If a VM exits, terminate it and remove it from the list of
available VMs. That allows a VM with name `foo' to be restarted
after it has exited.

This changes structures shared between vmd and vmctl. You need to
rebuild vmctl also.

ok mlarkin@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.27 05-Feb-2016 reyk

Fix a possible use-after-free in vmd, forward the result to the
control socket before free'ing the vm.

Found by and OK jsg@


# 1.26 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.25 11-Dec-2015 reyk

The vmctl "id" argument can now be a number of or a vm name, eg.
vmctl stop 3
vmctl stop "openbsd.vm"


# 1.24 08-Dec-2015 jsg

when checking the config file with -n don't open /dev/vmm or require root
ok reyk@


# 1.23 08-Dec-2015 jsg

make the -f option work as intended
ok reyk@


# 1.22 07-Dec-2015 reyk

tweak initial error logging


# 1.21 06-Dec-2015 reyk

Prevent running a VM with the same name multiple times - multiple
instances of the same configuration will be handled in a different way
later. It is also not a good idea to use the same writeable disk
with multiple VMs at the same time.

As discussed with mlarkin@


# 1.20 06-Dec-2015 reyk

Print the TTY in the vmctl status output.


# 1.19 06-Dec-2015 reyk

When a new vm is created with VMM_IOC_CREATE, the kernel assigns a
unique id to it. This happens in the vm child process and has to be
communicated to the parent processes to track the vm. Knowing the vm
id in the parent and vmm processes also allows to remove vm from the
daemons list on terminate requests later.


# 1.18 06-Dec-2015 reyk

Check errno from config_getvm() correctly


# 1.17 05-Dec-2015 reyk

Print shorter error message if opening /dev/vmm failed.

Pointed out by deraadt@


# 1.16 03-Dec-2015 reyk

Re-add the "load" and "reload" commands to vmctl: Instead of parsing
the configuration in vmctl directly, it now sends a (re)load request
to vmd. The reload also resets the existing configuration status -
this doesn't do much difference yet but a future change will compare
if a specified VM is already running. "load" will allow to add
configuration, while "reload" resets the state before loading.


# 1.15 03-Dec-2015 reyk

Add and document -D and -f flags to vmd.


# 1.14 03-Dec-2015 reyk

mlarkin's code has been moved to vmm.c, so it is ok to claim the copyright.


# 1.13 03-Dec-2015 reyk

Add support for an optional vm.conf(5) file in vmd. This will replace
vmm.conf(5) in vmmctl. For a short time, both vmd and vmmctl will
support a configuration file, but vmmctl will be changed to send
"load" requests to vmd instead of loading and parsing the file
directly.


# 1.12 03-Dec-2015 reyk

prepare config_getvm() for parse.y


# 1.11 02-Dec-2015 reyk

send the tty name to vmmctl and print it as a result.


# 1.10 02-Dec-2015 reyk

Split the fully privileged parent into two processes "parent" and
"vmm" with reduced privileges:
- the "parent" opens fds (disks, ifs, etc.) but runs as root but pledged as
"stdio rpath wpath proc tty sendfd".
- the "vmm" process handles the creation and supervision of vm processes,
and the primary communication with the vmm(4) subsystem. It runs as _vmd
in the chroot but does not use pledge, as the vmm ioctls are not allowed
by any pledge model yet.
With this change, vmd starts to track the configuration state of VMs
in vmd and will allow other things later (like terminating a vm by
name, moving the configuration parser to vmd, ...). More incremental
changes will follow.


# 1.9 02-Dec-2015 reyk

Start tweaking vmd's privsep and daemon model by splitting the main
process into multiple parts and adopting the "proc.c"-style from other
daemons. This allows to further reduce the privileges, to give better
pledge(2), and to add some upcoming changes.

"please do" mlarkin@, deraadt@


# 1.8 26-Nov-2015 reyk

Automatically start vmm(4) when the first VM is created and after the
last VM is terminated. This allows to remove the explicit "vmm
enable" / "vmm disable" (VMM_IOC_START / VMM_IOC_STOP) ioctls. You'll
have to update kernel and userland for this change, as the kernel ABI
changes.

OK mpi@ mlarkin@


# 1.7 25-Nov-2015 tedu

typo: should be looking pid == -1


# 1.6 23-Nov-2015 reyk

accept4() is restarted after signals which prevents vmd from exiting
in the current control socket loop. Add a poll before the accept that
is not restarted and allows to escape the loop. This code is kind of
temporary, as we're planning to replace the event handling, but it
allows to kill (or Ctrl+c) vmd for now.

OK tedu@, discussed with many


# 1.5 23-Nov-2015 reyk

I accidentally removed a newline in usage() when converting the log
messages to log_*.

From Cesare Gargano


# 1.4 23-Nov-2015 reyk

Add support for logging to stderr or syslog, and to run vmd in
foreground with -d.

OK mlarkin@ jung@


# 1.3 22-Nov-2015 deraadt

use PATH_MAX where needed


# 1.2 22-Nov-2015 reyk

Add $ Ids


# 1.1 22-Nov-2015 mlarkin

vmd(8) - virtual machine daemon.

There is still a lot to be done, and fixed, in these userland components
but I have received enough "it works, commit it" emails that it's time
to finish those things in tree.

discussed with many, tested by many.


# 1.104 15-Oct-2018 reyk

Prevent VM reboot loops by rate-limiting the interval a VM can reboot.

This looping has been experienced by people who run VMs with a broken
kernel or boot loader that trigger a very fast reboot loop (triple
fault) of a VM that ends up using a lot of CPU and resources on the
host. Some fixes in vmm(4) and vmd(8) helped to avoid such conditions
but it can still occur if something is wrong in the guest VM itself.

If the VM restarts after less than VM_START_RATE_SEC (6) seconds, we
increment the limit counter. After VM_START_RATE_LIMIT (3) of suchs
fast reboots the VM is stopped.

There are only very few people who intentionally want to reboot-loop a
VM very quickly (many times within a second); mostly for fuzzing.
They will have to recompile and adjust the stated #defines in the code
as we don't have a config option to disable it.

OK mlarkin@


Revision tags: OPENBSD_6_4_BASE
# 1.103 08-Oct-2018 reyk

Add support for qcow2 base images (external snapshots).

This works is from Ori Bernstein, committing on his behalf:

Add support to vmd for external snapshots. That is, snapshots that are
derived from a base image. Data lookups start in the derived image,
and if the derived image does not contain some data, the search
proceeds ot the base image. Multiple derived images may exist off of
a single base image.

A limitation of this format is that modifying the base image will
corrupt the derived image.

This change also adds support for creating disk derived disk images to
vmctl. To use it:

vmctl create derived.qcow2 -s 16G -b base.qcow2

From Ori Bernstein
OK mlarkin@ reyk@


# 1.102 29-Sep-2018 pd

vmd: don't remove vm if sending failed

Fix a bug where a vm was removed in vmd.c after vmctl send even if sending
failed.
spotted by solene@
ok mlarkin@


# 1.101 28-Sep-2018 reyk

Fix copy-pasto to use maxmem instead of maxcpu

Reported by Greg Steuck

OK mlarkin@


# 1.100 10-Sep-2018 bluhm

vmd(8) clould close file descriptor 0 as not all fd fields were
properly initialized with -1. Also avoid closing -1.
OK mlarkin@


# 1.99 10-Sep-2018 bluhm

During the fork+exec implementation, daemon(3) was moved after
proc_init(). As a consequence vmd(8) child processes did not detach
from the terminal anymore. Dup /dev/null to the stdio file descriptors
in the children.
OK mlarkin@ reyk@


# 1.98 15-Jul-2018 reyk

Track resources and enforce cpu/memory/interface limits for non-root users.

The limits are currently hard-coded and undocumented (4 CPUs/VMs, 2G
memory, 8 interfaces) but will be configurable in an upcoming diff.
These limits are tracked in total usage; for example, a user will be
able to run up to 4 VMs with 512M of memory or a single VM with 2G.

OK ccardenas@ mlarkin@


# 1.97 13-Jul-2018 reyk

Check the disk/kernel/cdrom file permissions after openening the fd.

This prevents time of TOCTOU attacks for instances.

OK mlarkin@


# 1.96 13-Jul-2018 reyk

Add "allow instance" option.

This allows users to create VM instances and change desired options,
for example a user can be allowed to run a VM with all the
pre-configured options but specify an own disk image.

(mlarkin@ was fine with iterating over it)

OK ccardenas@


# 1.95 12-Jul-2018 reyk

Allow to use configured/running VMs as templates for other VM instances.

This introduces new grammar and the -t optional in vmctl start.

(For now, only root can create VM instances; but it is planned to allow
users to create their own VMs based on permissions and quota.)

OK ccardenas@ mlarkin@ jmc@


# 1.94 11-Jul-2018 reyk

style - indent each case statement in a switch.


# 1.93 11-Jul-2018 reyk

Add -w option to vmctl stop to wait for completion of VM termination.

Use it in /etc/rc.d/vmd accordingly.

OK sthen@


# 1.92 11-Jul-2018 reyk

Rename function to vmd_check_vmh


# 1.91 11-Jul-2018 reyk

Add -f option to vmctl stop to forcefully kill a VM.

This also fixes a bug in vmm_sighdlr where it might have missed
forwarding the TERMINATE_EVENT to the vmd parent after a VM child
died, leading to an abandoned VM in the vmd parent process.

OK ccardenas@ mlarkin@ benno@ kn@


# 1.90 10-Jul-2018 reyk

style (single-line ifs don't need braces)


# 1.89 10-Jul-2018 reyk

vmd already had DEBUG/DPRINTF, there is no need for VMD_DEBUG/dprintf

Replace all occurences of dprintf with DPRINTF (defined in proc.h).


# 1.88 10-Jul-2018 reyk

Tweak debug log messages

- Turn tracing messages into DPRINTF (only compiled with DEBUG).

- Pass __func__ to vm_stop and vm_remove: this way we can track who
called the function in the async context. It replaces the manual
log_debug in front of each vm_stop/vm_remove. This debug logging
trick can be removed in the future once we are more confident about
it.

OK ccardenas@ mlarkin@


# 1.87 26-Jun-2018 reyk

Add "socket owner" to allow changing the owner of the vmd control socket.

This allows to open vmctl control or console access to other users
that are not in group wheel. Access for non-root users still defaults
to read-only actions unless you change the owner (user/group) of each
individual VM.

Requested by Mischa Peters

OK mlarkin@


# 1.86 19-Jun-2018 reyk

knf


# 1.85 13-May-2018 pd

vmd(8): enable pause / unpause for vm owners

Patch from Mohamed Aslan. Thanks!
ok kn@


# 1.84 25-Apr-2018 mlarkin

vmd(8)'s early error messages weren't visible when started via /etc/rc
(such as errors relating to not having VMX/etc). Change the log_init
to log to syslog so at least we have some chance of seeing these errors.

requested and ok beck@


# 1.83 21-Apr-2018 mlarkin

spelling error in log message


# 1.82 29-Mar-2018 martijn

Make sure that the global config is send out immediately when it is
loaded. This makes sure that the local prefix specied in the config is
always used.

OK ccardenas@


Revision tags: OPENBSD_6_3_BASE
# 1.81 14-Mar-2018 mlarkin

block two VMs from using the same disk image file at the same time.
Also changes an error message in vmctl to reflect same.


# 1.80 18-Feb-2018 pd

vmd: fix vmctl pause for non existing vm ids (never returns)

check if vm id is valid before sending to vmm for pausing. The 'lock' is caused
by vmm sending back ENOENT for a non existent vm but vmd drops the message
because it doesn't recogize the vmid vmm is talking about. This is an artifact
of the 'policy' don't trust any imsg from a sibling priv sep process and do
your own checking.

reported by Abel Abraham Camarillo Ojeda
ok mlarkin@ and ccardenas@


# 1.79 10-Jan-2018 sthen

Don't require "disk" or "kernel", also allow just "cdrom" instead, a VM can
still be useful with only cdrom storage. ok ccardenas@


# 1.78 08-Jan-2018 mpi

Enable TIOCUCNTL to be able to set ns8250's break detected condition.

It is now possible to send BREAK commands to vmd(8) independently of
the serial terminal emulator.

Happy virtual ddb(4) hacking!

No objection from mlarkin@, ok nicm@, ccardenas@, deraadt@


# 1.77 03-Jan-2018 ccardenas

Add initial CD-ROM support to VMD via vioscsi.

* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8)
* Support various sized ISOs (Limitation of 4G ISOs on Linux guests)
* Known working guests: OpenBSD (primary), Alpine Linux (primary),
CentOS 6 (secondary), Ubuntu 17.10 (secondary).
NOTE: Secondary indicates some issue(s) preventing full/reliable
functionality outside the scope of the vioscsi work.
* If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's
default BIOS) will boot from CD-ROM.

ok mlarkin@, jca@


# 1.76 06-Dec-2017 abieber

Make vmd respect owner when starting non-disabled vms.

OK pd@, benno@


# 1.75 30-Nov-2017 ccardenas

When performing vmctl reload and a previously configured vm is running,
exit with an EALREADY vs EPERM.

ok mlarkin@


# 1.74 11-Nov-2017 mlarkin

update switch handling in vmd(8). vmd now gets switch information (rdomain,
etc) from underlying switch interface instead of handling this on its
own.

Diff from carlos cardenas, Thanks!

ok reyk@


# 1.73 07-Nov-2017 mlarkin

typo in previous


# 1.72 07-Nov-2017 mlarkin

comment function vm_checkperm


# 1.71 24-Oct-2017 mlarkin

The VMD parent process didn't handle the case of a VM exiting
with a non 0 return properly (i.e. EIO).

From: Carlos Cardenas, thanks!


# 1.70 07-Oct-2017 mlarkin

vmd: retain ownership on vm reboot

from Jesper Wallin, thanks!


Revision tags: OPENBSD_6_2_BASE
# 1.69 08-Sep-2017 mlarkin

vmd: add more explanatory log_debug messages

From Carlos Cardenas, many thanks!


# 1.68 20-Aug-2017 pd

vmd: Allow only upward migration

This restricts receiving vms from hosts with more cpu features.

Tested on
broadwell -> skylake (works)
skylake -> broadwell (don't work)

ok mlarkin@


# 1.67 15-Aug-2017 pd

vmd: fix vm id displayed by vmctl when receiving a vm

Also fix two debug messages and an IMSG type.


# 1.66 14-Aug-2017 jasper

validate vm names before creating them; a valid name contains alphanumeric
characters, including '.', '_' and '-'. but does not start with the latter
three.

ok mlarkin@ pd@


# 1.65 13-Aug-2017 jasper

don't issue a termination command to an already stopped vm

ok mlarkin@


# 1.64 15-Jul-2017 pd

Add vmctl send and vmctl receive

ok reyk@ and mlarkin@


# 1.63 09-Jul-2017 pd

vmd/vmctl: Add ability to pause / unpause vms

With help from Ashwin Agrawal

ok reyk@ mlarkin@


# 1.62 29-May-2017 mlarkin

vmd(8): prevent crashing when presented with a vm name argument to
"vmctl stop" that doesn't exist.

Diff from Pratik Vyas, thanks!


# 1.61 04-May-2017 reyk

Report command failure back to vmctl reload, reset, load, log verbose.

OK mlarkin@


# 1.60 04-May-2017 reyk

Add support for rdomains.

This allows to configure VM interfaces and switches in individual rdomains.

OK mlarkin@


# 1.59 25-Apr-2017 reyk

Generate randomized MAC addresses earlier to keep them across reboots.

OK deraadt@


# 1.58 21-Apr-2017 reyk

Add global configuration option "local prefix" to change prefix for -L.

The default prefix is 100.64.0.0/10 from RFC6598.

Requested by sthen@ chris@
OK mlarkin@


# 1.57 19-Apr-2017 reyk

Add support for dynamic "NAT" interfaces (-L/local interface).

When a local interface is configured, vmd configures a /31 address on
the tap(4) interface of the host and provides another IP in the same
subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server
that replies with IP, gateway, and DNS addresses to the VM. The
built-in server only ever responds to the VM on the inside and cannot
leak its DHCP responses to the outside.

Thanks to Uwe Werler, Josh Grosse, and some others for testing!

OK deraadt@


# 1.56 06-Apr-2017 reyk

Do not expose vmm(4) VM IDs to the user, use vmd(8)'s IDs instead.

Each VM has two IDs: one from the kernel (vmm) and a different one
from userland (vmd). The vmm ID is not consistent and incremented on
every boot during runtimg of the host system. The vmd ID remains the
same during the lifetime of a configured VM, even after reboots.
Configured VMs will even get and keep their IDs when the configuration
is loaded. This is more what users expect.

Pointed out and tested by otto@

OK deraadt@


Revision tags: OPENBSD_6_1_BASE
# 1.55 15-Mar-2017 reyk

More fixes for starting and stopping VMs, fixing fallout from vm_running.

- Don't start a VM that is already running
- Keep the VM as running until it is powered off (and not stopping)
- Don't fatal in the parent if the vmm process referenced an unknown VM
- Don't stop a VM that is already stopping
- Indicate that a VM is stopping in "vmctl status"

The previous "vmctl stop; vmctl stop" to force-shutdown is not
supported anymore - the shutdown timeout should make sure that the VM
is really terminated. To force-shutdown, reference the VM by ID.
We might add a flag to vmctl stop to just turn the VM off.


# 1.54 15-Mar-2017 reyk

Close the tty if the VM was powered down.

The parent keeps a copy of each VM's tty fd to reuse it on reboot.
Close this tty if the VM was stopped, and not rebooted, by calling
vm_stop(vm, 0) instead of just setting vm_running to 0. Also make
sure that vm_ttyname is not used after free'ing it.


# 1.53 02-Mar-2017 reyk

Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.

This is especially useful when multiple VMs share a switch, the
implementation is independent from the underlying switch or bridge.

no objections mlarkin@


# 1.52 01-Mar-2017 reyk

Add "owner" option to set a user/group ownership for pre-configured VMs

This allows matching users to start or stop VMs that they "own" and to
access the console accordingly.

OK mlarkin@


# 1.51 27-Feb-2017 reyk

Replace openpty(3) with local function that uses pre-opened /dev/ptm fd

This allows more flexibility for upcoming changes and better pledge.
We also didn't use half of the features of libutil's openpty function.
Additionally, make sure that the ttys are closed correctly on shutdown.

OK gilles@


# 1.50 13-Jan-2017 edd

Make it possible to remove VMs from vmd(8)'s internal queue.

The semantics agreed with reyk@ are:

* ad-hoc created vms, created with `vmctl start`, are removed once stopped.
* Stopped VMs defined in a config file are flushed before a `vmctl reload`.

OK reyk@


# 1.49 11-Jan-2017 reyk

Add imsg communication channel between vmd and invividual VMs.
For now, this is only used to forward "log verbose|brief" requests,
but it will be used for better things later.

OK mlarkin@


# 1.48 09-Jan-2017 reyk

Stop accessing verbose and debug variables from log.c directly.

This replaces log_verbose() and "extern int verbose" with the two functions
log_setverbose() and log_getverbose().

Pointed out by benno@
OK krw@ eric@ gilles@ (OK gilles@ for the snmpd bits as well)


# 1.47 14-Dec-2016 reyk

Allow to start disabled and pre-configured VMs by name, "vmctl start foo".

With testing from Jon Bernard

OK mlarkin@


# 1.46 14-Dec-2016 reyk

If a VM terminates with the result EAGAIN, close all fds except the
pty and re-send it to the vmm monitor process. With additional
changes in vmm.c, this will allow perform a cold reboot of VM.

With testing and feedback from Jon Bernard
OK mlarkin@


# 1.45 26-Nov-2016 reyk

Implement basic support for boot.conf(8) on the disk image.

Like the real boot loader, load and parse hd0a:/etc/boot.conf from the
first disk and fall back to /bsd. Not all boot loader options are
supported, but it at least does set device, set image, and boot -acds
(eg. for booting single-user).

For example, it can now boot install60.fs that includes a boot.conf
with "set image /6.0/amd64/bsd.rd":
vmctl start install -c -d install60.fs -d OpenBSD.img

This pseudo-bootloader is only needed without BIOS and could
potentially be replaced in the future.

OK mlarkin@


# 1.44 26-Nov-2016 reyk

If -m/memory is not specified, use 512M by default.

Default value picked with mlarkin - not too small and not too large.

OK mlarkin@


# 1.43 24-Nov-2016 reyk

Add support for booting the kernel from the disk image.

This make the kernel/-k argument optional and, if not specified, tries
to find the /bsd kernel in the primary hd0a partition of the first
disk image itself. It doesn't support hd0a:/etc/boot.conf yet, and it
is no BIOS or full boot loader, but it makes booting and handling of
VMs a bit easier - booting an external kernel is still supported.

The UFS file system code ufs.c is directly from libsa which is also
used by the real boot loader. The code compiles with a few signedness
warning which will be fixed separately.

OK mlarkin@


# 1.42 22-Nov-2016 reyk

Fix error path of config_setvm() and its callers. This unbreaks
loading of invalid kernel files.

Reported by mlarkin@
OK mlarkin@


# 1.41 22-Nov-2016 reyk

There is no need for res when there is already ret.


# 1.40 22-Nov-2016 edd

Insert disabled VMs into vmd(8)'s queues and allow vmctl(8) to display them.

Tested by Jon Bernard and reyk@.

OK reyk@, no objections mlarkin@.

Thanks


# 1.39 04-Nov-2016 reyk

Pass the internal vmid or 0 to vm_register() instead of changing it
once again after setting the next available id.

Suggested by edd@


# 1.38 04-Nov-2016 reyk

Update the config/register/get VM methods to match the config_set/get
style that is used in other places. Also keep the vmid from the parent.

OK edd@


# 1.37 29-Oct-2016 edd

Separate parsing vms and switches from starting them in vmd(8).

Brings us one step closer to having disabled by default vms is vm.conf(5),
which can be started with vmctl(8).

Input, testing and OK reyk@. Thanks.


# 1.36 17-Oct-2016 reyk

Add the option to specify an interface group per virtual switch as well;
this group will be added to all VM tap(4) interfaces in the switch.

Tested by martijn@


# 1.35 15-Oct-2016 reyk

Allow to add an interface to an interface group; with the group keyword.

Requested and tested by martijn@


# 1.34 12-Oct-2016 reyk

Fix functionality and semantics of vmctl load/reload/reset.

OK rzalamena@


# 1.33 06-Oct-2016 reyk

Terminate VMs on shutdown of vmd instead of leaving them running as
undead VM processes.

OK mlarkin@


# 1.32 05-Oct-2016 reyk

Add support for enhanced networking configuration and virtual switches.
See vm.conf(5) for more details.

OK mlarkin@


# 1.31 04-Oct-2016 reyk

Add a new "priv" process that is responsible for ioctls and restricted
operations that aren't allowed under pledge. This is a companion to
the "vmd" process that runs as root but with pledge.

With the "priv" process, each new tap(4) interface now gets a
description to indicate the vm, eg. "vm1-if0-myvm". For network
configuration will be done by vmd/priv later.

OK mlarkin@


# 1.30 29-Sep-2016 reyk

Implement fork+exec for vmd, using the same framework from httpd etc.

No objections from mlarkin@ sunil@


# 1.29 17-Aug-2016 deraadt

small bits of header cleanup; ok mlarkin


# 1.28 29-Jul-2016 stefan

Allow starting a VM again after it was terminated

If a VM exits, terminate it and remove it from the list of
available VMs. That allows a VM with name `foo' to be restarted
after it has exited.

This changes structures shared between vmd and vmctl. You need to
rebuild vmctl also.

ok mlarkin@


Revision tags: OPENBSD_5_9_BASE OPENBSD_6_0_BASE
# 1.27 05-Feb-2016 reyk

Fix a possible use-after-free in vmd, forward the result to the
control socket before free'ing the vm.

Found by and OK jsg@


# 1.26 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.25 11-Dec-2015 reyk

The vmctl "id" argument can now be a number of or a vm name, eg.
vmctl stop 3
vmctl stop "openbsd.vm"


# 1.24 08-Dec-2015 jsg

when checking the config file with -n don't open /dev/vmm or require root
ok reyk@


# 1.23 08-Dec-2015 jsg

make the -f option work as intended
ok reyk@


# 1.22 07-Dec-2015 reyk

tweak initial error logging


# 1.21 06-Dec-2015 reyk

Prevent running a VM with the same name multiple times - multiple
instances of the same configuration will be handled in a different way
later. It is also not a good idea to use the same writeable disk
with multiple VMs at the same time.

As discussed with mlarkin@


# 1.20 06-Dec-2015 reyk

Print the TTY in the vmctl status output.


# 1.19 06-Dec-2015 reyk

When a new vm is created with VMM_IOC_CREATE, the kernel assigns a
unique id to it. This happens in the vm child process and has to be
communicated to the parent processes to track the vm. Knowing the vm
id in the parent and vmm processes also allows to remove vm from the
daemons list on terminate requests later.


# 1.18 06-Dec-2015 reyk

Check errno from config_getvm() correctly


# 1.17 05-Dec-2015 reyk

Print shorter error message if opening /dev/vmm failed.

Pointed out by deraadt@


# 1.16 03-Dec-2015 reyk

Re-add the "load" and "reload" commands to vmctl: Instead of parsing
the configuration in vmctl directly, it now sends a (re)load request
to vmd. The reload also resets the existing configuration status -
this doesn't do much difference yet but a future change will compare
if a specified VM is already running. "load" will allow to add
configuration, while "reload" resets the state before loading.


# 1.15 03-Dec-2015 reyk

Add and document -D and -f flags to vmd.


# 1.14 03-Dec-2015 reyk

mlarkin's code has been moved to vmm.c, so it is ok to claim the copyright.


# 1.13 03-Dec-2015 reyk

Add support for an optional vm.conf(5) file in vmd. This will replace
vmm.conf(5) in vmmctl. For a short time, both vmd and vmmctl will
support a configuration file, but vmmctl will be changed to send
"load" requests to vmd instead of loading and parsing the file
directly.


# 1.12 03-Dec-2015 reyk

prepare config_getvm() for parse.y


# 1.11 02-Dec-2015 reyk

send the tty name to vmmctl and print it as a result.


# 1.10 02-Dec-2015 reyk

Split the fully privileged parent into two processes "parent" and
"vmm" with reduced privileges:
- the "parent" opens fds (disks, ifs, etc.) but runs as root but pledged as
"stdio rpath wpath proc tty sendfd".
- the "vmm" process handles the creation and supervision of vm processes,
and the primary communication with the vmm(4) subsystem. It runs as _vmd
in the chroot but does not use pledge, as the vmm ioctls are not allowed
by any pledge model yet.
With this change, vmd starts to track the configuration state of VMs
in vmd and will allow other things later (like terminating a vm by
name, moving the configuration parser to vmd, ...). More incremental
changes will follow.


# 1.9 02-Dec-2015 reyk

Start tweaking vmd's privsep and daemon model by splitting the main
process into multiple parts and adopting the "proc.c"-style from other
daemons. This allows to further reduce the privileges, to give better
pledge(2), and to add some upcoming changes.

"please do" mlarkin@, deraadt@


# 1.8 26-Nov-2015 reyk

Automatically start vmm(4) when the first VM is created and after the
last VM is terminated. This allows to remove the explicit "vmm
enable" / "vmm disable" (VMM_IOC_START / VMM_IOC_STOP) ioctls. You'll
have to update kernel and userland for this change, as the kernel ABI
changes.

OK mpi@ mlarkin@


# 1.7 25-Nov-2015 tedu

typo: should be looking pid == -1


# 1.6 23-Nov-2015 reyk

accept4() is restarted after signals which prevents vmd from exiting
in the current control socket loop. Add a poll before the accept that
is not restarted and allows to escape the loop. This code is kind of
temporary, as we're planning to replace the event handling, but it
allows to kill (or Ctrl+c) vmd for now.

OK tedu@, discussed with many


# 1.5 23-Nov-2015 reyk

I accidentally removed a newline in usage() when converting the log
messages to log_*.

From Cesare Gargano


# 1.4 23-Nov-2015 reyk

Add support for logging to stderr or syslog, and to run vmd in
foreground with -d.

OK mlarkin@ jung@


# 1.3 22-Nov-2015 deraadt

use PATH_MAX where needed


# 1.2 22-Nov-2015 reyk

Add $ Ids


# 1.1 22-Nov-2015 mlarkin

vmd(8) - virtual machine daemon.

There is still a lot to be done, and fixed, in these userland components
but I have received enough "it works, commit it" emails that it's time
to finish those things in tree.

discussed with many, tested by many.