#
1.100 |
|
29-Apr-2024 |
dv |
vmm & vmd: drop "continue" flag to simplify running a vcpu.
There's no need to distinguish the "first" time running a vcpu from the subsequent times because vmm(4) uses in-kernel state tracking the last vm exit reason to optimize the logic for updating vcpu registers from userland. While here, clean up the DPRINTF's to make the Intel VMX logic similar to the AMD SVM.
ok mlarkin@
|
#
1.99 |
|
09-Apr-2024 |
dv |
vmm/vmd: add exception injection and refactor inject api.
In order to continue work on mmio and other instruction emulation, vmd(8) needs the ability to inject exceptions (like page faults) from userland.
Refactor the way events are injected from userland, cleaning up how hardware (external) interrupts are injected in the process.
ok mlarkin@
|
Revision tags: OPENBSD_7_5_BASE
|
#
1.98 |
|
20-Feb-2024 |
dv |
Utilize separate threads for RX and TX in vmd(8)'s vionet.
This commit adds multithreading to allow both virtqueues to be processed in parallel along with additional synchronization primitives to protect device configuration state. Allowing RX and TX to operate independently reduces overall network latency for guests and helps alleviate the TX side dominating cpu time.
Tested with help from phessler@, kn@, and mlarkin@. ok mlarkin@.
|
#
1.97 |
|
05-Feb-2024 |
dv |
Cleanup fcntl(3) usage and fd lifetimes in vmd(8).
Remove extraneous fcntl(3) usage for setting fd features that can be set at time of open(2), pipe2(2), or socketpair(2). Also cleans up pty creation switching to using functions from libutil instead of direct ioctl(2) calls.
ok mlarkin@, original diff ok claudio@ as well.
|
#
1.96 |
|
18-Jan-2024 |
claudio |
Use imsg_get_fd() in vmd.
vmd uses a lot of fd passing and does it sometimes via extra abstraction so this just tries to convert the code without any optimisations.
ok dv@
|
#
1.95 |
|
10-Jan-2024 |
dv |
vmm/vmd: add io instruction length to exit information.
Add the instruction length to the vm exit information to allower vmd(8) to manipulate the instruction pointer after io emulation. This is preparation for emulating string-based io instructions.
Removes the instruction pointer update from the kernel (vmm(4)) as well as the instruction length checks, which were overly restrictive anyways based on the way prefixes work in x86 instructions.
ok mlarkin@
|
Revision tags: OPENBSD_7_4_BASE
|
#
1.94 |
|
26-Sep-2023 |
dv |
vmd(8): disambiguate log messages per vm and device.
The logging output from vmd(8) often specifies the function performing the logging, but leaves which vm or vm device to guesswork and reading tea leaves.
Change the logging formatting to prefix with information about the specific vm and potentially the device subprocess. Most of this logging is behind the "verbose" mode, but for warnings this will clarify which vm or device logged the warning.
The format of vm/<name>/<device><index> is chosen to be concise and less ugly than other approaches. This adjusts the process naming for devices to match, dropping the use of brackets.
In the process of this change, updating log settings dynamically via vmctl(8) is fixed by properly broadcasting that information to the device subprocesses. The "vmm" process also now updates its own state properly, so settings survive vm reboots.
ok mlarkin@
|
#
1.93 |
|
26-Sep-2023 |
dv |
vmd(8): fix vm pause deadlock.
When vcpu threads pause, they are holding the run mutex lock. If the event thread is asked to assert an irq on the pic and interrupts are pending, it will try to take the run mutex lock on the vcpu. This deadlocks.
Release the lock in the vcpu thread before waiting on the pause condition variable.
ok mlarkin@
|
#
1.92 |
|
23-Sep-2023 |
dv |
vmd(8): log vmd's vm id, not vmm's in vcpu_run_loop.
Some guests cause a warning message during a shutdown. Log the vmd vm id and not the kernel vmm id as it's next to useless to the end user. This has annoyed me too much.
|
#
1.91 |
|
06-Sep-2023 |
dv |
vmm(4)/vmd(8): include pending interrupt in vm_run_parmams.
To remove an ioctl(2) from the vcpu thread hotpath in vmd(8), add a flag in the vm_run_params structure to indicate if there's another interrupt pending. This reduces latency in vcpu work related to i/o as we save a trip into the kernel just to flip the interrupt pending flag on or off.
Tested by phessler@, mbuhl@, stsp@, and Mischa Peters.
ok mlarkin@
|
#
1.90 |
|
13-Jul-2023 |
dv |
vmd(8): pull validation into local prefix parser.
Validation for local prefixes, both inet and inet6, was scattered around. To make it even more confusing, vmd was using generic address parsing logic from prior network daemons. vmd doesn't need to parse addresses other than when parsing the local prefix settings in vm.conf and no runtime parsing is needed.
This change merges parsing and validation based on vmd's specific needs for local prefixes (e.g. reserving enough bits for vm id and network interface id encoding in an ipv4 address). In addition, it simplifies the struct from a generic address struct to one focused on just storing the v4 and v6 prefixes and masks. This cleans up an unused TAILQ struct member that isn't used by vmd and was leftover copy-pasta from those prior daemons.
The address parsing that vmd uses is also updated to using the latest logic in bgpd(8).
ok mlarkin@
|
#
1.89 |
|
13-May-2023 |
dv |
vmm(4)/vmd(8): switch to anonymous shared mappings.
While splitting out emulated virtio network and block devices into separate processes, I originally used named mappings via shm_mkstemp(3). While this functionally achieved the desired result, it had two unintended consequences:
1) tearing down a vm process and its child processes required excessive locking as the guest memory was tied into the VFS layer.
2) it was observed by mlarkin@ that actions in other parts of the VFS layer could cause some of the guest memory to flush to storage, possibly filling /tmp.
This commit adds a new vmm(4) ioctl dedicated to allowing a process request the kernel share a mapping of guest memory into its own vm space. This requires an open fd to /dev/vmm (requiring root) and both the "vmm" and "proc" pledge(2) promises. In addition, the caller must know enough about the original memory ranges to reconstruct them to make the vm's ranges.
Tested with help from Mischa Peters.
ok mlarkin@
|
#
1.88 |
|
28-Apr-2023 |
dv |
vmd(8)/vmctl(8): allow vm owners to override boot kernel.
vmd allows non-root users to "own" a vm defined in vm.conf(5). While the user can start/stop the vm, if they break their filesystem they have no means of booting recovery media like a ramdisk kernel.
This change opens the provided boot kernel via vmctl and passes the file descriptor through the control channel to vmd. The next boot of the vm will use the provided file descriptor as boot kernel/bios. Subsequent boots (e.g. a reboot) will return to using behavior defined in vm.conf or the default bios image.
ok mlarkin@
|
#
1.87 |
|
27-Apr-2023 |
dv |
vmd(8): introduce multi-process model for virtio devices.
Isolate virtio network and block device emulation in dedicated processes, forked and exec'd from the vm process. This allows for tightening pledge promises to just "stdio".
Communication between the vcpu's and these devices now occurs via imsg channels, which adds the benefit of not always blocking the vcpu thread while emulating the device.
With this commit, it's possible that vmd is the first open source hypervisor that *defaults* to a multi-process device emulation model without requiring any additional configuration from the operator.
Testing help from phessler@ and Mischa Peters.
ok mlarkin@
|
#
1.86 |
|
25-Apr-2023 |
dv |
vmm(4)/vmd(8): pull struct members out of vmm ioctl create struct.
The object sent to vmm(4) contained file paths and details the kernel does not need for cpu virtualization as device emulation is in userland. Effectively, "pull up" the struct members from the vm_create_params struct to the parent vmop_create_params struct.
This allows us to clean up some of vmd(8) and simplify things for switching to having vmctl(8) open the "kernel" file (SeaBIOS, bsd.rd, etc.) to allow users to boot recovery ramdisk kernels.
ok mlarkin@
|
#
1.85 |
|
23-Apr-2023 |
dv |
vmd(8): teach vmm process how to exec.
Use execvp(2) to launch vm children with new address spaces. Consequently, introduces use of unveil(2) into the vmm and vm processes.
This imposes the requirement of launching vmd with absolute paths, similar to sshd(8).
ok mlarkin@
|
#
1.84 |
|
23-Apr-2023 |
anton |
unbreak tree by coping with recent s/XCR0/XFEATURE rename
|
Revision tags: OPENBSD_7_3_BASE
|
#
1.83 |
|
06-Feb-2023 |
dv |
vmd(8): scan pci bus to determine bootorder strings.
vmd's SeaBIOS bootorder strings had hardcoded pci device ids, so if a user added a network interface the bootorder strings didn't line up with reality. Using vmctl(8) to boot from a cdrom (-B cdrom) would fail, for instance, if attaching both a nic and a disk as well.
This change scans the pci devices and finds the first of each type to construct viable bootorder strings.
ok jan@
|
#
1.82 |
|
28-Jan-2023 |
dv |
Move some header definitions from vmm(4) to vmd(8).
Part of an ongoing effort to move userland-specific information out of a kernel header and directly into vmd(8). No functional change.
ok mlarkin@
|
#
1.81 |
|
08-Jan-2023 |
dv |
vmd(8): add thread names to vm process.
ok guenther@.
|
#
1.80 |
|
04-Jan-2023 |
dv |
Typos in vmd error message. No functional change.
|
#
1.79 |
|
28-Dec-2022 |
jmc |
spelling fixes; from paul tagliamonte any parts of his diff not taken are noted on tech
|
#
1.78 |
|
26-Dec-2022 |
dv |
vmd(8): provide a detailed e820 memory map.
When booting guests with SeaBIOS, vmd(8) supplied details about the available guest memory via CMOS registers. Consequently, we've been carrying some patches in the ports tree to SeaBIOS to fetch this information like it's the 1990s.
When a vm initializes memory ranges, we now track what each range represents. This information can be used to supply the e820 memory map to SeaBIOS via the fw_cfg interface allowing it to properly communicate memory ranges to a guest operating system. (This will also allow us to drop some patches from the port.)
Given the ranges can now be marked with a purpose, this also allows vmm(4) to switch from hard-coded mmio ranges and instead let the information on the memory range dictate if vmm should be handling a page fault or sending to vmd for a memory assist.
Tested by Mischa Peters and others. OK mlarkin@.
|
#
1.77 |
|
23-Dec-2022 |
dv |
vmd(8): implement zero-copy operations on virtqueues.
The original virtio device implementation relied on allocating a buffer on heap, copying the virtqueue from the guest, mutating the copy, and then overwriting the virtqueue in the guest.
While the approach worked, it was both complex and added extra overhead. On older hardware, switching to the zero-copy approach can show a noticeable performance improvement for vionet devices. An added benefit is this diff also reduces the amount of code in vmd, which is always a welcome change.
In addition, change to talking about the queue pfn and not "address" as the virtio-pci spec has drivers provide a 32-bit value representing the physical page number of the location in guest memory, not the linear address.
Original idea from dlg@ while working on re-adding async task queues.
ok dlg@, tested by many
|
#
1.76 |
|
11-Nov-2022 |
dv |
Revert removal of toggling interrupt line in vmd vcpu run loop.
phessler reports a performance regression. Needs more testing.
|
#
1.75 |
|
10-Nov-2022 |
dv |
vmd(8): remove toggling interrupt line on vcpu in vcpu run loop
We toggle the interrupt "line" on the vcpu when we assert or deassert irq on the pic in either the vcpu thread (emulating some devices) or on the device event thread (mostly handling reading available data). Having it in the vcpu run loop here just results in another ioctl(2) call before the one for re-entering the guest cpu.
Removing it shows no noticeable behavioral change in existing guests.
ok mlarkin@
|
#
1.74 |
|
10-Nov-2022 |
dv |
vmd(8): import mmio decode and emulation, disabled for now.
The initial mmio support for vmd adds support for only specific MOV and MOVZX instructions. Plan is to begin iterating in-tree on other missing pieces. All functionality is gated behind an #if for now.
Only change to vmm(4) is reordering register #define's in vmmvar.h.
ok mlarkin@
|
Revision tags: OPENBSD_7_2_BASE
|
#
1.73 |
|
01-Sep-2022 |
dv |
vmm(4): send all port io emulation to userland
Simplify things by sending any io exits from IN/OUT instructions to userland instead of trying to emulate anything in the kernel. vmm was sending most pertinent exits to vmd anyways, so this functionally changes little.
An added benefit is this solves an issue reported by tb@ where i386 OpenBSD guests would probe for a pc keyboard repeatedly and cause excessive vm exits. (The emulation in vmm was not properly handling these port reads.)
While here, make the assignment of the VEI_DIR_{IN,OUT} enum values not assume the underlying integer the compiler may assign.
ok mlarkin@
|
#
1.72 |
|
30-Aug-2022 |
dv |
Initial support for mmio assist for vmm(4)
Provide the basic information required for a userland assist in emulating instructions touching mmio regions, sending as much information as is provided by the host hardware.
No decode or assist provided at the moment by vmd(8).
ok mlarkin@
|
#
1.71 |
|
29-Jun-2022 |
dv |
vmd(8): fix off by one in vm memory range check
When inspecting if a gpa falls into a known memory range, vmd was considering it valid 1 byte past the end resulting in selecting the wrong starting range for the search.
ok mlarkin@
|
#
1.70 |
|
26-Jun-2022 |
dv |
vmd: create a copy of bios at 4g boundary
Newer Linux kernels call into the bios to perform a reboot and our version of SeaBIOS assumes there's a "copy" of the bios ending at 4g. When SeaBIOS reads from this area, since vmd doesn't perform mmio yet, guests terminate with an unhandled fault.
Carve out some space ending at 4g and copy the bios there. Technically we could load garbage there, but give SeaBIOS what it wants for now.
ok mlarkin@
|
#
1.69 |
|
03-May-2022 |
dv |
vmm/vmd/vmctl: standardize memory units to bytes
At different points in the vm lifecycle vmm(4), vmctl(8), and vmd(8) refer to a vm's memory range sizes in either bytes or megabytes. This is needlessly complex.
Switch to using bytes everywhere and adjust types and constants accordingly. While this makes it possible to specify vm's with memory in fractions of megabytes, the logic requiring whole megabyte values remains.
Feedback from deraadt@, mlarkin@, and Matthew Martin.
ok mlarkin@
|
Revision tags: OPENBSD_7_1_BASE
|
#
1.68 |
|
01-Mar-2022 |
dv |
vmd(8): gracefully handle hitting data limits when starting a vm
With recent changes to login.conf(5) to restrict daemon datasize to a finite value, users can now hit resource limits when attempting to start a vm.
This change fixes the error path when hitting the limit. vmd(8) will no longer abort and memory error messages are relayed to the user.
While here, address potential under-reads/writes using atomicio when relaying data between the child vm process and vmd's vmm process.
Original diff from tedu@. OK mlarkin@.
|
#
1.67 |
|
30-Dec-2021 |
claudio |
Add back support for -B net -b bsd.rd which emulates a PXE install and results in an autoinstall. This can be used to quickly create new OpenBSD installs. OK dv@
|
#
1.66 |
|
29-Nov-2021 |
deraadt |
mostly avoid sys/param.h with a local nitems() ok mlarkin
|
Revision tags: OPENBSD_7_0_BASE
|
#
1.65 |
|
01-Sep-2021 |
dv |
remove unused functions and cleanup vmd.h
Discussed with mlarkin@. These functions were implemented but never used. While in vmd.h, fix the order to match current vmd(8) reality.
|
#
1.64 |
|
16-Jul-2021 |
dv |
vmd(8): simplify vcpu logic, removing uart & vionet reads
Remove legacy state handling on the ns8250 and virtio network devices originally put in place before using libevent for async device events. The vcpu thread doesn't need to process device data as it is handled by the libevent thread.
This has the benefit of simplifying some of the message passing between threads introduced to the ns8250 uart since both the vcpu and libevent threads were processing read events.
No functional change intended. Tested by many, including abieber@, weerd@, Mischa Peters, and Matthias Schmidt. (Thanks.)
OK mlarkin@
|
#
1.63 |
|
16-Jun-2021 |
dv |
cleanup vmd(8) includes and header files
Lots of organic growth other the years lead to unnecessary includes (proc.h everywhere) and odd dependencies between header files. This cleans things up a bit to help with upcoming cleanup around dhcp code.
No functional change.
"go for it" mlarkin@
|
Revision tags: OPENBSD_6_9_BASE
|
#
1.62 |
|
05-Apr-2021 |
dv |
Support booting from compressed kernel images.
The bsd.rd ramdisk now ships gzip'd on amd64. Use libz in base to transparently handle decompression of any compressed kernel images.
Patch from Josh Rickmar.
ok kn@
|
#
1.61 |
|
29-Mar-2021 |
dv |
Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior ignored did not intercept these packets and instead transmitted them.
This should make vmd(8)'s dhcp behave more as a true dhcp server should and allows it to work properly with the new dhcpleased(8) attempting a renewal.
OK mlarkin@
|
#
1.60 |
|
19-Mar-2021 |
kn |
Remove booting from kernels in raw/qcow2 images
Diff and (slightly tweaked) text below from Dave Voutila < dave at sisu dot io >, thanks!
-- Since 6.7 switched to FFS2 as the default filesystem for new installs, the ability for vmd(8) to load a kernel and boot.conf from a disk image directly (without SeaBIOS) has been broken.
A diff from tb to add FFS2 support never mdae it into the tree.
On 5th Jan 2021, new ramdisks for amd64 have started shipping gzipped, breaking the ability to load the bsd.rd directly as a kernel image for a vmd guest without first uncompressing the image.
Using BIOS works, the FFS2 change happend ten months ago and few if any have complained about the breakage. vmctl(8) is still vague about supporting it per its man page and one still has to pass the disk image twice as a "-b" and "-d" argument to boot an OpenBSD guest *without* BIOS.
Josh Rickmar reported the gzip issue on bugs@ and provided patches to add support for compressed ramdisks and kernel images. The easiest way to do so is to drop support for FFS images since they require a call to fmemopen(3) while all the other logic uses fopen(3)/fdopen(3) calls and a file descriptor. It is much easier to get thsoe patches merged if they don't have to account for extracting files from disk images. --
No objections anyone "Removing it makes sense" reyk (who wrote the FFS module) OK mlarkin
|
#
1.59 |
|
13-Feb-2021 |
mlarkin |
Fix some wrong comments and KNF/long line wraps
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.58 |
|
28-Jun-2020 |
pd |
vmd(8): Eliminate libevent state corruption
libevent functions for com, pic and rtc are now only called on event_thread. vcpu exit handlers send messages on a dev pipe and callbacks on these events do the event management (event_add, evtimer_add, etc). Previously, libevent state was mutated by two threads, event_thread, that runs all the callbacks and the vcpu thread when running exit handlers. This could have lead to libevent state corruption.
Patch from Dave Voutila <dave@sisu.io>
ok claudio@ tested by abieber@ and brynet@
|
Revision tags: OPENBSD_6_7_BASE
|
#
1.57 |
|
30-Apr-2020 |
pd |
vmd(8): correctly terminate vm processes after sending vm
Instead of a round about way of sending a message to vmm that 'send is successful' and terminating by vm_remove from vmm, we can send the imsg and exit in the vm process. The sigchld handler in vmm will vm_remove it from its structures. This is how a normal vm is terminated as well.
Previously, vm_remove was called in vmm_dispatch_vm (ie. the event handler to receive messages from vm process) when hanlding the IMSG_VMDOP_SEND_VM_RESPONSE (ie. the vm process has written the vm state to the fd passed on by vmctl send). This is not how vm_remove was intented to be used as it does a free(vm). The vm struct holds the buffers for imsg and so after handling this IMSG_VMDOP_SEND_VM_RESPONSE message, vmm_dispatch_vm loops again to do imsg_get(ibuf, &imsg) to read the next message (and we had just freed this *ibuf when we freed the vm struct) causing it to segfault.
reported by kn@ ok kn@
|
#
1.56 |
|
21-Apr-2020 |
pd |
vmd: improve concurrency control in pause
Previous implementation hit a deadlock sometimes as the pthread_cond_broadcast for the pause mutex could happen before pthread_cond_wait. This implementation uses a barrier which is hit when all vpcus are paused.
ok mpi@
|
#
1.55 |
|
08-Apr-2020 |
pd |
vmm(4): add IOCTL handler to sets the access protections of the ept
This exposes VMM_IOC_MPROTECT_EPT which can be used by vmd to lock in physical pages. Currently, vmd just terminates the vm in case it gets a protection fault in the future.
This feature is used by solo5 which uses vmm(4) as a backend hypervisor.
ok mpi@
Patch from Adam Steen <adam@adamsteen.com.au>
|
#
1.54 |
|
11-Dec-2019 |
pd |
vmd: proper concurrency control when pausing a vm
Removes an XXX which slept for 1s waiting for the vcpu thread to reach HLT and pause. We now define a paused and unpaused condition so that a call to pause_vm() / vmctl pause blocks till the vm really reaches a paused state.
Also, detach events for devices from event loop when pausing and add them back when unpausing. This is because some callbacks call pthread_mutex_lock and if the vm is paused, it would block also causing the libevent thread to block. This would mean that we would not be able to process any IMSGs received from vmm (parent process) including a message to unpause.
ok mlarkin@
|
#
1.53 |
|
30-Nov-2019 |
mlarkin |
Revert previous - the stability was not as improved as we had thought and we ended up accidentally breaking vmctl. This will need more thought.
ok ori@
|
#
1.52 |
|
29-Nov-2019 |
mlarkin |
Fix at least one cause of VMs spinning at 100% host CPU
After debugging with ori@, it looks like an event ends up on the wrong libevent queue, and we end continually de-queueing and re-queueing the event continually. While it's unclear exactly why this happened, a clue on libevent's github issues page for the same problem pointed us to using a different event base for the device events. This seems to have unstuck ori@'s problematic VM, and I have also seen no more hangs after this.
We have not completely separated the queues; ori@ will work on setting new libevent bases for those later. But those events are pretty frequency.
with help from and ok ori@
|
Revision tags: OPENBSD_6_6_BASE
|
#
1.51 |
|
17-Jul-2019 |
pd |
vmm/vmd: Fix migration with pvclock
Implement VMM_IOC_READVMPARAMS and VMM_IOC_WRITEVMPARAMS ioctls to read and write pvclock state.
reads ok mlarkin@
|
#
1.50 |
|
28-Jun-2019 |
deraadt |
When system calls indicate an error they return -1, not some arbitrary value < 0. errno is only updated in this case. Change all (most?) callers of syscalls to follow this better, and let's see if this strictness helps us in the future.
|
#
1.49 |
|
28-May-2019 |
pd |
vmd: unset CR0_CD and CR0_NW in default flat64 register values
These never got unset on AMD/SVM guests when booted via vmctl start -b causing them to run very slow
ok mlarkin@
|
#
1.48 |
|
12-May-2019 |
pd |
vmm: add a x86 page table walker
Add a first cut of x86 page table walker to vmd(8) and vmm(4). This function is not used right now but is a building block for future features like HPET, OUTSB and INSB emulation, nested virtualisation support, etc.
With help from Mike Larkin
ok mlarkin@
|
#
1.47 |
|
11-May-2019 |
jasper |
vm_dump_header allocated space for a signature but it was never set; set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image
ok mlarkin@ pd@
|
#
1.46 |
|
11-May-2019 |
jasper |
track the state of the vm (running, paused, etc) using a single bitfield instead of a handful of separate variables. this will makes it easier for vmd to report and check on the individual vm states
no functional change intended
ok ccardenas@ mlarkin@
|
Revision tags: OPENBSD_6_5_BASE
|
#
1.45 |
|
01-Mar-2019 |
mlarkin |
vmd(8): remove some i386 remnants that missed the original cleanup
ok pd, kn, deraadt
|
#
1.44 |
|
20-Feb-2019 |
mlarkin |
vmd(8): initialize guest %drX registers to power-on defaults on launch
Initializes the %drX registers to power on defaults, and bump the VM send/recieve header to reflect same
discussed with deraadt@
|
#
1.43 |
|
10-Dec-2018 |
claudio |
Implement the fw_cfg interface basics and use it to set the bootorder if a bootdevice was forced. This implements both the pure IO port interface and also the new DMA interface, a few direct commands are implemented which are needed but in general the "file" interface should be used. There is no write support for the guest. Tested against the latest vmm-firmware port. This requires also a -current kernel to pass the IO ports to vmd(8). OK mlarkin@ ccardenas@
|
#
1.42 |
|
06-Dec-2018 |
claudio |
Make it possible to define the bootdevice in vmd. This information is used currently only when booting a OpenBSD kernel. If VMBOOTDEV_NET is used the internal dhcp server will pass "auto_install" as boot file to the client and the boot loader passes the MAC of the first interface to the kernel to indicate PXE booting. Adding boot order support to SeaBIOS is not yet implemented. Ok ccardenas@
|
Revision tags: OPENBSD_6_4_BASE
|
#
1.41 |
|
08-Oct-2018 |
reyk |
Add support for qcow2 base images (external snapshots).
This works is from Ori Bernstein, committing on his behalf:
Add support to vmd for external snapshots. That is, snapshots that are derived from a base image. Data lookups start in the derived image, and if the derived image does not contain some data, the search proceeds ot the base image. Multiple derived images may exist off of a single base image.
A limitation of this format is that modifying the base image will corrupt the derived image.
This change also adds support for creating disk derived disk images to vmctl. To use it:
vmctl create derived.qcow2 -s 16G -b base.qcow2
From Ori Bernstein OK mlarkin@ reyk@
|
#
1.40 |
|
28-Sep-2018 |
reyk |
Support vmd-internal's vmboot with qcow2 disk images.
OK mlarkin@
|
#
1.39 |
|
19-Sep-2018 |
ccardenas |
Various clean up items for disks.
- qcow2: general cleanup - vioraw: check malloc - virtio: add function to sync disks - vm: call virtio_shutdown to sync disks when vm is finished executing
Thanks to Ori Bernstein.
Ok miko@
|
#
1.38 |
|
17-Jul-2018 |
mlarkin |
vmd(8): fix vmctl -b option for i386 kernels.
ok pd@
|
#
1.37 |
|
12-Jul-2018 |
mlarkin |
vmm(8)/vmm(4): send a copy of the guest register state to vmd on exit, avoiding multiple readregs ioctls back to vmm in case register content is needed subsequently.
ok phessler
|
#
1.36 |
|
10-Jul-2018 |
mlarkin |
vmd(8): route ELCR handler to the right function
|
#
1.35 |
|
09-Jul-2018 |
mlarkin |
vmd(8): better debug message in a failure case
|
#
1.34 |
|
19-Jun-2018 |
reyk |
knf
|
#
1.33 |
|
27-Apr-2018 |
mlarkin |
vmd(8): implement vmd side of ELCR registers
ok guenther
|
#
1.32 |
|
26-Apr-2018 |
mlarkin |
vmd(8): handle PIT channel 2 status readback via port 0x61
Allow PIT channel 2 status (fired/counting) readback via port 0x61 bit 5.
ok guenther@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.31 |
|
03-Jan-2018 |
ccardenas |
Add initial CD-ROM support to VMD via vioscsi.
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
#
1.30 |
|
29-Nov-2017 |
mlarkin |
make vmm(4) less responsible for initial register state, preferring to let usermode daemons handle that.
ok pd@
|
#
1.29 |
|
28-Nov-2017 |
mlarkin |
fix some spelling errors in a few comments
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.28 |
|
19-Sep-2017 |
mlarkin |
Clarify a wrong conditional, found by jsg.
ok jsg
|
#
1.27 |
|
17-Sep-2017 |
pd |
vmd: send/recv pci config space instead of recreating pci devices on receive
ok mlarkin@
|
#
1.26 |
|
17-Sep-2017 |
pd |
vmd: re add rtc.per and rtc.sec evtimers on receive
This was missed in receive. mc146818_start is already defined. This fixes rtc time resync on receive.
ok mlarkin@
|
#
1.25 |
|
11-Sep-2017 |
dlg |
add functions to provide direct access to guest memory as vmd addresses
iovec_mem() populates an iovec array based on guest physical addresses. this allows the use of things like readv and writev for moving data between the guest and a disk image file without having to bounce the memory.
vaddr_mem() provides a vmd usable pointer based on a guests physical address. this makes it possible to directly reference things like virtio rings without having to bounce that memory either. however, it assumes that a contiguous range of guest physical memory will sit in a single vm memory range. mlarkin@ says this is right.
ok mlarkin@
|
#
1.24 |
|
20-Aug-2017 |
pd |
vmd: Allow only upward migration
This restricts receiving vms from hosts with more cpu features.
Tested on broadwell -> skylake (works) skylake -> broadwell (don't work)
ok mlarkin@
|
#
1.23 |
|
14-Aug-2017 |
mlarkin |
vmd: set MSR_MISC_ENABLE=0 on vm creation, this will be re-set in vmm based on proper values from the host in use.
|
#
1.22 |
|
15-Jul-2017 |
pd |
Add vmctl send and vmctl receive
ok reyk@ and mlarkin@
|
#
1.21 |
|
09-Jul-2017 |
pd |
vmd/vmctl: Add ability to pause / unpause vms
With help from Ashwin Agrawal
ok reyk@ mlarkin@
|
#
1.20 |
|
07-Jun-2017 |
mlarkin |
vmd: Implement simulated baudrate support in the ns8250 module. The previous version was allowing an output rate that is "too fast", and linux guests would give up after 512 characters TXed ("too much work for irq4").
This diff calculates the approximate rate we can sustain at the current programmed baud rate and limits the output to that rate by inserting a HZ delay after a specified number of characters have been transmitted. This fixes the linux guest console issue.
Note that the console now outputs at more or less the selected baud rate, instead of nearly instantaneously as before - if you selected 9600 in your guest VMs before, you might want to change that to 115200 now for a better console experience.
krw@ "seems like a good idea to me"
|
#
1.19 |
|
30-May-2017 |
tedu |
split vioblk read/write functions into start and finish as prep for async io operations. ok mlarkin
|
#
1.18 |
|
28-May-2017 |
mlarkin |
SVM: add some exit types
Also, fix a comment that wasn't applicable anymore, and change a format from decimal to hex
|
#
1.17 |
|
05-May-2017 |
reyk |
VMs cannot use proc_compose() to PROC_VMM, they have to use imsg_compose() on the "vmm_pipe" directly. This fixes the communication channel from VMs back to vmm.
|
#
1.16 |
|
05-May-2017 |
mlarkin |
Allow vmd(8) to set guest %xcr0
Usermode part of previous vmm(4) diff.
Posted to tech by Pratik Vyas
|
#
1.15 |
|
02-May-2017 |
mlarkin |
fix an error in i386 vmd build
|
#
1.14 |
|
02-May-2017 |
mlarkin |
Matching vmd(8) part of previous diff (first part of vmctl send/receive).
ok kettenis
|
#
1.13 |
|
25-Apr-2017 |
reyk |
spacing
|
#
1.12 |
|
19-Apr-2017 |
reyk |
Add support for dynamic "NAT" interfaces (-L/local interface).
When a local interface is configured, vmd configures a /31 address on the tap(4) interface of the host and provides another IP in the same subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server that replies with IP, gateway, and DNS addresses to the VM. The built-in server only ever responds to the VM on the inside and cannot leak its DHCP responses to the outside.
Thanks to Uwe Werler, Josh Grosse, and some others for testing!
OK deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.11 |
|
27-Mar-2017 |
deraadt |
die whitespace die die die
|
#
1.10 |
|
25-Mar-2017 |
mlarkin |
Last bits needed to get seabios + alpine linux working. This is enough to get started and let more people help finding and fixing bugs.
ok kettenis, deraadt
|
#
1.9 |
|
25-Mar-2017 |
reyk |
Boot using BIOS from /etc/firmware/vmm-bios by default.
Instead of using the internal "vmboot", VMs will now be booted using the external BIOS firmware in /etc/firmware/vmm-bios (which is subject to a LGPLv3 license). Direct booting of OpenBSD kernels or non-default BIOS images is still supported for now using the -b/boot option that is replacing the -k/kernel option.
As requested by Theo, vmd(8) fails if neither the default BIOS is found nor a kernel has been specified in the VM configuration. The "vmm" BIOS has to be installed using fw_update(1), which will be done automatically in most cases where the OpenBSD can fetch it after install/upgrade.
OK mlarkin@
|
#
1.8 |
|
25-Mar-2017 |
mlarkin |
Implement some missing functionality and clean up some code in vmd pci emulation.
ok kettenis
|
#
1.7 |
|
25-Mar-2017 |
mlarkin |
Introduce a new function to obtain properly sized input data, and convert i8253/i8259/mc146818 emulation to use this.
|
#
1.6 |
|
24-Mar-2017 |
mlarkin |
Allow vmd to proceed after an interrupt occurred after retiring a cpuid instruction. Matches previous commit to kernel vmm.c
|
#
1.5 |
|
23-Mar-2017 |
mlarkin |
Implement memory size and SMP CPU count NVRAM registers in the emulated mc146818. This is needed for seabios to boot properly (and construct a sensible e820 map to send to the guest OS).
|
#
1.4 |
|
21-Mar-2017 |
mlarkin |
Fix two errors in NS8250 (UART) emulation. The first error zeroed out the high bits of %eax on reading register data from the emulated UART ports. The second error didn't properly assert the TXRDY bit during init - this bit was only set after the first character was sent. Both these bugs caused seabios to not be able to output any data. Found during the recent effort to get Linux guests booting.
|
#
1.3 |
|
15-Mar-2017 |
reyk |
Improve vmmci(4) shutdown and reboot.
This change handles various cases to power off the VM, even if it is unresponsive, stuck in ddb, or when the shutdown was initiated from the VM guest side. Usage of timeout and VM ACKs make sure that the VM is really turned off at some point.
OK mlarkin@
|
#
1.2 |
|
02-Mar-2017 |
reyk |
Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.
This is especially useful when multiple VMs share a switch, the implementation is independent from the underlying switch or bridge.
no objections mlarkin@
|
#
1.1 |
|
01-Mar-2017 |
reyk |
Split vmm.c into two files: vm.c for the VM child, vmm.c for the parent
As discussed with mlarkin@, it makes it easier to maintain the file.
OK mlarkin@
|
#
1.99 |
|
09-Apr-2024 |
dv |
vmm/vmd: add exception injection and refactor inject api.
In order to continue work on mmio and other instruction emulation, vmd(8) needs the ability to inject exceptions (like page faults) from userland.
Refactor the way events are injected from userland, cleaning up how hardware (external) interrupts are injected in the process.
ok mlarkin@
|
Revision tags: OPENBSD_7_5_BASE
|
#
1.98 |
|
20-Feb-2024 |
dv |
Utilize separate threads for RX and TX in vmd(8)'s vionet.
This commit adds multithreading to allow both virtqueues to be processed in parallel along with additional synchronization primitives to protect device configuration state. Allowing RX and TX to operate independently reduces overall network latency for guests and helps alleviate the TX side dominating cpu time.
Tested with help from phessler@, kn@, and mlarkin@. ok mlarkin@.
|
#
1.97 |
|
05-Feb-2024 |
dv |
Cleanup fcntl(3) usage and fd lifetimes in vmd(8).
Remove extraneous fcntl(3) usage for setting fd features that can be set at time of open(2), pipe2(2), or socketpair(2). Also cleans up pty creation switching to using functions from libutil instead of direct ioctl(2) calls.
ok mlarkin@, original diff ok claudio@ as well.
|
#
1.96 |
|
18-Jan-2024 |
claudio |
Use imsg_get_fd() in vmd.
vmd uses a lot of fd passing and does it sometimes via extra abstraction so this just tries to convert the code without any optimisations.
ok dv@
|
#
1.95 |
|
10-Jan-2024 |
dv |
vmm/vmd: add io instruction length to exit information.
Add the instruction length to the vm exit information to allower vmd(8) to manipulate the instruction pointer after io emulation. This is preparation for emulating string-based io instructions.
Removes the instruction pointer update from the kernel (vmm(4)) as well as the instruction length checks, which were overly restrictive anyways based on the way prefixes work in x86 instructions.
ok mlarkin@
|
Revision tags: OPENBSD_7_4_BASE
|
#
1.94 |
|
26-Sep-2023 |
dv |
vmd(8): disambiguate log messages per vm and device.
The logging output from vmd(8) often specifies the function performing the logging, but leaves which vm or vm device to guesswork and reading tea leaves.
Change the logging formatting to prefix with information about the specific vm and potentially the device subprocess. Most of this logging is behind the "verbose" mode, but for warnings this will clarify which vm or device logged the warning.
The format of vm/<name>/<device><index> is chosen to be concise and less ugly than other approaches. This adjusts the process naming for devices to match, dropping the use of brackets.
In the process of this change, updating log settings dynamically via vmctl(8) is fixed by properly broadcasting that information to the device subprocesses. The "vmm" process also now updates its own state properly, so settings survive vm reboots.
ok mlarkin@
|
#
1.93 |
|
26-Sep-2023 |
dv |
vmd(8): fix vm pause deadlock.
When vcpu threads pause, they are holding the run mutex lock. If the event thread is asked to assert an irq on the pic and interrupts are pending, it will try to take the run mutex lock on the vcpu. This deadlocks.
Release the lock in the vcpu thread before waiting on the pause condition variable.
ok mlarkin@
|
#
1.92 |
|
23-Sep-2023 |
dv |
vmd(8): log vmd's vm id, not vmm's in vcpu_run_loop.
Some guests cause a warning message during a shutdown. Log the vmd vm id and not the kernel vmm id as it's next to useless to the end user. This has annoyed me too much.
|
#
1.91 |
|
06-Sep-2023 |
dv |
vmm(4)/vmd(8): include pending interrupt in vm_run_parmams.
To remove an ioctl(2) from the vcpu thread hotpath in vmd(8), add a flag in the vm_run_params structure to indicate if there's another interrupt pending. This reduces latency in vcpu work related to i/o as we save a trip into the kernel just to flip the interrupt pending flag on or off.
Tested by phessler@, mbuhl@, stsp@, and Mischa Peters.
ok mlarkin@
|
#
1.90 |
|
13-Jul-2023 |
dv |
vmd(8): pull validation into local prefix parser.
Validation for local prefixes, both inet and inet6, was scattered around. To make it even more confusing, vmd was using generic address parsing logic from prior network daemons. vmd doesn't need to parse addresses other than when parsing the local prefix settings in vm.conf and no runtime parsing is needed.
This change merges parsing and validation based on vmd's specific needs for local prefixes (e.g. reserving enough bits for vm id and network interface id encoding in an ipv4 address). In addition, it simplifies the struct from a generic address struct to one focused on just storing the v4 and v6 prefixes and masks. This cleans up an unused TAILQ struct member that isn't used by vmd and was leftover copy-pasta from those prior daemons.
The address parsing that vmd uses is also updated to using the latest logic in bgpd(8).
ok mlarkin@
|
#
1.89 |
|
13-May-2023 |
dv |
vmm(4)/vmd(8): switch to anonymous shared mappings.
While splitting out emulated virtio network and block devices into separate processes, I originally used named mappings via shm_mkstemp(3). While this functionally achieved the desired result, it had two unintended consequences:
1) tearing down a vm process and its child processes required excessive locking as the guest memory was tied into the VFS layer.
2) it was observed by mlarkin@ that actions in other parts of the VFS layer could cause some of the guest memory to flush to storage, possibly filling /tmp.
This commit adds a new vmm(4) ioctl dedicated to allowing a process request the kernel share a mapping of guest memory into its own vm space. This requires an open fd to /dev/vmm (requiring root) and both the "vmm" and "proc" pledge(2) promises. In addition, the caller must know enough about the original memory ranges to reconstruct them to make the vm's ranges.
Tested with help from Mischa Peters.
ok mlarkin@
|
#
1.88 |
|
28-Apr-2023 |
dv |
vmd(8)/vmctl(8): allow vm owners to override boot kernel.
vmd allows non-root users to "own" a vm defined in vm.conf(5). While the user can start/stop the vm, if they break their filesystem they have no means of booting recovery media like a ramdisk kernel.
This change opens the provided boot kernel via vmctl and passes the file descriptor through the control channel to vmd. The next boot of the vm will use the provided file descriptor as boot kernel/bios. Subsequent boots (e.g. a reboot) will return to using behavior defined in vm.conf or the default bios image.
ok mlarkin@
|
#
1.87 |
|
27-Apr-2023 |
dv |
vmd(8): introduce multi-process model for virtio devices.
Isolate virtio network and block device emulation in dedicated processes, forked and exec'd from the vm process. This allows for tightening pledge promises to just "stdio".
Communication between the vcpu's and these devices now occurs via imsg channels, which adds the benefit of not always blocking the vcpu thread while emulating the device.
With this commit, it's possible that vmd is the first open source hypervisor that *defaults* to a multi-process device emulation model without requiring any additional configuration from the operator.
Testing help from phessler@ and Mischa Peters.
ok mlarkin@
|
#
1.86 |
|
25-Apr-2023 |
dv |
vmm(4)/vmd(8): pull struct members out of vmm ioctl create struct.
The object sent to vmm(4) contained file paths and details the kernel does not need for cpu virtualization as device emulation is in userland. Effectively, "pull up" the struct members from the vm_create_params struct to the parent vmop_create_params struct.
This allows us to clean up some of vmd(8) and simplify things for switching to having vmctl(8) open the "kernel" file (SeaBIOS, bsd.rd, etc.) to allow users to boot recovery ramdisk kernels.
ok mlarkin@
|
#
1.85 |
|
23-Apr-2023 |
dv |
vmd(8): teach vmm process how to exec.
Use execvp(2) to launch vm children with new address spaces. Consequently, introduces use of unveil(2) into the vmm and vm processes.
This imposes the requirement of launching vmd with absolute paths, similar to sshd(8).
ok mlarkin@
|
#
1.84 |
|
23-Apr-2023 |
anton |
unbreak tree by coping with recent s/XCR0/XFEATURE rename
|
Revision tags: OPENBSD_7_3_BASE
|
#
1.83 |
|
06-Feb-2023 |
dv |
vmd(8): scan pci bus to determine bootorder strings.
vmd's SeaBIOS bootorder strings had hardcoded pci device ids, so if a user added a network interface the bootorder strings didn't line up with reality. Using vmctl(8) to boot from a cdrom (-B cdrom) would fail, for instance, if attaching both a nic and a disk as well.
This change scans the pci devices and finds the first of each type to construct viable bootorder strings.
ok jan@
|
#
1.82 |
|
28-Jan-2023 |
dv |
Move some header definitions from vmm(4) to vmd(8).
Part of an ongoing effort to move userland-specific information out of a kernel header and directly into vmd(8). No functional change.
ok mlarkin@
|
#
1.81 |
|
08-Jan-2023 |
dv |
vmd(8): add thread names to vm process.
ok guenther@.
|
#
1.80 |
|
04-Jan-2023 |
dv |
Typos in vmd error message. No functional change.
|
#
1.79 |
|
28-Dec-2022 |
jmc |
spelling fixes; from paul tagliamonte any parts of his diff not taken are noted on tech
|
#
1.78 |
|
26-Dec-2022 |
dv |
vmd(8): provide a detailed e820 memory map.
When booting guests with SeaBIOS, vmd(8) supplied details about the available guest memory via CMOS registers. Consequently, we've been carrying some patches in the ports tree to SeaBIOS to fetch this information like it's the 1990s.
When a vm initializes memory ranges, we now track what each range represents. This information can be used to supply the e820 memory map to SeaBIOS via the fw_cfg interface allowing it to properly communicate memory ranges to a guest operating system. (This will also allow us to drop some patches from the port.)
Given the ranges can now be marked with a purpose, this also allows vmm(4) to switch from hard-coded mmio ranges and instead let the information on the memory range dictate if vmm should be handling a page fault or sending to vmd for a memory assist.
Tested by Mischa Peters and others. OK mlarkin@.
|
#
1.77 |
|
23-Dec-2022 |
dv |
vmd(8): implement zero-copy operations on virtqueues.
The original virtio device implementation relied on allocating a buffer on heap, copying the virtqueue from the guest, mutating the copy, and then overwriting the virtqueue in the guest.
While the approach worked, it was both complex and added extra overhead. On older hardware, switching to the zero-copy approach can show a noticeable performance improvement for vionet devices. An added benefit is this diff also reduces the amount of code in vmd, which is always a welcome change.
In addition, change to talking about the queue pfn and not "address" as the virtio-pci spec has drivers provide a 32-bit value representing the physical page number of the location in guest memory, not the linear address.
Original idea from dlg@ while working on re-adding async task queues.
ok dlg@, tested by many
|
#
1.76 |
|
11-Nov-2022 |
dv |
Revert removal of toggling interrupt line in vmd vcpu run loop.
phessler reports a performance regression. Needs more testing.
|
#
1.75 |
|
10-Nov-2022 |
dv |
vmd(8): remove toggling interrupt line on vcpu in vcpu run loop
We toggle the interrupt "line" on the vcpu when we assert or deassert irq on the pic in either the vcpu thread (emulating some devices) or on the device event thread (mostly handling reading available data). Having it in the vcpu run loop here just results in another ioctl(2) call before the one for re-entering the guest cpu.
Removing it shows no noticeable behavioral change in existing guests.
ok mlarkin@
|
#
1.74 |
|
10-Nov-2022 |
dv |
vmd(8): import mmio decode and emulation, disabled for now.
The initial mmio support for vmd adds support for only specific MOV and MOVZX instructions. Plan is to begin iterating in-tree on other missing pieces. All functionality is gated behind an #if for now.
Only change to vmm(4) is reordering register #define's in vmmvar.h.
ok mlarkin@
|
Revision tags: OPENBSD_7_2_BASE
|
#
1.73 |
|
01-Sep-2022 |
dv |
vmm(4): send all port io emulation to userland
Simplify things by sending any io exits from IN/OUT instructions to userland instead of trying to emulate anything in the kernel. vmm was sending most pertinent exits to vmd anyways, so this functionally changes little.
An added benefit is this solves an issue reported by tb@ where i386 OpenBSD guests would probe for a pc keyboard repeatedly and cause excessive vm exits. (The emulation in vmm was not properly handling these port reads.)
While here, make the assignment of the VEI_DIR_{IN,OUT} enum values not assume the underlying integer the compiler may assign.
ok mlarkin@
|
#
1.72 |
|
30-Aug-2022 |
dv |
Initial support for mmio assist for vmm(4)
Provide the basic information required for a userland assist in emulating instructions touching mmio regions, sending as much information as is provided by the host hardware.
No decode or assist provided at the moment by vmd(8).
ok mlarkin@
|
#
1.71 |
|
29-Jun-2022 |
dv |
vmd(8): fix off by one in vm memory range check
When inspecting if a gpa falls into a known memory range, vmd was considering it valid 1 byte past the end resulting in selecting the wrong starting range for the search.
ok mlarkin@
|
#
1.70 |
|
26-Jun-2022 |
dv |
vmd: create a copy of bios at 4g boundary
Newer Linux kernels call into the bios to perform a reboot and our version of SeaBIOS assumes there's a "copy" of the bios ending at 4g. When SeaBIOS reads from this area, since vmd doesn't perform mmio yet, guests terminate with an unhandled fault.
Carve out some space ending at 4g and copy the bios there. Technically we could load garbage there, but give SeaBIOS what it wants for now.
ok mlarkin@
|
#
1.69 |
|
03-May-2022 |
dv |
vmm/vmd/vmctl: standardize memory units to bytes
At different points in the vm lifecycle vmm(4), vmctl(8), and vmd(8) refer to a vm's memory range sizes in either bytes or megabytes. This is needlessly complex.
Switch to using bytes everywhere and adjust types and constants accordingly. While this makes it possible to specify vm's with memory in fractions of megabytes, the logic requiring whole megabyte values remains.
Feedback from deraadt@, mlarkin@, and Matthew Martin.
ok mlarkin@
|
Revision tags: OPENBSD_7_1_BASE
|
#
1.68 |
|
01-Mar-2022 |
dv |
vmd(8): gracefully handle hitting data limits when starting a vm
With recent changes to login.conf(5) to restrict daemon datasize to a finite value, users can now hit resource limits when attempting to start a vm.
This change fixes the error path when hitting the limit. vmd(8) will no longer abort and memory error messages are relayed to the user.
While here, address potential under-reads/writes using atomicio when relaying data between the child vm process and vmd's vmm process.
Original diff from tedu@. OK mlarkin@.
|
#
1.67 |
|
30-Dec-2021 |
claudio |
Add back support for -B net -b bsd.rd which emulates a PXE install and results in an autoinstall. This can be used to quickly create new OpenBSD installs. OK dv@
|
#
1.66 |
|
29-Nov-2021 |
deraadt |
mostly avoid sys/param.h with a local nitems() ok mlarkin
|
Revision tags: OPENBSD_7_0_BASE
|
#
1.65 |
|
01-Sep-2021 |
dv |
remove unused functions and cleanup vmd.h
Discussed with mlarkin@. These functions were implemented but never used. While in vmd.h, fix the order to match current vmd(8) reality.
|
#
1.64 |
|
16-Jul-2021 |
dv |
vmd(8): simplify vcpu logic, removing uart & vionet reads
Remove legacy state handling on the ns8250 and virtio network devices originally put in place before using libevent for async device events. The vcpu thread doesn't need to process device data as it is handled by the libevent thread.
This has the benefit of simplifying some of the message passing between threads introduced to the ns8250 uart since both the vcpu and libevent threads were processing read events.
No functional change intended. Tested by many, including abieber@, weerd@, Mischa Peters, and Matthias Schmidt. (Thanks.)
OK mlarkin@
|
#
1.63 |
|
16-Jun-2021 |
dv |
cleanup vmd(8) includes and header files
Lots of organic growth other the years lead to unnecessary includes (proc.h everywhere) and odd dependencies between header files. This cleans things up a bit to help with upcoming cleanup around dhcp code.
No functional change.
"go for it" mlarkin@
|
Revision tags: OPENBSD_6_9_BASE
|
#
1.62 |
|
05-Apr-2021 |
dv |
Support booting from compressed kernel images.
The bsd.rd ramdisk now ships gzip'd on amd64. Use libz in base to transparently handle decompression of any compressed kernel images.
Patch from Josh Rickmar.
ok kn@
|
#
1.61 |
|
29-Mar-2021 |
dv |
Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior ignored did not intercept these packets and instead transmitted them.
This should make vmd(8)'s dhcp behave more as a true dhcp server should and allows it to work properly with the new dhcpleased(8) attempting a renewal.
OK mlarkin@
|
#
1.60 |
|
19-Mar-2021 |
kn |
Remove booting from kernels in raw/qcow2 images
Diff and (slightly tweaked) text below from Dave Voutila < dave at sisu dot io >, thanks!
-- Since 6.7 switched to FFS2 as the default filesystem for new installs, the ability for vmd(8) to load a kernel and boot.conf from a disk image directly (without SeaBIOS) has been broken.
A diff from tb to add FFS2 support never mdae it into the tree.
On 5th Jan 2021, new ramdisks for amd64 have started shipping gzipped, breaking the ability to load the bsd.rd directly as a kernel image for a vmd guest without first uncompressing the image.
Using BIOS works, the FFS2 change happend ten months ago and few if any have complained about the breakage. vmctl(8) is still vague about supporting it per its man page and one still has to pass the disk image twice as a "-b" and "-d" argument to boot an OpenBSD guest *without* BIOS.
Josh Rickmar reported the gzip issue on bugs@ and provided patches to add support for compressed ramdisks and kernel images. The easiest way to do so is to drop support for FFS images since they require a call to fmemopen(3) while all the other logic uses fopen(3)/fdopen(3) calls and a file descriptor. It is much easier to get thsoe patches merged if they don't have to account for extracting files from disk images. --
No objections anyone "Removing it makes sense" reyk (who wrote the FFS module) OK mlarkin
|
#
1.59 |
|
13-Feb-2021 |
mlarkin |
Fix some wrong comments and KNF/long line wraps
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.58 |
|
28-Jun-2020 |
pd |
vmd(8): Eliminate libevent state corruption
libevent functions for com, pic and rtc are now only called on event_thread. vcpu exit handlers send messages on a dev pipe and callbacks on these events do the event management (event_add, evtimer_add, etc). Previously, libevent state was mutated by two threads, event_thread, that runs all the callbacks and the vcpu thread when running exit handlers. This could have lead to libevent state corruption.
Patch from Dave Voutila <dave@sisu.io>
ok claudio@ tested by abieber@ and brynet@
|
Revision tags: OPENBSD_6_7_BASE
|
#
1.57 |
|
30-Apr-2020 |
pd |
vmd(8): correctly terminate vm processes after sending vm
Instead of a round about way of sending a message to vmm that 'send is successful' and terminating by vm_remove from vmm, we can send the imsg and exit in the vm process. The sigchld handler in vmm will vm_remove it from its structures. This is how a normal vm is terminated as well.
Previously, vm_remove was called in vmm_dispatch_vm (ie. the event handler to receive messages from vm process) when hanlding the IMSG_VMDOP_SEND_VM_RESPONSE (ie. the vm process has written the vm state to the fd passed on by vmctl send). This is not how vm_remove was intented to be used as it does a free(vm). The vm struct holds the buffers for imsg and so after handling this IMSG_VMDOP_SEND_VM_RESPONSE message, vmm_dispatch_vm loops again to do imsg_get(ibuf, &imsg) to read the next message (and we had just freed this *ibuf when we freed the vm struct) causing it to segfault.
reported by kn@ ok kn@
|
#
1.56 |
|
21-Apr-2020 |
pd |
vmd: improve concurrency control in pause
Previous implementation hit a deadlock sometimes as the pthread_cond_broadcast for the pause mutex could happen before pthread_cond_wait. This implementation uses a barrier which is hit when all vpcus are paused.
ok mpi@
|
#
1.55 |
|
08-Apr-2020 |
pd |
vmm(4): add IOCTL handler to sets the access protections of the ept
This exposes VMM_IOC_MPROTECT_EPT which can be used by vmd to lock in physical pages. Currently, vmd just terminates the vm in case it gets a protection fault in the future.
This feature is used by solo5 which uses vmm(4) as a backend hypervisor.
ok mpi@
Patch from Adam Steen <adam@adamsteen.com.au>
|
#
1.54 |
|
11-Dec-2019 |
pd |
vmd: proper concurrency control when pausing a vm
Removes an XXX which slept for 1s waiting for the vcpu thread to reach HLT and pause. We now define a paused and unpaused condition so that a call to pause_vm() / vmctl pause blocks till the vm really reaches a paused state.
Also, detach events for devices from event loop when pausing and add them back when unpausing. This is because some callbacks call pthread_mutex_lock and if the vm is paused, it would block also causing the libevent thread to block. This would mean that we would not be able to process any IMSGs received from vmm (parent process) including a message to unpause.
ok mlarkin@
|
#
1.53 |
|
30-Nov-2019 |
mlarkin |
Revert previous - the stability was not as improved as we had thought and we ended up accidentally breaking vmctl. This will need more thought.
ok ori@
|
#
1.52 |
|
29-Nov-2019 |
mlarkin |
Fix at least one cause of VMs spinning at 100% host CPU
After debugging with ori@, it looks like an event ends up on the wrong libevent queue, and we end continually de-queueing and re-queueing the event continually. While it's unclear exactly why this happened, a clue on libevent's github issues page for the same problem pointed us to using a different event base for the device events. This seems to have unstuck ori@'s problematic VM, and I have also seen no more hangs after this.
We have not completely separated the queues; ori@ will work on setting new libevent bases for those later. But those events are pretty frequency.
with help from and ok ori@
|
Revision tags: OPENBSD_6_6_BASE
|
#
1.51 |
|
17-Jul-2019 |
pd |
vmm/vmd: Fix migration with pvclock
Implement VMM_IOC_READVMPARAMS and VMM_IOC_WRITEVMPARAMS ioctls to read and write pvclock state.
reads ok mlarkin@
|
#
1.50 |
|
28-Jun-2019 |
deraadt |
When system calls indicate an error they return -1, not some arbitrary value < 0. errno is only updated in this case. Change all (most?) callers of syscalls to follow this better, and let's see if this strictness helps us in the future.
|
#
1.49 |
|
28-May-2019 |
pd |
vmd: unset CR0_CD and CR0_NW in default flat64 register values
These never got unset on AMD/SVM guests when booted via vmctl start -b causing them to run very slow
ok mlarkin@
|
#
1.48 |
|
12-May-2019 |
pd |
vmm: add a x86 page table walker
Add a first cut of x86 page table walker to vmd(8) and vmm(4). This function is not used right now but is a building block for future features like HPET, OUTSB and INSB emulation, nested virtualisation support, etc.
With help from Mike Larkin
ok mlarkin@
|
#
1.47 |
|
11-May-2019 |
jasper |
vm_dump_header allocated space for a signature but it was never set; set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image
ok mlarkin@ pd@
|
#
1.46 |
|
11-May-2019 |
jasper |
track the state of the vm (running, paused, etc) using a single bitfield instead of a handful of separate variables. this will makes it easier for vmd to report and check on the individual vm states
no functional change intended
ok ccardenas@ mlarkin@
|
Revision tags: OPENBSD_6_5_BASE
|
#
1.45 |
|
01-Mar-2019 |
mlarkin |
vmd(8): remove some i386 remnants that missed the original cleanup
ok pd, kn, deraadt
|
#
1.44 |
|
20-Feb-2019 |
mlarkin |
vmd(8): initialize guest %drX registers to power-on defaults on launch
Initializes the %drX registers to power on defaults, and bump the VM send/recieve header to reflect same
discussed with deraadt@
|
#
1.43 |
|
10-Dec-2018 |
claudio |
Implement the fw_cfg interface basics and use it to set the bootorder if a bootdevice was forced. This implements both the pure IO port interface and also the new DMA interface, a few direct commands are implemented which are needed but in general the "file" interface should be used. There is no write support for the guest. Tested against the latest vmm-firmware port. This requires also a -current kernel to pass the IO ports to vmd(8). OK mlarkin@ ccardenas@
|
#
1.42 |
|
06-Dec-2018 |
claudio |
Make it possible to define the bootdevice in vmd. This information is used currently only when booting a OpenBSD kernel. If VMBOOTDEV_NET is used the internal dhcp server will pass "auto_install" as boot file to the client and the boot loader passes the MAC of the first interface to the kernel to indicate PXE booting. Adding boot order support to SeaBIOS is not yet implemented. Ok ccardenas@
|
Revision tags: OPENBSD_6_4_BASE
|
#
1.41 |
|
08-Oct-2018 |
reyk |
Add support for qcow2 base images (external snapshots).
This works is from Ori Bernstein, committing on his behalf:
Add support to vmd for external snapshots. That is, snapshots that are derived from a base image. Data lookups start in the derived image, and if the derived image does not contain some data, the search proceeds ot the base image. Multiple derived images may exist off of a single base image.
A limitation of this format is that modifying the base image will corrupt the derived image.
This change also adds support for creating disk derived disk images to vmctl. To use it:
vmctl create derived.qcow2 -s 16G -b base.qcow2
From Ori Bernstein OK mlarkin@ reyk@
|
#
1.40 |
|
28-Sep-2018 |
reyk |
Support vmd-internal's vmboot with qcow2 disk images.
OK mlarkin@
|
#
1.39 |
|
19-Sep-2018 |
ccardenas |
Various clean up items for disks.
- qcow2: general cleanup - vioraw: check malloc - virtio: add function to sync disks - vm: call virtio_shutdown to sync disks when vm is finished executing
Thanks to Ori Bernstein.
Ok miko@
|
#
1.38 |
|
17-Jul-2018 |
mlarkin |
vmd(8): fix vmctl -b option for i386 kernels.
ok pd@
|
#
1.37 |
|
12-Jul-2018 |
mlarkin |
vmm(8)/vmm(4): send a copy of the guest register state to vmd on exit, avoiding multiple readregs ioctls back to vmm in case register content is needed subsequently.
ok phessler
|
#
1.36 |
|
10-Jul-2018 |
mlarkin |
vmd(8): route ELCR handler to the right function
|
#
1.35 |
|
09-Jul-2018 |
mlarkin |
vmd(8): better debug message in a failure case
|
#
1.34 |
|
19-Jun-2018 |
reyk |
knf
|
#
1.33 |
|
27-Apr-2018 |
mlarkin |
vmd(8): implement vmd side of ELCR registers
ok guenther
|
#
1.32 |
|
26-Apr-2018 |
mlarkin |
vmd(8): handle PIT channel 2 status readback via port 0x61
Allow PIT channel 2 status (fired/counting) readback via port 0x61 bit 5.
ok guenther@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.31 |
|
03-Jan-2018 |
ccardenas |
Add initial CD-ROM support to VMD via vioscsi.
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
#
1.30 |
|
29-Nov-2017 |
mlarkin |
make vmm(4) less responsible for initial register state, preferring to let usermode daemons handle that.
ok pd@
|
#
1.29 |
|
28-Nov-2017 |
mlarkin |
fix some spelling errors in a few comments
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.28 |
|
19-Sep-2017 |
mlarkin |
Clarify a wrong conditional, found by jsg.
ok jsg
|
#
1.27 |
|
17-Sep-2017 |
pd |
vmd: send/recv pci config space instead of recreating pci devices on receive
ok mlarkin@
|
#
1.26 |
|
17-Sep-2017 |
pd |
vmd: re add rtc.per and rtc.sec evtimers on receive
This was missed in receive. mc146818_start is already defined. This fixes rtc time resync on receive.
ok mlarkin@
|
#
1.25 |
|
11-Sep-2017 |
dlg |
add functions to provide direct access to guest memory as vmd addresses
iovec_mem() populates an iovec array based on guest physical addresses. this allows the use of things like readv and writev for moving data between the guest and a disk image file without having to bounce the memory.
vaddr_mem() provides a vmd usable pointer based on a guests physical address. this makes it possible to directly reference things like virtio rings without having to bounce that memory either. however, it assumes that a contiguous range of guest physical memory will sit in a single vm memory range. mlarkin@ says this is right.
ok mlarkin@
|
#
1.24 |
|
20-Aug-2017 |
pd |
vmd: Allow only upward migration
This restricts receiving vms from hosts with more cpu features.
Tested on broadwell -> skylake (works) skylake -> broadwell (don't work)
ok mlarkin@
|
#
1.23 |
|
14-Aug-2017 |
mlarkin |
vmd: set MSR_MISC_ENABLE=0 on vm creation, this will be re-set in vmm based on proper values from the host in use.
|
#
1.22 |
|
15-Jul-2017 |
pd |
Add vmctl send and vmctl receive
ok reyk@ and mlarkin@
|
#
1.21 |
|
09-Jul-2017 |
pd |
vmd/vmctl: Add ability to pause / unpause vms
With help from Ashwin Agrawal
ok reyk@ mlarkin@
|
#
1.20 |
|
07-Jun-2017 |
mlarkin |
vmd: Implement simulated baudrate support in the ns8250 module. The previous version was allowing an output rate that is "too fast", and linux guests would give up after 512 characters TXed ("too much work for irq4").
This diff calculates the approximate rate we can sustain at the current programmed baud rate and limits the output to that rate by inserting a HZ delay after a specified number of characters have been transmitted. This fixes the linux guest console issue.
Note that the console now outputs at more or less the selected baud rate, instead of nearly instantaneously as before - if you selected 9600 in your guest VMs before, you might want to change that to 115200 now for a better console experience.
krw@ "seems like a good idea to me"
|
#
1.19 |
|
30-May-2017 |
tedu |
split vioblk read/write functions into start and finish as prep for async io operations. ok mlarkin
|
#
1.18 |
|
28-May-2017 |
mlarkin |
SVM: add some exit types
Also, fix a comment that wasn't applicable anymore, and change a format from decimal to hex
|
#
1.17 |
|
05-May-2017 |
reyk |
VMs cannot use proc_compose() to PROC_VMM, they have to use imsg_compose() on the "vmm_pipe" directly. This fixes the communication channel from VMs back to vmm.
|
#
1.16 |
|
05-May-2017 |
mlarkin |
Allow vmd(8) to set guest %xcr0
Usermode part of previous vmm(4) diff.
Posted to tech by Pratik Vyas
|
#
1.15 |
|
02-May-2017 |
mlarkin |
fix an error in i386 vmd build
|
#
1.14 |
|
02-May-2017 |
mlarkin |
Matching vmd(8) part of previous diff (first part of vmctl send/receive).
ok kettenis
|
#
1.13 |
|
25-Apr-2017 |
reyk |
spacing
|
#
1.12 |
|
19-Apr-2017 |
reyk |
Add support for dynamic "NAT" interfaces (-L/local interface).
When a local interface is configured, vmd configures a /31 address on the tap(4) interface of the host and provides another IP in the same subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server that replies with IP, gateway, and DNS addresses to the VM. The built-in server only ever responds to the VM on the inside and cannot leak its DHCP responses to the outside.
Thanks to Uwe Werler, Josh Grosse, and some others for testing!
OK deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.11 |
|
27-Mar-2017 |
deraadt |
die whitespace die die die
|
#
1.10 |
|
25-Mar-2017 |
mlarkin |
Last bits needed to get seabios + alpine linux working. This is enough to get started and let more people help finding and fixing bugs.
ok kettenis, deraadt
|
#
1.9 |
|
25-Mar-2017 |
reyk |
Boot using BIOS from /etc/firmware/vmm-bios by default.
Instead of using the internal "vmboot", VMs will now be booted using the external BIOS firmware in /etc/firmware/vmm-bios (which is subject to a LGPLv3 license). Direct booting of OpenBSD kernels or non-default BIOS images is still supported for now using the -b/boot option that is replacing the -k/kernel option.
As requested by Theo, vmd(8) fails if neither the default BIOS is found nor a kernel has been specified in the VM configuration. The "vmm" BIOS has to be installed using fw_update(1), which will be done automatically in most cases where the OpenBSD can fetch it after install/upgrade.
OK mlarkin@
|
#
1.8 |
|
25-Mar-2017 |
mlarkin |
Implement some missing functionality and clean up some code in vmd pci emulation.
ok kettenis
|
#
1.7 |
|
25-Mar-2017 |
mlarkin |
Introduce a new function to obtain properly sized input data, and convert i8253/i8259/mc146818 emulation to use this.
|
#
1.6 |
|
24-Mar-2017 |
mlarkin |
Allow vmd to proceed after an interrupt occurred after retiring a cpuid instruction. Matches previous commit to kernel vmm.c
|
#
1.5 |
|
23-Mar-2017 |
mlarkin |
Implement memory size and SMP CPU count NVRAM registers in the emulated mc146818. This is needed for seabios to boot properly (and construct a sensible e820 map to send to the guest OS).
|
#
1.4 |
|
21-Mar-2017 |
mlarkin |
Fix two errors in NS8250 (UART) emulation. The first error zeroed out the high bits of %eax on reading register data from the emulated UART ports. The second error didn't properly assert the TXRDY bit during init - this bit was only set after the first character was sent. Both these bugs caused seabios to not be able to output any data. Found during the recent effort to get Linux guests booting.
|
#
1.3 |
|
15-Mar-2017 |
reyk |
Improve vmmci(4) shutdown and reboot.
This change handles various cases to power off the VM, even if it is unresponsive, stuck in ddb, or when the shutdown was initiated from the VM guest side. Usage of timeout and VM ACKs make sure that the VM is really turned off at some point.
OK mlarkin@
|
#
1.2 |
|
02-Mar-2017 |
reyk |
Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.
This is especially useful when multiple VMs share a switch, the implementation is independent from the underlying switch or bridge.
no objections mlarkin@
|
#
1.1 |
|
01-Mar-2017 |
reyk |
Split vmm.c into two files: vm.c for the VM child, vmm.c for the parent
As discussed with mlarkin@, it makes it easier to maintain the file.
OK mlarkin@
|
#
1.98 |
|
20-Feb-2024 |
dv |
Utilize separate threads for RX and TX in vmd(8)'s vionet.
This commit adds multithreading to allow both virtqueues to be processed in parallel along with additional synchronization primitives to protect device configuration state. Allowing RX and TX to operate independently reduces overall network latency for guests and helps alleviate the TX side dominating cpu time.
Tested with help from phessler@, kn@, and mlarkin@. ok mlarkin@.
|
#
1.97 |
|
05-Feb-2024 |
dv |
Cleanup fcntl(3) usage and fd lifetimes in vmd(8).
Remove extraneous fcntl(3) usage for setting fd features that can be set at time of open(2), pipe2(2), or socketpair(2). Also cleans up pty creation switching to using functions from libutil instead of direct ioctl(2) calls.
ok mlarkin@, original diff ok claudio@ as well.
|
#
1.96 |
|
18-Jan-2024 |
claudio |
Use imsg_get_fd() in vmd.
vmd uses a lot of fd passing and does it sometimes via extra abstraction so this just tries to convert the code without any optimisations.
ok dv@
|
#
1.95 |
|
10-Jan-2024 |
dv |
vmm/vmd: add io instruction length to exit information.
Add the instruction length to the vm exit information to allower vmd(8) to manipulate the instruction pointer after io emulation. This is preparation for emulating string-based io instructions.
Removes the instruction pointer update from the kernel (vmm(4)) as well as the instruction length checks, which were overly restrictive anyways based on the way prefixes work in x86 instructions.
ok mlarkin@
|
Revision tags: OPENBSD_7_4_BASE
|
#
1.94 |
|
26-Sep-2023 |
dv |
vmd(8): disambiguate log messages per vm and device.
The logging output from vmd(8) often specifies the function performing the logging, but leaves which vm or vm device to guesswork and reading tea leaves.
Change the logging formatting to prefix with information about the specific vm and potentially the device subprocess. Most of this logging is behind the "verbose" mode, but for warnings this will clarify which vm or device logged the warning.
The format of vm/<name>/<device><index> is chosen to be concise and less ugly than other approaches. This adjusts the process naming for devices to match, dropping the use of brackets.
In the process of this change, updating log settings dynamically via vmctl(8) is fixed by properly broadcasting that information to the device subprocesses. The "vmm" process also now updates its own state properly, so settings survive vm reboots.
ok mlarkin@
|
#
1.93 |
|
26-Sep-2023 |
dv |
vmd(8): fix vm pause deadlock.
When vcpu threads pause, they are holding the run mutex lock. If the event thread is asked to assert an irq on the pic and interrupts are pending, it will try to take the run mutex lock on the vcpu. This deadlocks.
Release the lock in the vcpu thread before waiting on the pause condition variable.
ok mlarkin@
|
#
1.92 |
|
23-Sep-2023 |
dv |
vmd(8): log vmd's vm id, not vmm's in vcpu_run_loop.
Some guests cause a warning message during a shutdown. Log the vmd vm id and not the kernel vmm id as it's next to useless to the end user. This has annoyed me too much.
|
#
1.91 |
|
06-Sep-2023 |
dv |
vmm(4)/vmd(8): include pending interrupt in vm_run_parmams.
To remove an ioctl(2) from the vcpu thread hotpath in vmd(8), add a flag in the vm_run_params structure to indicate if there's another interrupt pending. This reduces latency in vcpu work related to i/o as we save a trip into the kernel just to flip the interrupt pending flag on or off.
Tested by phessler@, mbuhl@, stsp@, and Mischa Peters.
ok mlarkin@
|
#
1.90 |
|
13-Jul-2023 |
dv |
vmd(8): pull validation into local prefix parser.
Validation for local prefixes, both inet and inet6, was scattered around. To make it even more confusing, vmd was using generic address parsing logic from prior network daemons. vmd doesn't need to parse addresses other than when parsing the local prefix settings in vm.conf and no runtime parsing is needed.
This change merges parsing and validation based on vmd's specific needs for local prefixes (e.g. reserving enough bits for vm id and network interface id encoding in an ipv4 address). In addition, it simplifies the struct from a generic address struct to one focused on just storing the v4 and v6 prefixes and masks. This cleans up an unused TAILQ struct member that isn't used by vmd and was leftover copy-pasta from those prior daemons.
The address parsing that vmd uses is also updated to using the latest logic in bgpd(8).
ok mlarkin@
|
#
1.89 |
|
13-May-2023 |
dv |
vmm(4)/vmd(8): switch to anonymous shared mappings.
While splitting out emulated virtio network and block devices into separate processes, I originally used named mappings via shm_mkstemp(3). While this functionally achieved the desired result, it had two unintended consequences:
1) tearing down a vm process and its child processes required excessive locking as the guest memory was tied into the VFS layer.
2) it was observed by mlarkin@ that actions in other parts of the VFS layer could cause some of the guest memory to flush to storage, possibly filling /tmp.
This commit adds a new vmm(4) ioctl dedicated to allowing a process request the kernel share a mapping of guest memory into its own vm space. This requires an open fd to /dev/vmm (requiring root) and both the "vmm" and "proc" pledge(2) promises. In addition, the caller must know enough about the original memory ranges to reconstruct them to make the vm's ranges.
Tested with help from Mischa Peters.
ok mlarkin@
|
#
1.88 |
|
28-Apr-2023 |
dv |
vmd(8)/vmctl(8): allow vm owners to override boot kernel.
vmd allows non-root users to "own" a vm defined in vm.conf(5). While the user can start/stop the vm, if they break their filesystem they have no means of booting recovery media like a ramdisk kernel.
This change opens the provided boot kernel via vmctl and passes the file descriptor through the control channel to vmd. The next boot of the vm will use the provided file descriptor as boot kernel/bios. Subsequent boots (e.g. a reboot) will return to using behavior defined in vm.conf or the default bios image.
ok mlarkin@
|
#
1.87 |
|
27-Apr-2023 |
dv |
vmd(8): introduce multi-process model for virtio devices.
Isolate virtio network and block device emulation in dedicated processes, forked and exec'd from the vm process. This allows for tightening pledge promises to just "stdio".
Communication between the vcpu's and these devices now occurs via imsg channels, which adds the benefit of not always blocking the vcpu thread while emulating the device.
With this commit, it's possible that vmd is the first open source hypervisor that *defaults* to a multi-process device emulation model without requiring any additional configuration from the operator.
Testing help from phessler@ and Mischa Peters.
ok mlarkin@
|
#
1.86 |
|
25-Apr-2023 |
dv |
vmm(4)/vmd(8): pull struct members out of vmm ioctl create struct.
The object sent to vmm(4) contained file paths and details the kernel does not need for cpu virtualization as device emulation is in userland. Effectively, "pull up" the struct members from the vm_create_params struct to the parent vmop_create_params struct.
This allows us to clean up some of vmd(8) and simplify things for switching to having vmctl(8) open the "kernel" file (SeaBIOS, bsd.rd, etc.) to allow users to boot recovery ramdisk kernels.
ok mlarkin@
|
#
1.85 |
|
23-Apr-2023 |
dv |
vmd(8): teach vmm process how to exec.
Use execvp(2) to launch vm children with new address spaces. Consequently, introduces use of unveil(2) into the vmm and vm processes.
This imposes the requirement of launching vmd with absolute paths, similar to sshd(8).
ok mlarkin@
|
#
1.84 |
|
23-Apr-2023 |
anton |
unbreak tree by coping with recent s/XCR0/XFEATURE rename
|
Revision tags: OPENBSD_7_3_BASE
|
#
1.83 |
|
06-Feb-2023 |
dv |
vmd(8): scan pci bus to determine bootorder strings.
vmd's SeaBIOS bootorder strings had hardcoded pci device ids, so if a user added a network interface the bootorder strings didn't line up with reality. Using vmctl(8) to boot from a cdrom (-B cdrom) would fail, for instance, if attaching both a nic and a disk as well.
This change scans the pci devices and finds the first of each type to construct viable bootorder strings.
ok jan@
|
#
1.82 |
|
28-Jan-2023 |
dv |
Move some header definitions from vmm(4) to vmd(8).
Part of an ongoing effort to move userland-specific information out of a kernel header and directly into vmd(8). No functional change.
ok mlarkin@
|
#
1.81 |
|
08-Jan-2023 |
dv |
vmd(8): add thread names to vm process.
ok guenther@.
|
#
1.80 |
|
04-Jan-2023 |
dv |
Typos in vmd error message. No functional change.
|
#
1.79 |
|
28-Dec-2022 |
jmc |
spelling fixes; from paul tagliamonte any parts of his diff not taken are noted on tech
|
#
1.78 |
|
26-Dec-2022 |
dv |
vmd(8): provide a detailed e820 memory map.
When booting guests with SeaBIOS, vmd(8) supplied details about the available guest memory via CMOS registers. Consequently, we've been carrying some patches in the ports tree to SeaBIOS to fetch this information like it's the 1990s.
When a vm initializes memory ranges, we now track what each range represents. This information can be used to supply the e820 memory map to SeaBIOS via the fw_cfg interface allowing it to properly communicate memory ranges to a guest operating system. (This will also allow us to drop some patches from the port.)
Given the ranges can now be marked with a purpose, this also allows vmm(4) to switch from hard-coded mmio ranges and instead let the information on the memory range dictate if vmm should be handling a page fault or sending to vmd for a memory assist.
Tested by Mischa Peters and others. OK mlarkin@.
|
#
1.77 |
|
23-Dec-2022 |
dv |
vmd(8): implement zero-copy operations on virtqueues.
The original virtio device implementation relied on allocating a buffer on heap, copying the virtqueue from the guest, mutating the copy, and then overwriting the virtqueue in the guest.
While the approach worked, it was both complex and added extra overhead. On older hardware, switching to the zero-copy approach can show a noticeable performance improvement for vionet devices. An added benefit is this diff also reduces the amount of code in vmd, which is always a welcome change.
In addition, change to talking about the queue pfn and not "address" as the virtio-pci spec has drivers provide a 32-bit value representing the physical page number of the location in guest memory, not the linear address.
Original idea from dlg@ while working on re-adding async task queues.
ok dlg@, tested by many
|
#
1.76 |
|
11-Nov-2022 |
dv |
Revert removal of toggling interrupt line in vmd vcpu run loop.
phessler reports a performance regression. Needs more testing.
|
#
1.75 |
|
10-Nov-2022 |
dv |
vmd(8): remove toggling interrupt line on vcpu in vcpu run loop
We toggle the interrupt "line" on the vcpu when we assert or deassert irq on the pic in either the vcpu thread (emulating some devices) or on the device event thread (mostly handling reading available data). Having it in the vcpu run loop here just results in another ioctl(2) call before the one for re-entering the guest cpu.
Removing it shows no noticeable behavioral change in existing guests.
ok mlarkin@
|
#
1.74 |
|
10-Nov-2022 |
dv |
vmd(8): import mmio decode and emulation, disabled for now.
The initial mmio support for vmd adds support for only specific MOV and MOVZX instructions. Plan is to begin iterating in-tree on other missing pieces. All functionality is gated behind an #if for now.
Only change to vmm(4) is reordering register #define's in vmmvar.h.
ok mlarkin@
|
Revision tags: OPENBSD_7_2_BASE
|
#
1.73 |
|
01-Sep-2022 |
dv |
vmm(4): send all port io emulation to userland
Simplify things by sending any io exits from IN/OUT instructions to userland instead of trying to emulate anything in the kernel. vmm was sending most pertinent exits to vmd anyways, so this functionally changes little.
An added benefit is this solves an issue reported by tb@ where i386 OpenBSD guests would probe for a pc keyboard repeatedly and cause excessive vm exits. (The emulation in vmm was not properly handling these port reads.)
While here, make the assignment of the VEI_DIR_{IN,OUT} enum values not assume the underlying integer the compiler may assign.
ok mlarkin@
|
#
1.72 |
|
30-Aug-2022 |
dv |
Initial support for mmio assist for vmm(4)
Provide the basic information required for a userland assist in emulating instructions touching mmio regions, sending as much information as is provided by the host hardware.
No decode or assist provided at the moment by vmd(8).
ok mlarkin@
|
#
1.71 |
|
29-Jun-2022 |
dv |
vmd(8): fix off by one in vm memory range check
When inspecting if a gpa falls into a known memory range, vmd was considering it valid 1 byte past the end resulting in selecting the wrong starting range for the search.
ok mlarkin@
|
#
1.70 |
|
26-Jun-2022 |
dv |
vmd: create a copy of bios at 4g boundary
Newer Linux kernels call into the bios to perform a reboot and our version of SeaBIOS assumes there's a "copy" of the bios ending at 4g. When SeaBIOS reads from this area, since vmd doesn't perform mmio yet, guests terminate with an unhandled fault.
Carve out some space ending at 4g and copy the bios there. Technically we could load garbage there, but give SeaBIOS what it wants for now.
ok mlarkin@
|
#
1.69 |
|
03-May-2022 |
dv |
vmm/vmd/vmctl: standardize memory units to bytes
At different points in the vm lifecycle vmm(4), vmctl(8), and vmd(8) refer to a vm's memory range sizes in either bytes or megabytes. This is needlessly complex.
Switch to using bytes everywhere and adjust types and constants accordingly. While this makes it possible to specify vm's with memory in fractions of megabytes, the logic requiring whole megabyte values remains.
Feedback from deraadt@, mlarkin@, and Matthew Martin.
ok mlarkin@
|
Revision tags: OPENBSD_7_1_BASE
|
#
1.68 |
|
01-Mar-2022 |
dv |
vmd(8): gracefully handle hitting data limits when starting a vm
With recent changes to login.conf(5) to restrict daemon datasize to a finite value, users can now hit resource limits when attempting to start a vm.
This change fixes the error path when hitting the limit. vmd(8) will no longer abort and memory error messages are relayed to the user.
While here, address potential under-reads/writes using atomicio when relaying data between the child vm process and vmd's vmm process.
Original diff from tedu@. OK mlarkin@.
|
#
1.67 |
|
30-Dec-2021 |
claudio |
Add back support for -B net -b bsd.rd which emulates a PXE install and results in an autoinstall. This can be used to quickly create new OpenBSD installs. OK dv@
|
#
1.66 |
|
29-Nov-2021 |
deraadt |
mostly avoid sys/param.h with a local nitems() ok mlarkin
|
Revision tags: OPENBSD_7_0_BASE
|
#
1.65 |
|
01-Sep-2021 |
dv |
remove unused functions and cleanup vmd.h
Discussed with mlarkin@. These functions were implemented but never used. While in vmd.h, fix the order to match current vmd(8) reality.
|
#
1.64 |
|
16-Jul-2021 |
dv |
vmd(8): simplify vcpu logic, removing uart & vionet reads
Remove legacy state handling on the ns8250 and virtio network devices originally put in place before using libevent for async device events. The vcpu thread doesn't need to process device data as it is handled by the libevent thread.
This has the benefit of simplifying some of the message passing between threads introduced to the ns8250 uart since both the vcpu and libevent threads were processing read events.
No functional change intended. Tested by many, including abieber@, weerd@, Mischa Peters, and Matthias Schmidt. (Thanks.)
OK mlarkin@
|
#
1.63 |
|
16-Jun-2021 |
dv |
cleanup vmd(8) includes and header files
Lots of organic growth other the years lead to unnecessary includes (proc.h everywhere) and odd dependencies between header files. This cleans things up a bit to help with upcoming cleanup around dhcp code.
No functional change.
"go for it" mlarkin@
|
Revision tags: OPENBSD_6_9_BASE
|
#
1.62 |
|
05-Apr-2021 |
dv |
Support booting from compressed kernel images.
The bsd.rd ramdisk now ships gzip'd on amd64. Use libz in base to transparently handle decompression of any compressed kernel images.
Patch from Josh Rickmar.
ok kn@
|
#
1.61 |
|
29-Mar-2021 |
dv |
Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior ignored did not intercept these packets and instead transmitted them.
This should make vmd(8)'s dhcp behave more as a true dhcp server should and allows it to work properly with the new dhcpleased(8) attempting a renewal.
OK mlarkin@
|
#
1.60 |
|
19-Mar-2021 |
kn |
Remove booting from kernels in raw/qcow2 images
Diff and (slightly tweaked) text below from Dave Voutila < dave at sisu dot io >, thanks!
-- Since 6.7 switched to FFS2 as the default filesystem for new installs, the ability for vmd(8) to load a kernel and boot.conf from a disk image directly (without SeaBIOS) has been broken.
A diff from tb to add FFS2 support never mdae it into the tree.
On 5th Jan 2021, new ramdisks for amd64 have started shipping gzipped, breaking the ability to load the bsd.rd directly as a kernel image for a vmd guest without first uncompressing the image.
Using BIOS works, the FFS2 change happend ten months ago and few if any have complained about the breakage. vmctl(8) is still vague about supporting it per its man page and one still has to pass the disk image twice as a "-b" and "-d" argument to boot an OpenBSD guest *without* BIOS.
Josh Rickmar reported the gzip issue on bugs@ and provided patches to add support for compressed ramdisks and kernel images. The easiest way to do so is to drop support for FFS images since they require a call to fmemopen(3) while all the other logic uses fopen(3)/fdopen(3) calls and a file descriptor. It is much easier to get thsoe patches merged if they don't have to account for extracting files from disk images. --
No objections anyone "Removing it makes sense" reyk (who wrote the FFS module) OK mlarkin
|
#
1.59 |
|
13-Feb-2021 |
mlarkin |
Fix some wrong comments and KNF/long line wraps
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.58 |
|
28-Jun-2020 |
pd |
vmd(8): Eliminate libevent state corruption
libevent functions for com, pic and rtc are now only called on event_thread. vcpu exit handlers send messages on a dev pipe and callbacks on these events do the event management (event_add, evtimer_add, etc). Previously, libevent state was mutated by two threads, event_thread, that runs all the callbacks and the vcpu thread when running exit handlers. This could have lead to libevent state corruption.
Patch from Dave Voutila <dave@sisu.io>
ok claudio@ tested by abieber@ and brynet@
|
Revision tags: OPENBSD_6_7_BASE
|
#
1.57 |
|
30-Apr-2020 |
pd |
vmd(8): correctly terminate vm processes after sending vm
Instead of a round about way of sending a message to vmm that 'send is successful' and terminating by vm_remove from vmm, we can send the imsg and exit in the vm process. The sigchld handler in vmm will vm_remove it from its structures. This is how a normal vm is terminated as well.
Previously, vm_remove was called in vmm_dispatch_vm (ie. the event handler to receive messages from vm process) when hanlding the IMSG_VMDOP_SEND_VM_RESPONSE (ie. the vm process has written the vm state to the fd passed on by vmctl send). This is not how vm_remove was intented to be used as it does a free(vm). The vm struct holds the buffers for imsg and so after handling this IMSG_VMDOP_SEND_VM_RESPONSE message, vmm_dispatch_vm loops again to do imsg_get(ibuf, &imsg) to read the next message (and we had just freed this *ibuf when we freed the vm struct) causing it to segfault.
reported by kn@ ok kn@
|
#
1.56 |
|
21-Apr-2020 |
pd |
vmd: improve concurrency control in pause
Previous implementation hit a deadlock sometimes as the pthread_cond_broadcast for the pause mutex could happen before pthread_cond_wait. This implementation uses a barrier which is hit when all vpcus are paused.
ok mpi@
|
#
1.55 |
|
08-Apr-2020 |
pd |
vmm(4): add IOCTL handler to sets the access protections of the ept
This exposes VMM_IOC_MPROTECT_EPT which can be used by vmd to lock in physical pages. Currently, vmd just terminates the vm in case it gets a protection fault in the future.
This feature is used by solo5 which uses vmm(4) as a backend hypervisor.
ok mpi@
Patch from Adam Steen <adam@adamsteen.com.au>
|
#
1.54 |
|
11-Dec-2019 |
pd |
vmd: proper concurrency control when pausing a vm
Removes an XXX which slept for 1s waiting for the vcpu thread to reach HLT and pause. We now define a paused and unpaused condition so that a call to pause_vm() / vmctl pause blocks till the vm really reaches a paused state.
Also, detach events for devices from event loop when pausing and add them back when unpausing. This is because some callbacks call pthread_mutex_lock and if the vm is paused, it would block also causing the libevent thread to block. This would mean that we would not be able to process any IMSGs received from vmm (parent process) including a message to unpause.
ok mlarkin@
|
#
1.53 |
|
30-Nov-2019 |
mlarkin |
Revert previous - the stability was not as improved as we had thought and we ended up accidentally breaking vmctl. This will need more thought.
ok ori@
|
#
1.52 |
|
29-Nov-2019 |
mlarkin |
Fix at least one cause of VMs spinning at 100% host CPU
After debugging with ori@, it looks like an event ends up on the wrong libevent queue, and we end continually de-queueing and re-queueing the event continually. While it's unclear exactly why this happened, a clue on libevent's github issues page for the same problem pointed us to using a different event base for the device events. This seems to have unstuck ori@'s problematic VM, and I have also seen no more hangs after this.
We have not completely separated the queues; ori@ will work on setting new libevent bases for those later. But those events are pretty frequency.
with help from and ok ori@
|
Revision tags: OPENBSD_6_6_BASE
|
#
1.51 |
|
17-Jul-2019 |
pd |
vmm/vmd: Fix migration with pvclock
Implement VMM_IOC_READVMPARAMS and VMM_IOC_WRITEVMPARAMS ioctls to read and write pvclock state.
reads ok mlarkin@
|
#
1.50 |
|
28-Jun-2019 |
deraadt |
When system calls indicate an error they return -1, not some arbitrary value < 0. errno is only updated in this case. Change all (most?) callers of syscalls to follow this better, and let's see if this strictness helps us in the future.
|
#
1.49 |
|
28-May-2019 |
pd |
vmd: unset CR0_CD and CR0_NW in default flat64 register values
These never got unset on AMD/SVM guests when booted via vmctl start -b causing them to run very slow
ok mlarkin@
|
#
1.48 |
|
12-May-2019 |
pd |
vmm: add a x86 page table walker
Add a first cut of x86 page table walker to vmd(8) and vmm(4). This function is not used right now but is a building block for future features like HPET, OUTSB and INSB emulation, nested virtualisation support, etc.
With help from Mike Larkin
ok mlarkin@
|
#
1.47 |
|
11-May-2019 |
jasper |
vm_dump_header allocated space for a signature but it was never set; set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image
ok mlarkin@ pd@
|
#
1.46 |
|
11-May-2019 |
jasper |
track the state of the vm (running, paused, etc) using a single bitfield instead of a handful of separate variables. this will makes it easier for vmd to report and check on the individual vm states
no functional change intended
ok ccardenas@ mlarkin@
|
Revision tags: OPENBSD_6_5_BASE
|
#
1.45 |
|
01-Mar-2019 |
mlarkin |
vmd(8): remove some i386 remnants that missed the original cleanup
ok pd, kn, deraadt
|
#
1.44 |
|
20-Feb-2019 |
mlarkin |
vmd(8): initialize guest %drX registers to power-on defaults on launch
Initializes the %drX registers to power on defaults, and bump the VM send/recieve header to reflect same
discussed with deraadt@
|
#
1.43 |
|
10-Dec-2018 |
claudio |
Implement the fw_cfg interface basics and use it to set the bootorder if a bootdevice was forced. This implements both the pure IO port interface and also the new DMA interface, a few direct commands are implemented which are needed but in general the "file" interface should be used. There is no write support for the guest. Tested against the latest vmm-firmware port. This requires also a -current kernel to pass the IO ports to vmd(8). OK mlarkin@ ccardenas@
|
#
1.42 |
|
06-Dec-2018 |
claudio |
Make it possible to define the bootdevice in vmd. This information is used currently only when booting a OpenBSD kernel. If VMBOOTDEV_NET is used the internal dhcp server will pass "auto_install" as boot file to the client and the boot loader passes the MAC of the first interface to the kernel to indicate PXE booting. Adding boot order support to SeaBIOS is not yet implemented. Ok ccardenas@
|
Revision tags: OPENBSD_6_4_BASE
|
#
1.41 |
|
08-Oct-2018 |
reyk |
Add support for qcow2 base images (external snapshots).
This works is from Ori Bernstein, committing on his behalf:
Add support to vmd for external snapshots. That is, snapshots that are derived from a base image. Data lookups start in the derived image, and if the derived image does not contain some data, the search proceeds ot the base image. Multiple derived images may exist off of a single base image.
A limitation of this format is that modifying the base image will corrupt the derived image.
This change also adds support for creating disk derived disk images to vmctl. To use it:
vmctl create derived.qcow2 -s 16G -b base.qcow2
From Ori Bernstein OK mlarkin@ reyk@
|
#
1.40 |
|
28-Sep-2018 |
reyk |
Support vmd-internal's vmboot with qcow2 disk images.
OK mlarkin@
|
#
1.39 |
|
19-Sep-2018 |
ccardenas |
Various clean up items for disks.
- qcow2: general cleanup - vioraw: check malloc - virtio: add function to sync disks - vm: call virtio_shutdown to sync disks when vm is finished executing
Thanks to Ori Bernstein.
Ok miko@
|
#
1.38 |
|
17-Jul-2018 |
mlarkin |
vmd(8): fix vmctl -b option for i386 kernels.
ok pd@
|
#
1.37 |
|
12-Jul-2018 |
mlarkin |
vmm(8)/vmm(4): send a copy of the guest register state to vmd on exit, avoiding multiple readregs ioctls back to vmm in case register content is needed subsequently.
ok phessler
|
#
1.36 |
|
10-Jul-2018 |
mlarkin |
vmd(8): route ELCR handler to the right function
|
#
1.35 |
|
09-Jul-2018 |
mlarkin |
vmd(8): better debug message in a failure case
|
#
1.34 |
|
19-Jun-2018 |
reyk |
knf
|
#
1.33 |
|
27-Apr-2018 |
mlarkin |
vmd(8): implement vmd side of ELCR registers
ok guenther
|
#
1.32 |
|
26-Apr-2018 |
mlarkin |
vmd(8): handle PIT channel 2 status readback via port 0x61
Allow PIT channel 2 status (fired/counting) readback via port 0x61 bit 5.
ok guenther@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.31 |
|
03-Jan-2018 |
ccardenas |
Add initial CD-ROM support to VMD via vioscsi.
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
#
1.30 |
|
29-Nov-2017 |
mlarkin |
make vmm(4) less responsible for initial register state, preferring to let usermode daemons handle that.
ok pd@
|
#
1.29 |
|
28-Nov-2017 |
mlarkin |
fix some spelling errors in a few comments
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.28 |
|
19-Sep-2017 |
mlarkin |
Clarify a wrong conditional, found by jsg.
ok jsg
|
#
1.27 |
|
17-Sep-2017 |
pd |
vmd: send/recv pci config space instead of recreating pci devices on receive
ok mlarkin@
|
#
1.26 |
|
17-Sep-2017 |
pd |
vmd: re add rtc.per and rtc.sec evtimers on receive
This was missed in receive. mc146818_start is already defined. This fixes rtc time resync on receive.
ok mlarkin@
|
#
1.25 |
|
11-Sep-2017 |
dlg |
add functions to provide direct access to guest memory as vmd addresses
iovec_mem() populates an iovec array based on guest physical addresses. this allows the use of things like readv and writev for moving data between the guest and a disk image file without having to bounce the memory.
vaddr_mem() provides a vmd usable pointer based on a guests physical address. this makes it possible to directly reference things like virtio rings without having to bounce that memory either. however, it assumes that a contiguous range of guest physical memory will sit in a single vm memory range. mlarkin@ says this is right.
ok mlarkin@
|
#
1.24 |
|
20-Aug-2017 |
pd |
vmd: Allow only upward migration
This restricts receiving vms from hosts with more cpu features.
Tested on broadwell -> skylake (works) skylake -> broadwell (don't work)
ok mlarkin@
|
#
1.23 |
|
14-Aug-2017 |
mlarkin |
vmd: set MSR_MISC_ENABLE=0 on vm creation, this will be re-set in vmm based on proper values from the host in use.
|
#
1.22 |
|
15-Jul-2017 |
pd |
Add vmctl send and vmctl receive
ok reyk@ and mlarkin@
|
#
1.21 |
|
09-Jul-2017 |
pd |
vmd/vmctl: Add ability to pause / unpause vms
With help from Ashwin Agrawal
ok reyk@ mlarkin@
|
#
1.20 |
|
07-Jun-2017 |
mlarkin |
vmd: Implement simulated baudrate support in the ns8250 module. The previous version was allowing an output rate that is "too fast", and linux guests would give up after 512 characters TXed ("too much work for irq4").
This diff calculates the approximate rate we can sustain at the current programmed baud rate and limits the output to that rate by inserting a HZ delay after a specified number of characters have been transmitted. This fixes the linux guest console issue.
Note that the console now outputs at more or less the selected baud rate, instead of nearly instantaneously as before - if you selected 9600 in your guest VMs before, you might want to change that to 115200 now for a better console experience.
krw@ "seems like a good idea to me"
|
#
1.19 |
|
30-May-2017 |
tedu |
split vioblk read/write functions into start and finish as prep for async io operations. ok mlarkin
|
#
1.18 |
|
28-May-2017 |
mlarkin |
SVM: add some exit types
Also, fix a comment that wasn't applicable anymore, and change a format from decimal to hex
|
#
1.17 |
|
05-May-2017 |
reyk |
VMs cannot use proc_compose() to PROC_VMM, they have to use imsg_compose() on the "vmm_pipe" directly. This fixes the communication channel from VMs back to vmm.
|
#
1.16 |
|
05-May-2017 |
mlarkin |
Allow vmd(8) to set guest %xcr0
Usermode part of previous vmm(4) diff.
Posted to tech by Pratik Vyas
|
#
1.15 |
|
02-May-2017 |
mlarkin |
fix an error in i386 vmd build
|
#
1.14 |
|
02-May-2017 |
mlarkin |
Matching vmd(8) part of previous diff (first part of vmctl send/receive).
ok kettenis
|
#
1.13 |
|
25-Apr-2017 |
reyk |
spacing
|
#
1.12 |
|
19-Apr-2017 |
reyk |
Add support for dynamic "NAT" interfaces (-L/local interface).
When a local interface is configured, vmd configures a /31 address on the tap(4) interface of the host and provides another IP in the same subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server that replies with IP, gateway, and DNS addresses to the VM. The built-in server only ever responds to the VM on the inside and cannot leak its DHCP responses to the outside.
Thanks to Uwe Werler, Josh Grosse, and some others for testing!
OK deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.11 |
|
27-Mar-2017 |
deraadt |
die whitespace die die die
|
#
1.10 |
|
25-Mar-2017 |
mlarkin |
Last bits needed to get seabios + alpine linux working. This is enough to get started and let more people help finding and fixing bugs.
ok kettenis, deraadt
|
#
1.9 |
|
25-Mar-2017 |
reyk |
Boot using BIOS from /etc/firmware/vmm-bios by default.
Instead of using the internal "vmboot", VMs will now be booted using the external BIOS firmware in /etc/firmware/vmm-bios (which is subject to a LGPLv3 license). Direct booting of OpenBSD kernels or non-default BIOS images is still supported for now using the -b/boot option that is replacing the -k/kernel option.
As requested by Theo, vmd(8) fails if neither the default BIOS is found nor a kernel has been specified in the VM configuration. The "vmm" BIOS has to be installed using fw_update(1), which will be done automatically in most cases where the OpenBSD can fetch it after install/upgrade.
OK mlarkin@
|
#
1.8 |
|
25-Mar-2017 |
mlarkin |
Implement some missing functionality and clean up some code in vmd pci emulation.
ok kettenis
|
#
1.7 |
|
25-Mar-2017 |
mlarkin |
Introduce a new function to obtain properly sized input data, and convert i8253/i8259/mc146818 emulation to use this.
|
#
1.6 |
|
24-Mar-2017 |
mlarkin |
Allow vmd to proceed after an interrupt occurred after retiring a cpuid instruction. Matches previous commit to kernel vmm.c
|
#
1.5 |
|
23-Mar-2017 |
mlarkin |
Implement memory size and SMP CPU count NVRAM registers in the emulated mc146818. This is needed for seabios to boot properly (and construct a sensible e820 map to send to the guest OS).
|
#
1.4 |
|
21-Mar-2017 |
mlarkin |
Fix two errors in NS8250 (UART) emulation. The first error zeroed out the high bits of %eax on reading register data from the emulated UART ports. The second error didn't properly assert the TXRDY bit during init - this bit was only set after the first character was sent. Both these bugs caused seabios to not be able to output any data. Found during the recent effort to get Linux guests booting.
|
#
1.3 |
|
15-Mar-2017 |
reyk |
Improve vmmci(4) shutdown and reboot.
This change handles various cases to power off the VM, even if it is unresponsive, stuck in ddb, or when the shutdown was initiated from the VM guest side. Usage of timeout and VM ACKs make sure that the VM is really turned off at some point.
OK mlarkin@
|
#
1.2 |
|
02-Mar-2017 |
reyk |
Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.
This is especially useful when multiple VMs share a switch, the implementation is independent from the underlying switch or bridge.
no objections mlarkin@
|
#
1.1 |
|
01-Mar-2017 |
reyk |
Split vmm.c into two files: vm.c for the VM child, vmm.c for the parent
As discussed with mlarkin@, it makes it easier to maintain the file.
OK mlarkin@
|
#
1.97 |
|
05-Feb-2024 |
dv |
Cleanup fcntl(3) usage and fd lifetimes in vmd(8).
Remove extraneous fcntl(3) usage for setting fd features that can be set at time of open(2), pipe2(2), or socketpair(2). Also cleans up pty creation switching to using functions from libutil instead of direct ioctl(2) calls.
ok mlarkin@, original diff ok claudio@ as well.
|
#
1.96 |
|
18-Jan-2024 |
claudio |
Use imsg_get_fd() in vmd.
vmd uses a lot of fd passing and does it sometimes via extra abstraction so this just tries to convert the code without any optimisations.
ok dv@
|
#
1.95 |
|
10-Jan-2024 |
dv |
vmm/vmd: add io instruction length to exit information.
Add the instruction length to the vm exit information to allower vmd(8) to manipulate the instruction pointer after io emulation. This is preparation for emulating string-based io instructions.
Removes the instruction pointer update from the kernel (vmm(4)) as well as the instruction length checks, which were overly restrictive anyways based on the way prefixes work in x86 instructions.
ok mlarkin@
|
Revision tags: OPENBSD_7_4_BASE
|
#
1.94 |
|
26-Sep-2023 |
dv |
vmd(8): disambiguate log messages per vm and device.
The logging output from vmd(8) often specifies the function performing the logging, but leaves which vm or vm device to guesswork and reading tea leaves.
Change the logging formatting to prefix with information about the specific vm and potentially the device subprocess. Most of this logging is behind the "verbose" mode, but for warnings this will clarify which vm or device logged the warning.
The format of vm/<name>/<device><index> is chosen to be concise and less ugly than other approaches. This adjusts the process naming for devices to match, dropping the use of brackets.
In the process of this change, updating log settings dynamically via vmctl(8) is fixed by properly broadcasting that information to the device subprocesses. The "vmm" process also now updates its own state properly, so settings survive vm reboots.
ok mlarkin@
|
#
1.93 |
|
26-Sep-2023 |
dv |
vmd(8): fix vm pause deadlock.
When vcpu threads pause, they are holding the run mutex lock. If the event thread is asked to assert an irq on the pic and interrupts are pending, it will try to take the run mutex lock on the vcpu. This deadlocks.
Release the lock in the vcpu thread before waiting on the pause condition variable.
ok mlarkin@
|
#
1.92 |
|
23-Sep-2023 |
dv |
vmd(8): log vmd's vm id, not vmm's in vcpu_run_loop.
Some guests cause a warning message during a shutdown. Log the vmd vm id and not the kernel vmm id as it's next to useless to the end user. This has annoyed me too much.
|
#
1.91 |
|
06-Sep-2023 |
dv |
vmm(4)/vmd(8): include pending interrupt in vm_run_parmams.
To remove an ioctl(2) from the vcpu thread hotpath in vmd(8), add a flag in the vm_run_params structure to indicate if there's another interrupt pending. This reduces latency in vcpu work related to i/o as we save a trip into the kernel just to flip the interrupt pending flag on or off.
Tested by phessler@, mbuhl@, stsp@, and Mischa Peters.
ok mlarkin@
|
#
1.90 |
|
13-Jul-2023 |
dv |
vmd(8): pull validation into local prefix parser.
Validation for local prefixes, both inet and inet6, was scattered around. To make it even more confusing, vmd was using generic address parsing logic from prior network daemons. vmd doesn't need to parse addresses other than when parsing the local prefix settings in vm.conf and no runtime parsing is needed.
This change merges parsing and validation based on vmd's specific needs for local prefixes (e.g. reserving enough bits for vm id and network interface id encoding in an ipv4 address). In addition, it simplifies the struct from a generic address struct to one focused on just storing the v4 and v6 prefixes and masks. This cleans up an unused TAILQ struct member that isn't used by vmd and was leftover copy-pasta from those prior daemons.
The address parsing that vmd uses is also updated to using the latest logic in bgpd(8).
ok mlarkin@
|
#
1.89 |
|
13-May-2023 |
dv |
vmm(4)/vmd(8): switch to anonymous shared mappings.
While splitting out emulated virtio network and block devices into separate processes, I originally used named mappings via shm_mkstemp(3). While this functionally achieved the desired result, it had two unintended consequences:
1) tearing down a vm process and its child processes required excessive locking as the guest memory was tied into the VFS layer.
2) it was observed by mlarkin@ that actions in other parts of the VFS layer could cause some of the guest memory to flush to storage, possibly filling /tmp.
This commit adds a new vmm(4) ioctl dedicated to allowing a process request the kernel share a mapping of guest memory into its own vm space. This requires an open fd to /dev/vmm (requiring root) and both the "vmm" and "proc" pledge(2) promises. In addition, the caller must know enough about the original memory ranges to reconstruct them to make the vm's ranges.
Tested with help from Mischa Peters.
ok mlarkin@
|
#
1.88 |
|
28-Apr-2023 |
dv |
vmd(8)/vmctl(8): allow vm owners to override boot kernel.
vmd allows non-root users to "own" a vm defined in vm.conf(5). While the user can start/stop the vm, if they break their filesystem they have no means of booting recovery media like a ramdisk kernel.
This change opens the provided boot kernel via vmctl and passes the file descriptor through the control channel to vmd. The next boot of the vm will use the provided file descriptor as boot kernel/bios. Subsequent boots (e.g. a reboot) will return to using behavior defined in vm.conf or the default bios image.
ok mlarkin@
|
#
1.87 |
|
27-Apr-2023 |
dv |
vmd(8): introduce multi-process model for virtio devices.
Isolate virtio network and block device emulation in dedicated processes, forked and exec'd from the vm process. This allows for tightening pledge promises to just "stdio".
Communication between the vcpu's and these devices now occurs via imsg channels, which adds the benefit of not always blocking the vcpu thread while emulating the device.
With this commit, it's possible that vmd is the first open source hypervisor that *defaults* to a multi-process device emulation model without requiring any additional configuration from the operator.
Testing help from phessler@ and Mischa Peters.
ok mlarkin@
|
#
1.86 |
|
25-Apr-2023 |
dv |
vmm(4)/vmd(8): pull struct members out of vmm ioctl create struct.
The object sent to vmm(4) contained file paths and details the kernel does not need for cpu virtualization as device emulation is in userland. Effectively, "pull up" the struct members from the vm_create_params struct to the parent vmop_create_params struct.
This allows us to clean up some of vmd(8) and simplify things for switching to having vmctl(8) open the "kernel" file (SeaBIOS, bsd.rd, etc.) to allow users to boot recovery ramdisk kernels.
ok mlarkin@
|
#
1.85 |
|
23-Apr-2023 |
dv |
vmd(8): teach vmm process how to exec.
Use execvp(2) to launch vm children with new address spaces. Consequently, introduces use of unveil(2) into the vmm and vm processes.
This imposes the requirement of launching vmd with absolute paths, similar to sshd(8).
ok mlarkin@
|
#
1.84 |
|
23-Apr-2023 |
anton |
unbreak tree by coping with recent s/XCR0/XFEATURE rename
|
Revision tags: OPENBSD_7_3_BASE
|
#
1.83 |
|
06-Feb-2023 |
dv |
vmd(8): scan pci bus to determine bootorder strings.
vmd's SeaBIOS bootorder strings had hardcoded pci device ids, so if a user added a network interface the bootorder strings didn't line up with reality. Using vmctl(8) to boot from a cdrom (-B cdrom) would fail, for instance, if attaching both a nic and a disk as well.
This change scans the pci devices and finds the first of each type to construct viable bootorder strings.
ok jan@
|
#
1.82 |
|
28-Jan-2023 |
dv |
Move some header definitions from vmm(4) to vmd(8).
Part of an ongoing effort to move userland-specific information out of a kernel header and directly into vmd(8). No functional change.
ok mlarkin@
|
#
1.81 |
|
08-Jan-2023 |
dv |
vmd(8): add thread names to vm process.
ok guenther@.
|
#
1.80 |
|
04-Jan-2023 |
dv |
Typos in vmd error message. No functional change.
|
#
1.79 |
|
28-Dec-2022 |
jmc |
spelling fixes; from paul tagliamonte any parts of his diff not taken are noted on tech
|
#
1.78 |
|
26-Dec-2022 |
dv |
vmd(8): provide a detailed e820 memory map.
When booting guests with SeaBIOS, vmd(8) supplied details about the available guest memory via CMOS registers. Consequently, we've been carrying some patches in the ports tree to SeaBIOS to fetch this information like it's the 1990s.
When a vm initializes memory ranges, we now track what each range represents. This information can be used to supply the e820 memory map to SeaBIOS via the fw_cfg interface allowing it to properly communicate memory ranges to a guest operating system. (This will also allow us to drop some patches from the port.)
Given the ranges can now be marked with a purpose, this also allows vmm(4) to switch from hard-coded mmio ranges and instead let the information on the memory range dictate if vmm should be handling a page fault or sending to vmd for a memory assist.
Tested by Mischa Peters and others. OK mlarkin@.
|
#
1.77 |
|
23-Dec-2022 |
dv |
vmd(8): implement zero-copy operations on virtqueues.
The original virtio device implementation relied on allocating a buffer on heap, copying the virtqueue from the guest, mutating the copy, and then overwriting the virtqueue in the guest.
While the approach worked, it was both complex and added extra overhead. On older hardware, switching to the zero-copy approach can show a noticeable performance improvement for vionet devices. An added benefit is this diff also reduces the amount of code in vmd, which is always a welcome change.
In addition, change to talking about the queue pfn and not "address" as the virtio-pci spec has drivers provide a 32-bit value representing the physical page number of the location in guest memory, not the linear address.
Original idea from dlg@ while working on re-adding async task queues.
ok dlg@, tested by many
|
#
1.76 |
|
11-Nov-2022 |
dv |
Revert removal of toggling interrupt line in vmd vcpu run loop.
phessler reports a performance regression. Needs more testing.
|
#
1.75 |
|
10-Nov-2022 |
dv |
vmd(8): remove toggling interrupt line on vcpu in vcpu run loop
We toggle the interrupt "line" on the vcpu when we assert or deassert irq on the pic in either the vcpu thread (emulating some devices) or on the device event thread (mostly handling reading available data). Having it in the vcpu run loop here just results in another ioctl(2) call before the one for re-entering the guest cpu.
Removing it shows no noticeable behavioral change in existing guests.
ok mlarkin@
|
#
1.74 |
|
10-Nov-2022 |
dv |
vmd(8): import mmio decode and emulation, disabled for now.
The initial mmio support for vmd adds support for only specific MOV and MOVZX instructions. Plan is to begin iterating in-tree on other missing pieces. All functionality is gated behind an #if for now.
Only change to vmm(4) is reordering register #define's in vmmvar.h.
ok mlarkin@
|
Revision tags: OPENBSD_7_2_BASE
|
#
1.73 |
|
01-Sep-2022 |
dv |
vmm(4): send all port io emulation to userland
Simplify things by sending any io exits from IN/OUT instructions to userland instead of trying to emulate anything in the kernel. vmm was sending most pertinent exits to vmd anyways, so this functionally changes little.
An added benefit is this solves an issue reported by tb@ where i386 OpenBSD guests would probe for a pc keyboard repeatedly and cause excessive vm exits. (The emulation in vmm was not properly handling these port reads.)
While here, make the assignment of the VEI_DIR_{IN,OUT} enum values not assume the underlying integer the compiler may assign.
ok mlarkin@
|
#
1.72 |
|
30-Aug-2022 |
dv |
Initial support for mmio assist for vmm(4)
Provide the basic information required for a userland assist in emulating instructions touching mmio regions, sending as much information as is provided by the host hardware.
No decode or assist provided at the moment by vmd(8).
ok mlarkin@
|
#
1.71 |
|
29-Jun-2022 |
dv |
vmd(8): fix off by one in vm memory range check
When inspecting if a gpa falls into a known memory range, vmd was considering it valid 1 byte past the end resulting in selecting the wrong starting range for the search.
ok mlarkin@
|
#
1.70 |
|
26-Jun-2022 |
dv |
vmd: create a copy of bios at 4g boundary
Newer Linux kernels call into the bios to perform a reboot and our version of SeaBIOS assumes there's a "copy" of the bios ending at 4g. When SeaBIOS reads from this area, since vmd doesn't perform mmio yet, guests terminate with an unhandled fault.
Carve out some space ending at 4g and copy the bios there. Technically we could load garbage there, but give SeaBIOS what it wants for now.
ok mlarkin@
|
#
1.69 |
|
03-May-2022 |
dv |
vmm/vmd/vmctl: standardize memory units to bytes
At different points in the vm lifecycle vmm(4), vmctl(8), and vmd(8) refer to a vm's memory range sizes in either bytes or megabytes. This is needlessly complex.
Switch to using bytes everywhere and adjust types and constants accordingly. While this makes it possible to specify vm's with memory in fractions of megabytes, the logic requiring whole megabyte values remains.
Feedback from deraadt@, mlarkin@, and Matthew Martin.
ok mlarkin@
|
Revision tags: OPENBSD_7_1_BASE
|
#
1.68 |
|
01-Mar-2022 |
dv |
vmd(8): gracefully handle hitting data limits when starting a vm
With recent changes to login.conf(5) to restrict daemon datasize to a finite value, users can now hit resource limits when attempting to start a vm.
This change fixes the error path when hitting the limit. vmd(8) will no longer abort and memory error messages are relayed to the user.
While here, address potential under-reads/writes using atomicio when relaying data between the child vm process and vmd's vmm process.
Original diff from tedu@. OK mlarkin@.
|
#
1.67 |
|
30-Dec-2021 |
claudio |
Add back support for -B net -b bsd.rd which emulates a PXE install and results in an autoinstall. This can be used to quickly create new OpenBSD installs. OK dv@
|
#
1.66 |
|
29-Nov-2021 |
deraadt |
mostly avoid sys/param.h with a local nitems() ok mlarkin
|
Revision tags: OPENBSD_7_0_BASE
|
#
1.65 |
|
01-Sep-2021 |
dv |
remove unused functions and cleanup vmd.h
Discussed with mlarkin@. These functions were implemented but never used. While in vmd.h, fix the order to match current vmd(8) reality.
|
#
1.64 |
|
16-Jul-2021 |
dv |
vmd(8): simplify vcpu logic, removing uart & vionet reads
Remove legacy state handling on the ns8250 and virtio network devices originally put in place before using libevent for async device events. The vcpu thread doesn't need to process device data as it is handled by the libevent thread.
This has the benefit of simplifying some of the message passing between threads introduced to the ns8250 uart since both the vcpu and libevent threads were processing read events.
No functional change intended. Tested by many, including abieber@, weerd@, Mischa Peters, and Matthias Schmidt. (Thanks.)
OK mlarkin@
|
#
1.63 |
|
16-Jun-2021 |
dv |
cleanup vmd(8) includes and header files
Lots of organic growth other the years lead to unnecessary includes (proc.h everywhere) and odd dependencies between header files. This cleans things up a bit to help with upcoming cleanup around dhcp code.
No functional change.
"go for it" mlarkin@
|
Revision tags: OPENBSD_6_9_BASE
|
#
1.62 |
|
05-Apr-2021 |
dv |
Support booting from compressed kernel images.
The bsd.rd ramdisk now ships gzip'd on amd64. Use libz in base to transparently handle decompression of any compressed kernel images.
Patch from Josh Rickmar.
ok kn@
|
#
1.61 |
|
29-Mar-2021 |
dv |
Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior ignored did not intercept these packets and instead transmitted them.
This should make vmd(8)'s dhcp behave more as a true dhcp server should and allows it to work properly with the new dhcpleased(8) attempting a renewal.
OK mlarkin@
|
#
1.60 |
|
19-Mar-2021 |
kn |
Remove booting from kernels in raw/qcow2 images
Diff and (slightly tweaked) text below from Dave Voutila < dave at sisu dot io >, thanks!
-- Since 6.7 switched to FFS2 as the default filesystem for new installs, the ability for vmd(8) to load a kernel and boot.conf from a disk image directly (without SeaBIOS) has been broken.
A diff from tb to add FFS2 support never mdae it into the tree.
On 5th Jan 2021, new ramdisks for amd64 have started shipping gzipped, breaking the ability to load the bsd.rd directly as a kernel image for a vmd guest without first uncompressing the image.
Using BIOS works, the FFS2 change happend ten months ago and few if any have complained about the breakage. vmctl(8) is still vague about supporting it per its man page and one still has to pass the disk image twice as a "-b" and "-d" argument to boot an OpenBSD guest *without* BIOS.
Josh Rickmar reported the gzip issue on bugs@ and provided patches to add support for compressed ramdisks and kernel images. The easiest way to do so is to drop support for FFS images since they require a call to fmemopen(3) while all the other logic uses fopen(3)/fdopen(3) calls and a file descriptor. It is much easier to get thsoe patches merged if they don't have to account for extracting files from disk images. --
No objections anyone "Removing it makes sense" reyk (who wrote the FFS module) OK mlarkin
|
#
1.59 |
|
13-Feb-2021 |
mlarkin |
Fix some wrong comments and KNF/long line wraps
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.58 |
|
28-Jun-2020 |
pd |
vmd(8): Eliminate libevent state corruption
libevent functions for com, pic and rtc are now only called on event_thread. vcpu exit handlers send messages on a dev pipe and callbacks on these events do the event management (event_add, evtimer_add, etc). Previously, libevent state was mutated by two threads, event_thread, that runs all the callbacks and the vcpu thread when running exit handlers. This could have lead to libevent state corruption.
Patch from Dave Voutila <dave@sisu.io>
ok claudio@ tested by abieber@ and brynet@
|
Revision tags: OPENBSD_6_7_BASE
|
#
1.57 |
|
30-Apr-2020 |
pd |
vmd(8): correctly terminate vm processes after sending vm
Instead of a round about way of sending a message to vmm that 'send is successful' and terminating by vm_remove from vmm, we can send the imsg and exit in the vm process. The sigchld handler in vmm will vm_remove it from its structures. This is how a normal vm is terminated as well.
Previously, vm_remove was called in vmm_dispatch_vm (ie. the event handler to receive messages from vm process) when hanlding the IMSG_VMDOP_SEND_VM_RESPONSE (ie. the vm process has written the vm state to the fd passed on by vmctl send). This is not how vm_remove was intented to be used as it does a free(vm). The vm struct holds the buffers for imsg and so after handling this IMSG_VMDOP_SEND_VM_RESPONSE message, vmm_dispatch_vm loops again to do imsg_get(ibuf, &imsg) to read the next message (and we had just freed this *ibuf when we freed the vm struct) causing it to segfault.
reported by kn@ ok kn@
|
#
1.56 |
|
21-Apr-2020 |
pd |
vmd: improve concurrency control in pause
Previous implementation hit a deadlock sometimes as the pthread_cond_broadcast for the pause mutex could happen before pthread_cond_wait. This implementation uses a barrier which is hit when all vpcus are paused.
ok mpi@
|
#
1.55 |
|
08-Apr-2020 |
pd |
vmm(4): add IOCTL handler to sets the access protections of the ept
This exposes VMM_IOC_MPROTECT_EPT which can be used by vmd to lock in physical pages. Currently, vmd just terminates the vm in case it gets a protection fault in the future.
This feature is used by solo5 which uses vmm(4) as a backend hypervisor.
ok mpi@
Patch from Adam Steen <adam@adamsteen.com.au>
|
#
1.54 |
|
11-Dec-2019 |
pd |
vmd: proper concurrency control when pausing a vm
Removes an XXX which slept for 1s waiting for the vcpu thread to reach HLT and pause. We now define a paused and unpaused condition so that a call to pause_vm() / vmctl pause blocks till the vm really reaches a paused state.
Also, detach events for devices from event loop when pausing and add them back when unpausing. This is because some callbacks call pthread_mutex_lock and if the vm is paused, it would block also causing the libevent thread to block. This would mean that we would not be able to process any IMSGs received from vmm (parent process) including a message to unpause.
ok mlarkin@
|
#
1.53 |
|
30-Nov-2019 |
mlarkin |
Revert previous - the stability was not as improved as we had thought and we ended up accidentally breaking vmctl. This will need more thought.
ok ori@
|
#
1.52 |
|
29-Nov-2019 |
mlarkin |
Fix at least one cause of VMs spinning at 100% host CPU
After debugging with ori@, it looks like an event ends up on the wrong libevent queue, and we end continually de-queueing and re-queueing the event continually. While it's unclear exactly why this happened, a clue on libevent's github issues page for the same problem pointed us to using a different event base for the device events. This seems to have unstuck ori@'s problematic VM, and I have also seen no more hangs after this.
We have not completely separated the queues; ori@ will work on setting new libevent bases for those later. But those events are pretty frequency.
with help from and ok ori@
|
Revision tags: OPENBSD_6_6_BASE
|
#
1.51 |
|
17-Jul-2019 |
pd |
vmm/vmd: Fix migration with pvclock
Implement VMM_IOC_READVMPARAMS and VMM_IOC_WRITEVMPARAMS ioctls to read and write pvclock state.
reads ok mlarkin@
|
#
1.50 |
|
28-Jun-2019 |
deraadt |
When system calls indicate an error they return -1, not some arbitrary value < 0. errno is only updated in this case. Change all (most?) callers of syscalls to follow this better, and let's see if this strictness helps us in the future.
|
#
1.49 |
|
28-May-2019 |
pd |
vmd: unset CR0_CD and CR0_NW in default flat64 register values
These never got unset on AMD/SVM guests when booted via vmctl start -b causing them to run very slow
ok mlarkin@
|
#
1.48 |
|
12-May-2019 |
pd |
vmm: add a x86 page table walker
Add a first cut of x86 page table walker to vmd(8) and vmm(4). This function is not used right now but is a building block for future features like HPET, OUTSB and INSB emulation, nested virtualisation support, etc.
With help from Mike Larkin
ok mlarkin@
|
#
1.47 |
|
11-May-2019 |
jasper |
vm_dump_header allocated space for a signature but it was never set; set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image
ok mlarkin@ pd@
|
#
1.46 |
|
11-May-2019 |
jasper |
track the state of the vm (running, paused, etc) using a single bitfield instead of a handful of separate variables. this will makes it easier for vmd to report and check on the individual vm states
no functional change intended
ok ccardenas@ mlarkin@
|
Revision tags: OPENBSD_6_5_BASE
|
#
1.45 |
|
01-Mar-2019 |
mlarkin |
vmd(8): remove some i386 remnants that missed the original cleanup
ok pd, kn, deraadt
|
#
1.44 |
|
20-Feb-2019 |
mlarkin |
vmd(8): initialize guest %drX registers to power-on defaults on launch
Initializes the %drX registers to power on defaults, and bump the VM send/recieve header to reflect same
discussed with deraadt@
|
#
1.43 |
|
10-Dec-2018 |
claudio |
Implement the fw_cfg interface basics and use it to set the bootorder if a bootdevice was forced. This implements both the pure IO port interface and also the new DMA interface, a few direct commands are implemented which are needed but in general the "file" interface should be used. There is no write support for the guest. Tested against the latest vmm-firmware port. This requires also a -current kernel to pass the IO ports to vmd(8). OK mlarkin@ ccardenas@
|
#
1.42 |
|
06-Dec-2018 |
claudio |
Make it possible to define the bootdevice in vmd. This information is used currently only when booting a OpenBSD kernel. If VMBOOTDEV_NET is used the internal dhcp server will pass "auto_install" as boot file to the client and the boot loader passes the MAC of the first interface to the kernel to indicate PXE booting. Adding boot order support to SeaBIOS is not yet implemented. Ok ccardenas@
|
Revision tags: OPENBSD_6_4_BASE
|
#
1.41 |
|
08-Oct-2018 |
reyk |
Add support for qcow2 base images (external snapshots).
This works is from Ori Bernstein, committing on his behalf:
Add support to vmd for external snapshots. That is, snapshots that are derived from a base image. Data lookups start in the derived image, and if the derived image does not contain some data, the search proceeds ot the base image. Multiple derived images may exist off of a single base image.
A limitation of this format is that modifying the base image will corrupt the derived image.
This change also adds support for creating disk derived disk images to vmctl. To use it:
vmctl create derived.qcow2 -s 16G -b base.qcow2
From Ori Bernstein OK mlarkin@ reyk@
|
#
1.40 |
|
28-Sep-2018 |
reyk |
Support vmd-internal's vmboot with qcow2 disk images.
OK mlarkin@
|
#
1.39 |
|
19-Sep-2018 |
ccardenas |
Various clean up items for disks.
- qcow2: general cleanup - vioraw: check malloc - virtio: add function to sync disks - vm: call virtio_shutdown to sync disks when vm is finished executing
Thanks to Ori Bernstein.
Ok miko@
|
#
1.38 |
|
17-Jul-2018 |
mlarkin |
vmd(8): fix vmctl -b option for i386 kernels.
ok pd@
|
#
1.37 |
|
12-Jul-2018 |
mlarkin |
vmm(8)/vmm(4): send a copy of the guest register state to vmd on exit, avoiding multiple readregs ioctls back to vmm in case register content is needed subsequently.
ok phessler
|
#
1.36 |
|
10-Jul-2018 |
mlarkin |
vmd(8): route ELCR handler to the right function
|
#
1.35 |
|
09-Jul-2018 |
mlarkin |
vmd(8): better debug message in a failure case
|
#
1.34 |
|
19-Jun-2018 |
reyk |
knf
|
#
1.33 |
|
27-Apr-2018 |
mlarkin |
vmd(8): implement vmd side of ELCR registers
ok guenther
|
#
1.32 |
|
26-Apr-2018 |
mlarkin |
vmd(8): handle PIT channel 2 status readback via port 0x61
Allow PIT channel 2 status (fired/counting) readback via port 0x61 bit 5.
ok guenther@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.31 |
|
03-Jan-2018 |
ccardenas |
Add initial CD-ROM support to VMD via vioscsi.
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
#
1.30 |
|
29-Nov-2017 |
mlarkin |
make vmm(4) less responsible for initial register state, preferring to let usermode daemons handle that.
ok pd@
|
#
1.29 |
|
28-Nov-2017 |
mlarkin |
fix some spelling errors in a few comments
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.28 |
|
19-Sep-2017 |
mlarkin |
Clarify a wrong conditional, found by jsg.
ok jsg
|
#
1.27 |
|
17-Sep-2017 |
pd |
vmd: send/recv pci config space instead of recreating pci devices on receive
ok mlarkin@
|
#
1.26 |
|
17-Sep-2017 |
pd |
vmd: re add rtc.per and rtc.sec evtimers on receive
This was missed in receive. mc146818_start is already defined. This fixes rtc time resync on receive.
ok mlarkin@
|
#
1.25 |
|
11-Sep-2017 |
dlg |
add functions to provide direct access to guest memory as vmd addresses
iovec_mem() populates an iovec array based on guest physical addresses. this allows the use of things like readv and writev for moving data between the guest and a disk image file without having to bounce the memory.
vaddr_mem() provides a vmd usable pointer based on a guests physical address. this makes it possible to directly reference things like virtio rings without having to bounce that memory either. however, it assumes that a contiguous range of guest physical memory will sit in a single vm memory range. mlarkin@ says this is right.
ok mlarkin@
|
#
1.24 |
|
20-Aug-2017 |
pd |
vmd: Allow only upward migration
This restricts receiving vms from hosts with more cpu features.
Tested on broadwell -> skylake (works) skylake -> broadwell (don't work)
ok mlarkin@
|
#
1.23 |
|
14-Aug-2017 |
mlarkin |
vmd: set MSR_MISC_ENABLE=0 on vm creation, this will be re-set in vmm based on proper values from the host in use.
|
#
1.22 |
|
15-Jul-2017 |
pd |
Add vmctl send and vmctl receive
ok reyk@ and mlarkin@
|
#
1.21 |
|
09-Jul-2017 |
pd |
vmd/vmctl: Add ability to pause / unpause vms
With help from Ashwin Agrawal
ok reyk@ mlarkin@
|
#
1.20 |
|
07-Jun-2017 |
mlarkin |
vmd: Implement simulated baudrate support in the ns8250 module. The previous version was allowing an output rate that is "too fast", and linux guests would give up after 512 characters TXed ("too much work for irq4").
This diff calculates the approximate rate we can sustain at the current programmed baud rate and limits the output to that rate by inserting a HZ delay after a specified number of characters have been transmitted. This fixes the linux guest console issue.
Note that the console now outputs at more or less the selected baud rate, instead of nearly instantaneously as before - if you selected 9600 in your guest VMs before, you might want to change that to 115200 now for a better console experience.
krw@ "seems like a good idea to me"
|
#
1.19 |
|
30-May-2017 |
tedu |
split vioblk read/write functions into start and finish as prep for async io operations. ok mlarkin
|
#
1.18 |
|
28-May-2017 |
mlarkin |
SVM: add some exit types
Also, fix a comment that wasn't applicable anymore, and change a format from decimal to hex
|
#
1.17 |
|
05-May-2017 |
reyk |
VMs cannot use proc_compose() to PROC_VMM, they have to use imsg_compose() on the "vmm_pipe" directly. This fixes the communication channel from VMs back to vmm.
|
#
1.16 |
|
05-May-2017 |
mlarkin |
Allow vmd(8) to set guest %xcr0
Usermode part of previous vmm(4) diff.
Posted to tech by Pratik Vyas
|
#
1.15 |
|
02-May-2017 |
mlarkin |
fix an error in i386 vmd build
|
#
1.14 |
|
02-May-2017 |
mlarkin |
Matching vmd(8) part of previous diff (first part of vmctl send/receive).
ok kettenis
|
#
1.13 |
|
25-Apr-2017 |
reyk |
spacing
|
#
1.12 |
|
19-Apr-2017 |
reyk |
Add support for dynamic "NAT" interfaces (-L/local interface).
When a local interface is configured, vmd configures a /31 address on the tap(4) interface of the host and provides another IP in the same subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server that replies with IP, gateway, and DNS addresses to the VM. The built-in server only ever responds to the VM on the inside and cannot leak its DHCP responses to the outside.
Thanks to Uwe Werler, Josh Grosse, and some others for testing!
OK deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.11 |
|
27-Mar-2017 |
deraadt |
die whitespace die die die
|
#
1.10 |
|
25-Mar-2017 |
mlarkin |
Last bits needed to get seabios + alpine linux working. This is enough to get started and let more people help finding and fixing bugs.
ok kettenis, deraadt
|
#
1.9 |
|
25-Mar-2017 |
reyk |
Boot using BIOS from /etc/firmware/vmm-bios by default.
Instead of using the internal "vmboot", VMs will now be booted using the external BIOS firmware in /etc/firmware/vmm-bios (which is subject to a LGPLv3 license). Direct booting of OpenBSD kernels or non-default BIOS images is still supported for now using the -b/boot option that is replacing the -k/kernel option.
As requested by Theo, vmd(8) fails if neither the default BIOS is found nor a kernel has been specified in the VM configuration. The "vmm" BIOS has to be installed using fw_update(1), which will be done automatically in most cases where the OpenBSD can fetch it after install/upgrade.
OK mlarkin@
|
#
1.8 |
|
25-Mar-2017 |
mlarkin |
Implement some missing functionality and clean up some code in vmd pci emulation.
ok kettenis
|
#
1.7 |
|
25-Mar-2017 |
mlarkin |
Introduce a new function to obtain properly sized input data, and convert i8253/i8259/mc146818 emulation to use this.
|
#
1.6 |
|
24-Mar-2017 |
mlarkin |
Allow vmd to proceed after an interrupt occurred after retiring a cpuid instruction. Matches previous commit to kernel vmm.c
|
#
1.5 |
|
23-Mar-2017 |
mlarkin |
Implement memory size and SMP CPU count NVRAM registers in the emulated mc146818. This is needed for seabios to boot properly (and construct a sensible e820 map to send to the guest OS).
|
#
1.4 |
|
21-Mar-2017 |
mlarkin |
Fix two errors in NS8250 (UART) emulation. The first error zeroed out the high bits of %eax on reading register data from the emulated UART ports. The second error didn't properly assert the TXRDY bit during init - this bit was only set after the first character was sent. Both these bugs caused seabios to not be able to output any data. Found during the recent effort to get Linux guests booting.
|
#
1.3 |
|
15-Mar-2017 |
reyk |
Improve vmmci(4) shutdown and reboot.
This change handles various cases to power off the VM, even if it is unresponsive, stuck in ddb, or when the shutdown was initiated from the VM guest side. Usage of timeout and VM ACKs make sure that the VM is really turned off at some point.
OK mlarkin@
|
#
1.2 |
|
02-Mar-2017 |
reyk |
Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.
This is especially useful when multiple VMs share a switch, the implementation is independent from the underlying switch or bridge.
no objections mlarkin@
|
#
1.1 |
|
01-Mar-2017 |
reyk |
Split vmm.c into two files: vm.c for the VM child, vmm.c for the parent
As discussed with mlarkin@, it makes it easier to maintain the file.
OK mlarkin@
|
#
1.96 |
|
18-Jan-2024 |
claudio |
Use imsg_get_fd() in vmd.
vmd uses a lot of fd passing and does it sometimes via extra abstraction so this just tries to convert the code without any optimisations.
ok dv@
|
#
1.95 |
|
10-Jan-2024 |
dv |
vmm/vmd: add io instruction length to exit information.
Add the instruction length to the vm exit information to allower vmd(8) to manipulate the instruction pointer after io emulation. This is preparation for emulating string-based io instructions.
Removes the instruction pointer update from the kernel (vmm(4)) as well as the instruction length checks, which were overly restrictive anyways based on the way prefixes work in x86 instructions.
ok mlarkin@
|
Revision tags: OPENBSD_7_4_BASE
|
#
1.94 |
|
26-Sep-2023 |
dv |
vmd(8): disambiguate log messages per vm and device.
The logging output from vmd(8) often specifies the function performing the logging, but leaves which vm or vm device to guesswork and reading tea leaves.
Change the logging formatting to prefix with information about the specific vm and potentially the device subprocess. Most of this logging is behind the "verbose" mode, but for warnings this will clarify which vm or device logged the warning.
The format of vm/<name>/<device><index> is chosen to be concise and less ugly than other approaches. This adjusts the process naming for devices to match, dropping the use of brackets.
In the process of this change, updating log settings dynamically via vmctl(8) is fixed by properly broadcasting that information to the device subprocesses. The "vmm" process also now updates its own state properly, so settings survive vm reboots.
ok mlarkin@
|
#
1.93 |
|
26-Sep-2023 |
dv |
vmd(8): fix vm pause deadlock.
When vcpu threads pause, they are holding the run mutex lock. If the event thread is asked to assert an irq on the pic and interrupts are pending, it will try to take the run mutex lock on the vcpu. This deadlocks.
Release the lock in the vcpu thread before waiting on the pause condition variable.
ok mlarkin@
|
#
1.92 |
|
23-Sep-2023 |
dv |
vmd(8): log vmd's vm id, not vmm's in vcpu_run_loop.
Some guests cause a warning message during a shutdown. Log the vmd vm id and not the kernel vmm id as it's next to useless to the end user. This has annoyed me too much.
|
#
1.91 |
|
06-Sep-2023 |
dv |
vmm(4)/vmd(8): include pending interrupt in vm_run_parmams.
To remove an ioctl(2) from the vcpu thread hotpath in vmd(8), add a flag in the vm_run_params structure to indicate if there's another interrupt pending. This reduces latency in vcpu work related to i/o as we save a trip into the kernel just to flip the interrupt pending flag on or off.
Tested by phessler@, mbuhl@, stsp@, and Mischa Peters.
ok mlarkin@
|
#
1.90 |
|
13-Jul-2023 |
dv |
vmd(8): pull validation into local prefix parser.
Validation for local prefixes, both inet and inet6, was scattered around. To make it even more confusing, vmd was using generic address parsing logic from prior network daemons. vmd doesn't need to parse addresses other than when parsing the local prefix settings in vm.conf and no runtime parsing is needed.
This change merges parsing and validation based on vmd's specific needs for local prefixes (e.g. reserving enough bits for vm id and network interface id encoding in an ipv4 address). In addition, it simplifies the struct from a generic address struct to one focused on just storing the v4 and v6 prefixes and masks. This cleans up an unused TAILQ struct member that isn't used by vmd and was leftover copy-pasta from those prior daemons.
The address parsing that vmd uses is also updated to using the latest logic in bgpd(8).
ok mlarkin@
|
#
1.89 |
|
13-May-2023 |
dv |
vmm(4)/vmd(8): switch to anonymous shared mappings.
While splitting out emulated virtio network and block devices into separate processes, I originally used named mappings via shm_mkstemp(3). While this functionally achieved the desired result, it had two unintended consequences:
1) tearing down a vm process and its child processes required excessive locking as the guest memory was tied into the VFS layer.
2) it was observed by mlarkin@ that actions in other parts of the VFS layer could cause some of the guest memory to flush to storage, possibly filling /tmp.
This commit adds a new vmm(4) ioctl dedicated to allowing a process request the kernel share a mapping of guest memory into its own vm space. This requires an open fd to /dev/vmm (requiring root) and both the "vmm" and "proc" pledge(2) promises. In addition, the caller must know enough about the original memory ranges to reconstruct them to make the vm's ranges.
Tested with help from Mischa Peters.
ok mlarkin@
|
#
1.88 |
|
28-Apr-2023 |
dv |
vmd(8)/vmctl(8): allow vm owners to override boot kernel.
vmd allows non-root users to "own" a vm defined in vm.conf(5). While the user can start/stop the vm, if they break their filesystem they have no means of booting recovery media like a ramdisk kernel.
This change opens the provided boot kernel via vmctl and passes the file descriptor through the control channel to vmd. The next boot of the vm will use the provided file descriptor as boot kernel/bios. Subsequent boots (e.g. a reboot) will return to using behavior defined in vm.conf or the default bios image.
ok mlarkin@
|
#
1.87 |
|
27-Apr-2023 |
dv |
vmd(8): introduce multi-process model for virtio devices.
Isolate virtio network and block device emulation in dedicated processes, forked and exec'd from the vm process. This allows for tightening pledge promises to just "stdio".
Communication between the vcpu's and these devices now occurs via imsg channels, which adds the benefit of not always blocking the vcpu thread while emulating the device.
With this commit, it's possible that vmd is the first open source hypervisor that *defaults* to a multi-process device emulation model without requiring any additional configuration from the operator.
Testing help from phessler@ and Mischa Peters.
ok mlarkin@
|
#
1.86 |
|
25-Apr-2023 |
dv |
vmm(4)/vmd(8): pull struct members out of vmm ioctl create struct.
The object sent to vmm(4) contained file paths and details the kernel does not need for cpu virtualization as device emulation is in userland. Effectively, "pull up" the struct members from the vm_create_params struct to the parent vmop_create_params struct.
This allows us to clean up some of vmd(8) and simplify things for switching to having vmctl(8) open the "kernel" file (SeaBIOS, bsd.rd, etc.) to allow users to boot recovery ramdisk kernels.
ok mlarkin@
|
#
1.85 |
|
23-Apr-2023 |
dv |
vmd(8): teach vmm process how to exec.
Use execvp(2) to launch vm children with new address spaces. Consequently, introduces use of unveil(2) into the vmm and vm processes.
This imposes the requirement of launching vmd with absolute paths, similar to sshd(8).
ok mlarkin@
|
#
1.84 |
|
23-Apr-2023 |
anton |
unbreak tree by coping with recent s/XCR0/XFEATURE rename
|
Revision tags: OPENBSD_7_3_BASE
|
#
1.83 |
|
06-Feb-2023 |
dv |
vmd(8): scan pci bus to determine bootorder strings.
vmd's SeaBIOS bootorder strings had hardcoded pci device ids, so if a user added a network interface the bootorder strings didn't line up with reality. Using vmctl(8) to boot from a cdrom (-B cdrom) would fail, for instance, if attaching both a nic and a disk as well.
This change scans the pci devices and finds the first of each type to construct viable bootorder strings.
ok jan@
|
#
1.82 |
|
28-Jan-2023 |
dv |
Move some header definitions from vmm(4) to vmd(8).
Part of an ongoing effort to move userland-specific information out of a kernel header and directly into vmd(8). No functional change.
ok mlarkin@
|
#
1.81 |
|
08-Jan-2023 |
dv |
vmd(8): add thread names to vm process.
ok guenther@.
|
#
1.80 |
|
04-Jan-2023 |
dv |
Typos in vmd error message. No functional change.
|
#
1.79 |
|
28-Dec-2022 |
jmc |
spelling fixes; from paul tagliamonte any parts of his diff not taken are noted on tech
|
#
1.78 |
|
26-Dec-2022 |
dv |
vmd(8): provide a detailed e820 memory map.
When booting guests with SeaBIOS, vmd(8) supplied details about the available guest memory via CMOS registers. Consequently, we've been carrying some patches in the ports tree to SeaBIOS to fetch this information like it's the 1990s.
When a vm initializes memory ranges, we now track what each range represents. This information can be used to supply the e820 memory map to SeaBIOS via the fw_cfg interface allowing it to properly communicate memory ranges to a guest operating system. (This will also allow us to drop some patches from the port.)
Given the ranges can now be marked with a purpose, this also allows vmm(4) to switch from hard-coded mmio ranges and instead let the information on the memory range dictate if vmm should be handling a page fault or sending to vmd for a memory assist.
Tested by Mischa Peters and others. OK mlarkin@.
|
#
1.77 |
|
23-Dec-2022 |
dv |
vmd(8): implement zero-copy operations on virtqueues.
The original virtio device implementation relied on allocating a buffer on heap, copying the virtqueue from the guest, mutating the copy, and then overwriting the virtqueue in the guest.
While the approach worked, it was both complex and added extra overhead. On older hardware, switching to the zero-copy approach can show a noticeable performance improvement for vionet devices. An added benefit is this diff also reduces the amount of code in vmd, which is always a welcome change.
In addition, change to talking about the queue pfn and not "address" as the virtio-pci spec has drivers provide a 32-bit value representing the physical page number of the location in guest memory, not the linear address.
Original idea from dlg@ while working on re-adding async task queues.
ok dlg@, tested by many
|
#
1.76 |
|
11-Nov-2022 |
dv |
Revert removal of toggling interrupt line in vmd vcpu run loop.
phessler reports a performance regression. Needs more testing.
|
#
1.75 |
|
10-Nov-2022 |
dv |
vmd(8): remove toggling interrupt line on vcpu in vcpu run loop
We toggle the interrupt "line" on the vcpu when we assert or deassert irq on the pic in either the vcpu thread (emulating some devices) or on the device event thread (mostly handling reading available data). Having it in the vcpu run loop here just results in another ioctl(2) call before the one for re-entering the guest cpu.
Removing it shows no noticeable behavioral change in existing guests.
ok mlarkin@
|
#
1.74 |
|
10-Nov-2022 |
dv |
vmd(8): import mmio decode and emulation, disabled for now.
The initial mmio support for vmd adds support for only specific MOV and MOVZX instructions. Plan is to begin iterating in-tree on other missing pieces. All functionality is gated behind an #if for now.
Only change to vmm(4) is reordering register #define's in vmmvar.h.
ok mlarkin@
|
Revision tags: OPENBSD_7_2_BASE
|
#
1.73 |
|
01-Sep-2022 |
dv |
vmm(4): send all port io emulation to userland
Simplify things by sending any io exits from IN/OUT instructions to userland instead of trying to emulate anything in the kernel. vmm was sending most pertinent exits to vmd anyways, so this functionally changes little.
An added benefit is this solves an issue reported by tb@ where i386 OpenBSD guests would probe for a pc keyboard repeatedly and cause excessive vm exits. (The emulation in vmm was not properly handling these port reads.)
While here, make the assignment of the VEI_DIR_{IN,OUT} enum values not assume the underlying integer the compiler may assign.
ok mlarkin@
|
#
1.72 |
|
30-Aug-2022 |
dv |
Initial support for mmio assist for vmm(4)
Provide the basic information required for a userland assist in emulating instructions touching mmio regions, sending as much information as is provided by the host hardware.
No decode or assist provided at the moment by vmd(8).
ok mlarkin@
|
#
1.71 |
|
29-Jun-2022 |
dv |
vmd(8): fix off by one in vm memory range check
When inspecting if a gpa falls into a known memory range, vmd was considering it valid 1 byte past the end resulting in selecting the wrong starting range for the search.
ok mlarkin@
|
#
1.70 |
|
26-Jun-2022 |
dv |
vmd: create a copy of bios at 4g boundary
Newer Linux kernels call into the bios to perform a reboot and our version of SeaBIOS assumes there's a "copy" of the bios ending at 4g. When SeaBIOS reads from this area, since vmd doesn't perform mmio yet, guests terminate with an unhandled fault.
Carve out some space ending at 4g and copy the bios there. Technically we could load garbage there, but give SeaBIOS what it wants for now.
ok mlarkin@
|
#
1.69 |
|
03-May-2022 |
dv |
vmm/vmd/vmctl: standardize memory units to bytes
At different points in the vm lifecycle vmm(4), vmctl(8), and vmd(8) refer to a vm's memory range sizes in either bytes or megabytes. This is needlessly complex.
Switch to using bytes everywhere and adjust types and constants accordingly. While this makes it possible to specify vm's with memory in fractions of megabytes, the logic requiring whole megabyte values remains.
Feedback from deraadt@, mlarkin@, and Matthew Martin.
ok mlarkin@
|
Revision tags: OPENBSD_7_1_BASE
|
#
1.68 |
|
01-Mar-2022 |
dv |
vmd(8): gracefully handle hitting data limits when starting a vm
With recent changes to login.conf(5) to restrict daemon datasize to a finite value, users can now hit resource limits when attempting to start a vm.
This change fixes the error path when hitting the limit. vmd(8) will no longer abort and memory error messages are relayed to the user.
While here, address potential under-reads/writes using atomicio when relaying data between the child vm process and vmd's vmm process.
Original diff from tedu@. OK mlarkin@.
|
#
1.67 |
|
30-Dec-2021 |
claudio |
Add back support for -B net -b bsd.rd which emulates a PXE install and results in an autoinstall. This can be used to quickly create new OpenBSD installs. OK dv@
|
#
1.66 |
|
29-Nov-2021 |
deraadt |
mostly avoid sys/param.h with a local nitems() ok mlarkin
|
Revision tags: OPENBSD_7_0_BASE
|
#
1.65 |
|
01-Sep-2021 |
dv |
remove unused functions and cleanup vmd.h
Discussed with mlarkin@. These functions were implemented but never used. While in vmd.h, fix the order to match current vmd(8) reality.
|
#
1.64 |
|
16-Jul-2021 |
dv |
vmd(8): simplify vcpu logic, removing uart & vionet reads
Remove legacy state handling on the ns8250 and virtio network devices originally put in place before using libevent for async device events. The vcpu thread doesn't need to process device data as it is handled by the libevent thread.
This has the benefit of simplifying some of the message passing between threads introduced to the ns8250 uart since both the vcpu and libevent threads were processing read events.
No functional change intended. Tested by many, including abieber@, weerd@, Mischa Peters, and Matthias Schmidt. (Thanks.)
OK mlarkin@
|
#
1.63 |
|
16-Jun-2021 |
dv |
cleanup vmd(8) includes and header files
Lots of organic growth other the years lead to unnecessary includes (proc.h everywhere) and odd dependencies between header files. This cleans things up a bit to help with upcoming cleanup around dhcp code.
No functional change.
"go for it" mlarkin@
|
Revision tags: OPENBSD_6_9_BASE
|
#
1.62 |
|
05-Apr-2021 |
dv |
Support booting from compressed kernel images.
The bsd.rd ramdisk now ships gzip'd on amd64. Use libz in base to transparently handle decompression of any compressed kernel images.
Patch from Josh Rickmar.
ok kn@
|
#
1.61 |
|
29-Mar-2021 |
dv |
Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior ignored did not intercept these packets and instead transmitted them.
This should make vmd(8)'s dhcp behave more as a true dhcp server should and allows it to work properly with the new dhcpleased(8) attempting a renewal.
OK mlarkin@
|
#
1.60 |
|
19-Mar-2021 |
kn |
Remove booting from kernels in raw/qcow2 images
Diff and (slightly tweaked) text below from Dave Voutila < dave at sisu dot io >, thanks!
-- Since 6.7 switched to FFS2 as the default filesystem for new installs, the ability for vmd(8) to load a kernel and boot.conf from a disk image directly (without SeaBIOS) has been broken.
A diff from tb to add FFS2 support never mdae it into the tree.
On 5th Jan 2021, new ramdisks for amd64 have started shipping gzipped, breaking the ability to load the bsd.rd directly as a kernel image for a vmd guest without first uncompressing the image.
Using BIOS works, the FFS2 change happend ten months ago and few if any have complained about the breakage. vmctl(8) is still vague about supporting it per its man page and one still has to pass the disk image twice as a "-b" and "-d" argument to boot an OpenBSD guest *without* BIOS.
Josh Rickmar reported the gzip issue on bugs@ and provided patches to add support for compressed ramdisks and kernel images. The easiest way to do so is to drop support for FFS images since they require a call to fmemopen(3) while all the other logic uses fopen(3)/fdopen(3) calls and a file descriptor. It is much easier to get thsoe patches merged if they don't have to account for extracting files from disk images. --
No objections anyone "Removing it makes sense" reyk (who wrote the FFS module) OK mlarkin
|
#
1.59 |
|
13-Feb-2021 |
mlarkin |
Fix some wrong comments and KNF/long line wraps
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.58 |
|
28-Jun-2020 |
pd |
vmd(8): Eliminate libevent state corruption
libevent functions for com, pic and rtc are now only called on event_thread. vcpu exit handlers send messages on a dev pipe and callbacks on these events do the event management (event_add, evtimer_add, etc). Previously, libevent state was mutated by two threads, event_thread, that runs all the callbacks and the vcpu thread when running exit handlers. This could have lead to libevent state corruption.
Patch from Dave Voutila <dave@sisu.io>
ok claudio@ tested by abieber@ and brynet@
|
Revision tags: OPENBSD_6_7_BASE
|
#
1.57 |
|
30-Apr-2020 |
pd |
vmd(8): correctly terminate vm processes after sending vm
Instead of a round about way of sending a message to vmm that 'send is successful' and terminating by vm_remove from vmm, we can send the imsg and exit in the vm process. The sigchld handler in vmm will vm_remove it from its structures. This is how a normal vm is terminated as well.
Previously, vm_remove was called in vmm_dispatch_vm (ie. the event handler to receive messages from vm process) when hanlding the IMSG_VMDOP_SEND_VM_RESPONSE (ie. the vm process has written the vm state to the fd passed on by vmctl send). This is not how vm_remove was intented to be used as it does a free(vm). The vm struct holds the buffers for imsg and so after handling this IMSG_VMDOP_SEND_VM_RESPONSE message, vmm_dispatch_vm loops again to do imsg_get(ibuf, &imsg) to read the next message (and we had just freed this *ibuf when we freed the vm struct) causing it to segfault.
reported by kn@ ok kn@
|
#
1.56 |
|
21-Apr-2020 |
pd |
vmd: improve concurrency control in pause
Previous implementation hit a deadlock sometimes as the pthread_cond_broadcast for the pause mutex could happen before pthread_cond_wait. This implementation uses a barrier which is hit when all vpcus are paused.
ok mpi@
|
#
1.55 |
|
08-Apr-2020 |
pd |
vmm(4): add IOCTL handler to sets the access protections of the ept
This exposes VMM_IOC_MPROTECT_EPT which can be used by vmd to lock in physical pages. Currently, vmd just terminates the vm in case it gets a protection fault in the future.
This feature is used by solo5 which uses vmm(4) as a backend hypervisor.
ok mpi@
Patch from Adam Steen <adam@adamsteen.com.au>
|
#
1.54 |
|
11-Dec-2019 |
pd |
vmd: proper concurrency control when pausing a vm
Removes an XXX which slept for 1s waiting for the vcpu thread to reach HLT and pause. We now define a paused and unpaused condition so that a call to pause_vm() / vmctl pause blocks till the vm really reaches a paused state.
Also, detach events for devices from event loop when pausing and add them back when unpausing. This is because some callbacks call pthread_mutex_lock and if the vm is paused, it would block also causing the libevent thread to block. This would mean that we would not be able to process any IMSGs received from vmm (parent process) including a message to unpause.
ok mlarkin@
|
#
1.53 |
|
30-Nov-2019 |
mlarkin |
Revert previous - the stability was not as improved as we had thought and we ended up accidentally breaking vmctl. This will need more thought.
ok ori@
|
#
1.52 |
|
29-Nov-2019 |
mlarkin |
Fix at least one cause of VMs spinning at 100% host CPU
After debugging with ori@, it looks like an event ends up on the wrong libevent queue, and we end continually de-queueing and re-queueing the event continually. While it's unclear exactly why this happened, a clue on libevent's github issues page for the same problem pointed us to using a different event base for the device events. This seems to have unstuck ori@'s problematic VM, and I have also seen no more hangs after this.
We have not completely separated the queues; ori@ will work on setting new libevent bases for those later. But those events are pretty frequency.
with help from and ok ori@
|
Revision tags: OPENBSD_6_6_BASE
|
#
1.51 |
|
17-Jul-2019 |
pd |
vmm/vmd: Fix migration with pvclock
Implement VMM_IOC_READVMPARAMS and VMM_IOC_WRITEVMPARAMS ioctls to read and write pvclock state.
reads ok mlarkin@
|
#
1.50 |
|
28-Jun-2019 |
deraadt |
When system calls indicate an error they return -1, not some arbitrary value < 0. errno is only updated in this case. Change all (most?) callers of syscalls to follow this better, and let's see if this strictness helps us in the future.
|
#
1.49 |
|
28-May-2019 |
pd |
vmd: unset CR0_CD and CR0_NW in default flat64 register values
These never got unset on AMD/SVM guests when booted via vmctl start -b causing them to run very slow
ok mlarkin@
|
#
1.48 |
|
12-May-2019 |
pd |
vmm: add a x86 page table walker
Add a first cut of x86 page table walker to vmd(8) and vmm(4). This function is not used right now but is a building block for future features like HPET, OUTSB and INSB emulation, nested virtualisation support, etc.
With help from Mike Larkin
ok mlarkin@
|
#
1.47 |
|
11-May-2019 |
jasper |
vm_dump_header allocated space for a signature but it was never set; set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image
ok mlarkin@ pd@
|
#
1.46 |
|
11-May-2019 |
jasper |
track the state of the vm (running, paused, etc) using a single bitfield instead of a handful of separate variables. this will makes it easier for vmd to report and check on the individual vm states
no functional change intended
ok ccardenas@ mlarkin@
|
Revision tags: OPENBSD_6_5_BASE
|
#
1.45 |
|
01-Mar-2019 |
mlarkin |
vmd(8): remove some i386 remnants that missed the original cleanup
ok pd, kn, deraadt
|
#
1.44 |
|
20-Feb-2019 |
mlarkin |
vmd(8): initialize guest %drX registers to power-on defaults on launch
Initializes the %drX registers to power on defaults, and bump the VM send/recieve header to reflect same
discussed with deraadt@
|
#
1.43 |
|
10-Dec-2018 |
claudio |
Implement the fw_cfg interface basics and use it to set the bootorder if a bootdevice was forced. This implements both the pure IO port interface and also the new DMA interface, a few direct commands are implemented which are needed but in general the "file" interface should be used. There is no write support for the guest. Tested against the latest vmm-firmware port. This requires also a -current kernel to pass the IO ports to vmd(8). OK mlarkin@ ccardenas@
|
#
1.42 |
|
06-Dec-2018 |
claudio |
Make it possible to define the bootdevice in vmd. This information is used currently only when booting a OpenBSD kernel. If VMBOOTDEV_NET is used the internal dhcp server will pass "auto_install" as boot file to the client and the boot loader passes the MAC of the first interface to the kernel to indicate PXE booting. Adding boot order support to SeaBIOS is not yet implemented. Ok ccardenas@
|
Revision tags: OPENBSD_6_4_BASE
|
#
1.41 |
|
08-Oct-2018 |
reyk |
Add support for qcow2 base images (external snapshots).
This works is from Ori Bernstein, committing on his behalf:
Add support to vmd for external snapshots. That is, snapshots that are derived from a base image. Data lookups start in the derived image, and if the derived image does not contain some data, the search proceeds ot the base image. Multiple derived images may exist off of a single base image.
A limitation of this format is that modifying the base image will corrupt the derived image.
This change also adds support for creating disk derived disk images to vmctl. To use it:
vmctl create derived.qcow2 -s 16G -b base.qcow2
From Ori Bernstein OK mlarkin@ reyk@
|
#
1.40 |
|
28-Sep-2018 |
reyk |
Support vmd-internal's vmboot with qcow2 disk images.
OK mlarkin@
|
#
1.39 |
|
19-Sep-2018 |
ccardenas |
Various clean up items for disks.
- qcow2: general cleanup - vioraw: check malloc - virtio: add function to sync disks - vm: call virtio_shutdown to sync disks when vm is finished executing
Thanks to Ori Bernstein.
Ok miko@
|
#
1.38 |
|
17-Jul-2018 |
mlarkin |
vmd(8): fix vmctl -b option for i386 kernels.
ok pd@
|
#
1.37 |
|
12-Jul-2018 |
mlarkin |
vmm(8)/vmm(4): send a copy of the guest register state to vmd on exit, avoiding multiple readregs ioctls back to vmm in case register content is needed subsequently.
ok phessler
|
#
1.36 |
|
10-Jul-2018 |
mlarkin |
vmd(8): route ELCR handler to the right function
|
#
1.35 |
|
09-Jul-2018 |
mlarkin |
vmd(8): better debug message in a failure case
|
#
1.34 |
|
19-Jun-2018 |
reyk |
knf
|
#
1.33 |
|
27-Apr-2018 |
mlarkin |
vmd(8): implement vmd side of ELCR registers
ok guenther
|
#
1.32 |
|
26-Apr-2018 |
mlarkin |
vmd(8): handle PIT channel 2 status readback via port 0x61
Allow PIT channel 2 status (fired/counting) readback via port 0x61 bit 5.
ok guenther@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.31 |
|
03-Jan-2018 |
ccardenas |
Add initial CD-ROM support to VMD via vioscsi.
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
#
1.30 |
|
29-Nov-2017 |
mlarkin |
make vmm(4) less responsible for initial register state, preferring to let usermode daemons handle that.
ok pd@
|
#
1.29 |
|
28-Nov-2017 |
mlarkin |
fix some spelling errors in a few comments
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.28 |
|
19-Sep-2017 |
mlarkin |
Clarify a wrong conditional, found by jsg.
ok jsg
|
#
1.27 |
|
17-Sep-2017 |
pd |
vmd: send/recv pci config space instead of recreating pci devices on receive
ok mlarkin@
|
#
1.26 |
|
17-Sep-2017 |
pd |
vmd: re add rtc.per and rtc.sec evtimers on receive
This was missed in receive. mc146818_start is already defined. This fixes rtc time resync on receive.
ok mlarkin@
|
#
1.25 |
|
11-Sep-2017 |
dlg |
add functions to provide direct access to guest memory as vmd addresses
iovec_mem() populates an iovec array based on guest physical addresses. this allows the use of things like readv and writev for moving data between the guest and a disk image file without having to bounce the memory.
vaddr_mem() provides a vmd usable pointer based on a guests physical address. this makes it possible to directly reference things like virtio rings without having to bounce that memory either. however, it assumes that a contiguous range of guest physical memory will sit in a single vm memory range. mlarkin@ says this is right.
ok mlarkin@
|
#
1.24 |
|
20-Aug-2017 |
pd |
vmd: Allow only upward migration
This restricts receiving vms from hosts with more cpu features.
Tested on broadwell -> skylake (works) skylake -> broadwell (don't work)
ok mlarkin@
|
#
1.23 |
|
14-Aug-2017 |
mlarkin |
vmd: set MSR_MISC_ENABLE=0 on vm creation, this will be re-set in vmm based on proper values from the host in use.
|
#
1.22 |
|
15-Jul-2017 |
pd |
Add vmctl send and vmctl receive
ok reyk@ and mlarkin@
|
#
1.21 |
|
09-Jul-2017 |
pd |
vmd/vmctl: Add ability to pause / unpause vms
With help from Ashwin Agrawal
ok reyk@ mlarkin@
|
#
1.20 |
|
07-Jun-2017 |
mlarkin |
vmd: Implement simulated baudrate support in the ns8250 module. The previous version was allowing an output rate that is "too fast", and linux guests would give up after 512 characters TXed ("too much work for irq4").
This diff calculates the approximate rate we can sustain at the current programmed baud rate and limits the output to that rate by inserting a HZ delay after a specified number of characters have been transmitted. This fixes the linux guest console issue.
Note that the console now outputs at more or less the selected baud rate, instead of nearly instantaneously as before - if you selected 9600 in your guest VMs before, you might want to change that to 115200 now for a better console experience.
krw@ "seems like a good idea to me"
|
#
1.19 |
|
30-May-2017 |
tedu |
split vioblk read/write functions into start and finish as prep for async io operations. ok mlarkin
|
#
1.18 |
|
28-May-2017 |
mlarkin |
SVM: add some exit types
Also, fix a comment that wasn't applicable anymore, and change a format from decimal to hex
|
#
1.17 |
|
05-May-2017 |
reyk |
VMs cannot use proc_compose() to PROC_VMM, they have to use imsg_compose() on the "vmm_pipe" directly. This fixes the communication channel from VMs back to vmm.
|
#
1.16 |
|
05-May-2017 |
mlarkin |
Allow vmd(8) to set guest %xcr0
Usermode part of previous vmm(4) diff.
Posted to tech by Pratik Vyas
|
#
1.15 |
|
02-May-2017 |
mlarkin |
fix an error in i386 vmd build
|
#
1.14 |
|
02-May-2017 |
mlarkin |
Matching vmd(8) part of previous diff (first part of vmctl send/receive).
ok kettenis
|
#
1.13 |
|
25-Apr-2017 |
reyk |
spacing
|
#
1.12 |
|
19-Apr-2017 |
reyk |
Add support for dynamic "NAT" interfaces (-L/local interface).
When a local interface is configured, vmd configures a /31 address on the tap(4) interface of the host and provides another IP in the same subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server that replies with IP, gateway, and DNS addresses to the VM. The built-in server only ever responds to the VM on the inside and cannot leak its DHCP responses to the outside.
Thanks to Uwe Werler, Josh Grosse, and some others for testing!
OK deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.11 |
|
27-Mar-2017 |
deraadt |
die whitespace die die die
|
#
1.10 |
|
25-Mar-2017 |
mlarkin |
Last bits needed to get seabios + alpine linux working. This is enough to get started and let more people help finding and fixing bugs.
ok kettenis, deraadt
|
#
1.9 |
|
25-Mar-2017 |
reyk |
Boot using BIOS from /etc/firmware/vmm-bios by default.
Instead of using the internal "vmboot", VMs will now be booted using the external BIOS firmware in /etc/firmware/vmm-bios (which is subject to a LGPLv3 license). Direct booting of OpenBSD kernels or non-default BIOS images is still supported for now using the -b/boot option that is replacing the -k/kernel option.
As requested by Theo, vmd(8) fails if neither the default BIOS is found nor a kernel has been specified in the VM configuration. The "vmm" BIOS has to be installed using fw_update(1), which will be done automatically in most cases where the OpenBSD can fetch it after install/upgrade.
OK mlarkin@
|
#
1.8 |
|
25-Mar-2017 |
mlarkin |
Implement some missing functionality and clean up some code in vmd pci emulation.
ok kettenis
|
#
1.7 |
|
25-Mar-2017 |
mlarkin |
Introduce a new function to obtain properly sized input data, and convert i8253/i8259/mc146818 emulation to use this.
|
#
1.6 |
|
24-Mar-2017 |
mlarkin |
Allow vmd to proceed after an interrupt occurred after retiring a cpuid instruction. Matches previous commit to kernel vmm.c
|
#
1.5 |
|
23-Mar-2017 |
mlarkin |
Implement memory size and SMP CPU count NVRAM registers in the emulated mc146818. This is needed for seabios to boot properly (and construct a sensible e820 map to send to the guest OS).
|
#
1.4 |
|
21-Mar-2017 |
mlarkin |
Fix two errors in NS8250 (UART) emulation. The first error zeroed out the high bits of %eax on reading register data from the emulated UART ports. The second error didn't properly assert the TXRDY bit during init - this bit was only set after the first character was sent. Both these bugs caused seabios to not be able to output any data. Found during the recent effort to get Linux guests booting.
|
#
1.3 |
|
15-Mar-2017 |
reyk |
Improve vmmci(4) shutdown and reboot.
This change handles various cases to power off the VM, even if it is unresponsive, stuck in ddb, or when the shutdown was initiated from the VM guest side. Usage of timeout and VM ACKs make sure that the VM is really turned off at some point.
OK mlarkin@
|
#
1.2 |
|
02-Mar-2017 |
reyk |
Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.
This is especially useful when multiple VMs share a switch, the implementation is independent from the underlying switch or bridge.
no objections mlarkin@
|
#
1.1 |
|
01-Mar-2017 |
reyk |
Split vmm.c into two files: vm.c for the VM child, vmm.c for the parent
As discussed with mlarkin@, it makes it easier to maintain the file.
OK mlarkin@
|
#
1.95 |
|
10-Jan-2024 |
dv |
vmm/vmd: add io instruction length to exit information.
Add the instruction length to the vm exit information to allower vmd(8) to manipulate the instruction pointer after io emulation. This is preparation for emulating string-based io instructions.
Removes the instruction pointer update from the kernel (vmm(4)) as well as the instruction length checks, which were overly restrictive anyways based on the way prefixes work in x86 instructions.
ok mlarkin@
|
Revision tags: OPENBSD_7_4_BASE
|
#
1.94 |
|
26-Sep-2023 |
dv |
vmd(8): disambiguate log messages per vm and device.
The logging output from vmd(8) often specifies the function performing the logging, but leaves which vm or vm device to guesswork and reading tea leaves.
Change the logging formatting to prefix with information about the specific vm and potentially the device subprocess. Most of this logging is behind the "verbose" mode, but for warnings this will clarify which vm or device logged the warning.
The format of vm/<name>/<device><index> is chosen to be concise and less ugly than other approaches. This adjusts the process naming for devices to match, dropping the use of brackets.
In the process of this change, updating log settings dynamically via vmctl(8) is fixed by properly broadcasting that information to the device subprocesses. The "vmm" process also now updates its own state properly, so settings survive vm reboots.
ok mlarkin@
|
#
1.93 |
|
26-Sep-2023 |
dv |
vmd(8): fix vm pause deadlock.
When vcpu threads pause, they are holding the run mutex lock. If the event thread is asked to assert an irq on the pic and interrupts are pending, it will try to take the run mutex lock on the vcpu. This deadlocks.
Release the lock in the vcpu thread before waiting on the pause condition variable.
ok mlarkin@
|
#
1.92 |
|
23-Sep-2023 |
dv |
vmd(8): log vmd's vm id, not vmm's in vcpu_run_loop.
Some guests cause a warning message during a shutdown. Log the vmd vm id and not the kernel vmm id as it's next to useless to the end user. This has annoyed me too much.
|
#
1.91 |
|
06-Sep-2023 |
dv |
vmm(4)/vmd(8): include pending interrupt in vm_run_parmams.
To remove an ioctl(2) from the vcpu thread hotpath in vmd(8), add a flag in the vm_run_params structure to indicate if there's another interrupt pending. This reduces latency in vcpu work related to i/o as we save a trip into the kernel just to flip the interrupt pending flag on or off.
Tested by phessler@, mbuhl@, stsp@, and Mischa Peters.
ok mlarkin@
|
#
1.90 |
|
13-Jul-2023 |
dv |
vmd(8): pull validation into local prefix parser.
Validation for local prefixes, both inet and inet6, was scattered around. To make it even more confusing, vmd was using generic address parsing logic from prior network daemons. vmd doesn't need to parse addresses other than when parsing the local prefix settings in vm.conf and no runtime parsing is needed.
This change merges parsing and validation based on vmd's specific needs for local prefixes (e.g. reserving enough bits for vm id and network interface id encoding in an ipv4 address). In addition, it simplifies the struct from a generic address struct to one focused on just storing the v4 and v6 prefixes and masks. This cleans up an unused TAILQ struct member that isn't used by vmd and was leftover copy-pasta from those prior daemons.
The address parsing that vmd uses is also updated to using the latest logic in bgpd(8).
ok mlarkin@
|
#
1.89 |
|
13-May-2023 |
dv |
vmm(4)/vmd(8): switch to anonymous shared mappings.
While splitting out emulated virtio network and block devices into separate processes, I originally used named mappings via shm_mkstemp(3). While this functionally achieved the desired result, it had two unintended consequences:
1) tearing down a vm process and its child processes required excessive locking as the guest memory was tied into the VFS layer.
2) it was observed by mlarkin@ that actions in other parts of the VFS layer could cause some of the guest memory to flush to storage, possibly filling /tmp.
This commit adds a new vmm(4) ioctl dedicated to allowing a process request the kernel share a mapping of guest memory into its own vm space. This requires an open fd to /dev/vmm (requiring root) and both the "vmm" and "proc" pledge(2) promises. In addition, the caller must know enough about the original memory ranges to reconstruct them to make the vm's ranges.
Tested with help from Mischa Peters.
ok mlarkin@
|
#
1.88 |
|
28-Apr-2023 |
dv |
vmd(8)/vmctl(8): allow vm owners to override boot kernel.
vmd allows non-root users to "own" a vm defined in vm.conf(5). While the user can start/stop the vm, if they break their filesystem they have no means of booting recovery media like a ramdisk kernel.
This change opens the provided boot kernel via vmctl and passes the file descriptor through the control channel to vmd. The next boot of the vm will use the provided file descriptor as boot kernel/bios. Subsequent boots (e.g. a reboot) will return to using behavior defined in vm.conf or the default bios image.
ok mlarkin@
|
#
1.87 |
|
27-Apr-2023 |
dv |
vmd(8): introduce multi-process model for virtio devices.
Isolate virtio network and block device emulation in dedicated processes, forked and exec'd from the vm process. This allows for tightening pledge promises to just "stdio".
Communication between the vcpu's and these devices now occurs via imsg channels, which adds the benefit of not always blocking the vcpu thread while emulating the device.
With this commit, it's possible that vmd is the first open source hypervisor that *defaults* to a multi-process device emulation model without requiring any additional configuration from the operator.
Testing help from phessler@ and Mischa Peters.
ok mlarkin@
|
#
1.86 |
|
25-Apr-2023 |
dv |
vmm(4)/vmd(8): pull struct members out of vmm ioctl create struct.
The object sent to vmm(4) contained file paths and details the kernel does not need for cpu virtualization as device emulation is in userland. Effectively, "pull up" the struct members from the vm_create_params struct to the parent vmop_create_params struct.
This allows us to clean up some of vmd(8) and simplify things for switching to having vmctl(8) open the "kernel" file (SeaBIOS, bsd.rd, etc.) to allow users to boot recovery ramdisk kernels.
ok mlarkin@
|
#
1.85 |
|
23-Apr-2023 |
dv |
vmd(8): teach vmm process how to exec.
Use execvp(2) to launch vm children with new address spaces. Consequently, introduces use of unveil(2) into the vmm and vm processes.
This imposes the requirement of launching vmd with absolute paths, similar to sshd(8).
ok mlarkin@
|
#
1.84 |
|
23-Apr-2023 |
anton |
unbreak tree by coping with recent s/XCR0/XFEATURE rename
|
Revision tags: OPENBSD_7_3_BASE
|
#
1.83 |
|
06-Feb-2023 |
dv |
vmd(8): scan pci bus to determine bootorder strings.
vmd's SeaBIOS bootorder strings had hardcoded pci device ids, so if a user added a network interface the bootorder strings didn't line up with reality. Using vmctl(8) to boot from a cdrom (-B cdrom) would fail, for instance, if attaching both a nic and a disk as well.
This change scans the pci devices and finds the first of each type to construct viable bootorder strings.
ok jan@
|
#
1.82 |
|
28-Jan-2023 |
dv |
Move some header definitions from vmm(4) to vmd(8).
Part of an ongoing effort to move userland-specific information out of a kernel header and directly into vmd(8). No functional change.
ok mlarkin@
|
#
1.81 |
|
08-Jan-2023 |
dv |
vmd(8): add thread names to vm process.
ok guenther@.
|
#
1.80 |
|
04-Jan-2023 |
dv |
Typos in vmd error message. No functional change.
|
#
1.79 |
|
28-Dec-2022 |
jmc |
spelling fixes; from paul tagliamonte any parts of his diff not taken are noted on tech
|
#
1.78 |
|
26-Dec-2022 |
dv |
vmd(8): provide a detailed e820 memory map.
When booting guests with SeaBIOS, vmd(8) supplied details about the available guest memory via CMOS registers. Consequently, we've been carrying some patches in the ports tree to SeaBIOS to fetch this information like it's the 1990s.
When a vm initializes memory ranges, we now track what each range represents. This information can be used to supply the e820 memory map to SeaBIOS via the fw_cfg interface allowing it to properly communicate memory ranges to a guest operating system. (This will also allow us to drop some patches from the port.)
Given the ranges can now be marked with a purpose, this also allows vmm(4) to switch from hard-coded mmio ranges and instead let the information on the memory range dictate if vmm should be handling a page fault or sending to vmd for a memory assist.
Tested by Mischa Peters and others. OK mlarkin@.
|
#
1.77 |
|
23-Dec-2022 |
dv |
vmd(8): implement zero-copy operations on virtqueues.
The original virtio device implementation relied on allocating a buffer on heap, copying the virtqueue from the guest, mutating the copy, and then overwriting the virtqueue in the guest.
While the approach worked, it was both complex and added extra overhead. On older hardware, switching to the zero-copy approach can show a noticeable performance improvement for vionet devices. An added benefit is this diff also reduces the amount of code in vmd, which is always a welcome change.
In addition, change to talking about the queue pfn and not "address" as the virtio-pci spec has drivers provide a 32-bit value representing the physical page number of the location in guest memory, not the linear address.
Original idea from dlg@ while working on re-adding async task queues.
ok dlg@, tested by many
|
#
1.76 |
|
11-Nov-2022 |
dv |
Revert removal of toggling interrupt line in vmd vcpu run loop.
phessler reports a performance regression. Needs more testing.
|
#
1.75 |
|
10-Nov-2022 |
dv |
vmd(8): remove toggling interrupt line on vcpu in vcpu run loop
We toggle the interrupt "line" on the vcpu when we assert or deassert irq on the pic in either the vcpu thread (emulating some devices) or on the device event thread (mostly handling reading available data). Having it in the vcpu run loop here just results in another ioctl(2) call before the one for re-entering the guest cpu.
Removing it shows no noticeable behavioral change in existing guests.
ok mlarkin@
|
#
1.74 |
|
10-Nov-2022 |
dv |
vmd(8): import mmio decode and emulation, disabled for now.
The initial mmio support for vmd adds support for only specific MOV and MOVZX instructions. Plan is to begin iterating in-tree on other missing pieces. All functionality is gated behind an #if for now.
Only change to vmm(4) is reordering register #define's in vmmvar.h.
ok mlarkin@
|
Revision tags: OPENBSD_7_2_BASE
|
#
1.73 |
|
01-Sep-2022 |
dv |
vmm(4): send all port io emulation to userland
Simplify things by sending any io exits from IN/OUT instructions to userland instead of trying to emulate anything in the kernel. vmm was sending most pertinent exits to vmd anyways, so this functionally changes little.
An added benefit is this solves an issue reported by tb@ where i386 OpenBSD guests would probe for a pc keyboard repeatedly and cause excessive vm exits. (The emulation in vmm was not properly handling these port reads.)
While here, make the assignment of the VEI_DIR_{IN,OUT} enum values not assume the underlying integer the compiler may assign.
ok mlarkin@
|
#
1.72 |
|
30-Aug-2022 |
dv |
Initial support for mmio assist for vmm(4)
Provide the basic information required for a userland assist in emulating instructions touching mmio regions, sending as much information as is provided by the host hardware.
No decode or assist provided at the moment by vmd(8).
ok mlarkin@
|
#
1.71 |
|
29-Jun-2022 |
dv |
vmd(8): fix off by one in vm memory range check
When inspecting if a gpa falls into a known memory range, vmd was considering it valid 1 byte past the end resulting in selecting the wrong starting range for the search.
ok mlarkin@
|
#
1.70 |
|
26-Jun-2022 |
dv |
vmd: create a copy of bios at 4g boundary
Newer Linux kernels call into the bios to perform a reboot and our version of SeaBIOS assumes there's a "copy" of the bios ending at 4g. When SeaBIOS reads from this area, since vmd doesn't perform mmio yet, guests terminate with an unhandled fault.
Carve out some space ending at 4g and copy the bios there. Technically we could load garbage there, but give SeaBIOS what it wants for now.
ok mlarkin@
|
#
1.69 |
|
03-May-2022 |
dv |
vmm/vmd/vmctl: standardize memory units to bytes
At different points in the vm lifecycle vmm(4), vmctl(8), and vmd(8) refer to a vm's memory range sizes in either bytes or megabytes. This is needlessly complex.
Switch to using bytes everywhere and adjust types and constants accordingly. While this makes it possible to specify vm's with memory in fractions of megabytes, the logic requiring whole megabyte values remains.
Feedback from deraadt@, mlarkin@, and Matthew Martin.
ok mlarkin@
|
Revision tags: OPENBSD_7_1_BASE
|
#
1.68 |
|
01-Mar-2022 |
dv |
vmd(8): gracefully handle hitting data limits when starting a vm
With recent changes to login.conf(5) to restrict daemon datasize to a finite value, users can now hit resource limits when attempting to start a vm.
This change fixes the error path when hitting the limit. vmd(8) will no longer abort and memory error messages are relayed to the user.
While here, address potential under-reads/writes using atomicio when relaying data between the child vm process and vmd's vmm process.
Original diff from tedu@. OK mlarkin@.
|
#
1.67 |
|
30-Dec-2021 |
claudio |
Add back support for -B net -b bsd.rd which emulates a PXE install and results in an autoinstall. This can be used to quickly create new OpenBSD installs. OK dv@
|
#
1.66 |
|
29-Nov-2021 |
deraadt |
mostly avoid sys/param.h with a local nitems() ok mlarkin
|
Revision tags: OPENBSD_7_0_BASE
|
#
1.65 |
|
01-Sep-2021 |
dv |
remove unused functions and cleanup vmd.h
Discussed with mlarkin@. These functions were implemented but never used. While in vmd.h, fix the order to match current vmd(8) reality.
|
#
1.64 |
|
16-Jul-2021 |
dv |
vmd(8): simplify vcpu logic, removing uart & vionet reads
Remove legacy state handling on the ns8250 and virtio network devices originally put in place before using libevent for async device events. The vcpu thread doesn't need to process device data as it is handled by the libevent thread.
This has the benefit of simplifying some of the message passing between threads introduced to the ns8250 uart since both the vcpu and libevent threads were processing read events.
No functional change intended. Tested by many, including abieber@, weerd@, Mischa Peters, and Matthias Schmidt. (Thanks.)
OK mlarkin@
|
#
1.63 |
|
16-Jun-2021 |
dv |
cleanup vmd(8) includes and header files
Lots of organic growth other the years lead to unnecessary includes (proc.h everywhere) and odd dependencies between header files. This cleans things up a bit to help with upcoming cleanup around dhcp code.
No functional change.
"go for it" mlarkin@
|
Revision tags: OPENBSD_6_9_BASE
|
#
1.62 |
|
05-Apr-2021 |
dv |
Support booting from compressed kernel images.
The bsd.rd ramdisk now ships gzip'd on amd64. Use libz in base to transparently handle decompression of any compressed kernel images.
Patch from Josh Rickmar.
ok kn@
|
#
1.61 |
|
29-Mar-2021 |
dv |
Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior ignored did not intercept these packets and instead transmitted them.
This should make vmd(8)'s dhcp behave more as a true dhcp server should and allows it to work properly with the new dhcpleased(8) attempting a renewal.
OK mlarkin@
|
#
1.60 |
|
19-Mar-2021 |
kn |
Remove booting from kernels in raw/qcow2 images
Diff and (slightly tweaked) text below from Dave Voutila < dave at sisu dot io >, thanks!
-- Since 6.7 switched to FFS2 as the default filesystem for new installs, the ability for vmd(8) to load a kernel and boot.conf from a disk image directly (without SeaBIOS) has been broken.
A diff from tb to add FFS2 support never mdae it into the tree.
On 5th Jan 2021, new ramdisks for amd64 have started shipping gzipped, breaking the ability to load the bsd.rd directly as a kernel image for a vmd guest without first uncompressing the image.
Using BIOS works, the FFS2 change happend ten months ago and few if any have complained about the breakage. vmctl(8) is still vague about supporting it per its man page and one still has to pass the disk image twice as a "-b" and "-d" argument to boot an OpenBSD guest *without* BIOS.
Josh Rickmar reported the gzip issue on bugs@ and provided patches to add support for compressed ramdisks and kernel images. The easiest way to do so is to drop support for FFS images since they require a call to fmemopen(3) while all the other logic uses fopen(3)/fdopen(3) calls and a file descriptor. It is much easier to get thsoe patches merged if they don't have to account for extracting files from disk images. --
No objections anyone "Removing it makes sense" reyk (who wrote the FFS module) OK mlarkin
|
#
1.59 |
|
13-Feb-2021 |
mlarkin |
Fix some wrong comments and KNF/long line wraps
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.58 |
|
28-Jun-2020 |
pd |
vmd(8): Eliminate libevent state corruption
libevent functions for com, pic and rtc are now only called on event_thread. vcpu exit handlers send messages on a dev pipe and callbacks on these events do the event management (event_add, evtimer_add, etc). Previously, libevent state was mutated by two threads, event_thread, that runs all the callbacks and the vcpu thread when running exit handlers. This could have lead to libevent state corruption.
Patch from Dave Voutila <dave@sisu.io>
ok claudio@ tested by abieber@ and brynet@
|
Revision tags: OPENBSD_6_7_BASE
|
#
1.57 |
|
30-Apr-2020 |
pd |
vmd(8): correctly terminate vm processes after sending vm
Instead of a round about way of sending a message to vmm that 'send is successful' and terminating by vm_remove from vmm, we can send the imsg and exit in the vm process. The sigchld handler in vmm will vm_remove it from its structures. This is how a normal vm is terminated as well.
Previously, vm_remove was called in vmm_dispatch_vm (ie. the event handler to receive messages from vm process) when hanlding the IMSG_VMDOP_SEND_VM_RESPONSE (ie. the vm process has written the vm state to the fd passed on by vmctl send). This is not how vm_remove was intented to be used as it does a free(vm). The vm struct holds the buffers for imsg and so after handling this IMSG_VMDOP_SEND_VM_RESPONSE message, vmm_dispatch_vm loops again to do imsg_get(ibuf, &imsg) to read the next message (and we had just freed this *ibuf when we freed the vm struct) causing it to segfault.
reported by kn@ ok kn@
|
#
1.56 |
|
21-Apr-2020 |
pd |
vmd: improve concurrency control in pause
Previous implementation hit a deadlock sometimes as the pthread_cond_broadcast for the pause mutex could happen before pthread_cond_wait. This implementation uses a barrier which is hit when all vpcus are paused.
ok mpi@
|
#
1.55 |
|
08-Apr-2020 |
pd |
vmm(4): add IOCTL handler to sets the access protections of the ept
This exposes VMM_IOC_MPROTECT_EPT which can be used by vmd to lock in physical pages. Currently, vmd just terminates the vm in case it gets a protection fault in the future.
This feature is used by solo5 which uses vmm(4) as a backend hypervisor.
ok mpi@
Patch from Adam Steen <adam@adamsteen.com.au>
|
#
1.54 |
|
11-Dec-2019 |
pd |
vmd: proper concurrency control when pausing a vm
Removes an XXX which slept for 1s waiting for the vcpu thread to reach HLT and pause. We now define a paused and unpaused condition so that a call to pause_vm() / vmctl pause blocks till the vm really reaches a paused state.
Also, detach events for devices from event loop when pausing and add them back when unpausing. This is because some callbacks call pthread_mutex_lock and if the vm is paused, it would block also causing the libevent thread to block. This would mean that we would not be able to process any IMSGs received from vmm (parent process) including a message to unpause.
ok mlarkin@
|
#
1.53 |
|
30-Nov-2019 |
mlarkin |
Revert previous - the stability was not as improved as we had thought and we ended up accidentally breaking vmctl. This will need more thought.
ok ori@
|
#
1.52 |
|
29-Nov-2019 |
mlarkin |
Fix at least one cause of VMs spinning at 100% host CPU
After debugging with ori@, it looks like an event ends up on the wrong libevent queue, and we end continually de-queueing and re-queueing the event continually. While it's unclear exactly why this happened, a clue on libevent's github issues page for the same problem pointed us to using a different event base for the device events. This seems to have unstuck ori@'s problematic VM, and I have also seen no more hangs after this.
We have not completely separated the queues; ori@ will work on setting new libevent bases for those later. But those events are pretty frequency.
with help from and ok ori@
|
Revision tags: OPENBSD_6_6_BASE
|
#
1.51 |
|
17-Jul-2019 |
pd |
vmm/vmd: Fix migration with pvclock
Implement VMM_IOC_READVMPARAMS and VMM_IOC_WRITEVMPARAMS ioctls to read and write pvclock state.
reads ok mlarkin@
|
#
1.50 |
|
28-Jun-2019 |
deraadt |
When system calls indicate an error they return -1, not some arbitrary value < 0. errno is only updated in this case. Change all (most?) callers of syscalls to follow this better, and let's see if this strictness helps us in the future.
|
#
1.49 |
|
28-May-2019 |
pd |
vmd: unset CR0_CD and CR0_NW in default flat64 register values
These never got unset on AMD/SVM guests when booted via vmctl start -b causing them to run very slow
ok mlarkin@
|
#
1.48 |
|
12-May-2019 |
pd |
vmm: add a x86 page table walker
Add a first cut of x86 page table walker to vmd(8) and vmm(4). This function is not used right now but is a building block for future features like HPET, OUTSB and INSB emulation, nested virtualisation support, etc.
With help from Mike Larkin
ok mlarkin@
|
#
1.47 |
|
11-May-2019 |
jasper |
vm_dump_header allocated space for a signature but it was never set; set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image
ok mlarkin@ pd@
|
#
1.46 |
|
11-May-2019 |
jasper |
track the state of the vm (running, paused, etc) using a single bitfield instead of a handful of separate variables. this will makes it easier for vmd to report and check on the individual vm states
no functional change intended
ok ccardenas@ mlarkin@
|
Revision tags: OPENBSD_6_5_BASE
|
#
1.45 |
|
01-Mar-2019 |
mlarkin |
vmd(8): remove some i386 remnants that missed the original cleanup
ok pd, kn, deraadt
|
#
1.44 |
|
20-Feb-2019 |
mlarkin |
vmd(8): initialize guest %drX registers to power-on defaults on launch
Initializes the %drX registers to power on defaults, and bump the VM send/recieve header to reflect same
discussed with deraadt@
|
#
1.43 |
|
10-Dec-2018 |
claudio |
Implement the fw_cfg interface basics and use it to set the bootorder if a bootdevice was forced. This implements both the pure IO port interface and also the new DMA interface, a few direct commands are implemented which are needed but in general the "file" interface should be used. There is no write support for the guest. Tested against the latest vmm-firmware port. This requires also a -current kernel to pass the IO ports to vmd(8). OK mlarkin@ ccardenas@
|
#
1.42 |
|
06-Dec-2018 |
claudio |
Make it possible to define the bootdevice in vmd. This information is used currently only when booting a OpenBSD kernel. If VMBOOTDEV_NET is used the internal dhcp server will pass "auto_install" as boot file to the client and the boot loader passes the MAC of the first interface to the kernel to indicate PXE booting. Adding boot order support to SeaBIOS is not yet implemented. Ok ccardenas@
|
Revision tags: OPENBSD_6_4_BASE
|
#
1.41 |
|
08-Oct-2018 |
reyk |
Add support for qcow2 base images (external snapshots).
This works is from Ori Bernstein, committing on his behalf:
Add support to vmd for external snapshots. That is, snapshots that are derived from a base image. Data lookups start in the derived image, and if the derived image does not contain some data, the search proceeds ot the base image. Multiple derived images may exist off of a single base image.
A limitation of this format is that modifying the base image will corrupt the derived image.
This change also adds support for creating disk derived disk images to vmctl. To use it:
vmctl create derived.qcow2 -s 16G -b base.qcow2
From Ori Bernstein OK mlarkin@ reyk@
|
#
1.40 |
|
28-Sep-2018 |
reyk |
Support vmd-internal's vmboot with qcow2 disk images.
OK mlarkin@
|
#
1.39 |
|
19-Sep-2018 |
ccardenas |
Various clean up items for disks.
- qcow2: general cleanup - vioraw: check malloc - virtio: add function to sync disks - vm: call virtio_shutdown to sync disks when vm is finished executing
Thanks to Ori Bernstein.
Ok miko@
|
#
1.38 |
|
17-Jul-2018 |
mlarkin |
vmd(8): fix vmctl -b option for i386 kernels.
ok pd@
|
#
1.37 |
|
12-Jul-2018 |
mlarkin |
vmm(8)/vmm(4): send a copy of the guest register state to vmd on exit, avoiding multiple readregs ioctls back to vmm in case register content is needed subsequently.
ok phessler
|
#
1.36 |
|
10-Jul-2018 |
mlarkin |
vmd(8): route ELCR handler to the right function
|
#
1.35 |
|
09-Jul-2018 |
mlarkin |
vmd(8): better debug message in a failure case
|
#
1.34 |
|
19-Jun-2018 |
reyk |
knf
|
#
1.33 |
|
27-Apr-2018 |
mlarkin |
vmd(8): implement vmd side of ELCR registers
ok guenther
|
#
1.32 |
|
26-Apr-2018 |
mlarkin |
vmd(8): handle PIT channel 2 status readback via port 0x61
Allow PIT channel 2 status (fired/counting) readback via port 0x61 bit 5.
ok guenther@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.31 |
|
03-Jan-2018 |
ccardenas |
Add initial CD-ROM support to VMD via vioscsi.
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
#
1.30 |
|
29-Nov-2017 |
mlarkin |
make vmm(4) less responsible for initial register state, preferring to let usermode daemons handle that.
ok pd@
|
#
1.29 |
|
28-Nov-2017 |
mlarkin |
fix some spelling errors in a few comments
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.28 |
|
19-Sep-2017 |
mlarkin |
Clarify a wrong conditional, found by jsg.
ok jsg
|
#
1.27 |
|
17-Sep-2017 |
pd |
vmd: send/recv pci config space instead of recreating pci devices on receive
ok mlarkin@
|
#
1.26 |
|
17-Sep-2017 |
pd |
vmd: re add rtc.per and rtc.sec evtimers on receive
This was missed in receive. mc146818_start is already defined. This fixes rtc time resync on receive.
ok mlarkin@
|
#
1.25 |
|
11-Sep-2017 |
dlg |
add functions to provide direct access to guest memory as vmd addresses
iovec_mem() populates an iovec array based on guest physical addresses. this allows the use of things like readv and writev for moving data between the guest and a disk image file without having to bounce the memory.
vaddr_mem() provides a vmd usable pointer based on a guests physical address. this makes it possible to directly reference things like virtio rings without having to bounce that memory either. however, it assumes that a contiguous range of guest physical memory will sit in a single vm memory range. mlarkin@ says this is right.
ok mlarkin@
|
#
1.24 |
|
20-Aug-2017 |
pd |
vmd: Allow only upward migration
This restricts receiving vms from hosts with more cpu features.
Tested on broadwell -> skylake (works) skylake -> broadwell (don't work)
ok mlarkin@
|
#
1.23 |
|
14-Aug-2017 |
mlarkin |
vmd: set MSR_MISC_ENABLE=0 on vm creation, this will be re-set in vmm based on proper values from the host in use.
|
#
1.22 |
|
15-Jul-2017 |
pd |
Add vmctl send and vmctl receive
ok reyk@ and mlarkin@
|
#
1.21 |
|
09-Jul-2017 |
pd |
vmd/vmctl: Add ability to pause / unpause vms
With help from Ashwin Agrawal
ok reyk@ mlarkin@
|
#
1.20 |
|
07-Jun-2017 |
mlarkin |
vmd: Implement simulated baudrate support in the ns8250 module. The previous version was allowing an output rate that is "too fast", and linux guests would give up after 512 characters TXed ("too much work for irq4").
This diff calculates the approximate rate we can sustain at the current programmed baud rate and limits the output to that rate by inserting a HZ delay after a specified number of characters have been transmitted. This fixes the linux guest console issue.
Note that the console now outputs at more or less the selected baud rate, instead of nearly instantaneously as before - if you selected 9600 in your guest VMs before, you might want to change that to 115200 now for a better console experience.
krw@ "seems like a good idea to me"
|
#
1.19 |
|
30-May-2017 |
tedu |
split vioblk read/write functions into start and finish as prep for async io operations. ok mlarkin
|
#
1.18 |
|
28-May-2017 |
mlarkin |
SVM: add some exit types
Also, fix a comment that wasn't applicable anymore, and change a format from decimal to hex
|
#
1.17 |
|
05-May-2017 |
reyk |
VMs cannot use proc_compose() to PROC_VMM, they have to use imsg_compose() on the "vmm_pipe" directly. This fixes the communication channel from VMs back to vmm.
|
#
1.16 |
|
05-May-2017 |
mlarkin |
Allow vmd(8) to set guest %xcr0
Usermode part of previous vmm(4) diff.
Posted to tech by Pratik Vyas
|
#
1.15 |
|
02-May-2017 |
mlarkin |
fix an error in i386 vmd build
|
#
1.14 |
|
02-May-2017 |
mlarkin |
Matching vmd(8) part of previous diff (first part of vmctl send/receive).
ok kettenis
|
#
1.13 |
|
25-Apr-2017 |
reyk |
spacing
|
#
1.12 |
|
19-Apr-2017 |
reyk |
Add support for dynamic "NAT" interfaces (-L/local interface).
When a local interface is configured, vmd configures a /31 address on the tap(4) interface of the host and provides another IP in the same subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server that replies with IP, gateway, and DNS addresses to the VM. The built-in server only ever responds to the VM on the inside and cannot leak its DHCP responses to the outside.
Thanks to Uwe Werler, Josh Grosse, and some others for testing!
OK deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.11 |
|
27-Mar-2017 |
deraadt |
die whitespace die die die
|
#
1.10 |
|
25-Mar-2017 |
mlarkin |
Last bits needed to get seabios + alpine linux working. This is enough to get started and let more people help finding and fixing bugs.
ok kettenis, deraadt
|
#
1.9 |
|
25-Mar-2017 |
reyk |
Boot using BIOS from /etc/firmware/vmm-bios by default.
Instead of using the internal "vmboot", VMs will now be booted using the external BIOS firmware in /etc/firmware/vmm-bios (which is subject to a LGPLv3 license). Direct booting of OpenBSD kernels or non-default BIOS images is still supported for now using the -b/boot option that is replacing the -k/kernel option.
As requested by Theo, vmd(8) fails if neither the default BIOS is found nor a kernel has been specified in the VM configuration. The "vmm" BIOS has to be installed using fw_update(1), which will be done automatically in most cases where the OpenBSD can fetch it after install/upgrade.
OK mlarkin@
|
#
1.8 |
|
25-Mar-2017 |
mlarkin |
Implement some missing functionality and clean up some code in vmd pci emulation.
ok kettenis
|
#
1.7 |
|
25-Mar-2017 |
mlarkin |
Introduce a new function to obtain properly sized input data, and convert i8253/i8259/mc146818 emulation to use this.
|
#
1.6 |
|
24-Mar-2017 |
mlarkin |
Allow vmd to proceed after an interrupt occurred after retiring a cpuid instruction. Matches previous commit to kernel vmm.c
|
#
1.5 |
|
23-Mar-2017 |
mlarkin |
Implement memory size and SMP CPU count NVRAM registers in the emulated mc146818. This is needed for seabios to boot properly (and construct a sensible e820 map to send to the guest OS).
|
#
1.4 |
|
21-Mar-2017 |
mlarkin |
Fix two errors in NS8250 (UART) emulation. The first error zeroed out the high bits of %eax on reading register data from the emulated UART ports. The second error didn't properly assert the TXRDY bit during init - this bit was only set after the first character was sent. Both these bugs caused seabios to not be able to output any data. Found during the recent effort to get Linux guests booting.
|
#
1.3 |
|
15-Mar-2017 |
reyk |
Improve vmmci(4) shutdown and reboot.
This change handles various cases to power off the VM, even if it is unresponsive, stuck in ddb, or when the shutdown was initiated from the VM guest side. Usage of timeout and VM ACKs make sure that the VM is really turned off at some point.
OK mlarkin@
|
#
1.2 |
|
02-Mar-2017 |
reyk |
Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.
This is especially useful when multiple VMs share a switch, the implementation is independent from the underlying switch or bridge.
no objections mlarkin@
|
#
1.1 |
|
01-Mar-2017 |
reyk |
Split vmm.c into two files: vm.c for the VM child, vmm.c for the parent
As discussed with mlarkin@, it makes it easier to maintain the file.
OK mlarkin@
|
#
1.94 |
|
26-Sep-2023 |
dv |
vmd(8): disambiguate log messages per vm and device.
The logging output from vmd(8) often specifies the function performing the logging, but leaves which vm or vm device to guesswork and reading tea leaves.
Change the logging formatting to prefix with information about the specific vm and potentially the device subprocess. Most of this logging is behind the "verbose" mode, but for warnings this will clarify which vm or device logged the warning.
The format of vm/<name>/<device><index> is chosen to be concise and less ugly than other approaches. This adjusts the process naming for devices to match, dropping the use of brackets.
In the process of this change, updating log settings dynamically via vmctl(8) is fixed by properly broadcasting that information to the device subprocesses. The "vmm" process also now updates its own state properly, so settings survive vm reboots.
ok mlarkin@
|
#
1.93 |
|
26-Sep-2023 |
dv |
vmd(8): fix vm pause deadlock.
When vcpu threads pause, they are holding the run mutex lock. If the event thread is asked to assert an irq on the pic and interrupts are pending, it will try to take the run mutex lock on the vcpu. This deadlocks.
Release the lock in the vcpu thread before waiting on the pause condition variable.
ok mlarkin@
|
#
1.92 |
|
23-Sep-2023 |
dv |
vmd(8): log vmd's vm id, not vmm's in vcpu_run_loop.
Some guests cause a warning message during a shutdown. Log the vmd vm id and not the kernel vmm id as it's next to useless to the end user. This has annoyed me too much.
|
#
1.91 |
|
06-Sep-2023 |
dv |
vmm(4)/vmd(8): include pending interrupt in vm_run_parmams.
To remove an ioctl(2) from the vcpu thread hotpath in vmd(8), add a flag in the vm_run_params structure to indicate if there's another interrupt pending. This reduces latency in vcpu work related to i/o as we save a trip into the kernel just to flip the interrupt pending flag on or off.
Tested by phessler@, mbuhl@, stsp@, and Mischa Peters.
ok mlarkin@
|
#
1.90 |
|
13-Jul-2023 |
dv |
vmd(8): pull validation into local prefix parser.
Validation for local prefixes, both inet and inet6, was scattered around. To make it even more confusing, vmd was using generic address parsing logic from prior network daemons. vmd doesn't need to parse addresses other than when parsing the local prefix settings in vm.conf and no runtime parsing is needed.
This change merges parsing and validation based on vmd's specific needs for local prefixes (e.g. reserving enough bits for vm id and network interface id encoding in an ipv4 address). In addition, it simplifies the struct from a generic address struct to one focused on just storing the v4 and v6 prefixes and masks. This cleans up an unused TAILQ struct member that isn't used by vmd and was leftover copy-pasta from those prior daemons.
The address parsing that vmd uses is also updated to using the latest logic in bgpd(8).
ok mlarkin@
|
#
1.89 |
|
13-May-2023 |
dv |
vmm(4)/vmd(8): switch to anonymous shared mappings.
While splitting out emulated virtio network and block devices into separate processes, I originally used named mappings via shm_mkstemp(3). While this functionally achieved the desired result, it had two unintended consequences:
1) tearing down a vm process and its child processes required excessive locking as the guest memory was tied into the VFS layer.
2) it was observed by mlarkin@ that actions in other parts of the VFS layer could cause some of the guest memory to flush to storage, possibly filling /tmp.
This commit adds a new vmm(4) ioctl dedicated to allowing a process request the kernel share a mapping of guest memory into its own vm space. This requires an open fd to /dev/vmm (requiring root) and both the "vmm" and "proc" pledge(2) promises. In addition, the caller must know enough about the original memory ranges to reconstruct them to make the vm's ranges.
Tested with help from Mischa Peters.
ok mlarkin@
|
#
1.88 |
|
28-Apr-2023 |
dv |
vmd(8)/vmctl(8): allow vm owners to override boot kernel.
vmd allows non-root users to "own" a vm defined in vm.conf(5). While the user can start/stop the vm, if they break their filesystem they have no means of booting recovery media like a ramdisk kernel.
This change opens the provided boot kernel via vmctl and passes the file descriptor through the control channel to vmd. The next boot of the vm will use the provided file descriptor as boot kernel/bios. Subsequent boots (e.g. a reboot) will return to using behavior defined in vm.conf or the default bios image.
ok mlarkin@
|
#
1.87 |
|
27-Apr-2023 |
dv |
vmd(8): introduce multi-process model for virtio devices.
Isolate virtio network and block device emulation in dedicated processes, forked and exec'd from the vm process. This allows for tightening pledge promises to just "stdio".
Communication between the vcpu's and these devices now occurs via imsg channels, which adds the benefit of not always blocking the vcpu thread while emulating the device.
With this commit, it's possible that vmd is the first open source hypervisor that *defaults* to a multi-process device emulation model without requiring any additional configuration from the operator.
Testing help from phessler@ and Mischa Peters.
ok mlarkin@
|
#
1.86 |
|
25-Apr-2023 |
dv |
vmm(4)/vmd(8): pull struct members out of vmm ioctl create struct.
The object sent to vmm(4) contained file paths and details the kernel does not need for cpu virtualization as device emulation is in userland. Effectively, "pull up" the struct members from the vm_create_params struct to the parent vmop_create_params struct.
This allows us to clean up some of vmd(8) and simplify things for switching to having vmctl(8) open the "kernel" file (SeaBIOS, bsd.rd, etc.) to allow users to boot recovery ramdisk kernels.
ok mlarkin@
|
#
1.85 |
|
23-Apr-2023 |
dv |
vmd(8): teach vmm process how to exec.
Use execvp(2) to launch vm children with new address spaces. Consequently, introduces use of unveil(2) into the vmm and vm processes.
This imposes the requirement of launching vmd with absolute paths, similar to sshd(8).
ok mlarkin@
|
#
1.84 |
|
23-Apr-2023 |
anton |
unbreak tree by coping with recent s/XCR0/XFEATURE rename
|
Revision tags: OPENBSD_7_3_BASE
|
#
1.83 |
|
06-Feb-2023 |
dv |
vmd(8): scan pci bus to determine bootorder strings.
vmd's SeaBIOS bootorder strings had hardcoded pci device ids, so if a user added a network interface the bootorder strings didn't line up with reality. Using vmctl(8) to boot from a cdrom (-B cdrom) would fail, for instance, if attaching both a nic and a disk as well.
This change scans the pci devices and finds the first of each type to construct viable bootorder strings.
ok jan@
|
#
1.82 |
|
28-Jan-2023 |
dv |
Move some header definitions from vmm(4) to vmd(8).
Part of an ongoing effort to move userland-specific information out of a kernel header and directly into vmd(8). No functional change.
ok mlarkin@
|
#
1.81 |
|
08-Jan-2023 |
dv |
vmd(8): add thread names to vm process.
ok guenther@.
|
#
1.80 |
|
04-Jan-2023 |
dv |
Typos in vmd error message. No functional change.
|
#
1.79 |
|
28-Dec-2022 |
jmc |
spelling fixes; from paul tagliamonte any parts of his diff not taken are noted on tech
|
#
1.78 |
|
26-Dec-2022 |
dv |
vmd(8): provide a detailed e820 memory map.
When booting guests with SeaBIOS, vmd(8) supplied details about the available guest memory via CMOS registers. Consequently, we've been carrying some patches in the ports tree to SeaBIOS to fetch this information like it's the 1990s.
When a vm initializes memory ranges, we now track what each range represents. This information can be used to supply the e820 memory map to SeaBIOS via the fw_cfg interface allowing it to properly communicate memory ranges to a guest operating system. (This will also allow us to drop some patches from the port.)
Given the ranges can now be marked with a purpose, this also allows vmm(4) to switch from hard-coded mmio ranges and instead let the information on the memory range dictate if vmm should be handling a page fault or sending to vmd for a memory assist.
Tested by Mischa Peters and others. OK mlarkin@.
|
#
1.77 |
|
23-Dec-2022 |
dv |
vmd(8): implement zero-copy operations on virtqueues.
The original virtio device implementation relied on allocating a buffer on heap, copying the virtqueue from the guest, mutating the copy, and then overwriting the virtqueue in the guest.
While the approach worked, it was both complex and added extra overhead. On older hardware, switching to the zero-copy approach can show a noticeable performance improvement for vionet devices. An added benefit is this diff also reduces the amount of code in vmd, which is always a welcome change.
In addition, change to talking about the queue pfn and not "address" as the virtio-pci spec has drivers provide a 32-bit value representing the physical page number of the location in guest memory, not the linear address.
Original idea from dlg@ while working on re-adding async task queues.
ok dlg@, tested by many
|
#
1.76 |
|
11-Nov-2022 |
dv |
Revert removal of toggling interrupt line in vmd vcpu run loop.
phessler reports a performance regression. Needs more testing.
|
#
1.75 |
|
10-Nov-2022 |
dv |
vmd(8): remove toggling interrupt line on vcpu in vcpu run loop
We toggle the interrupt "line" on the vcpu when we assert or deassert irq on the pic in either the vcpu thread (emulating some devices) or on the device event thread (mostly handling reading available data). Having it in the vcpu run loop here just results in another ioctl(2) call before the one for re-entering the guest cpu.
Removing it shows no noticeable behavioral change in existing guests.
ok mlarkin@
|
#
1.74 |
|
10-Nov-2022 |
dv |
vmd(8): import mmio decode and emulation, disabled for now.
The initial mmio support for vmd adds support for only specific MOV and MOVZX instructions. Plan is to begin iterating in-tree on other missing pieces. All functionality is gated behind an #if for now.
Only change to vmm(4) is reordering register #define's in vmmvar.h.
ok mlarkin@
|
Revision tags: OPENBSD_7_2_BASE
|
#
1.73 |
|
01-Sep-2022 |
dv |
vmm(4): send all port io emulation to userland
Simplify things by sending any io exits from IN/OUT instructions to userland instead of trying to emulate anything in the kernel. vmm was sending most pertinent exits to vmd anyways, so this functionally changes little.
An added benefit is this solves an issue reported by tb@ where i386 OpenBSD guests would probe for a pc keyboard repeatedly and cause excessive vm exits. (The emulation in vmm was not properly handling these port reads.)
While here, make the assignment of the VEI_DIR_{IN,OUT} enum values not assume the underlying integer the compiler may assign.
ok mlarkin@
|
#
1.72 |
|
30-Aug-2022 |
dv |
Initial support for mmio assist for vmm(4)
Provide the basic information required for a userland assist in emulating instructions touching mmio regions, sending as much information as is provided by the host hardware.
No decode or assist provided at the moment by vmd(8).
ok mlarkin@
|
#
1.71 |
|
29-Jun-2022 |
dv |
vmd(8): fix off by one in vm memory range check
When inspecting if a gpa falls into a known memory range, vmd was considering it valid 1 byte past the end resulting in selecting the wrong starting range for the search.
ok mlarkin@
|
#
1.70 |
|
26-Jun-2022 |
dv |
vmd: create a copy of bios at 4g boundary
Newer Linux kernels call into the bios to perform a reboot and our version of SeaBIOS assumes there's a "copy" of the bios ending at 4g. When SeaBIOS reads from this area, since vmd doesn't perform mmio yet, guests terminate with an unhandled fault.
Carve out some space ending at 4g and copy the bios there. Technically we could load garbage there, but give SeaBIOS what it wants for now.
ok mlarkin@
|
#
1.69 |
|
03-May-2022 |
dv |
vmm/vmd/vmctl: standardize memory units to bytes
At different points in the vm lifecycle vmm(4), vmctl(8), and vmd(8) refer to a vm's memory range sizes in either bytes or megabytes. This is needlessly complex.
Switch to using bytes everywhere and adjust types and constants accordingly. While this makes it possible to specify vm's with memory in fractions of megabytes, the logic requiring whole megabyte values remains.
Feedback from deraadt@, mlarkin@, and Matthew Martin.
ok mlarkin@
|
Revision tags: OPENBSD_7_1_BASE
|
#
1.68 |
|
01-Mar-2022 |
dv |
vmd(8): gracefully handle hitting data limits when starting a vm
With recent changes to login.conf(5) to restrict daemon datasize to a finite value, users can now hit resource limits when attempting to start a vm.
This change fixes the error path when hitting the limit. vmd(8) will no longer abort and memory error messages are relayed to the user.
While here, address potential under-reads/writes using atomicio when relaying data between the child vm process and vmd's vmm process.
Original diff from tedu@. OK mlarkin@.
|
#
1.67 |
|
30-Dec-2021 |
claudio |
Add back support for -B net -b bsd.rd which emulates a PXE install and results in an autoinstall. This can be used to quickly create new OpenBSD installs. OK dv@
|
#
1.66 |
|
29-Nov-2021 |
deraadt |
mostly avoid sys/param.h with a local nitems() ok mlarkin
|
Revision tags: OPENBSD_7_0_BASE
|
#
1.65 |
|
01-Sep-2021 |
dv |
remove unused functions and cleanup vmd.h
Discussed with mlarkin@. These functions were implemented but never used. While in vmd.h, fix the order to match current vmd(8) reality.
|
#
1.64 |
|
16-Jul-2021 |
dv |
vmd(8): simplify vcpu logic, removing uart & vionet reads
Remove legacy state handling on the ns8250 and virtio network devices originally put in place before using libevent for async device events. The vcpu thread doesn't need to process device data as it is handled by the libevent thread.
This has the benefit of simplifying some of the message passing between threads introduced to the ns8250 uart since both the vcpu and libevent threads were processing read events.
No functional change intended. Tested by many, including abieber@, weerd@, Mischa Peters, and Matthias Schmidt. (Thanks.)
OK mlarkin@
|
#
1.63 |
|
16-Jun-2021 |
dv |
cleanup vmd(8) includes and header files
Lots of organic growth other the years lead to unnecessary includes (proc.h everywhere) and odd dependencies between header files. This cleans things up a bit to help with upcoming cleanup around dhcp code.
No functional change.
"go for it" mlarkin@
|
Revision tags: OPENBSD_6_9_BASE
|
#
1.62 |
|
05-Apr-2021 |
dv |
Support booting from compressed kernel images.
The bsd.rd ramdisk now ships gzip'd on amd64. Use libz in base to transparently handle decompression of any compressed kernel images.
Patch from Josh Rickmar.
ok kn@
|
#
1.61 |
|
29-Mar-2021 |
dv |
Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior ignored did not intercept these packets and instead transmitted them.
This should make vmd(8)'s dhcp behave more as a true dhcp server should and allows it to work properly with the new dhcpleased(8) attempting a renewal.
OK mlarkin@
|
#
1.60 |
|
19-Mar-2021 |
kn |
Remove booting from kernels in raw/qcow2 images
Diff and (slightly tweaked) text below from Dave Voutila < dave at sisu dot io >, thanks!
-- Since 6.7 switched to FFS2 as the default filesystem for new installs, the ability for vmd(8) to load a kernel and boot.conf from a disk image directly (without SeaBIOS) has been broken.
A diff from tb to add FFS2 support never mdae it into the tree.
On 5th Jan 2021, new ramdisks for amd64 have started shipping gzipped, breaking the ability to load the bsd.rd directly as a kernel image for a vmd guest without first uncompressing the image.
Using BIOS works, the FFS2 change happend ten months ago and few if any have complained about the breakage. vmctl(8) is still vague about supporting it per its man page and one still has to pass the disk image twice as a "-b" and "-d" argument to boot an OpenBSD guest *without* BIOS.
Josh Rickmar reported the gzip issue on bugs@ and provided patches to add support for compressed ramdisks and kernel images. The easiest way to do so is to drop support for FFS images since they require a call to fmemopen(3) while all the other logic uses fopen(3)/fdopen(3) calls and a file descriptor. It is much easier to get thsoe patches merged if they don't have to account for extracting files from disk images. --
No objections anyone "Removing it makes sense" reyk (who wrote the FFS module) OK mlarkin
|
#
1.59 |
|
13-Feb-2021 |
mlarkin |
Fix some wrong comments and KNF/long line wraps
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.58 |
|
28-Jun-2020 |
pd |
vmd(8): Eliminate libevent state corruption
libevent functions for com, pic and rtc are now only called on event_thread. vcpu exit handlers send messages on a dev pipe and callbacks on these events do the event management (event_add, evtimer_add, etc). Previously, libevent state was mutated by two threads, event_thread, that runs all the callbacks and the vcpu thread when running exit handlers. This could have lead to libevent state corruption.
Patch from Dave Voutila <dave@sisu.io>
ok claudio@ tested by abieber@ and brynet@
|
Revision tags: OPENBSD_6_7_BASE
|
#
1.57 |
|
30-Apr-2020 |
pd |
vmd(8): correctly terminate vm processes after sending vm
Instead of a round about way of sending a message to vmm that 'send is successful' and terminating by vm_remove from vmm, we can send the imsg and exit in the vm process. The sigchld handler in vmm will vm_remove it from its structures. This is how a normal vm is terminated as well.
Previously, vm_remove was called in vmm_dispatch_vm (ie. the event handler to receive messages from vm process) when hanlding the IMSG_VMDOP_SEND_VM_RESPONSE (ie. the vm process has written the vm state to the fd passed on by vmctl send). This is not how vm_remove was intented to be used as it does a free(vm). The vm struct holds the buffers for imsg and so after handling this IMSG_VMDOP_SEND_VM_RESPONSE message, vmm_dispatch_vm loops again to do imsg_get(ibuf, &imsg) to read the next message (and we had just freed this *ibuf when we freed the vm struct) causing it to segfault.
reported by kn@ ok kn@
|
#
1.56 |
|
21-Apr-2020 |
pd |
vmd: improve concurrency control in pause
Previous implementation hit a deadlock sometimes as the pthread_cond_broadcast for the pause mutex could happen before pthread_cond_wait. This implementation uses a barrier which is hit when all vpcus are paused.
ok mpi@
|
#
1.55 |
|
08-Apr-2020 |
pd |
vmm(4): add IOCTL handler to sets the access protections of the ept
This exposes VMM_IOC_MPROTECT_EPT which can be used by vmd to lock in physical pages. Currently, vmd just terminates the vm in case it gets a protection fault in the future.
This feature is used by solo5 which uses vmm(4) as a backend hypervisor.
ok mpi@
Patch from Adam Steen <adam@adamsteen.com.au>
|
#
1.54 |
|
11-Dec-2019 |
pd |
vmd: proper concurrency control when pausing a vm
Removes an XXX which slept for 1s waiting for the vcpu thread to reach HLT and pause. We now define a paused and unpaused condition so that a call to pause_vm() / vmctl pause blocks till the vm really reaches a paused state.
Also, detach events for devices from event loop when pausing and add them back when unpausing. This is because some callbacks call pthread_mutex_lock and if the vm is paused, it would block also causing the libevent thread to block. This would mean that we would not be able to process any IMSGs received from vmm (parent process) including a message to unpause.
ok mlarkin@
|
#
1.53 |
|
30-Nov-2019 |
mlarkin |
Revert previous - the stability was not as improved as we had thought and we ended up accidentally breaking vmctl. This will need more thought.
ok ori@
|
#
1.52 |
|
29-Nov-2019 |
mlarkin |
Fix at least one cause of VMs spinning at 100% host CPU
After debugging with ori@, it looks like an event ends up on the wrong libevent queue, and we end continually de-queueing and re-queueing the event continually. While it's unclear exactly why this happened, a clue on libevent's github issues page for the same problem pointed us to using a different event base for the device events. This seems to have unstuck ori@'s problematic VM, and I have also seen no more hangs after this.
We have not completely separated the queues; ori@ will work on setting new libevent bases for those later. But those events are pretty frequency.
with help from and ok ori@
|
Revision tags: OPENBSD_6_6_BASE
|
#
1.51 |
|
17-Jul-2019 |
pd |
vmm/vmd: Fix migration with pvclock
Implement VMM_IOC_READVMPARAMS and VMM_IOC_WRITEVMPARAMS ioctls to read and write pvclock state.
reads ok mlarkin@
|
#
1.50 |
|
28-Jun-2019 |
deraadt |
When system calls indicate an error they return -1, not some arbitrary value < 0. errno is only updated in this case. Change all (most?) callers of syscalls to follow this better, and let's see if this strictness helps us in the future.
|
#
1.49 |
|
28-May-2019 |
pd |
vmd: unset CR0_CD and CR0_NW in default flat64 register values
These never got unset on AMD/SVM guests when booted via vmctl start -b causing them to run very slow
ok mlarkin@
|
#
1.48 |
|
12-May-2019 |
pd |
vmm: add a x86 page table walker
Add a first cut of x86 page table walker to vmd(8) and vmm(4). This function is not used right now but is a building block for future features like HPET, OUTSB and INSB emulation, nested virtualisation support, etc.
With help from Mike Larkin
ok mlarkin@
|
#
1.47 |
|
11-May-2019 |
jasper |
vm_dump_header allocated space for a signature but it was never set; set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image
ok mlarkin@ pd@
|
#
1.46 |
|
11-May-2019 |
jasper |
track the state of the vm (running, paused, etc) using a single bitfield instead of a handful of separate variables. this will makes it easier for vmd to report and check on the individual vm states
no functional change intended
ok ccardenas@ mlarkin@
|
Revision tags: OPENBSD_6_5_BASE
|
#
1.45 |
|
01-Mar-2019 |
mlarkin |
vmd(8): remove some i386 remnants that missed the original cleanup
ok pd, kn, deraadt
|
#
1.44 |
|
20-Feb-2019 |
mlarkin |
vmd(8): initialize guest %drX registers to power-on defaults on launch
Initializes the %drX registers to power on defaults, and bump the VM send/recieve header to reflect same
discussed with deraadt@
|
#
1.43 |
|
10-Dec-2018 |
claudio |
Implement the fw_cfg interface basics and use it to set the bootorder if a bootdevice was forced. This implements both the pure IO port interface and also the new DMA interface, a few direct commands are implemented which are needed but in general the "file" interface should be used. There is no write support for the guest. Tested against the latest vmm-firmware port. This requires also a -current kernel to pass the IO ports to vmd(8). OK mlarkin@ ccardenas@
|
#
1.42 |
|
06-Dec-2018 |
claudio |
Make it possible to define the bootdevice in vmd. This information is used currently only when booting a OpenBSD kernel. If VMBOOTDEV_NET is used the internal dhcp server will pass "auto_install" as boot file to the client and the boot loader passes the MAC of the first interface to the kernel to indicate PXE booting. Adding boot order support to SeaBIOS is not yet implemented. Ok ccardenas@
|
Revision tags: OPENBSD_6_4_BASE
|
#
1.41 |
|
08-Oct-2018 |
reyk |
Add support for qcow2 base images (external snapshots).
This works is from Ori Bernstein, committing on his behalf:
Add support to vmd for external snapshots. That is, snapshots that are derived from a base image. Data lookups start in the derived image, and if the derived image does not contain some data, the search proceeds ot the base image. Multiple derived images may exist off of a single base image.
A limitation of this format is that modifying the base image will corrupt the derived image.
This change also adds support for creating disk derived disk images to vmctl. To use it:
vmctl create derived.qcow2 -s 16G -b base.qcow2
From Ori Bernstein OK mlarkin@ reyk@
|
#
1.40 |
|
28-Sep-2018 |
reyk |
Support vmd-internal's vmboot with qcow2 disk images.
OK mlarkin@
|
#
1.39 |
|
19-Sep-2018 |
ccardenas |
Various clean up items for disks.
- qcow2: general cleanup - vioraw: check malloc - virtio: add function to sync disks - vm: call virtio_shutdown to sync disks when vm is finished executing
Thanks to Ori Bernstein.
Ok miko@
|
#
1.38 |
|
17-Jul-2018 |
mlarkin |
vmd(8): fix vmctl -b option for i386 kernels.
ok pd@
|
#
1.37 |
|
12-Jul-2018 |
mlarkin |
vmm(8)/vmm(4): send a copy of the guest register state to vmd on exit, avoiding multiple readregs ioctls back to vmm in case register content is needed subsequently.
ok phessler
|
#
1.36 |
|
10-Jul-2018 |
mlarkin |
vmd(8): route ELCR handler to the right function
|
#
1.35 |
|
09-Jul-2018 |
mlarkin |
vmd(8): better debug message in a failure case
|
#
1.34 |
|
19-Jun-2018 |
reyk |
knf
|
#
1.33 |
|
27-Apr-2018 |
mlarkin |
vmd(8): implement vmd side of ELCR registers
ok guenther
|
#
1.32 |
|
26-Apr-2018 |
mlarkin |
vmd(8): handle PIT channel 2 status readback via port 0x61
Allow PIT channel 2 status (fired/counting) readback via port 0x61 bit 5.
ok guenther@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.31 |
|
03-Jan-2018 |
ccardenas |
Add initial CD-ROM support to VMD via vioscsi.
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
#
1.30 |
|
29-Nov-2017 |
mlarkin |
make vmm(4) less responsible for initial register state, preferring to let usermode daemons handle that.
ok pd@
|
#
1.29 |
|
28-Nov-2017 |
mlarkin |
fix some spelling errors in a few comments
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.28 |
|
19-Sep-2017 |
mlarkin |
Clarify a wrong conditional, found by jsg.
ok jsg
|
#
1.27 |
|
17-Sep-2017 |
pd |
vmd: send/recv pci config space instead of recreating pci devices on receive
ok mlarkin@
|
#
1.26 |
|
17-Sep-2017 |
pd |
vmd: re add rtc.per and rtc.sec evtimers on receive
This was missed in receive. mc146818_start is already defined. This fixes rtc time resync on receive.
ok mlarkin@
|
#
1.25 |
|
11-Sep-2017 |
dlg |
add functions to provide direct access to guest memory as vmd addresses
iovec_mem() populates an iovec array based on guest physical addresses. this allows the use of things like readv and writev for moving data between the guest and a disk image file without having to bounce the memory.
vaddr_mem() provides a vmd usable pointer based on a guests physical address. this makes it possible to directly reference things like virtio rings without having to bounce that memory either. however, it assumes that a contiguous range of guest physical memory will sit in a single vm memory range. mlarkin@ says this is right.
ok mlarkin@
|
#
1.24 |
|
20-Aug-2017 |
pd |
vmd: Allow only upward migration
This restricts receiving vms from hosts with more cpu features.
Tested on broadwell -> skylake (works) skylake -> broadwell (don't work)
ok mlarkin@
|
#
1.23 |
|
14-Aug-2017 |
mlarkin |
vmd: set MSR_MISC_ENABLE=0 on vm creation, this will be re-set in vmm based on proper values from the host in use.
|
#
1.22 |
|
15-Jul-2017 |
pd |
Add vmctl send and vmctl receive
ok reyk@ and mlarkin@
|
#
1.21 |
|
09-Jul-2017 |
pd |
vmd/vmctl: Add ability to pause / unpause vms
With help from Ashwin Agrawal
ok reyk@ mlarkin@
|
#
1.20 |
|
07-Jun-2017 |
mlarkin |
vmd: Implement simulated baudrate support in the ns8250 module. The previous version was allowing an output rate that is "too fast", and linux guests would give up after 512 characters TXed ("too much work for irq4").
This diff calculates the approximate rate we can sustain at the current programmed baud rate and limits the output to that rate by inserting a HZ delay after a specified number of characters have been transmitted. This fixes the linux guest console issue.
Note that the console now outputs at more or less the selected baud rate, instead of nearly instantaneously as before - if you selected 9600 in your guest VMs before, you might want to change that to 115200 now for a better console experience.
krw@ "seems like a good idea to me"
|
#
1.19 |
|
30-May-2017 |
tedu |
split vioblk read/write functions into start and finish as prep for async io operations. ok mlarkin
|
#
1.18 |
|
28-May-2017 |
mlarkin |
SVM: add some exit types
Also, fix a comment that wasn't applicable anymore, and change a format from decimal to hex
|
#
1.17 |
|
05-May-2017 |
reyk |
VMs cannot use proc_compose() to PROC_VMM, they have to use imsg_compose() on the "vmm_pipe" directly. This fixes the communication channel from VMs back to vmm.
|
#
1.16 |
|
05-May-2017 |
mlarkin |
Allow vmd(8) to set guest %xcr0
Usermode part of previous vmm(4) diff.
Posted to tech by Pratik Vyas
|
#
1.15 |
|
02-May-2017 |
mlarkin |
fix an error in i386 vmd build
|
#
1.14 |
|
02-May-2017 |
mlarkin |
Matching vmd(8) part of previous diff (first part of vmctl send/receive).
ok kettenis
|
#
1.13 |
|
25-Apr-2017 |
reyk |
spacing
|
#
1.12 |
|
19-Apr-2017 |
reyk |
Add support for dynamic "NAT" interfaces (-L/local interface).
When a local interface is configured, vmd configures a /31 address on the tap(4) interface of the host and provides another IP in the same subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server that replies with IP, gateway, and DNS addresses to the VM. The built-in server only ever responds to the VM on the inside and cannot leak its DHCP responses to the outside.
Thanks to Uwe Werler, Josh Grosse, and some others for testing!
OK deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.11 |
|
27-Mar-2017 |
deraadt |
die whitespace die die die
|
#
1.10 |
|
25-Mar-2017 |
mlarkin |
Last bits needed to get seabios + alpine linux working. This is enough to get started and let more people help finding and fixing bugs.
ok kettenis, deraadt
|
#
1.9 |
|
25-Mar-2017 |
reyk |
Boot using BIOS from /etc/firmware/vmm-bios by default.
Instead of using the internal "vmboot", VMs will now be booted using the external BIOS firmware in /etc/firmware/vmm-bios (which is subject to a LGPLv3 license). Direct booting of OpenBSD kernels or non-default BIOS images is still supported for now using the -b/boot option that is replacing the -k/kernel option.
As requested by Theo, vmd(8) fails if neither the default BIOS is found nor a kernel has been specified in the VM configuration. The "vmm" BIOS has to be installed using fw_update(1), which will be done automatically in most cases where the OpenBSD can fetch it after install/upgrade.
OK mlarkin@
|
#
1.8 |
|
25-Mar-2017 |
mlarkin |
Implement some missing functionality and clean up some code in vmd pci emulation.
ok kettenis
|
#
1.7 |
|
25-Mar-2017 |
mlarkin |
Introduce a new function to obtain properly sized input data, and convert i8253/i8259/mc146818 emulation to use this.
|
#
1.6 |
|
24-Mar-2017 |
mlarkin |
Allow vmd to proceed after an interrupt occurred after retiring a cpuid instruction. Matches previous commit to kernel vmm.c
|
#
1.5 |
|
23-Mar-2017 |
mlarkin |
Implement memory size and SMP CPU count NVRAM registers in the emulated mc146818. This is needed for seabios to boot properly (and construct a sensible e820 map to send to the guest OS).
|
#
1.4 |
|
21-Mar-2017 |
mlarkin |
Fix two errors in NS8250 (UART) emulation. The first error zeroed out the high bits of %eax on reading register data from the emulated UART ports. The second error didn't properly assert the TXRDY bit during init - this bit was only set after the first character was sent. Both these bugs caused seabios to not be able to output any data. Found during the recent effort to get Linux guests booting.
|
#
1.3 |
|
15-Mar-2017 |
reyk |
Improve vmmci(4) shutdown and reboot.
This change handles various cases to power off the VM, even if it is unresponsive, stuck in ddb, or when the shutdown was initiated from the VM guest side. Usage of timeout and VM ACKs make sure that the VM is really turned off at some point.
OK mlarkin@
|
#
1.2 |
|
02-Mar-2017 |
reyk |
Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.
This is especially useful when multiple VMs share a switch, the implementation is independent from the underlying switch or bridge.
no objections mlarkin@
|
#
1.1 |
|
01-Mar-2017 |
reyk |
Split vmm.c into two files: vm.c for the VM child, vmm.c for the parent
As discussed with mlarkin@, it makes it easier to maintain the file.
OK mlarkin@
|
#
1.92 |
|
23-Sep-2023 |
dv |
vmd(8): log vmd's vm id, not vmm's in vcpu_run_loop.
Some guests cause a warning message during a shutdown. Log the vmd vm id and not the kernel vmm id as it's next to useless to the end user. This has annoyed me too much.
|
#
1.91 |
|
06-Sep-2023 |
dv |
vmm(4)/vmd(8): include pending interrupt in vm_run_parmams.
To remove an ioctl(2) from the vcpu thread hotpath in vmd(8), add a flag in the vm_run_params structure to indicate if there's another interrupt pending. This reduces latency in vcpu work related to i/o as we save a trip into the kernel just to flip the interrupt pending flag on or off.
Tested by phessler@, mbuhl@, stsp@, and Mischa Peters.
ok mlarkin@
|
#
1.90 |
|
13-Jul-2023 |
dv |
vmd(8): pull validation into local prefix parser.
Validation for local prefixes, both inet and inet6, was scattered around. To make it even more confusing, vmd was using generic address parsing logic from prior network daemons. vmd doesn't need to parse addresses other than when parsing the local prefix settings in vm.conf and no runtime parsing is needed.
This change merges parsing and validation based on vmd's specific needs for local prefixes (e.g. reserving enough bits for vm id and network interface id encoding in an ipv4 address). In addition, it simplifies the struct from a generic address struct to one focused on just storing the v4 and v6 prefixes and masks. This cleans up an unused TAILQ struct member that isn't used by vmd and was leftover copy-pasta from those prior daemons.
The address parsing that vmd uses is also updated to using the latest logic in bgpd(8).
ok mlarkin@
|
#
1.89 |
|
13-May-2023 |
dv |
vmm(4)/vmd(8): switch to anonymous shared mappings.
While splitting out emulated virtio network and block devices into separate processes, I originally used named mappings via shm_mkstemp(3). While this functionally achieved the desired result, it had two unintended consequences:
1) tearing down a vm process and its child processes required excessive locking as the guest memory was tied into the VFS layer.
2) it was observed by mlarkin@ that actions in other parts of the VFS layer could cause some of the guest memory to flush to storage, possibly filling /tmp.
This commit adds a new vmm(4) ioctl dedicated to allowing a process request the kernel share a mapping of guest memory into its own vm space. This requires an open fd to /dev/vmm (requiring root) and both the "vmm" and "proc" pledge(2) promises. In addition, the caller must know enough about the original memory ranges to reconstruct them to make the vm's ranges.
Tested with help from Mischa Peters.
ok mlarkin@
|
#
1.88 |
|
28-Apr-2023 |
dv |
vmd(8)/vmctl(8): allow vm owners to override boot kernel.
vmd allows non-root users to "own" a vm defined in vm.conf(5). While the user can start/stop the vm, if they break their filesystem they have no means of booting recovery media like a ramdisk kernel.
This change opens the provided boot kernel via vmctl and passes the file descriptor through the control channel to vmd. The next boot of the vm will use the provided file descriptor as boot kernel/bios. Subsequent boots (e.g. a reboot) will return to using behavior defined in vm.conf or the default bios image.
ok mlarkin@
|
#
1.87 |
|
27-Apr-2023 |
dv |
vmd(8): introduce multi-process model for virtio devices.
Isolate virtio network and block device emulation in dedicated processes, forked and exec'd from the vm process. This allows for tightening pledge promises to just "stdio".
Communication between the vcpu's and these devices now occurs via imsg channels, which adds the benefit of not always blocking the vcpu thread while emulating the device.
With this commit, it's possible that vmd is the first open source hypervisor that *defaults* to a multi-process device emulation model without requiring any additional configuration from the operator.
Testing help from phessler@ and Mischa Peters.
ok mlarkin@
|
#
1.86 |
|
25-Apr-2023 |
dv |
vmm(4)/vmd(8): pull struct members out of vmm ioctl create struct.
The object sent to vmm(4) contained file paths and details the kernel does not need for cpu virtualization as device emulation is in userland. Effectively, "pull up" the struct members from the vm_create_params struct to the parent vmop_create_params struct.
This allows us to clean up some of vmd(8) and simplify things for switching to having vmctl(8) open the "kernel" file (SeaBIOS, bsd.rd, etc.) to allow users to boot recovery ramdisk kernels.
ok mlarkin@
|
#
1.85 |
|
23-Apr-2023 |
dv |
vmd(8): teach vmm process how to exec.
Use execvp(2) to launch vm children with new address spaces. Consequently, introduces use of unveil(2) into the vmm and vm processes.
This imposes the requirement of launching vmd with absolute paths, similar to sshd(8).
ok mlarkin@
|
#
1.84 |
|
23-Apr-2023 |
anton |
unbreak tree by coping with recent s/XCR0/XFEATURE rename
|
Revision tags: OPENBSD_7_3_BASE
|
#
1.83 |
|
06-Feb-2023 |
dv |
vmd(8): scan pci bus to determine bootorder strings.
vmd's SeaBIOS bootorder strings had hardcoded pci device ids, so if a user added a network interface the bootorder strings didn't line up with reality. Using vmctl(8) to boot from a cdrom (-B cdrom) would fail, for instance, if attaching both a nic and a disk as well.
This change scans the pci devices and finds the first of each type to construct viable bootorder strings.
ok jan@
|
#
1.82 |
|
28-Jan-2023 |
dv |
Move some header definitions from vmm(4) to vmd(8).
Part of an ongoing effort to move userland-specific information out of a kernel header and directly into vmd(8). No functional change.
ok mlarkin@
|
#
1.81 |
|
08-Jan-2023 |
dv |
vmd(8): add thread names to vm process.
ok guenther@.
|
#
1.80 |
|
04-Jan-2023 |
dv |
Typos in vmd error message. No functional change.
|
#
1.79 |
|
28-Dec-2022 |
jmc |
spelling fixes; from paul tagliamonte any parts of his diff not taken are noted on tech
|
#
1.78 |
|
26-Dec-2022 |
dv |
vmd(8): provide a detailed e820 memory map.
When booting guests with SeaBIOS, vmd(8) supplied details about the available guest memory via CMOS registers. Consequently, we've been carrying some patches in the ports tree to SeaBIOS to fetch this information like it's the 1990s.
When a vm initializes memory ranges, we now track what each range represents. This information can be used to supply the e820 memory map to SeaBIOS via the fw_cfg interface allowing it to properly communicate memory ranges to a guest operating system. (This will also allow us to drop some patches from the port.)
Given the ranges can now be marked with a purpose, this also allows vmm(4) to switch from hard-coded mmio ranges and instead let the information on the memory range dictate if vmm should be handling a page fault or sending to vmd for a memory assist.
Tested by Mischa Peters and others. OK mlarkin@.
|
#
1.77 |
|
23-Dec-2022 |
dv |
vmd(8): implement zero-copy operations on virtqueues.
The original virtio device implementation relied on allocating a buffer on heap, copying the virtqueue from the guest, mutating the copy, and then overwriting the virtqueue in the guest.
While the approach worked, it was both complex and added extra overhead. On older hardware, switching to the zero-copy approach can show a noticeable performance improvement for vionet devices. An added benefit is this diff also reduces the amount of code in vmd, which is always a welcome change.
In addition, change to talking about the queue pfn and not "address" as the virtio-pci spec has drivers provide a 32-bit value representing the physical page number of the location in guest memory, not the linear address.
Original idea from dlg@ while working on re-adding async task queues.
ok dlg@, tested by many
|
#
1.76 |
|
11-Nov-2022 |
dv |
Revert removal of toggling interrupt line in vmd vcpu run loop.
phessler reports a performance regression. Needs more testing.
|
#
1.75 |
|
10-Nov-2022 |
dv |
vmd(8): remove toggling interrupt line on vcpu in vcpu run loop
We toggle the interrupt "line" on the vcpu when we assert or deassert irq on the pic in either the vcpu thread (emulating some devices) or on the device event thread (mostly handling reading available data). Having it in the vcpu run loop here just results in another ioctl(2) call before the one for re-entering the guest cpu.
Removing it shows no noticeable behavioral change in existing guests.
ok mlarkin@
|
#
1.74 |
|
10-Nov-2022 |
dv |
vmd(8): import mmio decode and emulation, disabled for now.
The initial mmio support for vmd adds support for only specific MOV and MOVZX instructions. Plan is to begin iterating in-tree on other missing pieces. All functionality is gated behind an #if for now.
Only change to vmm(4) is reordering register #define's in vmmvar.h.
ok mlarkin@
|
Revision tags: OPENBSD_7_2_BASE
|
#
1.73 |
|
01-Sep-2022 |
dv |
vmm(4): send all port io emulation to userland
Simplify things by sending any io exits from IN/OUT instructions to userland instead of trying to emulate anything in the kernel. vmm was sending most pertinent exits to vmd anyways, so this functionally changes little.
An added benefit is this solves an issue reported by tb@ where i386 OpenBSD guests would probe for a pc keyboard repeatedly and cause excessive vm exits. (The emulation in vmm was not properly handling these port reads.)
While here, make the assignment of the VEI_DIR_{IN,OUT} enum values not assume the underlying integer the compiler may assign.
ok mlarkin@
|
#
1.72 |
|
30-Aug-2022 |
dv |
Initial support for mmio assist for vmm(4)
Provide the basic information required for a userland assist in emulating instructions touching mmio regions, sending as much information as is provided by the host hardware.
No decode or assist provided at the moment by vmd(8).
ok mlarkin@
|
#
1.71 |
|
29-Jun-2022 |
dv |
vmd(8): fix off by one in vm memory range check
When inspecting if a gpa falls into a known memory range, vmd was considering it valid 1 byte past the end resulting in selecting the wrong starting range for the search.
ok mlarkin@
|
#
1.70 |
|
26-Jun-2022 |
dv |
vmd: create a copy of bios at 4g boundary
Newer Linux kernels call into the bios to perform a reboot and our version of SeaBIOS assumes there's a "copy" of the bios ending at 4g. When SeaBIOS reads from this area, since vmd doesn't perform mmio yet, guests terminate with an unhandled fault.
Carve out some space ending at 4g and copy the bios there. Technically we could load garbage there, but give SeaBIOS what it wants for now.
ok mlarkin@
|
#
1.69 |
|
03-May-2022 |
dv |
vmm/vmd/vmctl: standardize memory units to bytes
At different points in the vm lifecycle vmm(4), vmctl(8), and vmd(8) refer to a vm's memory range sizes in either bytes or megabytes. This is needlessly complex.
Switch to using bytes everywhere and adjust types and constants accordingly. While this makes it possible to specify vm's with memory in fractions of megabytes, the logic requiring whole megabyte values remains.
Feedback from deraadt@, mlarkin@, and Matthew Martin.
ok mlarkin@
|
Revision tags: OPENBSD_7_1_BASE
|
#
1.68 |
|
01-Mar-2022 |
dv |
vmd(8): gracefully handle hitting data limits when starting a vm
With recent changes to login.conf(5) to restrict daemon datasize to a finite value, users can now hit resource limits when attempting to start a vm.
This change fixes the error path when hitting the limit. vmd(8) will no longer abort and memory error messages are relayed to the user.
While here, address potential under-reads/writes using atomicio when relaying data between the child vm process and vmd's vmm process.
Original diff from tedu@. OK mlarkin@.
|
#
1.67 |
|
30-Dec-2021 |
claudio |
Add back support for -B net -b bsd.rd which emulates a PXE install and results in an autoinstall. This can be used to quickly create new OpenBSD installs. OK dv@
|
#
1.66 |
|
29-Nov-2021 |
deraadt |
mostly avoid sys/param.h with a local nitems() ok mlarkin
|
Revision tags: OPENBSD_7_0_BASE
|
#
1.65 |
|
01-Sep-2021 |
dv |
remove unused functions and cleanup vmd.h
Discussed with mlarkin@. These functions were implemented but never used. While in vmd.h, fix the order to match current vmd(8) reality.
|
#
1.64 |
|
16-Jul-2021 |
dv |
vmd(8): simplify vcpu logic, removing uart & vionet reads
Remove legacy state handling on the ns8250 and virtio network devices originally put in place before using libevent for async device events. The vcpu thread doesn't need to process device data as it is handled by the libevent thread.
This has the benefit of simplifying some of the message passing between threads introduced to the ns8250 uart since both the vcpu and libevent threads were processing read events.
No functional change intended. Tested by many, including abieber@, weerd@, Mischa Peters, and Matthias Schmidt. (Thanks.)
OK mlarkin@
|
#
1.63 |
|
16-Jun-2021 |
dv |
cleanup vmd(8) includes and header files
Lots of organic growth other the years lead to unnecessary includes (proc.h everywhere) and odd dependencies between header files. This cleans things up a bit to help with upcoming cleanup around dhcp code.
No functional change.
"go for it" mlarkin@
|
Revision tags: OPENBSD_6_9_BASE
|
#
1.62 |
|
05-Apr-2021 |
dv |
Support booting from compressed kernel images.
The bsd.rd ramdisk now ships gzip'd on amd64. Use libz in base to transparently handle decompression of any compressed kernel images.
Patch from Josh Rickmar.
ok kn@
|
#
1.61 |
|
29-Mar-2021 |
dv |
Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior ignored did not intercept these packets and instead transmitted them.
This should make vmd(8)'s dhcp behave more as a true dhcp server should and allows it to work properly with the new dhcpleased(8) attempting a renewal.
OK mlarkin@
|
#
1.60 |
|
19-Mar-2021 |
kn |
Remove booting from kernels in raw/qcow2 images
Diff and (slightly tweaked) text below from Dave Voutila < dave at sisu dot io >, thanks!
-- Since 6.7 switched to FFS2 as the default filesystem for new installs, the ability for vmd(8) to load a kernel and boot.conf from a disk image directly (without SeaBIOS) has been broken.
A diff from tb to add FFS2 support never mdae it into the tree.
On 5th Jan 2021, new ramdisks for amd64 have started shipping gzipped, breaking the ability to load the bsd.rd directly as a kernel image for a vmd guest without first uncompressing the image.
Using BIOS works, the FFS2 change happend ten months ago and few if any have complained about the breakage. vmctl(8) is still vague about supporting it per its man page and one still has to pass the disk image twice as a "-b" and "-d" argument to boot an OpenBSD guest *without* BIOS.
Josh Rickmar reported the gzip issue on bugs@ and provided patches to add support for compressed ramdisks and kernel images. The easiest way to do so is to drop support for FFS images since they require a call to fmemopen(3) while all the other logic uses fopen(3)/fdopen(3) calls and a file descriptor. It is much easier to get thsoe patches merged if they don't have to account for extracting files from disk images. --
No objections anyone "Removing it makes sense" reyk (who wrote the FFS module) OK mlarkin
|
#
1.59 |
|
13-Feb-2021 |
mlarkin |
Fix some wrong comments and KNF/long line wraps
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.58 |
|
28-Jun-2020 |
pd |
vmd(8): Eliminate libevent state corruption
libevent functions for com, pic and rtc are now only called on event_thread. vcpu exit handlers send messages on a dev pipe and callbacks on these events do the event management (event_add, evtimer_add, etc). Previously, libevent state was mutated by two threads, event_thread, that runs all the callbacks and the vcpu thread when running exit handlers. This could have lead to libevent state corruption.
Patch from Dave Voutila <dave@sisu.io>
ok claudio@ tested by abieber@ and brynet@
|
Revision tags: OPENBSD_6_7_BASE
|
#
1.57 |
|
30-Apr-2020 |
pd |
vmd(8): correctly terminate vm processes after sending vm
Instead of a round about way of sending a message to vmm that 'send is successful' and terminating by vm_remove from vmm, we can send the imsg and exit in the vm process. The sigchld handler in vmm will vm_remove it from its structures. This is how a normal vm is terminated as well.
Previously, vm_remove was called in vmm_dispatch_vm (ie. the event handler to receive messages from vm process) when hanlding the IMSG_VMDOP_SEND_VM_RESPONSE (ie. the vm process has written the vm state to the fd passed on by vmctl send). This is not how vm_remove was intented to be used as it does a free(vm). The vm struct holds the buffers for imsg and so after handling this IMSG_VMDOP_SEND_VM_RESPONSE message, vmm_dispatch_vm loops again to do imsg_get(ibuf, &imsg) to read the next message (and we had just freed this *ibuf when we freed the vm struct) causing it to segfault.
reported by kn@ ok kn@
|
#
1.56 |
|
21-Apr-2020 |
pd |
vmd: improve concurrency control in pause
Previous implementation hit a deadlock sometimes as the pthread_cond_broadcast for the pause mutex could happen before pthread_cond_wait. This implementation uses a barrier which is hit when all vpcus are paused.
ok mpi@
|
#
1.55 |
|
08-Apr-2020 |
pd |
vmm(4): add IOCTL handler to sets the access protections of the ept
This exposes VMM_IOC_MPROTECT_EPT which can be used by vmd to lock in physical pages. Currently, vmd just terminates the vm in case it gets a protection fault in the future.
This feature is used by solo5 which uses vmm(4) as a backend hypervisor.
ok mpi@
Patch from Adam Steen <adam@adamsteen.com.au>
|
#
1.54 |
|
11-Dec-2019 |
pd |
vmd: proper concurrency control when pausing a vm
Removes an XXX which slept for 1s waiting for the vcpu thread to reach HLT and pause. We now define a paused and unpaused condition so that a call to pause_vm() / vmctl pause blocks till the vm really reaches a paused state.
Also, detach events for devices from event loop when pausing and add them back when unpausing. This is because some callbacks call pthread_mutex_lock and if the vm is paused, it would block also causing the libevent thread to block. This would mean that we would not be able to process any IMSGs received from vmm (parent process) including a message to unpause.
ok mlarkin@
|
#
1.53 |
|
30-Nov-2019 |
mlarkin |
Revert previous - the stability was not as improved as we had thought and we ended up accidentally breaking vmctl. This will need more thought.
ok ori@
|
#
1.52 |
|
29-Nov-2019 |
mlarkin |
Fix at least one cause of VMs spinning at 100% host CPU
After debugging with ori@, it looks like an event ends up on the wrong libevent queue, and we end continually de-queueing and re-queueing the event continually. While it's unclear exactly why this happened, a clue on libevent's github issues page for the same problem pointed us to using a different event base for the device events. This seems to have unstuck ori@'s problematic VM, and I have also seen no more hangs after this.
We have not completely separated the queues; ori@ will work on setting new libevent bases for those later. But those events are pretty frequency.
with help from and ok ori@
|
Revision tags: OPENBSD_6_6_BASE
|
#
1.51 |
|
17-Jul-2019 |
pd |
vmm/vmd: Fix migration with pvclock
Implement VMM_IOC_READVMPARAMS and VMM_IOC_WRITEVMPARAMS ioctls to read and write pvclock state.
reads ok mlarkin@
|
#
1.50 |
|
28-Jun-2019 |
deraadt |
When system calls indicate an error they return -1, not some arbitrary value < 0. errno is only updated in this case. Change all (most?) callers of syscalls to follow this better, and let's see if this strictness helps us in the future.
|
#
1.49 |
|
28-May-2019 |
pd |
vmd: unset CR0_CD and CR0_NW in default flat64 register values
These never got unset on AMD/SVM guests when booted via vmctl start -b causing them to run very slow
ok mlarkin@
|
#
1.48 |
|
12-May-2019 |
pd |
vmm: add a x86 page table walker
Add a first cut of x86 page table walker to vmd(8) and vmm(4). This function is not used right now but is a building block for future features like HPET, OUTSB and INSB emulation, nested virtualisation support, etc.
With help from Mike Larkin
ok mlarkin@
|
#
1.47 |
|
11-May-2019 |
jasper |
vm_dump_header allocated space for a signature but it was never set; set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image
ok mlarkin@ pd@
|
#
1.46 |
|
11-May-2019 |
jasper |
track the state of the vm (running, paused, etc) using a single bitfield instead of a handful of separate variables. this will makes it easier for vmd to report and check on the individual vm states
no functional change intended
ok ccardenas@ mlarkin@
|
Revision tags: OPENBSD_6_5_BASE
|
#
1.45 |
|
01-Mar-2019 |
mlarkin |
vmd(8): remove some i386 remnants that missed the original cleanup
ok pd, kn, deraadt
|
#
1.44 |
|
20-Feb-2019 |
mlarkin |
vmd(8): initialize guest %drX registers to power-on defaults on launch
Initializes the %drX registers to power on defaults, and bump the VM send/recieve header to reflect same
discussed with deraadt@
|
#
1.43 |
|
10-Dec-2018 |
claudio |
Implement the fw_cfg interface basics and use it to set the bootorder if a bootdevice was forced. This implements both the pure IO port interface and also the new DMA interface, a few direct commands are implemented which are needed but in general the "file" interface should be used. There is no write support for the guest. Tested against the latest vmm-firmware port. This requires also a -current kernel to pass the IO ports to vmd(8). OK mlarkin@ ccardenas@
|
#
1.42 |
|
06-Dec-2018 |
claudio |
Make it possible to define the bootdevice in vmd. This information is used currently only when booting a OpenBSD kernel. If VMBOOTDEV_NET is used the internal dhcp server will pass "auto_install" as boot file to the client and the boot loader passes the MAC of the first interface to the kernel to indicate PXE booting. Adding boot order support to SeaBIOS is not yet implemented. Ok ccardenas@
|
Revision tags: OPENBSD_6_4_BASE
|
#
1.41 |
|
08-Oct-2018 |
reyk |
Add support for qcow2 base images (external snapshots).
This works is from Ori Bernstein, committing on his behalf:
Add support to vmd for external snapshots. That is, snapshots that are derived from a base image. Data lookups start in the derived image, and if the derived image does not contain some data, the search proceeds ot the base image. Multiple derived images may exist off of a single base image.
A limitation of this format is that modifying the base image will corrupt the derived image.
This change also adds support for creating disk derived disk images to vmctl. To use it:
vmctl create derived.qcow2 -s 16G -b base.qcow2
From Ori Bernstein OK mlarkin@ reyk@
|
#
1.40 |
|
28-Sep-2018 |
reyk |
Support vmd-internal's vmboot with qcow2 disk images.
OK mlarkin@
|
#
1.39 |
|
19-Sep-2018 |
ccardenas |
Various clean up items for disks.
- qcow2: general cleanup - vioraw: check malloc - virtio: add function to sync disks - vm: call virtio_shutdown to sync disks when vm is finished executing
Thanks to Ori Bernstein.
Ok miko@
|
#
1.38 |
|
17-Jul-2018 |
mlarkin |
vmd(8): fix vmctl -b option for i386 kernels.
ok pd@
|
#
1.37 |
|
12-Jul-2018 |
mlarkin |
vmm(8)/vmm(4): send a copy of the guest register state to vmd on exit, avoiding multiple readregs ioctls back to vmm in case register content is needed subsequently.
ok phessler
|
#
1.36 |
|
10-Jul-2018 |
mlarkin |
vmd(8): route ELCR handler to the right function
|
#
1.35 |
|
09-Jul-2018 |
mlarkin |
vmd(8): better debug message in a failure case
|
#
1.34 |
|
19-Jun-2018 |
reyk |
knf
|
#
1.33 |
|
27-Apr-2018 |
mlarkin |
vmd(8): implement vmd side of ELCR registers
ok guenther
|
#
1.32 |
|
26-Apr-2018 |
mlarkin |
vmd(8): handle PIT channel 2 status readback via port 0x61
Allow PIT channel 2 status (fired/counting) readback via port 0x61 bit 5.
ok guenther@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.31 |
|
03-Jan-2018 |
ccardenas |
Add initial CD-ROM support to VMD via vioscsi.
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
#
1.30 |
|
29-Nov-2017 |
mlarkin |
make vmm(4) less responsible for initial register state, preferring to let usermode daemons handle that.
ok pd@
|
#
1.29 |
|
28-Nov-2017 |
mlarkin |
fix some spelling errors in a few comments
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.28 |
|
19-Sep-2017 |
mlarkin |
Clarify a wrong conditional, found by jsg.
ok jsg
|
#
1.27 |
|
17-Sep-2017 |
pd |
vmd: send/recv pci config space instead of recreating pci devices on receive
ok mlarkin@
|
#
1.26 |
|
17-Sep-2017 |
pd |
vmd: re add rtc.per and rtc.sec evtimers on receive
This was missed in receive. mc146818_start is already defined. This fixes rtc time resync on receive.
ok mlarkin@
|
#
1.25 |
|
11-Sep-2017 |
dlg |
add functions to provide direct access to guest memory as vmd addresses
iovec_mem() populates an iovec array based on guest physical addresses. this allows the use of things like readv and writev for moving data between the guest and a disk image file without having to bounce the memory.
vaddr_mem() provides a vmd usable pointer based on a guests physical address. this makes it possible to directly reference things like virtio rings without having to bounce that memory either. however, it assumes that a contiguous range of guest physical memory will sit in a single vm memory range. mlarkin@ says this is right.
ok mlarkin@
|
#
1.24 |
|
20-Aug-2017 |
pd |
vmd: Allow only upward migration
This restricts receiving vms from hosts with more cpu features.
Tested on broadwell -> skylake (works) skylake -> broadwell (don't work)
ok mlarkin@
|
#
1.23 |
|
14-Aug-2017 |
mlarkin |
vmd: set MSR_MISC_ENABLE=0 on vm creation, this will be re-set in vmm based on proper values from the host in use.
|
#
1.22 |
|
15-Jul-2017 |
pd |
Add vmctl send and vmctl receive
ok reyk@ and mlarkin@
|
#
1.21 |
|
09-Jul-2017 |
pd |
vmd/vmctl: Add ability to pause / unpause vms
With help from Ashwin Agrawal
ok reyk@ mlarkin@
|
#
1.20 |
|
07-Jun-2017 |
mlarkin |
vmd: Implement simulated baudrate support in the ns8250 module. The previous version was allowing an output rate that is "too fast", and linux guests would give up after 512 characters TXed ("too much work for irq4").
This diff calculates the approximate rate we can sustain at the current programmed baud rate and limits the output to that rate by inserting a HZ delay after a specified number of characters have been transmitted. This fixes the linux guest console issue.
Note that the console now outputs at more or less the selected baud rate, instead of nearly instantaneously as before - if you selected 9600 in your guest VMs before, you might want to change that to 115200 now for a better console experience.
krw@ "seems like a good idea to me"
|
#
1.19 |
|
30-May-2017 |
tedu |
split vioblk read/write functions into start and finish as prep for async io operations. ok mlarkin
|
#
1.18 |
|
28-May-2017 |
mlarkin |
SVM: add some exit types
Also, fix a comment that wasn't applicable anymore, and change a format from decimal to hex
|
#
1.17 |
|
05-May-2017 |
reyk |
VMs cannot use proc_compose() to PROC_VMM, they have to use imsg_compose() on the "vmm_pipe" directly. This fixes the communication channel from VMs back to vmm.
|
#
1.16 |
|
05-May-2017 |
mlarkin |
Allow vmd(8) to set guest %xcr0
Usermode part of previous vmm(4) diff.
Posted to tech by Pratik Vyas
|
#
1.15 |
|
02-May-2017 |
mlarkin |
fix an error in i386 vmd build
|
#
1.14 |
|
02-May-2017 |
mlarkin |
Matching vmd(8) part of previous diff (first part of vmctl send/receive).
ok kettenis
|
#
1.13 |
|
25-Apr-2017 |
reyk |
spacing
|
#
1.12 |
|
19-Apr-2017 |
reyk |
Add support for dynamic "NAT" interfaces (-L/local interface).
When a local interface is configured, vmd configures a /31 address on the tap(4) interface of the host and provides another IP in the same subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server that replies with IP, gateway, and DNS addresses to the VM. The built-in server only ever responds to the VM on the inside and cannot leak its DHCP responses to the outside.
Thanks to Uwe Werler, Josh Grosse, and some others for testing!
OK deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.11 |
|
27-Mar-2017 |
deraadt |
die whitespace die die die
|
#
1.10 |
|
25-Mar-2017 |
mlarkin |
Last bits needed to get seabios + alpine linux working. This is enough to get started and let more people help finding and fixing bugs.
ok kettenis, deraadt
|
#
1.9 |
|
25-Mar-2017 |
reyk |
Boot using BIOS from /etc/firmware/vmm-bios by default.
Instead of using the internal "vmboot", VMs will now be booted using the external BIOS firmware in /etc/firmware/vmm-bios (which is subject to a LGPLv3 license). Direct booting of OpenBSD kernels or non-default BIOS images is still supported for now using the -b/boot option that is replacing the -k/kernel option.
As requested by Theo, vmd(8) fails if neither the default BIOS is found nor a kernel has been specified in the VM configuration. The "vmm" BIOS has to be installed using fw_update(1), which will be done automatically in most cases where the OpenBSD can fetch it after install/upgrade.
OK mlarkin@
|
#
1.8 |
|
25-Mar-2017 |
mlarkin |
Implement some missing functionality and clean up some code in vmd pci emulation.
ok kettenis
|
#
1.7 |
|
25-Mar-2017 |
mlarkin |
Introduce a new function to obtain properly sized input data, and convert i8253/i8259/mc146818 emulation to use this.
|
#
1.6 |
|
24-Mar-2017 |
mlarkin |
Allow vmd to proceed after an interrupt occurred after retiring a cpuid instruction. Matches previous commit to kernel vmm.c
|
#
1.5 |
|
23-Mar-2017 |
mlarkin |
Implement memory size and SMP CPU count NVRAM registers in the emulated mc146818. This is needed for seabios to boot properly (and construct a sensible e820 map to send to the guest OS).
|
#
1.4 |
|
21-Mar-2017 |
mlarkin |
Fix two errors in NS8250 (UART) emulation. The first error zeroed out the high bits of %eax on reading register data from the emulated UART ports. The second error didn't properly assert the TXRDY bit during init - this bit was only set after the first character was sent. Both these bugs caused seabios to not be able to output any data. Found during the recent effort to get Linux guests booting.
|
#
1.3 |
|
15-Mar-2017 |
reyk |
Improve vmmci(4) shutdown and reboot.
This change handles various cases to power off the VM, even if it is unresponsive, stuck in ddb, or when the shutdown was initiated from the VM guest side. Usage of timeout and VM ACKs make sure that the VM is really turned off at some point.
OK mlarkin@
|
#
1.2 |
|
02-Mar-2017 |
reyk |
Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.
This is especially useful when multiple VMs share a switch, the implementation is independent from the underlying switch or bridge.
no objections mlarkin@
|
#
1.1 |
|
01-Mar-2017 |
reyk |
Split vmm.c into two files: vm.c for the VM child, vmm.c for the parent
As discussed with mlarkin@, it makes it easier to maintain the file.
OK mlarkin@
|
#
1.91 |
|
06-Sep-2023 |
dv |
vmm(4)/vmd(8): include pending interrupt in vm_run_parmams.
To remove an ioctl(2) from the vcpu thread hotpath in vmd(8), add a flag in the vm_run_params structure to indicate if there's another interrupt pending. This reduces latency in vcpu work related to i/o as we save a trip into the kernel just to flip the interrupt pending flag on or off.
Tested by phessler@, mbuhl@, stsp@, and Mischa Peters.
ok mlarkin@
|
#
1.90 |
|
13-Jul-2023 |
dv |
vmd(8): pull validation into local prefix parser.
Validation for local prefixes, both inet and inet6, was scattered around. To make it even more confusing, vmd was using generic address parsing logic from prior network daemons. vmd doesn't need to parse addresses other than when parsing the local prefix settings in vm.conf and no runtime parsing is needed.
This change merges parsing and validation based on vmd's specific needs for local prefixes (e.g. reserving enough bits for vm id and network interface id encoding in an ipv4 address). In addition, it simplifies the struct from a generic address struct to one focused on just storing the v4 and v6 prefixes and masks. This cleans up an unused TAILQ struct member that isn't used by vmd and was leftover copy-pasta from those prior daemons.
The address parsing that vmd uses is also updated to using the latest logic in bgpd(8).
ok mlarkin@
|
#
1.89 |
|
13-May-2023 |
dv |
vmm(4)/vmd(8): switch to anonymous shared mappings.
While splitting out emulated virtio network and block devices into separate processes, I originally used named mappings via shm_mkstemp(3). While this functionally achieved the desired result, it had two unintended consequences:
1) tearing down a vm process and its child processes required excessive locking as the guest memory was tied into the VFS layer.
2) it was observed by mlarkin@ that actions in other parts of the VFS layer could cause some of the guest memory to flush to storage, possibly filling /tmp.
This commit adds a new vmm(4) ioctl dedicated to allowing a process request the kernel share a mapping of guest memory into its own vm space. This requires an open fd to /dev/vmm (requiring root) and both the "vmm" and "proc" pledge(2) promises. In addition, the caller must know enough about the original memory ranges to reconstruct them to make the vm's ranges.
Tested with help from Mischa Peters.
ok mlarkin@
|
#
1.88 |
|
28-Apr-2023 |
dv |
vmd(8)/vmctl(8): allow vm owners to override boot kernel.
vmd allows non-root users to "own" a vm defined in vm.conf(5). While the user can start/stop the vm, if they break their filesystem they have no means of booting recovery media like a ramdisk kernel.
This change opens the provided boot kernel via vmctl and passes the file descriptor through the control channel to vmd. The next boot of the vm will use the provided file descriptor as boot kernel/bios. Subsequent boots (e.g. a reboot) will return to using behavior defined in vm.conf or the default bios image.
ok mlarkin@
|
#
1.87 |
|
27-Apr-2023 |
dv |
vmd(8): introduce multi-process model for virtio devices.
Isolate virtio network and block device emulation in dedicated processes, forked and exec'd from the vm process. This allows for tightening pledge promises to just "stdio".
Communication between the vcpu's and these devices now occurs via imsg channels, which adds the benefit of not always blocking the vcpu thread while emulating the device.
With this commit, it's possible that vmd is the first open source hypervisor that *defaults* to a multi-process device emulation model without requiring any additional configuration from the operator.
Testing help from phessler@ and Mischa Peters.
ok mlarkin@
|
#
1.86 |
|
25-Apr-2023 |
dv |
vmm(4)/vmd(8): pull struct members out of vmm ioctl create struct.
The object sent to vmm(4) contained file paths and details the kernel does not need for cpu virtualization as device emulation is in userland. Effectively, "pull up" the struct members from the vm_create_params struct to the parent vmop_create_params struct.
This allows us to clean up some of vmd(8) and simplify things for switching to having vmctl(8) open the "kernel" file (SeaBIOS, bsd.rd, etc.) to allow users to boot recovery ramdisk kernels.
ok mlarkin@
|
#
1.85 |
|
23-Apr-2023 |
dv |
vmd(8): teach vmm process how to exec.
Use execvp(2) to launch vm children with new address spaces. Consequently, introduces use of unveil(2) into the vmm and vm processes.
This imposes the requirement of launching vmd with absolute paths, similar to sshd(8).
ok mlarkin@
|
#
1.84 |
|
23-Apr-2023 |
anton |
unbreak tree by coping with recent s/XCR0/XFEATURE rename
|
Revision tags: OPENBSD_7_3_BASE
|
#
1.83 |
|
06-Feb-2023 |
dv |
vmd(8): scan pci bus to determine bootorder strings.
vmd's SeaBIOS bootorder strings had hardcoded pci device ids, so if a user added a network interface the bootorder strings didn't line up with reality. Using vmctl(8) to boot from a cdrom (-B cdrom) would fail, for instance, if attaching both a nic and a disk as well.
This change scans the pci devices and finds the first of each type to construct viable bootorder strings.
ok jan@
|
#
1.82 |
|
28-Jan-2023 |
dv |
Move some header definitions from vmm(4) to vmd(8).
Part of an ongoing effort to move userland-specific information out of a kernel header and directly into vmd(8). No functional change.
ok mlarkin@
|
#
1.81 |
|
08-Jan-2023 |
dv |
vmd(8): add thread names to vm process.
ok guenther@.
|
#
1.80 |
|
04-Jan-2023 |
dv |
Typos in vmd error message. No functional change.
|
#
1.79 |
|
28-Dec-2022 |
jmc |
spelling fixes; from paul tagliamonte any parts of his diff not taken are noted on tech
|
#
1.78 |
|
26-Dec-2022 |
dv |
vmd(8): provide a detailed e820 memory map.
When booting guests with SeaBIOS, vmd(8) supplied details about the available guest memory via CMOS registers. Consequently, we've been carrying some patches in the ports tree to SeaBIOS to fetch this information like it's the 1990s.
When a vm initializes memory ranges, we now track what each range represents. This information can be used to supply the e820 memory map to SeaBIOS via the fw_cfg interface allowing it to properly communicate memory ranges to a guest operating system. (This will also allow us to drop some patches from the port.)
Given the ranges can now be marked with a purpose, this also allows vmm(4) to switch from hard-coded mmio ranges and instead let the information on the memory range dictate if vmm should be handling a page fault or sending to vmd for a memory assist.
Tested by Mischa Peters and others. OK mlarkin@.
|
#
1.77 |
|
23-Dec-2022 |
dv |
vmd(8): implement zero-copy operations on virtqueues.
The original virtio device implementation relied on allocating a buffer on heap, copying the virtqueue from the guest, mutating the copy, and then overwriting the virtqueue in the guest.
While the approach worked, it was both complex and added extra overhead. On older hardware, switching to the zero-copy approach can show a noticeable performance improvement for vionet devices. An added benefit is this diff also reduces the amount of code in vmd, which is always a welcome change.
In addition, change to talking about the queue pfn and not "address" as the virtio-pci spec has drivers provide a 32-bit value representing the physical page number of the location in guest memory, not the linear address.
Original idea from dlg@ while working on re-adding async task queues.
ok dlg@, tested by many
|
#
1.76 |
|
11-Nov-2022 |
dv |
Revert removal of toggling interrupt line in vmd vcpu run loop.
phessler reports a performance regression. Needs more testing.
|
#
1.75 |
|
10-Nov-2022 |
dv |
vmd(8): remove toggling interrupt line on vcpu in vcpu run loop
We toggle the interrupt "line" on the vcpu when we assert or deassert irq on the pic in either the vcpu thread (emulating some devices) or on the device event thread (mostly handling reading available data). Having it in the vcpu run loop here just results in another ioctl(2) call before the one for re-entering the guest cpu.
Removing it shows no noticeable behavioral change in existing guests.
ok mlarkin@
|
#
1.74 |
|
10-Nov-2022 |
dv |
vmd(8): import mmio decode and emulation, disabled for now.
The initial mmio support for vmd adds support for only specific MOV and MOVZX instructions. Plan is to begin iterating in-tree on other missing pieces. All functionality is gated behind an #if for now.
Only change to vmm(4) is reordering register #define's in vmmvar.h.
ok mlarkin@
|
Revision tags: OPENBSD_7_2_BASE
|
#
1.73 |
|
01-Sep-2022 |
dv |
vmm(4): send all port io emulation to userland
Simplify things by sending any io exits from IN/OUT instructions to userland instead of trying to emulate anything in the kernel. vmm was sending most pertinent exits to vmd anyways, so this functionally changes little.
An added benefit is this solves an issue reported by tb@ where i386 OpenBSD guests would probe for a pc keyboard repeatedly and cause excessive vm exits. (The emulation in vmm was not properly handling these port reads.)
While here, make the assignment of the VEI_DIR_{IN,OUT} enum values not assume the underlying integer the compiler may assign.
ok mlarkin@
|
#
1.72 |
|
30-Aug-2022 |
dv |
Initial support for mmio assist for vmm(4)
Provide the basic information required for a userland assist in emulating instructions touching mmio regions, sending as much information as is provided by the host hardware.
No decode or assist provided at the moment by vmd(8).
ok mlarkin@
|
#
1.71 |
|
29-Jun-2022 |
dv |
vmd(8): fix off by one in vm memory range check
When inspecting if a gpa falls into a known memory range, vmd was considering it valid 1 byte past the end resulting in selecting the wrong starting range for the search.
ok mlarkin@
|
#
1.70 |
|
26-Jun-2022 |
dv |
vmd: create a copy of bios at 4g boundary
Newer Linux kernels call into the bios to perform a reboot and our version of SeaBIOS assumes there's a "copy" of the bios ending at 4g. When SeaBIOS reads from this area, since vmd doesn't perform mmio yet, guests terminate with an unhandled fault.
Carve out some space ending at 4g and copy the bios there. Technically we could load garbage there, but give SeaBIOS what it wants for now.
ok mlarkin@
|
#
1.69 |
|
03-May-2022 |
dv |
vmm/vmd/vmctl: standardize memory units to bytes
At different points in the vm lifecycle vmm(4), vmctl(8), and vmd(8) refer to a vm's memory range sizes in either bytes or megabytes. This is needlessly complex.
Switch to using bytes everywhere and adjust types and constants accordingly. While this makes it possible to specify vm's with memory in fractions of megabytes, the logic requiring whole megabyte values remains.
Feedback from deraadt@, mlarkin@, and Matthew Martin.
ok mlarkin@
|
Revision tags: OPENBSD_7_1_BASE
|
#
1.68 |
|
01-Mar-2022 |
dv |
vmd(8): gracefully handle hitting data limits when starting a vm
With recent changes to login.conf(5) to restrict daemon datasize to a finite value, users can now hit resource limits when attempting to start a vm.
This change fixes the error path when hitting the limit. vmd(8) will no longer abort and memory error messages are relayed to the user.
While here, address potential under-reads/writes using atomicio when relaying data between the child vm process and vmd's vmm process.
Original diff from tedu@. OK mlarkin@.
|
#
1.67 |
|
30-Dec-2021 |
claudio |
Add back support for -B net -b bsd.rd which emulates a PXE install and results in an autoinstall. This can be used to quickly create new OpenBSD installs. OK dv@
|
#
1.66 |
|
29-Nov-2021 |
deraadt |
mostly avoid sys/param.h with a local nitems() ok mlarkin
|
Revision tags: OPENBSD_7_0_BASE
|
#
1.65 |
|
01-Sep-2021 |
dv |
remove unused functions and cleanup vmd.h
Discussed with mlarkin@. These functions were implemented but never used. While in vmd.h, fix the order to match current vmd(8) reality.
|
#
1.64 |
|
16-Jul-2021 |
dv |
vmd(8): simplify vcpu logic, removing uart & vionet reads
Remove legacy state handling on the ns8250 and virtio network devices originally put in place before using libevent for async device events. The vcpu thread doesn't need to process device data as it is handled by the libevent thread.
This has the benefit of simplifying some of the message passing between threads introduced to the ns8250 uart since both the vcpu and libevent threads were processing read events.
No functional change intended. Tested by many, including abieber@, weerd@, Mischa Peters, and Matthias Schmidt. (Thanks.)
OK mlarkin@
|
#
1.63 |
|
16-Jun-2021 |
dv |
cleanup vmd(8) includes and header files
Lots of organic growth other the years lead to unnecessary includes (proc.h everywhere) and odd dependencies between header files. This cleans things up a bit to help with upcoming cleanup around dhcp code.
No functional change.
"go for it" mlarkin@
|
Revision tags: OPENBSD_6_9_BASE
|
#
1.62 |
|
05-Apr-2021 |
dv |
Support booting from compressed kernel images.
The bsd.rd ramdisk now ships gzip'd on amd64. Use libz in base to transparently handle decompression of any compressed kernel images.
Patch from Josh Rickmar.
ok kn@
|
#
1.61 |
|
29-Mar-2021 |
dv |
Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior ignored did not intercept these packets and instead transmitted them.
This should make vmd(8)'s dhcp behave more as a true dhcp server should and allows it to work properly with the new dhcpleased(8) attempting a renewal.
OK mlarkin@
|
#
1.60 |
|
19-Mar-2021 |
kn |
Remove booting from kernels in raw/qcow2 images
Diff and (slightly tweaked) text below from Dave Voutila < dave at sisu dot io >, thanks!
-- Since 6.7 switched to FFS2 as the default filesystem for new installs, the ability for vmd(8) to load a kernel and boot.conf from a disk image directly (without SeaBIOS) has been broken.
A diff from tb to add FFS2 support never mdae it into the tree.
On 5th Jan 2021, new ramdisks for amd64 have started shipping gzipped, breaking the ability to load the bsd.rd directly as a kernel image for a vmd guest without first uncompressing the image.
Using BIOS works, the FFS2 change happend ten months ago and few if any have complained about the breakage. vmctl(8) is still vague about supporting it per its man page and one still has to pass the disk image twice as a "-b" and "-d" argument to boot an OpenBSD guest *without* BIOS.
Josh Rickmar reported the gzip issue on bugs@ and provided patches to add support for compressed ramdisks and kernel images. The easiest way to do so is to drop support for FFS images since they require a call to fmemopen(3) while all the other logic uses fopen(3)/fdopen(3) calls and a file descriptor. It is much easier to get thsoe patches merged if they don't have to account for extracting files from disk images. --
No objections anyone "Removing it makes sense" reyk (who wrote the FFS module) OK mlarkin
|
#
1.59 |
|
13-Feb-2021 |
mlarkin |
Fix some wrong comments and KNF/long line wraps
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.58 |
|
28-Jun-2020 |
pd |
vmd(8): Eliminate libevent state corruption
libevent functions for com, pic and rtc are now only called on event_thread. vcpu exit handlers send messages on a dev pipe and callbacks on these events do the event management (event_add, evtimer_add, etc). Previously, libevent state was mutated by two threads, event_thread, that runs all the callbacks and the vcpu thread when running exit handlers. This could have lead to libevent state corruption.
Patch from Dave Voutila <dave@sisu.io>
ok claudio@ tested by abieber@ and brynet@
|
Revision tags: OPENBSD_6_7_BASE
|
#
1.57 |
|
30-Apr-2020 |
pd |
vmd(8): correctly terminate vm processes after sending vm
Instead of a round about way of sending a message to vmm that 'send is successful' and terminating by vm_remove from vmm, we can send the imsg and exit in the vm process. The sigchld handler in vmm will vm_remove it from its structures. This is how a normal vm is terminated as well.
Previously, vm_remove was called in vmm_dispatch_vm (ie. the event handler to receive messages from vm process) when hanlding the IMSG_VMDOP_SEND_VM_RESPONSE (ie. the vm process has written the vm state to the fd passed on by vmctl send). This is not how vm_remove was intented to be used as it does a free(vm). The vm struct holds the buffers for imsg and so after handling this IMSG_VMDOP_SEND_VM_RESPONSE message, vmm_dispatch_vm loops again to do imsg_get(ibuf, &imsg) to read the next message (and we had just freed this *ibuf when we freed the vm struct) causing it to segfault.
reported by kn@ ok kn@
|
#
1.56 |
|
21-Apr-2020 |
pd |
vmd: improve concurrency control in pause
Previous implementation hit a deadlock sometimes as the pthread_cond_broadcast for the pause mutex could happen before pthread_cond_wait. This implementation uses a barrier which is hit when all vpcus are paused.
ok mpi@
|
#
1.55 |
|
08-Apr-2020 |
pd |
vmm(4): add IOCTL handler to sets the access protections of the ept
This exposes VMM_IOC_MPROTECT_EPT which can be used by vmd to lock in physical pages. Currently, vmd just terminates the vm in case it gets a protection fault in the future.
This feature is used by solo5 which uses vmm(4) as a backend hypervisor.
ok mpi@
Patch from Adam Steen <adam@adamsteen.com.au>
|
#
1.54 |
|
11-Dec-2019 |
pd |
vmd: proper concurrency control when pausing a vm
Removes an XXX which slept for 1s waiting for the vcpu thread to reach HLT and pause. We now define a paused and unpaused condition so that a call to pause_vm() / vmctl pause blocks till the vm really reaches a paused state.
Also, detach events for devices from event loop when pausing and add them back when unpausing. This is because some callbacks call pthread_mutex_lock and if the vm is paused, it would block also causing the libevent thread to block. This would mean that we would not be able to process any IMSGs received from vmm (parent process) including a message to unpause.
ok mlarkin@
|
#
1.53 |
|
30-Nov-2019 |
mlarkin |
Revert previous - the stability was not as improved as we had thought and we ended up accidentally breaking vmctl. This will need more thought.
ok ori@
|
#
1.52 |
|
29-Nov-2019 |
mlarkin |
Fix at least one cause of VMs spinning at 100% host CPU
After debugging with ori@, it looks like an event ends up on the wrong libevent queue, and we end continually de-queueing and re-queueing the event continually. While it's unclear exactly why this happened, a clue on libevent's github issues page for the same problem pointed us to using a different event base for the device events. This seems to have unstuck ori@'s problematic VM, and I have also seen no more hangs after this.
We have not completely separated the queues; ori@ will work on setting new libevent bases for those later. But those events are pretty frequency.
with help from and ok ori@
|
Revision tags: OPENBSD_6_6_BASE
|
#
1.51 |
|
17-Jul-2019 |
pd |
vmm/vmd: Fix migration with pvclock
Implement VMM_IOC_READVMPARAMS and VMM_IOC_WRITEVMPARAMS ioctls to read and write pvclock state.
reads ok mlarkin@
|
#
1.50 |
|
28-Jun-2019 |
deraadt |
When system calls indicate an error they return -1, not some arbitrary value < 0. errno is only updated in this case. Change all (most?) callers of syscalls to follow this better, and let's see if this strictness helps us in the future.
|
#
1.49 |
|
28-May-2019 |
pd |
vmd: unset CR0_CD and CR0_NW in default flat64 register values
These never got unset on AMD/SVM guests when booted via vmctl start -b causing them to run very slow
ok mlarkin@
|
#
1.48 |
|
12-May-2019 |
pd |
vmm: add a x86 page table walker
Add a first cut of x86 page table walker to vmd(8) and vmm(4). This function is not used right now but is a building block for future features like HPET, OUTSB and INSB emulation, nested virtualisation support, etc.
With help from Mike Larkin
ok mlarkin@
|
#
1.47 |
|
11-May-2019 |
jasper |
vm_dump_header allocated space for a signature but it was never set; set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image
ok mlarkin@ pd@
|
#
1.46 |
|
11-May-2019 |
jasper |
track the state of the vm (running, paused, etc) using a single bitfield instead of a handful of separate variables. this will makes it easier for vmd to report and check on the individual vm states
no functional change intended
ok ccardenas@ mlarkin@
|
Revision tags: OPENBSD_6_5_BASE
|
#
1.45 |
|
01-Mar-2019 |
mlarkin |
vmd(8): remove some i386 remnants that missed the original cleanup
ok pd, kn, deraadt
|
#
1.44 |
|
20-Feb-2019 |
mlarkin |
vmd(8): initialize guest %drX registers to power-on defaults on launch
Initializes the %drX registers to power on defaults, and bump the VM send/recieve header to reflect same
discussed with deraadt@
|
#
1.43 |
|
10-Dec-2018 |
claudio |
Implement the fw_cfg interface basics and use it to set the bootorder if a bootdevice was forced. This implements both the pure IO port interface and also the new DMA interface, a few direct commands are implemented which are needed but in general the "file" interface should be used. There is no write support for the guest. Tested against the latest vmm-firmware port. This requires also a -current kernel to pass the IO ports to vmd(8). OK mlarkin@ ccardenas@
|
#
1.42 |
|
06-Dec-2018 |
claudio |
Make it possible to define the bootdevice in vmd. This information is used currently only when booting a OpenBSD kernel. If VMBOOTDEV_NET is used the internal dhcp server will pass "auto_install" as boot file to the client and the boot loader passes the MAC of the first interface to the kernel to indicate PXE booting. Adding boot order support to SeaBIOS is not yet implemented. Ok ccardenas@
|
Revision tags: OPENBSD_6_4_BASE
|
#
1.41 |
|
08-Oct-2018 |
reyk |
Add support for qcow2 base images (external snapshots).
This works is from Ori Bernstein, committing on his behalf:
Add support to vmd for external snapshots. That is, snapshots that are derived from a base image. Data lookups start in the derived image, and if the derived image does not contain some data, the search proceeds ot the base image. Multiple derived images may exist off of a single base image.
A limitation of this format is that modifying the base image will corrupt the derived image.
This change also adds support for creating disk derived disk images to vmctl. To use it:
vmctl create derived.qcow2 -s 16G -b base.qcow2
From Ori Bernstein OK mlarkin@ reyk@
|
#
1.40 |
|
28-Sep-2018 |
reyk |
Support vmd-internal's vmboot with qcow2 disk images.
OK mlarkin@
|
#
1.39 |
|
19-Sep-2018 |
ccardenas |
Various clean up items for disks.
- qcow2: general cleanup - vioraw: check malloc - virtio: add function to sync disks - vm: call virtio_shutdown to sync disks when vm is finished executing
Thanks to Ori Bernstein.
Ok miko@
|
#
1.38 |
|
17-Jul-2018 |
mlarkin |
vmd(8): fix vmctl -b option for i386 kernels.
ok pd@
|
#
1.37 |
|
12-Jul-2018 |
mlarkin |
vmm(8)/vmm(4): send a copy of the guest register state to vmd on exit, avoiding multiple readregs ioctls back to vmm in case register content is needed subsequently.
ok phessler
|
#
1.36 |
|
10-Jul-2018 |
mlarkin |
vmd(8): route ELCR handler to the right function
|
#
1.35 |
|
09-Jul-2018 |
mlarkin |
vmd(8): better debug message in a failure case
|
#
1.34 |
|
19-Jun-2018 |
reyk |
knf
|
#
1.33 |
|
27-Apr-2018 |
mlarkin |
vmd(8): implement vmd side of ELCR registers
ok guenther
|
#
1.32 |
|
26-Apr-2018 |
mlarkin |
vmd(8): handle PIT channel 2 status readback via port 0x61
Allow PIT channel 2 status (fired/counting) readback via port 0x61 bit 5.
ok guenther@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.31 |
|
03-Jan-2018 |
ccardenas |
Add initial CD-ROM support to VMD via vioscsi.
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
#
1.30 |
|
29-Nov-2017 |
mlarkin |
make vmm(4) less responsible for initial register state, preferring to let usermode daemons handle that.
ok pd@
|
#
1.29 |
|
28-Nov-2017 |
mlarkin |
fix some spelling errors in a few comments
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.28 |
|
19-Sep-2017 |
mlarkin |
Clarify a wrong conditional, found by jsg.
ok jsg
|
#
1.27 |
|
17-Sep-2017 |
pd |
vmd: send/recv pci config space instead of recreating pci devices on receive
ok mlarkin@
|
#
1.26 |
|
17-Sep-2017 |
pd |
vmd: re add rtc.per and rtc.sec evtimers on receive
This was missed in receive. mc146818_start is already defined. This fixes rtc time resync on receive.
ok mlarkin@
|
#
1.25 |
|
11-Sep-2017 |
dlg |
add functions to provide direct access to guest memory as vmd addresses
iovec_mem() populates an iovec array based on guest physical addresses. this allows the use of things like readv and writev for moving data between the guest and a disk image file without having to bounce the memory.
vaddr_mem() provides a vmd usable pointer based on a guests physical address. this makes it possible to directly reference things like virtio rings without having to bounce that memory either. however, it assumes that a contiguous range of guest physical memory will sit in a single vm memory range. mlarkin@ says this is right.
ok mlarkin@
|
#
1.24 |
|
20-Aug-2017 |
pd |
vmd: Allow only upward migration
This restricts receiving vms from hosts with more cpu features.
Tested on broadwell -> skylake (works) skylake -> broadwell (don't work)
ok mlarkin@
|
#
1.23 |
|
14-Aug-2017 |
mlarkin |
vmd: set MSR_MISC_ENABLE=0 on vm creation, this will be re-set in vmm based on proper values from the host in use.
|
#
1.22 |
|
15-Jul-2017 |
pd |
Add vmctl send and vmctl receive
ok reyk@ and mlarkin@
|
#
1.21 |
|
09-Jul-2017 |
pd |
vmd/vmctl: Add ability to pause / unpause vms
With help from Ashwin Agrawal
ok reyk@ mlarkin@
|
#
1.20 |
|
07-Jun-2017 |
mlarkin |
vmd: Implement simulated baudrate support in the ns8250 module. The previous version was allowing an output rate that is "too fast", and linux guests would give up after 512 characters TXed ("too much work for irq4").
This diff calculates the approximate rate we can sustain at the current programmed baud rate and limits the output to that rate by inserting a HZ delay after a specified number of characters have been transmitted. This fixes the linux guest console issue.
Note that the console now outputs at more or less the selected baud rate, instead of nearly instantaneously as before - if you selected 9600 in your guest VMs before, you might want to change that to 115200 now for a better console experience.
krw@ "seems like a good idea to me"
|
#
1.19 |
|
30-May-2017 |
tedu |
split vioblk read/write functions into start and finish as prep for async io operations. ok mlarkin
|
#
1.18 |
|
28-May-2017 |
mlarkin |
SVM: add some exit types
Also, fix a comment that wasn't applicable anymore, and change a format from decimal to hex
|
#
1.17 |
|
05-May-2017 |
reyk |
VMs cannot use proc_compose() to PROC_VMM, they have to use imsg_compose() on the "vmm_pipe" directly. This fixes the communication channel from VMs back to vmm.
|
#
1.16 |
|
05-May-2017 |
mlarkin |
Allow vmd(8) to set guest %xcr0
Usermode part of previous vmm(4) diff.
Posted to tech by Pratik Vyas
|
#
1.15 |
|
02-May-2017 |
mlarkin |
fix an error in i386 vmd build
|
#
1.14 |
|
02-May-2017 |
mlarkin |
Matching vmd(8) part of previous diff (first part of vmctl send/receive).
ok kettenis
|
#
1.13 |
|
25-Apr-2017 |
reyk |
spacing
|
#
1.12 |
|
19-Apr-2017 |
reyk |
Add support for dynamic "NAT" interfaces (-L/local interface).
When a local interface is configured, vmd configures a /31 address on the tap(4) interface of the host and provides another IP in the same subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server that replies with IP, gateway, and DNS addresses to the VM. The built-in server only ever responds to the VM on the inside and cannot leak its DHCP responses to the outside.
Thanks to Uwe Werler, Josh Grosse, and some others for testing!
OK deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.11 |
|
27-Mar-2017 |
deraadt |
die whitespace die die die
|
#
1.10 |
|
25-Mar-2017 |
mlarkin |
Last bits needed to get seabios + alpine linux working. This is enough to get started and let more people help finding and fixing bugs.
ok kettenis, deraadt
|
#
1.9 |
|
25-Mar-2017 |
reyk |
Boot using BIOS from /etc/firmware/vmm-bios by default.
Instead of using the internal "vmboot", VMs will now be booted using the external BIOS firmware in /etc/firmware/vmm-bios (which is subject to a LGPLv3 license). Direct booting of OpenBSD kernels or non-default BIOS images is still supported for now using the -b/boot option that is replacing the -k/kernel option.
As requested by Theo, vmd(8) fails if neither the default BIOS is found nor a kernel has been specified in the VM configuration. The "vmm" BIOS has to be installed using fw_update(1), which will be done automatically in most cases where the OpenBSD can fetch it after install/upgrade.
OK mlarkin@
|
#
1.8 |
|
25-Mar-2017 |
mlarkin |
Implement some missing functionality and clean up some code in vmd pci emulation.
ok kettenis
|
#
1.7 |
|
25-Mar-2017 |
mlarkin |
Introduce a new function to obtain properly sized input data, and convert i8253/i8259/mc146818 emulation to use this.
|
#
1.6 |
|
24-Mar-2017 |
mlarkin |
Allow vmd to proceed after an interrupt occurred after retiring a cpuid instruction. Matches previous commit to kernel vmm.c
|
#
1.5 |
|
23-Mar-2017 |
mlarkin |
Implement memory size and SMP CPU count NVRAM registers in the emulated mc146818. This is needed for seabios to boot properly (and construct a sensible e820 map to send to the guest OS).
|
#
1.4 |
|
21-Mar-2017 |
mlarkin |
Fix two errors in NS8250 (UART) emulation. The first error zeroed out the high bits of %eax on reading register data from the emulated UART ports. The second error didn't properly assert the TXRDY bit during init - this bit was only set after the first character was sent. Both these bugs caused seabios to not be able to output any data. Found during the recent effort to get Linux guests booting.
|
#
1.3 |
|
15-Mar-2017 |
reyk |
Improve vmmci(4) shutdown and reboot.
This change handles various cases to power off the VM, even if it is unresponsive, stuck in ddb, or when the shutdown was initiated from the VM guest side. Usage of timeout and VM ACKs make sure that the VM is really turned off at some point.
OK mlarkin@
|
#
1.2 |
|
02-Mar-2017 |
reyk |
Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.
This is especially useful when multiple VMs share a switch, the implementation is independent from the underlying switch or bridge.
no objections mlarkin@
|
#
1.1 |
|
01-Mar-2017 |
reyk |
Split vmm.c into two files: vm.c for the VM child, vmm.c for the parent
As discussed with mlarkin@, it makes it easier to maintain the file.
OK mlarkin@
|
#
1.90 |
|
13-Jul-2023 |
dv |
vmd(8): pull validation into local prefix parser.
Validation for local prefixes, both inet and inet6, was scattered around. To make it even more confusing, vmd was using generic address parsing logic from prior network daemons. vmd doesn't need to parse addresses other than when parsing the local prefix settings in vm.conf and no runtime parsing is needed.
This change merges parsing and validation based on vmd's specific needs for local prefixes (e.g. reserving enough bits for vm id and network interface id encoding in an ipv4 address). In addition, it simplifies the struct from a generic address struct to one focused on just storing the v4 and v6 prefixes and masks. This cleans up an unused TAILQ struct member that isn't used by vmd and was leftover copy-pasta from those prior daemons.
The address parsing that vmd uses is also updated to using the latest logic in bgpd(8).
ok mlarkin@
|
#
1.89 |
|
13-May-2023 |
dv |
vmm(4)/vmd(8): switch to anonymous shared mappings.
While splitting out emulated virtio network and block devices into separate processes, I originally used named mappings via shm_mkstemp(3). While this functionally achieved the desired result, it had two unintended consequences:
1) tearing down a vm process and its child processes required excessive locking as the guest memory was tied into the VFS layer.
2) it was observed by mlarkin@ that actions in other parts of the VFS layer could cause some of the guest memory to flush to storage, possibly filling /tmp.
This commit adds a new vmm(4) ioctl dedicated to allowing a process request the kernel share a mapping of guest memory into its own vm space. This requires an open fd to /dev/vmm (requiring root) and both the "vmm" and "proc" pledge(2) promises. In addition, the caller must know enough about the original memory ranges to reconstruct them to make the vm's ranges.
Tested with help from Mischa Peters.
ok mlarkin@
|
#
1.88 |
|
28-Apr-2023 |
dv |
vmd(8)/vmctl(8): allow vm owners to override boot kernel.
vmd allows non-root users to "own" a vm defined in vm.conf(5). While the user can start/stop the vm, if they break their filesystem they have no means of booting recovery media like a ramdisk kernel.
This change opens the provided boot kernel via vmctl and passes the file descriptor through the control channel to vmd. The next boot of the vm will use the provided file descriptor as boot kernel/bios. Subsequent boots (e.g. a reboot) will return to using behavior defined in vm.conf or the default bios image.
ok mlarkin@
|
#
1.87 |
|
27-Apr-2023 |
dv |
vmd(8): introduce multi-process model for virtio devices.
Isolate virtio network and block device emulation in dedicated processes, forked and exec'd from the vm process. This allows for tightening pledge promises to just "stdio".
Communication between the vcpu's and these devices now occurs via imsg channels, which adds the benefit of not always blocking the vcpu thread while emulating the device.
With this commit, it's possible that vmd is the first open source hypervisor that *defaults* to a multi-process device emulation model without requiring any additional configuration from the operator.
Testing help from phessler@ and Mischa Peters.
ok mlarkin@
|
#
1.86 |
|
25-Apr-2023 |
dv |
vmm(4)/vmd(8): pull struct members out of vmm ioctl create struct.
The object sent to vmm(4) contained file paths and details the kernel does not need for cpu virtualization as device emulation is in userland. Effectively, "pull up" the struct members from the vm_create_params struct to the parent vmop_create_params struct.
This allows us to clean up some of vmd(8) and simplify things for switching to having vmctl(8) open the "kernel" file (SeaBIOS, bsd.rd, etc.) to allow users to boot recovery ramdisk kernels.
ok mlarkin@
|
#
1.85 |
|
23-Apr-2023 |
dv |
vmd(8): teach vmm process how to exec.
Use execvp(2) to launch vm children with new address spaces. Consequently, introduces use of unveil(2) into the vmm and vm processes.
This imposes the requirement of launching vmd with absolute paths, similar to sshd(8).
ok mlarkin@
|
#
1.84 |
|
23-Apr-2023 |
anton |
unbreak tree by coping with recent s/XCR0/XFEATURE rename
|
Revision tags: OPENBSD_7_3_BASE
|
#
1.83 |
|
06-Feb-2023 |
dv |
vmd(8): scan pci bus to determine bootorder strings.
vmd's SeaBIOS bootorder strings had hardcoded pci device ids, so if a user added a network interface the bootorder strings didn't line up with reality. Using vmctl(8) to boot from a cdrom (-B cdrom) would fail, for instance, if attaching both a nic and a disk as well.
This change scans the pci devices and finds the first of each type to construct viable bootorder strings.
ok jan@
|
#
1.82 |
|
28-Jan-2023 |
dv |
Move some header definitions from vmm(4) to vmd(8).
Part of an ongoing effort to move userland-specific information out of a kernel header and directly into vmd(8). No functional change.
ok mlarkin@
|
#
1.81 |
|
08-Jan-2023 |
dv |
vmd(8): add thread names to vm process.
ok guenther@.
|
#
1.80 |
|
04-Jan-2023 |
dv |
Typos in vmd error message. No functional change.
|
#
1.79 |
|
28-Dec-2022 |
jmc |
spelling fixes; from paul tagliamonte any parts of his diff not taken are noted on tech
|
#
1.78 |
|
26-Dec-2022 |
dv |
vmd(8): provide a detailed e820 memory map.
When booting guests with SeaBIOS, vmd(8) supplied details about the available guest memory via CMOS registers. Consequently, we've been carrying some patches in the ports tree to SeaBIOS to fetch this information like it's the 1990s.
When a vm initializes memory ranges, we now track what each range represents. This information can be used to supply the e820 memory map to SeaBIOS via the fw_cfg interface allowing it to properly communicate memory ranges to a guest operating system. (This will also allow us to drop some patches from the port.)
Given the ranges can now be marked with a purpose, this also allows vmm(4) to switch from hard-coded mmio ranges and instead let the information on the memory range dictate if vmm should be handling a page fault or sending to vmd for a memory assist.
Tested by Mischa Peters and others. OK mlarkin@.
|
#
1.77 |
|
23-Dec-2022 |
dv |
vmd(8): implement zero-copy operations on virtqueues.
The original virtio device implementation relied on allocating a buffer on heap, copying the virtqueue from the guest, mutating the copy, and then overwriting the virtqueue in the guest.
While the approach worked, it was both complex and added extra overhead. On older hardware, switching to the zero-copy approach can show a noticeable performance improvement for vionet devices. An added benefit is this diff also reduces the amount of code in vmd, which is always a welcome change.
In addition, change to talking about the queue pfn and not "address" as the virtio-pci spec has drivers provide a 32-bit value representing the physical page number of the location in guest memory, not the linear address.
Original idea from dlg@ while working on re-adding async task queues.
ok dlg@, tested by many
|
#
1.76 |
|
11-Nov-2022 |
dv |
Revert removal of toggling interrupt line in vmd vcpu run loop.
phessler reports a performance regression. Needs more testing.
|
#
1.75 |
|
10-Nov-2022 |
dv |
vmd(8): remove toggling interrupt line on vcpu in vcpu run loop
We toggle the interrupt "line" on the vcpu when we assert or deassert irq on the pic in either the vcpu thread (emulating some devices) or on the device event thread (mostly handling reading available data). Having it in the vcpu run loop here just results in another ioctl(2) call before the one for re-entering the guest cpu.
Removing it shows no noticeable behavioral change in existing guests.
ok mlarkin@
|
#
1.74 |
|
10-Nov-2022 |
dv |
vmd(8): import mmio decode and emulation, disabled for now.
The initial mmio support for vmd adds support for only specific MOV and MOVZX instructions. Plan is to begin iterating in-tree on other missing pieces. All functionality is gated behind an #if for now.
Only change to vmm(4) is reordering register #define's in vmmvar.h.
ok mlarkin@
|
Revision tags: OPENBSD_7_2_BASE
|
#
1.73 |
|
01-Sep-2022 |
dv |
vmm(4): send all port io emulation to userland
Simplify things by sending any io exits from IN/OUT instructions to userland instead of trying to emulate anything in the kernel. vmm was sending most pertinent exits to vmd anyways, so this functionally changes little.
An added benefit is this solves an issue reported by tb@ where i386 OpenBSD guests would probe for a pc keyboard repeatedly and cause excessive vm exits. (The emulation in vmm was not properly handling these port reads.)
While here, make the assignment of the VEI_DIR_{IN,OUT} enum values not assume the underlying integer the compiler may assign.
ok mlarkin@
|
#
1.72 |
|
30-Aug-2022 |
dv |
Initial support for mmio assist for vmm(4)
Provide the basic information required for a userland assist in emulating instructions touching mmio regions, sending as much information as is provided by the host hardware.
No decode or assist provided at the moment by vmd(8).
ok mlarkin@
|
#
1.71 |
|
29-Jun-2022 |
dv |
vmd(8): fix off by one in vm memory range check
When inspecting if a gpa falls into a known memory range, vmd was considering it valid 1 byte past the end resulting in selecting the wrong starting range for the search.
ok mlarkin@
|
#
1.70 |
|
26-Jun-2022 |
dv |
vmd: create a copy of bios at 4g boundary
Newer Linux kernels call into the bios to perform a reboot and our version of SeaBIOS assumes there's a "copy" of the bios ending at 4g. When SeaBIOS reads from this area, since vmd doesn't perform mmio yet, guests terminate with an unhandled fault.
Carve out some space ending at 4g and copy the bios there. Technically we could load garbage there, but give SeaBIOS what it wants for now.
ok mlarkin@
|
#
1.69 |
|
03-May-2022 |
dv |
vmm/vmd/vmctl: standardize memory units to bytes
At different points in the vm lifecycle vmm(4), vmctl(8), and vmd(8) refer to a vm's memory range sizes in either bytes or megabytes. This is needlessly complex.
Switch to using bytes everywhere and adjust types and constants accordingly. While this makes it possible to specify vm's with memory in fractions of megabytes, the logic requiring whole megabyte values remains.
Feedback from deraadt@, mlarkin@, and Matthew Martin.
ok mlarkin@
|
Revision tags: OPENBSD_7_1_BASE
|
#
1.68 |
|
01-Mar-2022 |
dv |
vmd(8): gracefully handle hitting data limits when starting a vm
With recent changes to login.conf(5) to restrict daemon datasize to a finite value, users can now hit resource limits when attempting to start a vm.
This change fixes the error path when hitting the limit. vmd(8) will no longer abort and memory error messages are relayed to the user.
While here, address potential under-reads/writes using atomicio when relaying data between the child vm process and vmd's vmm process.
Original diff from tedu@. OK mlarkin@.
|
#
1.67 |
|
30-Dec-2021 |
claudio |
Add back support for -B net -b bsd.rd which emulates a PXE install and results in an autoinstall. This can be used to quickly create new OpenBSD installs. OK dv@
|
#
1.66 |
|
29-Nov-2021 |
deraadt |
mostly avoid sys/param.h with a local nitems() ok mlarkin
|
Revision tags: OPENBSD_7_0_BASE
|
#
1.65 |
|
01-Sep-2021 |
dv |
remove unused functions and cleanup vmd.h
Discussed with mlarkin@. These functions were implemented but never used. While in vmd.h, fix the order to match current vmd(8) reality.
|
#
1.64 |
|
16-Jul-2021 |
dv |
vmd(8): simplify vcpu logic, removing uart & vionet reads
Remove legacy state handling on the ns8250 and virtio network devices originally put in place before using libevent for async device events. The vcpu thread doesn't need to process device data as it is handled by the libevent thread.
This has the benefit of simplifying some of the message passing between threads introduced to the ns8250 uart since both the vcpu and libevent threads were processing read events.
No functional change intended. Tested by many, including abieber@, weerd@, Mischa Peters, and Matthias Schmidt. (Thanks.)
OK mlarkin@
|
#
1.63 |
|
16-Jun-2021 |
dv |
cleanup vmd(8) includes and header files
Lots of organic growth other the years lead to unnecessary includes (proc.h everywhere) and odd dependencies between header files. This cleans things up a bit to help with upcoming cleanup around dhcp code.
No functional change.
"go for it" mlarkin@
|
Revision tags: OPENBSD_6_9_BASE
|
#
1.62 |
|
05-Apr-2021 |
dv |
Support booting from compressed kernel images.
The bsd.rd ramdisk now ships gzip'd on amd64. Use libz in base to transparently handle decompression of any compressed kernel images.
Patch from Josh Rickmar.
ok kn@
|
#
1.61 |
|
29-Mar-2021 |
dv |
Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior ignored did not intercept these packets and instead transmitted them.
This should make vmd(8)'s dhcp behave more as a true dhcp server should and allows it to work properly with the new dhcpleased(8) attempting a renewal.
OK mlarkin@
|
#
1.60 |
|
19-Mar-2021 |
kn |
Remove booting from kernels in raw/qcow2 images
Diff and (slightly tweaked) text below from Dave Voutila < dave at sisu dot io >, thanks!
-- Since 6.7 switched to FFS2 as the default filesystem for new installs, the ability for vmd(8) to load a kernel and boot.conf from a disk image directly (without SeaBIOS) has been broken.
A diff from tb to add FFS2 support never mdae it into the tree.
On 5th Jan 2021, new ramdisks for amd64 have started shipping gzipped, breaking the ability to load the bsd.rd directly as a kernel image for a vmd guest without first uncompressing the image.
Using BIOS works, the FFS2 change happend ten months ago and few if any have complained about the breakage. vmctl(8) is still vague about supporting it per its man page and one still has to pass the disk image twice as a "-b" and "-d" argument to boot an OpenBSD guest *without* BIOS.
Josh Rickmar reported the gzip issue on bugs@ and provided patches to add support for compressed ramdisks and kernel images. The easiest way to do so is to drop support for FFS images since they require a call to fmemopen(3) while all the other logic uses fopen(3)/fdopen(3) calls and a file descriptor. It is much easier to get thsoe patches merged if they don't have to account for extracting files from disk images. --
No objections anyone "Removing it makes sense" reyk (who wrote the FFS module) OK mlarkin
|
#
1.59 |
|
13-Feb-2021 |
mlarkin |
Fix some wrong comments and KNF/long line wraps
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.58 |
|
28-Jun-2020 |
pd |
vmd(8): Eliminate libevent state corruption
libevent functions for com, pic and rtc are now only called on event_thread. vcpu exit handlers send messages on a dev pipe and callbacks on these events do the event management (event_add, evtimer_add, etc). Previously, libevent state was mutated by two threads, event_thread, that runs all the callbacks and the vcpu thread when running exit handlers. This could have lead to libevent state corruption.
Patch from Dave Voutila <dave@sisu.io>
ok claudio@ tested by abieber@ and brynet@
|
Revision tags: OPENBSD_6_7_BASE
|
#
1.57 |
|
30-Apr-2020 |
pd |
vmd(8): correctly terminate vm processes after sending vm
Instead of a round about way of sending a message to vmm that 'send is successful' and terminating by vm_remove from vmm, we can send the imsg and exit in the vm process. The sigchld handler in vmm will vm_remove it from its structures. This is how a normal vm is terminated as well.
Previously, vm_remove was called in vmm_dispatch_vm (ie. the event handler to receive messages from vm process) when hanlding the IMSG_VMDOP_SEND_VM_RESPONSE (ie. the vm process has written the vm state to the fd passed on by vmctl send). This is not how vm_remove was intented to be used as it does a free(vm). The vm struct holds the buffers for imsg and so after handling this IMSG_VMDOP_SEND_VM_RESPONSE message, vmm_dispatch_vm loops again to do imsg_get(ibuf, &imsg) to read the next message (and we had just freed this *ibuf when we freed the vm struct) causing it to segfault.
reported by kn@ ok kn@
|
#
1.56 |
|
21-Apr-2020 |
pd |
vmd: improve concurrency control in pause
Previous implementation hit a deadlock sometimes as the pthread_cond_broadcast for the pause mutex could happen before pthread_cond_wait. This implementation uses a barrier which is hit when all vpcus are paused.
ok mpi@
|
#
1.55 |
|
08-Apr-2020 |
pd |
vmm(4): add IOCTL handler to sets the access protections of the ept
This exposes VMM_IOC_MPROTECT_EPT which can be used by vmd to lock in physical pages. Currently, vmd just terminates the vm in case it gets a protection fault in the future.
This feature is used by solo5 which uses vmm(4) as a backend hypervisor.
ok mpi@
Patch from Adam Steen <adam@adamsteen.com.au>
|
#
1.54 |
|
11-Dec-2019 |
pd |
vmd: proper concurrency control when pausing a vm
Removes an XXX which slept for 1s waiting for the vcpu thread to reach HLT and pause. We now define a paused and unpaused condition so that a call to pause_vm() / vmctl pause blocks till the vm really reaches a paused state.
Also, detach events for devices from event loop when pausing and add them back when unpausing. This is because some callbacks call pthread_mutex_lock and if the vm is paused, it would block also causing the libevent thread to block. This would mean that we would not be able to process any IMSGs received from vmm (parent process) including a message to unpause.
ok mlarkin@
|
#
1.53 |
|
30-Nov-2019 |
mlarkin |
Revert previous - the stability was not as improved as we had thought and we ended up accidentally breaking vmctl. This will need more thought.
ok ori@
|
#
1.52 |
|
29-Nov-2019 |
mlarkin |
Fix at least one cause of VMs spinning at 100% host CPU
After debugging with ori@, it looks like an event ends up on the wrong libevent queue, and we end continually de-queueing and re-queueing the event continually. While it's unclear exactly why this happened, a clue on libevent's github issues page for the same problem pointed us to using a different event base for the device events. This seems to have unstuck ori@'s problematic VM, and I have also seen no more hangs after this.
We have not completely separated the queues; ori@ will work on setting new libevent bases for those later. But those events are pretty frequency.
with help from and ok ori@
|
Revision tags: OPENBSD_6_6_BASE
|
#
1.51 |
|
17-Jul-2019 |
pd |
vmm/vmd: Fix migration with pvclock
Implement VMM_IOC_READVMPARAMS and VMM_IOC_WRITEVMPARAMS ioctls to read and write pvclock state.
reads ok mlarkin@
|
#
1.50 |
|
28-Jun-2019 |
deraadt |
When system calls indicate an error they return -1, not some arbitrary value < 0. errno is only updated in this case. Change all (most?) callers of syscalls to follow this better, and let's see if this strictness helps us in the future.
|
#
1.49 |
|
28-May-2019 |
pd |
vmd: unset CR0_CD and CR0_NW in default flat64 register values
These never got unset on AMD/SVM guests when booted via vmctl start -b causing them to run very slow
ok mlarkin@
|
#
1.48 |
|
12-May-2019 |
pd |
vmm: add a x86 page table walker
Add a first cut of x86 page table walker to vmd(8) and vmm(4). This function is not used right now but is a building block for future features like HPET, OUTSB and INSB emulation, nested virtualisation support, etc.
With help from Mike Larkin
ok mlarkin@
|
#
1.47 |
|
11-May-2019 |
jasper |
vm_dump_header allocated space for a signature but it was never set; set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image
ok mlarkin@ pd@
|
#
1.46 |
|
11-May-2019 |
jasper |
track the state of the vm (running, paused, etc) using a single bitfield instead of a handful of separate variables. this will makes it easier for vmd to report and check on the individual vm states
no functional change intended
ok ccardenas@ mlarkin@
|
Revision tags: OPENBSD_6_5_BASE
|
#
1.45 |
|
01-Mar-2019 |
mlarkin |
vmd(8): remove some i386 remnants that missed the original cleanup
ok pd, kn, deraadt
|
#
1.44 |
|
20-Feb-2019 |
mlarkin |
vmd(8): initialize guest %drX registers to power-on defaults on launch
Initializes the %drX registers to power on defaults, and bump the VM send/recieve header to reflect same
discussed with deraadt@
|
#
1.43 |
|
10-Dec-2018 |
claudio |
Implement the fw_cfg interface basics and use it to set the bootorder if a bootdevice was forced. This implements both the pure IO port interface and also the new DMA interface, a few direct commands are implemented which are needed but in general the "file" interface should be used. There is no write support for the guest. Tested against the latest vmm-firmware port. This requires also a -current kernel to pass the IO ports to vmd(8). OK mlarkin@ ccardenas@
|
#
1.42 |
|
06-Dec-2018 |
claudio |
Make it possible to define the bootdevice in vmd. This information is used currently only when booting a OpenBSD kernel. If VMBOOTDEV_NET is used the internal dhcp server will pass "auto_install" as boot file to the client and the boot loader passes the MAC of the first interface to the kernel to indicate PXE booting. Adding boot order support to SeaBIOS is not yet implemented. Ok ccardenas@
|
Revision tags: OPENBSD_6_4_BASE
|
#
1.41 |
|
08-Oct-2018 |
reyk |
Add support for qcow2 base images (external snapshots).
This works is from Ori Bernstein, committing on his behalf:
Add support to vmd for external snapshots. That is, snapshots that are derived from a base image. Data lookups start in the derived image, and if the derived image does not contain some data, the search proceeds ot the base image. Multiple derived images may exist off of a single base image.
A limitation of this format is that modifying the base image will corrupt the derived image.
This change also adds support for creating disk derived disk images to vmctl. To use it:
vmctl create derived.qcow2 -s 16G -b base.qcow2
From Ori Bernstein OK mlarkin@ reyk@
|
#
1.40 |
|
28-Sep-2018 |
reyk |
Support vmd-internal's vmboot with qcow2 disk images.
OK mlarkin@
|
#
1.39 |
|
19-Sep-2018 |
ccardenas |
Various clean up items for disks.
- qcow2: general cleanup - vioraw: check malloc - virtio: add function to sync disks - vm: call virtio_shutdown to sync disks when vm is finished executing
Thanks to Ori Bernstein.
Ok miko@
|
#
1.38 |
|
17-Jul-2018 |
mlarkin |
vmd(8): fix vmctl -b option for i386 kernels.
ok pd@
|
#
1.37 |
|
12-Jul-2018 |
mlarkin |
vmm(8)/vmm(4): send a copy of the guest register state to vmd on exit, avoiding multiple readregs ioctls back to vmm in case register content is needed subsequently.
ok phessler
|
#
1.36 |
|
10-Jul-2018 |
mlarkin |
vmd(8): route ELCR handler to the right function
|
#
1.35 |
|
09-Jul-2018 |
mlarkin |
vmd(8): better debug message in a failure case
|
#
1.34 |
|
19-Jun-2018 |
reyk |
knf
|
#
1.33 |
|
27-Apr-2018 |
mlarkin |
vmd(8): implement vmd side of ELCR registers
ok guenther
|
#
1.32 |
|
26-Apr-2018 |
mlarkin |
vmd(8): handle PIT channel 2 status readback via port 0x61
Allow PIT channel 2 status (fired/counting) readback via port 0x61 bit 5.
ok guenther@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.31 |
|
03-Jan-2018 |
ccardenas |
Add initial CD-ROM support to VMD via vioscsi.
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
#
1.30 |
|
29-Nov-2017 |
mlarkin |
make vmm(4) less responsible for initial register state, preferring to let usermode daemons handle that.
ok pd@
|
#
1.29 |
|
28-Nov-2017 |
mlarkin |
fix some spelling errors in a few comments
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.28 |
|
19-Sep-2017 |
mlarkin |
Clarify a wrong conditional, found by jsg.
ok jsg
|
#
1.27 |
|
17-Sep-2017 |
pd |
vmd: send/recv pci config space instead of recreating pci devices on receive
ok mlarkin@
|
#
1.26 |
|
17-Sep-2017 |
pd |
vmd: re add rtc.per and rtc.sec evtimers on receive
This was missed in receive. mc146818_start is already defined. This fixes rtc time resync on receive.
ok mlarkin@
|
#
1.25 |
|
11-Sep-2017 |
dlg |
add functions to provide direct access to guest memory as vmd addresses
iovec_mem() populates an iovec array based on guest physical addresses. this allows the use of things like readv and writev for moving data between the guest and a disk image file without having to bounce the memory.
vaddr_mem() provides a vmd usable pointer based on a guests physical address. this makes it possible to directly reference things like virtio rings without having to bounce that memory either. however, it assumes that a contiguous range of guest physical memory will sit in a single vm memory range. mlarkin@ says this is right.
ok mlarkin@
|
#
1.24 |
|
20-Aug-2017 |
pd |
vmd: Allow only upward migration
This restricts receiving vms from hosts with more cpu features.
Tested on broadwell -> skylake (works) skylake -> broadwell (don't work)
ok mlarkin@
|
#
1.23 |
|
14-Aug-2017 |
mlarkin |
vmd: set MSR_MISC_ENABLE=0 on vm creation, this will be re-set in vmm based on proper values from the host in use.
|
#
1.22 |
|
15-Jul-2017 |
pd |
Add vmctl send and vmctl receive
ok reyk@ and mlarkin@
|
#
1.21 |
|
09-Jul-2017 |
pd |
vmd/vmctl: Add ability to pause / unpause vms
With help from Ashwin Agrawal
ok reyk@ mlarkin@
|
#
1.20 |
|
07-Jun-2017 |
mlarkin |
vmd: Implement simulated baudrate support in the ns8250 module. The previous version was allowing an output rate that is "too fast", and linux guests would give up after 512 characters TXed ("too much work for irq4").
This diff calculates the approximate rate we can sustain at the current programmed baud rate and limits the output to that rate by inserting a HZ delay after a specified number of characters have been transmitted. This fixes the linux guest console issue.
Note that the console now outputs at more or less the selected baud rate, instead of nearly instantaneously as before - if you selected 9600 in your guest VMs before, you might want to change that to 115200 now for a better console experience.
krw@ "seems like a good idea to me"
|
#
1.19 |
|
30-May-2017 |
tedu |
split vioblk read/write functions into start and finish as prep for async io operations. ok mlarkin
|
#
1.18 |
|
28-May-2017 |
mlarkin |
SVM: add some exit types
Also, fix a comment that wasn't applicable anymore, and change a format from decimal to hex
|
#
1.17 |
|
05-May-2017 |
reyk |
VMs cannot use proc_compose() to PROC_VMM, they have to use imsg_compose() on the "vmm_pipe" directly. This fixes the communication channel from VMs back to vmm.
|
#
1.16 |
|
05-May-2017 |
mlarkin |
Allow vmd(8) to set guest %xcr0
Usermode part of previous vmm(4) diff.
Posted to tech by Pratik Vyas
|
#
1.15 |
|
02-May-2017 |
mlarkin |
fix an error in i386 vmd build
|
#
1.14 |
|
02-May-2017 |
mlarkin |
Matching vmd(8) part of previous diff (first part of vmctl send/receive).
ok kettenis
|
#
1.13 |
|
25-Apr-2017 |
reyk |
spacing
|
#
1.12 |
|
19-Apr-2017 |
reyk |
Add support for dynamic "NAT" interfaces (-L/local interface).
When a local interface is configured, vmd configures a /31 address on the tap(4) interface of the host and provides another IP in the same subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server that replies with IP, gateway, and DNS addresses to the VM. The built-in server only ever responds to the VM on the inside and cannot leak its DHCP responses to the outside.
Thanks to Uwe Werler, Josh Grosse, and some others for testing!
OK deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.11 |
|
27-Mar-2017 |
deraadt |
die whitespace die die die
|
#
1.10 |
|
25-Mar-2017 |
mlarkin |
Last bits needed to get seabios + alpine linux working. This is enough to get started and let more people help finding and fixing bugs.
ok kettenis, deraadt
|
#
1.9 |
|
25-Mar-2017 |
reyk |
Boot using BIOS from /etc/firmware/vmm-bios by default.
Instead of using the internal "vmboot", VMs will now be booted using the external BIOS firmware in /etc/firmware/vmm-bios (which is subject to a LGPLv3 license). Direct booting of OpenBSD kernels or non-default BIOS images is still supported for now using the -b/boot option that is replacing the -k/kernel option.
As requested by Theo, vmd(8) fails if neither the default BIOS is found nor a kernel has been specified in the VM configuration. The "vmm" BIOS has to be installed using fw_update(1), which will be done automatically in most cases where the OpenBSD can fetch it after install/upgrade.
OK mlarkin@
|
#
1.8 |
|
25-Mar-2017 |
mlarkin |
Implement some missing functionality and clean up some code in vmd pci emulation.
ok kettenis
|
#
1.7 |
|
25-Mar-2017 |
mlarkin |
Introduce a new function to obtain properly sized input data, and convert i8253/i8259/mc146818 emulation to use this.
|
#
1.6 |
|
24-Mar-2017 |
mlarkin |
Allow vmd to proceed after an interrupt occurred after retiring a cpuid instruction. Matches previous commit to kernel vmm.c
|
#
1.5 |
|
23-Mar-2017 |
mlarkin |
Implement memory size and SMP CPU count NVRAM registers in the emulated mc146818. This is needed for seabios to boot properly (and construct a sensible e820 map to send to the guest OS).
|
#
1.4 |
|
21-Mar-2017 |
mlarkin |
Fix two errors in NS8250 (UART) emulation. The first error zeroed out the high bits of %eax on reading register data from the emulated UART ports. The second error didn't properly assert the TXRDY bit during init - this bit was only set after the first character was sent. Both these bugs caused seabios to not be able to output any data. Found during the recent effort to get Linux guests booting.
|
#
1.3 |
|
15-Mar-2017 |
reyk |
Improve vmmci(4) shutdown and reboot.
This change handles various cases to power off the VM, even if it is unresponsive, stuck in ddb, or when the shutdown was initiated from the VM guest side. Usage of timeout and VM ACKs make sure that the VM is really turned off at some point.
OK mlarkin@
|
#
1.2 |
|
02-Mar-2017 |
reyk |
Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.
This is especially useful when multiple VMs share a switch, the implementation is independent from the underlying switch or bridge.
no objections mlarkin@
|
#
1.1 |
|
01-Mar-2017 |
reyk |
Split vmm.c into two files: vm.c for the VM child, vmm.c for the parent
As discussed with mlarkin@, it makes it easier to maintain the file.
OK mlarkin@
|
#
1.89 |
|
13-May-2023 |
dv |
vmm(4)/vmd(8): switch to anonymous shared mappings.
While splitting out emulated virtio network and block devices into separate processes, I originally used named mappings via shm_mkstemp(3). While this functionally achieved the desired result, it had two unintended consequences:
1) tearing down a vm process and its child processes required excessive locking as the guest memory was tied into the VFS layer.
2) it was observed by mlarkin@ that actions in other parts of the VFS layer could cause some of the guest memory to flush to storage, possibly filling /tmp.
This commit adds a new vmm(4) ioctl dedicated to allowing a process request the kernel share a mapping of guest memory into its own vm space. This requires an open fd to /dev/vmm (requiring root) and both the "vmm" and "proc" pledge(2) promises. In addition, the caller must know enough about the original memory ranges to reconstruct them to make the vm's ranges.
Tested with help from Mischa Peters.
ok mlarkin@
|
#
1.88 |
|
28-Apr-2023 |
dv |
vmd(8)/vmctl(8): allow vm owners to override boot kernel.
vmd allows non-root users to "own" a vm defined in vm.conf(5). While the user can start/stop the vm, if they break their filesystem they have no means of booting recovery media like a ramdisk kernel.
This change opens the provided boot kernel via vmctl and passes the file descriptor through the control channel to vmd. The next boot of the vm will use the provided file descriptor as boot kernel/bios. Subsequent boots (e.g. a reboot) will return to using behavior defined in vm.conf or the default bios image.
ok mlarkin@
|
#
1.87 |
|
27-Apr-2023 |
dv |
vmd(8): introduce multi-process model for virtio devices.
Isolate virtio network and block device emulation in dedicated processes, forked and exec'd from the vm process. This allows for tightening pledge promises to just "stdio".
Communication between the vcpu's and these devices now occurs via imsg channels, which adds the benefit of not always blocking the vcpu thread while emulating the device.
With this commit, it's possible that vmd is the first open source hypervisor that *defaults* to a multi-process device emulation model without requiring any additional configuration from the operator.
Testing help from phessler@ and Mischa Peters.
ok mlarkin@
|
#
1.86 |
|
25-Apr-2023 |
dv |
vmm(4)/vmd(8): pull struct members out of vmm ioctl create struct.
The object sent to vmm(4) contained file paths and details the kernel does not need for cpu virtualization as device emulation is in userland. Effectively, "pull up" the struct members from the vm_create_params struct to the parent vmop_create_params struct.
This allows us to clean up some of vmd(8) and simplify things for switching to having vmctl(8) open the "kernel" file (SeaBIOS, bsd.rd, etc.) to allow users to boot recovery ramdisk kernels.
ok mlarkin@
|
#
1.85 |
|
23-Apr-2023 |
dv |
vmd(8): teach vmm process how to exec.
Use execvp(2) to launch vm children with new address spaces. Consequently, introduces use of unveil(2) into the vmm and vm processes.
This imposes the requirement of launching vmd with absolute paths, similar to sshd(8).
ok mlarkin@
|
#
1.84 |
|
23-Apr-2023 |
anton |
unbreak tree by coping with recent s/XCR0/XFEATURE rename
|
Revision tags: OPENBSD_7_3_BASE
|
#
1.83 |
|
06-Feb-2023 |
dv |
vmd(8): scan pci bus to determine bootorder strings.
vmd's SeaBIOS bootorder strings had hardcoded pci device ids, so if a user added a network interface the bootorder strings didn't line up with reality. Using vmctl(8) to boot from a cdrom (-B cdrom) would fail, for instance, if attaching both a nic and a disk as well.
This change scans the pci devices and finds the first of each type to construct viable bootorder strings.
ok jan@
|
#
1.82 |
|
28-Jan-2023 |
dv |
Move some header definitions from vmm(4) to vmd(8).
Part of an ongoing effort to move userland-specific information out of a kernel header and directly into vmd(8). No functional change.
ok mlarkin@
|
#
1.81 |
|
08-Jan-2023 |
dv |
vmd(8): add thread names to vm process.
ok guenther@.
|
#
1.80 |
|
04-Jan-2023 |
dv |
Typos in vmd error message. No functional change.
|
#
1.79 |
|
28-Dec-2022 |
jmc |
spelling fixes; from paul tagliamonte any parts of his diff not taken are noted on tech
|
#
1.78 |
|
26-Dec-2022 |
dv |
vmd(8): provide a detailed e820 memory map.
When booting guests with SeaBIOS, vmd(8) supplied details about the available guest memory via CMOS registers. Consequently, we've been carrying some patches in the ports tree to SeaBIOS to fetch this information like it's the 1990s.
When a vm initializes memory ranges, we now track what each range represents. This information can be used to supply the e820 memory map to SeaBIOS via the fw_cfg interface allowing it to properly communicate memory ranges to a guest operating system. (This will also allow us to drop some patches from the port.)
Given the ranges can now be marked with a purpose, this also allows vmm(4) to switch from hard-coded mmio ranges and instead let the information on the memory range dictate if vmm should be handling a page fault or sending to vmd for a memory assist.
Tested by Mischa Peters and others. OK mlarkin@.
|
#
1.77 |
|
23-Dec-2022 |
dv |
vmd(8): implement zero-copy operations on virtqueues.
The original virtio device implementation relied on allocating a buffer on heap, copying the virtqueue from the guest, mutating the copy, and then overwriting the virtqueue in the guest.
While the approach worked, it was both complex and added extra overhead. On older hardware, switching to the zero-copy approach can show a noticeable performance improvement for vionet devices. An added benefit is this diff also reduces the amount of code in vmd, which is always a welcome change.
In addition, change to talking about the queue pfn and not "address" as the virtio-pci spec has drivers provide a 32-bit value representing the physical page number of the location in guest memory, not the linear address.
Original idea from dlg@ while working on re-adding async task queues.
ok dlg@, tested by many
|
#
1.76 |
|
11-Nov-2022 |
dv |
Revert removal of toggling interrupt line in vmd vcpu run loop.
phessler reports a performance regression. Needs more testing.
|
#
1.75 |
|
10-Nov-2022 |
dv |
vmd(8): remove toggling interrupt line on vcpu in vcpu run loop
We toggle the interrupt "line" on the vcpu when we assert or deassert irq on the pic in either the vcpu thread (emulating some devices) or on the device event thread (mostly handling reading available data). Having it in the vcpu run loop here just results in another ioctl(2) call before the one for re-entering the guest cpu.
Removing it shows no noticeable behavioral change in existing guests.
ok mlarkin@
|
#
1.74 |
|
10-Nov-2022 |
dv |
vmd(8): import mmio decode and emulation, disabled for now.
The initial mmio support for vmd adds support for only specific MOV and MOVZX instructions. Plan is to begin iterating in-tree on other missing pieces. All functionality is gated behind an #if for now.
Only change to vmm(4) is reordering register #define's in vmmvar.h.
ok mlarkin@
|
Revision tags: OPENBSD_7_2_BASE
|
#
1.73 |
|
01-Sep-2022 |
dv |
vmm(4): send all port io emulation to userland
Simplify things by sending any io exits from IN/OUT instructions to userland instead of trying to emulate anything in the kernel. vmm was sending most pertinent exits to vmd anyways, so this functionally changes little.
An added benefit is this solves an issue reported by tb@ where i386 OpenBSD guests would probe for a pc keyboard repeatedly and cause excessive vm exits. (The emulation in vmm was not properly handling these port reads.)
While here, make the assignment of the VEI_DIR_{IN,OUT} enum values not assume the underlying integer the compiler may assign.
ok mlarkin@
|
#
1.72 |
|
30-Aug-2022 |
dv |
Initial support for mmio assist for vmm(4)
Provide the basic information required for a userland assist in emulating instructions touching mmio regions, sending as much information as is provided by the host hardware.
No decode or assist provided at the moment by vmd(8).
ok mlarkin@
|
#
1.71 |
|
29-Jun-2022 |
dv |
vmd(8): fix off by one in vm memory range check
When inspecting if a gpa falls into a known memory range, vmd was considering it valid 1 byte past the end resulting in selecting the wrong starting range for the search.
ok mlarkin@
|
#
1.70 |
|
26-Jun-2022 |
dv |
vmd: create a copy of bios at 4g boundary
Newer Linux kernels call into the bios to perform a reboot and our version of SeaBIOS assumes there's a "copy" of the bios ending at 4g. When SeaBIOS reads from this area, since vmd doesn't perform mmio yet, guests terminate with an unhandled fault.
Carve out some space ending at 4g and copy the bios there. Technically we could load garbage there, but give SeaBIOS what it wants for now.
ok mlarkin@
|
#
1.69 |
|
03-May-2022 |
dv |
vmm/vmd/vmctl: standardize memory units to bytes
At different points in the vm lifecycle vmm(4), vmctl(8), and vmd(8) refer to a vm's memory range sizes in either bytes or megabytes. This is needlessly complex.
Switch to using bytes everywhere and adjust types and constants accordingly. While this makes it possible to specify vm's with memory in fractions of megabytes, the logic requiring whole megabyte values remains.
Feedback from deraadt@, mlarkin@, and Matthew Martin.
ok mlarkin@
|
Revision tags: OPENBSD_7_1_BASE
|
#
1.68 |
|
01-Mar-2022 |
dv |
vmd(8): gracefully handle hitting data limits when starting a vm
With recent changes to login.conf(5) to restrict daemon datasize to a finite value, users can now hit resource limits when attempting to start a vm.
This change fixes the error path when hitting the limit. vmd(8) will no longer abort and memory error messages are relayed to the user.
While here, address potential under-reads/writes using atomicio when relaying data between the child vm process and vmd's vmm process.
Original diff from tedu@. OK mlarkin@.
|
#
1.67 |
|
30-Dec-2021 |
claudio |
Add back support for -B net -b bsd.rd which emulates a PXE install and results in an autoinstall. This can be used to quickly create new OpenBSD installs. OK dv@
|
#
1.66 |
|
29-Nov-2021 |
deraadt |
mostly avoid sys/param.h with a local nitems() ok mlarkin
|
Revision tags: OPENBSD_7_0_BASE
|
#
1.65 |
|
01-Sep-2021 |
dv |
remove unused functions and cleanup vmd.h
Discussed with mlarkin@. These functions were implemented but never used. While in vmd.h, fix the order to match current vmd(8) reality.
|
#
1.64 |
|
16-Jul-2021 |
dv |
vmd(8): simplify vcpu logic, removing uart & vionet reads
Remove legacy state handling on the ns8250 and virtio network devices originally put in place before using libevent for async device events. The vcpu thread doesn't need to process device data as it is handled by the libevent thread.
This has the benefit of simplifying some of the message passing between threads introduced to the ns8250 uart since both the vcpu and libevent threads were processing read events.
No functional change intended. Tested by many, including abieber@, weerd@, Mischa Peters, and Matthias Schmidt. (Thanks.)
OK mlarkin@
|
#
1.63 |
|
16-Jun-2021 |
dv |
cleanup vmd(8) includes and header files
Lots of organic growth other the years lead to unnecessary includes (proc.h everywhere) and odd dependencies between header files. This cleans things up a bit to help with upcoming cleanup around dhcp code.
No functional change.
"go for it" mlarkin@
|
Revision tags: OPENBSD_6_9_BASE
|
#
1.62 |
|
05-Apr-2021 |
dv |
Support booting from compressed kernel images.
The bsd.rd ramdisk now ships gzip'd on amd64. Use libz in base to transparently handle decompression of any compressed kernel images.
Patch from Josh Rickmar.
ok kn@
|
#
1.61 |
|
29-Mar-2021 |
dv |
Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior ignored did not intercept these packets and instead transmitted them.
This should make vmd(8)'s dhcp behave more as a true dhcp server should and allows it to work properly with the new dhcpleased(8) attempting a renewal.
OK mlarkin@
|
#
1.60 |
|
19-Mar-2021 |
kn |
Remove booting from kernels in raw/qcow2 images
Diff and (slightly tweaked) text below from Dave Voutila < dave at sisu dot io >, thanks!
-- Since 6.7 switched to FFS2 as the default filesystem for new installs, the ability for vmd(8) to load a kernel and boot.conf from a disk image directly (without SeaBIOS) has been broken.
A diff from tb to add FFS2 support never mdae it into the tree.
On 5th Jan 2021, new ramdisks for amd64 have started shipping gzipped, breaking the ability to load the bsd.rd directly as a kernel image for a vmd guest without first uncompressing the image.
Using BIOS works, the FFS2 change happend ten months ago and few if any have complained about the breakage. vmctl(8) is still vague about supporting it per its man page and one still has to pass the disk image twice as a "-b" and "-d" argument to boot an OpenBSD guest *without* BIOS.
Josh Rickmar reported the gzip issue on bugs@ and provided patches to add support for compressed ramdisks and kernel images. The easiest way to do so is to drop support for FFS images since they require a call to fmemopen(3) while all the other logic uses fopen(3)/fdopen(3) calls and a file descriptor. It is much easier to get thsoe patches merged if they don't have to account for extracting files from disk images. --
No objections anyone "Removing it makes sense" reyk (who wrote the FFS module) OK mlarkin
|
#
1.59 |
|
13-Feb-2021 |
mlarkin |
Fix some wrong comments and KNF/long line wraps
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.58 |
|
28-Jun-2020 |
pd |
vmd(8): Eliminate libevent state corruption
libevent functions for com, pic and rtc are now only called on event_thread. vcpu exit handlers send messages on a dev pipe and callbacks on these events do the event management (event_add, evtimer_add, etc). Previously, libevent state was mutated by two threads, event_thread, that runs all the callbacks and the vcpu thread when running exit handlers. This could have lead to libevent state corruption.
Patch from Dave Voutila <dave@sisu.io>
ok claudio@ tested by abieber@ and brynet@
|
Revision tags: OPENBSD_6_7_BASE
|
#
1.57 |
|
30-Apr-2020 |
pd |
vmd(8): correctly terminate vm processes after sending vm
Instead of a round about way of sending a message to vmm that 'send is successful' and terminating by vm_remove from vmm, we can send the imsg and exit in the vm process. The sigchld handler in vmm will vm_remove it from its structures. This is how a normal vm is terminated as well.
Previously, vm_remove was called in vmm_dispatch_vm (ie. the event handler to receive messages from vm process) when hanlding the IMSG_VMDOP_SEND_VM_RESPONSE (ie. the vm process has written the vm state to the fd passed on by vmctl send). This is not how vm_remove was intented to be used as it does a free(vm). The vm struct holds the buffers for imsg and so after handling this IMSG_VMDOP_SEND_VM_RESPONSE message, vmm_dispatch_vm loops again to do imsg_get(ibuf, &imsg) to read the next message (and we had just freed this *ibuf when we freed the vm struct) causing it to segfault.
reported by kn@ ok kn@
|
#
1.56 |
|
21-Apr-2020 |
pd |
vmd: improve concurrency control in pause
Previous implementation hit a deadlock sometimes as the pthread_cond_broadcast for the pause mutex could happen before pthread_cond_wait. This implementation uses a barrier which is hit when all vpcus are paused.
ok mpi@
|
#
1.55 |
|
08-Apr-2020 |
pd |
vmm(4): add IOCTL handler to sets the access protections of the ept
This exposes VMM_IOC_MPROTECT_EPT which can be used by vmd to lock in physical pages. Currently, vmd just terminates the vm in case it gets a protection fault in the future.
This feature is used by solo5 which uses vmm(4) as a backend hypervisor.
ok mpi@
Patch from Adam Steen <adam@adamsteen.com.au>
|
#
1.54 |
|
11-Dec-2019 |
pd |
vmd: proper concurrency control when pausing a vm
Removes an XXX which slept for 1s waiting for the vcpu thread to reach HLT and pause. We now define a paused and unpaused condition so that a call to pause_vm() / vmctl pause blocks till the vm really reaches a paused state.
Also, detach events for devices from event loop when pausing and add them back when unpausing. This is because some callbacks call pthread_mutex_lock and if the vm is paused, it would block also causing the libevent thread to block. This would mean that we would not be able to process any IMSGs received from vmm (parent process) including a message to unpause.
ok mlarkin@
|
#
1.53 |
|
30-Nov-2019 |
mlarkin |
Revert previous - the stability was not as improved as we had thought and we ended up accidentally breaking vmctl. This will need more thought.
ok ori@
|
#
1.52 |
|
29-Nov-2019 |
mlarkin |
Fix at least one cause of VMs spinning at 100% host CPU
After debugging with ori@, it looks like an event ends up on the wrong libevent queue, and we end continually de-queueing and re-queueing the event continually. While it's unclear exactly why this happened, a clue on libevent's github issues page for the same problem pointed us to using a different event base for the device events. This seems to have unstuck ori@'s problematic VM, and I have also seen no more hangs after this.
We have not completely separated the queues; ori@ will work on setting new libevent bases for those later. But those events are pretty frequency.
with help from and ok ori@
|
Revision tags: OPENBSD_6_6_BASE
|
#
1.51 |
|
17-Jul-2019 |
pd |
vmm/vmd: Fix migration with pvclock
Implement VMM_IOC_READVMPARAMS and VMM_IOC_WRITEVMPARAMS ioctls to read and write pvclock state.
reads ok mlarkin@
|
#
1.50 |
|
28-Jun-2019 |
deraadt |
When system calls indicate an error they return -1, not some arbitrary value < 0. errno is only updated in this case. Change all (most?) callers of syscalls to follow this better, and let's see if this strictness helps us in the future.
|
#
1.49 |
|
28-May-2019 |
pd |
vmd: unset CR0_CD and CR0_NW in default flat64 register values
These never got unset on AMD/SVM guests when booted via vmctl start -b causing them to run very slow
ok mlarkin@
|
#
1.48 |
|
12-May-2019 |
pd |
vmm: add a x86 page table walker
Add a first cut of x86 page table walker to vmd(8) and vmm(4). This function is not used right now but is a building block for future features like HPET, OUTSB and INSB emulation, nested virtualisation support, etc.
With help from Mike Larkin
ok mlarkin@
|
#
1.47 |
|
11-May-2019 |
jasper |
vm_dump_header allocated space for a signature but it was never set; set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image
ok mlarkin@ pd@
|
#
1.46 |
|
11-May-2019 |
jasper |
track the state of the vm (running, paused, etc) using a single bitfield instead of a handful of separate variables. this will makes it easier for vmd to report and check on the individual vm states
no functional change intended
ok ccardenas@ mlarkin@
|
Revision tags: OPENBSD_6_5_BASE
|
#
1.45 |
|
01-Mar-2019 |
mlarkin |
vmd(8): remove some i386 remnants that missed the original cleanup
ok pd, kn, deraadt
|
#
1.44 |
|
20-Feb-2019 |
mlarkin |
vmd(8): initialize guest %drX registers to power-on defaults on launch
Initializes the %drX registers to power on defaults, and bump the VM send/recieve header to reflect same
discussed with deraadt@
|
#
1.43 |
|
10-Dec-2018 |
claudio |
Implement the fw_cfg interface basics and use it to set the bootorder if a bootdevice was forced. This implements both the pure IO port interface and also the new DMA interface, a few direct commands are implemented which are needed but in general the "file" interface should be used. There is no write support for the guest. Tested against the latest vmm-firmware port. This requires also a -current kernel to pass the IO ports to vmd(8). OK mlarkin@ ccardenas@
|
#
1.42 |
|
06-Dec-2018 |
claudio |
Make it possible to define the bootdevice in vmd. This information is used currently only when booting a OpenBSD kernel. If VMBOOTDEV_NET is used the internal dhcp server will pass "auto_install" as boot file to the client and the boot loader passes the MAC of the first interface to the kernel to indicate PXE booting. Adding boot order support to SeaBIOS is not yet implemented. Ok ccardenas@
|
Revision tags: OPENBSD_6_4_BASE
|
#
1.41 |
|
08-Oct-2018 |
reyk |
Add support for qcow2 base images (external snapshots).
This works is from Ori Bernstein, committing on his behalf:
Add support to vmd for external snapshots. That is, snapshots that are derived from a base image. Data lookups start in the derived image, and if the derived image does not contain some data, the search proceeds ot the base image. Multiple derived images may exist off of a single base image.
A limitation of this format is that modifying the base image will corrupt the derived image.
This change also adds support for creating disk derived disk images to vmctl. To use it:
vmctl create derived.qcow2 -s 16G -b base.qcow2
From Ori Bernstein OK mlarkin@ reyk@
|
#
1.40 |
|
28-Sep-2018 |
reyk |
Support vmd-internal's vmboot with qcow2 disk images.
OK mlarkin@
|
#
1.39 |
|
19-Sep-2018 |
ccardenas |
Various clean up items for disks.
- qcow2: general cleanup - vioraw: check malloc - virtio: add function to sync disks - vm: call virtio_shutdown to sync disks when vm is finished executing
Thanks to Ori Bernstein.
Ok miko@
|
#
1.38 |
|
17-Jul-2018 |
mlarkin |
vmd(8): fix vmctl -b option for i386 kernels.
ok pd@
|
#
1.37 |
|
12-Jul-2018 |
mlarkin |
vmm(8)/vmm(4): send a copy of the guest register state to vmd on exit, avoiding multiple readregs ioctls back to vmm in case register content is needed subsequently.
ok phessler
|
#
1.36 |
|
10-Jul-2018 |
mlarkin |
vmd(8): route ELCR handler to the right function
|
#
1.35 |
|
09-Jul-2018 |
mlarkin |
vmd(8): better debug message in a failure case
|
#
1.34 |
|
19-Jun-2018 |
reyk |
knf
|
#
1.33 |
|
27-Apr-2018 |
mlarkin |
vmd(8): implement vmd side of ELCR registers
ok guenther
|
#
1.32 |
|
26-Apr-2018 |
mlarkin |
vmd(8): handle PIT channel 2 status readback via port 0x61
Allow PIT channel 2 status (fired/counting) readback via port 0x61 bit 5.
ok guenther@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.31 |
|
03-Jan-2018 |
ccardenas |
Add initial CD-ROM support to VMD via vioscsi.
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
#
1.30 |
|
29-Nov-2017 |
mlarkin |
make vmm(4) less responsible for initial register state, preferring to let usermode daemons handle that.
ok pd@
|
#
1.29 |
|
28-Nov-2017 |
mlarkin |
fix some spelling errors in a few comments
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.28 |
|
19-Sep-2017 |
mlarkin |
Clarify a wrong conditional, found by jsg.
ok jsg
|
#
1.27 |
|
17-Sep-2017 |
pd |
vmd: send/recv pci config space instead of recreating pci devices on receive
ok mlarkin@
|
#
1.26 |
|
17-Sep-2017 |
pd |
vmd: re add rtc.per and rtc.sec evtimers on receive
This was missed in receive. mc146818_start is already defined. This fixes rtc time resync on receive.
ok mlarkin@
|
#
1.25 |
|
11-Sep-2017 |
dlg |
add functions to provide direct access to guest memory as vmd addresses
iovec_mem() populates an iovec array based on guest physical addresses. this allows the use of things like readv and writev for moving data between the guest and a disk image file without having to bounce the memory.
vaddr_mem() provides a vmd usable pointer based on a guests physical address. this makes it possible to directly reference things like virtio rings without having to bounce that memory either. however, it assumes that a contiguous range of guest physical memory will sit in a single vm memory range. mlarkin@ says this is right.
ok mlarkin@
|
#
1.24 |
|
20-Aug-2017 |
pd |
vmd: Allow only upward migration
This restricts receiving vms from hosts with more cpu features.
Tested on broadwell -> skylake (works) skylake -> broadwell (don't work)
ok mlarkin@
|
#
1.23 |
|
14-Aug-2017 |
mlarkin |
vmd: set MSR_MISC_ENABLE=0 on vm creation, this will be re-set in vmm based on proper values from the host in use.
|
#
1.22 |
|
15-Jul-2017 |
pd |
Add vmctl send and vmctl receive
ok reyk@ and mlarkin@
|
#
1.21 |
|
09-Jul-2017 |
pd |
vmd/vmctl: Add ability to pause / unpause vms
With help from Ashwin Agrawal
ok reyk@ mlarkin@
|
#
1.20 |
|
07-Jun-2017 |
mlarkin |
vmd: Implement simulated baudrate support in the ns8250 module. The previous version was allowing an output rate that is "too fast", and linux guests would give up after 512 characters TXed ("too much work for irq4").
This diff calculates the approximate rate we can sustain at the current programmed baud rate and limits the output to that rate by inserting a HZ delay after a specified number of characters have been transmitted. This fixes the linux guest console issue.
Note that the console now outputs at more or less the selected baud rate, instead of nearly instantaneously as before - if you selected 9600 in your guest VMs before, you might want to change that to 115200 now for a better console experience.
krw@ "seems like a good idea to me"
|
#
1.19 |
|
30-May-2017 |
tedu |
split vioblk read/write functions into start and finish as prep for async io operations. ok mlarkin
|
#
1.18 |
|
28-May-2017 |
mlarkin |
SVM: add some exit types
Also, fix a comment that wasn't applicable anymore, and change a format from decimal to hex
|
#
1.17 |
|
05-May-2017 |
reyk |
VMs cannot use proc_compose() to PROC_VMM, they have to use imsg_compose() on the "vmm_pipe" directly. This fixes the communication channel from VMs back to vmm.
|
#
1.16 |
|
05-May-2017 |
mlarkin |
Allow vmd(8) to set guest %xcr0
Usermode part of previous vmm(4) diff.
Posted to tech by Pratik Vyas
|
#
1.15 |
|
02-May-2017 |
mlarkin |
fix an error in i386 vmd build
|
#
1.14 |
|
02-May-2017 |
mlarkin |
Matching vmd(8) part of previous diff (first part of vmctl send/receive).
ok kettenis
|
#
1.13 |
|
25-Apr-2017 |
reyk |
spacing
|
#
1.12 |
|
19-Apr-2017 |
reyk |
Add support for dynamic "NAT" interfaces (-L/local interface).
When a local interface is configured, vmd configures a /31 address on the tap(4) interface of the host and provides another IP in the same subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server that replies with IP, gateway, and DNS addresses to the VM. The built-in server only ever responds to the VM on the inside and cannot leak its DHCP responses to the outside.
Thanks to Uwe Werler, Josh Grosse, and some others for testing!
OK deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.11 |
|
27-Mar-2017 |
deraadt |
die whitespace die die die
|
#
1.10 |
|
25-Mar-2017 |
mlarkin |
Last bits needed to get seabios + alpine linux working. This is enough to get started and let more people help finding and fixing bugs.
ok kettenis, deraadt
|
#
1.9 |
|
25-Mar-2017 |
reyk |
Boot using BIOS from /etc/firmware/vmm-bios by default.
Instead of using the internal "vmboot", VMs will now be booted using the external BIOS firmware in /etc/firmware/vmm-bios (which is subject to a LGPLv3 license). Direct booting of OpenBSD kernels or non-default BIOS images is still supported for now using the -b/boot option that is replacing the -k/kernel option.
As requested by Theo, vmd(8) fails if neither the default BIOS is found nor a kernel has been specified in the VM configuration. The "vmm" BIOS has to be installed using fw_update(1), which will be done automatically in most cases where the OpenBSD can fetch it after install/upgrade.
OK mlarkin@
|
#
1.8 |
|
25-Mar-2017 |
mlarkin |
Implement some missing functionality and clean up some code in vmd pci emulation.
ok kettenis
|
#
1.7 |
|
25-Mar-2017 |
mlarkin |
Introduce a new function to obtain properly sized input data, and convert i8253/i8259/mc146818 emulation to use this.
|
#
1.6 |
|
24-Mar-2017 |
mlarkin |
Allow vmd to proceed after an interrupt occurred after retiring a cpuid instruction. Matches previous commit to kernel vmm.c
|
#
1.5 |
|
23-Mar-2017 |
mlarkin |
Implement memory size and SMP CPU count NVRAM registers in the emulated mc146818. This is needed for seabios to boot properly (and construct a sensible e820 map to send to the guest OS).
|
#
1.4 |
|
21-Mar-2017 |
mlarkin |
Fix two errors in NS8250 (UART) emulation. The first error zeroed out the high bits of %eax on reading register data from the emulated UART ports. The second error didn't properly assert the TXRDY bit during init - this bit was only set after the first character was sent. Both these bugs caused seabios to not be able to output any data. Found during the recent effort to get Linux guests booting.
|
#
1.3 |
|
15-Mar-2017 |
reyk |
Improve vmmci(4) shutdown and reboot.
This change handles various cases to power off the VM, even if it is unresponsive, stuck in ddb, or when the shutdown was initiated from the VM guest side. Usage of timeout and VM ACKs make sure that the VM is really turned off at some point.
OK mlarkin@
|
#
1.2 |
|
02-Mar-2017 |
reyk |
Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.
This is especially useful when multiple VMs share a switch, the implementation is independent from the underlying switch or bridge.
no objections mlarkin@
|
#
1.1 |
|
01-Mar-2017 |
reyk |
Split vmm.c into two files: vm.c for the VM child, vmm.c for the parent
As discussed with mlarkin@, it makes it easier to maintain the file.
OK mlarkin@
|
#
1.88 |
|
28-Apr-2023 |
dv |
vmd(8)/vmctl(8): allow vm owners to override boot kernel.
vmd allows non-root users to "own" a vm defined in vm.conf(5). While the user can start/stop the vm, if they break their filesystem they have no means of booting recovery media like a ramdisk kernel.
This change opens the provided boot kernel via vmctl and passes the file descriptor through the control channel to vmd. The next boot of the vm will use the provided file descriptor as boot kernel/bios. Subsequent boots (e.g. a reboot) will return to using behavior defined in vm.conf or the default bios image.
ok mlarkin@
|
#
1.87 |
|
27-Apr-2023 |
dv |
vmd(8): introduce multi-process model for virtio devices.
Isolate virtio network and block device emulation in dedicated processes, forked and exec'd from the vm process. This allows for tightening pledge promises to just "stdio".
Communication between the vcpu's and these devices now occurs via imsg channels, which adds the benefit of not always blocking the vcpu thread while emulating the device.
With this commit, it's possible that vmd is the first open source hypervisor that *defaults* to a multi-process device emulation model without requiring any additional configuration from the operator.
Testing help from phessler@ and Mischa Peters.
ok mlarkin@
|
#
1.86 |
|
25-Apr-2023 |
dv |
vmm(4)/vmd(8): pull struct members out of vmm ioctl create struct.
The object sent to vmm(4) contained file paths and details the kernel does not need for cpu virtualization as device emulation is in userland. Effectively, "pull up" the struct members from the vm_create_params struct to the parent vmop_create_params struct.
This allows us to clean up some of vmd(8) and simplify things for switching to having vmctl(8) open the "kernel" file (SeaBIOS, bsd.rd, etc.) to allow users to boot recovery ramdisk kernels.
ok mlarkin@
|
#
1.85 |
|
23-Apr-2023 |
dv |
vmd(8): teach vmm process how to exec.
Use execvp(2) to launch vm children with new address spaces. Consequently, introduces use of unveil(2) into the vmm and vm processes.
This imposes the requirement of launching vmd with absolute paths, similar to sshd(8).
ok mlarkin@
|
#
1.84 |
|
23-Apr-2023 |
anton |
unbreak tree by coping with recent s/XCR0/XFEATURE rename
|
Revision tags: OPENBSD_7_3_BASE
|
#
1.83 |
|
06-Feb-2023 |
dv |
vmd(8): scan pci bus to determine bootorder strings.
vmd's SeaBIOS bootorder strings had hardcoded pci device ids, so if a user added a network interface the bootorder strings didn't line up with reality. Using vmctl(8) to boot from a cdrom (-B cdrom) would fail, for instance, if attaching both a nic and a disk as well.
This change scans the pci devices and finds the first of each type to construct viable bootorder strings.
ok jan@
|
#
1.82 |
|
28-Jan-2023 |
dv |
Move some header definitions from vmm(4) to vmd(8).
Part of an ongoing effort to move userland-specific information out of a kernel header and directly into vmd(8). No functional change.
ok mlarkin@
|
#
1.81 |
|
08-Jan-2023 |
dv |
vmd(8): add thread names to vm process.
ok guenther@.
|
#
1.80 |
|
04-Jan-2023 |
dv |
Typos in vmd error message. No functional change.
|
#
1.79 |
|
28-Dec-2022 |
jmc |
spelling fixes; from paul tagliamonte any parts of his diff not taken are noted on tech
|
#
1.78 |
|
26-Dec-2022 |
dv |
vmd(8): provide a detailed e820 memory map.
When booting guests with SeaBIOS, vmd(8) supplied details about the available guest memory via CMOS registers. Consequently, we've been carrying some patches in the ports tree to SeaBIOS to fetch this information like it's the 1990s.
When a vm initializes memory ranges, we now track what each range represents. This information can be used to supply the e820 memory map to SeaBIOS via the fw_cfg interface allowing it to properly communicate memory ranges to a guest operating system. (This will also allow us to drop some patches from the port.)
Given the ranges can now be marked with a purpose, this also allows vmm(4) to switch from hard-coded mmio ranges and instead let the information on the memory range dictate if vmm should be handling a page fault or sending to vmd for a memory assist.
Tested by Mischa Peters and others. OK mlarkin@.
|
#
1.77 |
|
23-Dec-2022 |
dv |
vmd(8): implement zero-copy operations on virtqueues.
The original virtio device implementation relied on allocating a buffer on heap, copying the virtqueue from the guest, mutating the copy, and then overwriting the virtqueue in the guest.
While the approach worked, it was both complex and added extra overhead. On older hardware, switching to the zero-copy approach can show a noticeable performance improvement for vionet devices. An added benefit is this diff also reduces the amount of code in vmd, which is always a welcome change.
In addition, change to talking about the queue pfn and not "address" as the virtio-pci spec has drivers provide a 32-bit value representing the physical page number of the location in guest memory, not the linear address.
Original idea from dlg@ while working on re-adding async task queues.
ok dlg@, tested by many
|
#
1.76 |
|
11-Nov-2022 |
dv |
Revert removal of toggling interrupt line in vmd vcpu run loop.
phessler reports a performance regression. Needs more testing.
|
#
1.75 |
|
10-Nov-2022 |
dv |
vmd(8): remove toggling interrupt line on vcpu in vcpu run loop
We toggle the interrupt "line" on the vcpu when we assert or deassert irq on the pic in either the vcpu thread (emulating some devices) or on the device event thread (mostly handling reading available data). Having it in the vcpu run loop here just results in another ioctl(2) call before the one for re-entering the guest cpu.
Removing it shows no noticeable behavioral change in existing guests.
ok mlarkin@
|
#
1.74 |
|
10-Nov-2022 |
dv |
vmd(8): import mmio decode and emulation, disabled for now.
The initial mmio support for vmd adds support for only specific MOV and MOVZX instructions. Plan is to begin iterating in-tree on other missing pieces. All functionality is gated behind an #if for now.
Only change to vmm(4) is reordering register #define's in vmmvar.h.
ok mlarkin@
|
Revision tags: OPENBSD_7_2_BASE
|
#
1.73 |
|
01-Sep-2022 |
dv |
vmm(4): send all port io emulation to userland
Simplify things by sending any io exits from IN/OUT instructions to userland instead of trying to emulate anything in the kernel. vmm was sending most pertinent exits to vmd anyways, so this functionally changes little.
An added benefit is this solves an issue reported by tb@ where i386 OpenBSD guests would probe for a pc keyboard repeatedly and cause excessive vm exits. (The emulation in vmm was not properly handling these port reads.)
While here, make the assignment of the VEI_DIR_{IN,OUT} enum values not assume the underlying integer the compiler may assign.
ok mlarkin@
|
#
1.72 |
|
30-Aug-2022 |
dv |
Initial support for mmio assist for vmm(4)
Provide the basic information required for a userland assist in emulating instructions touching mmio regions, sending as much information as is provided by the host hardware.
No decode or assist provided at the moment by vmd(8).
ok mlarkin@
|
#
1.71 |
|
29-Jun-2022 |
dv |
vmd(8): fix off by one in vm memory range check
When inspecting if a gpa falls into a known memory range, vmd was considering it valid 1 byte past the end resulting in selecting the wrong starting range for the search.
ok mlarkin@
|
#
1.70 |
|
26-Jun-2022 |
dv |
vmd: create a copy of bios at 4g boundary
Newer Linux kernels call into the bios to perform a reboot and our version of SeaBIOS assumes there's a "copy" of the bios ending at 4g. When SeaBIOS reads from this area, since vmd doesn't perform mmio yet, guests terminate with an unhandled fault.
Carve out some space ending at 4g and copy the bios there. Technically we could load garbage there, but give SeaBIOS what it wants for now.
ok mlarkin@
|
#
1.69 |
|
03-May-2022 |
dv |
vmm/vmd/vmctl: standardize memory units to bytes
At different points in the vm lifecycle vmm(4), vmctl(8), and vmd(8) refer to a vm's memory range sizes in either bytes or megabytes. This is needlessly complex.
Switch to using bytes everywhere and adjust types and constants accordingly. While this makes it possible to specify vm's with memory in fractions of megabytes, the logic requiring whole megabyte values remains.
Feedback from deraadt@, mlarkin@, and Matthew Martin.
ok mlarkin@
|
Revision tags: OPENBSD_7_1_BASE
|
#
1.68 |
|
01-Mar-2022 |
dv |
vmd(8): gracefully handle hitting data limits when starting a vm
With recent changes to login.conf(5) to restrict daemon datasize to a finite value, users can now hit resource limits when attempting to start a vm.
This change fixes the error path when hitting the limit. vmd(8) will no longer abort and memory error messages are relayed to the user.
While here, address potential under-reads/writes using atomicio when relaying data between the child vm process and vmd's vmm process.
Original diff from tedu@. OK mlarkin@.
|
#
1.67 |
|
30-Dec-2021 |
claudio |
Add back support for -B net -b bsd.rd which emulates a PXE install and results in an autoinstall. This can be used to quickly create new OpenBSD installs. OK dv@
|
#
1.66 |
|
29-Nov-2021 |
deraadt |
mostly avoid sys/param.h with a local nitems() ok mlarkin
|
Revision tags: OPENBSD_7_0_BASE
|
#
1.65 |
|
01-Sep-2021 |
dv |
remove unused functions and cleanup vmd.h
Discussed with mlarkin@. These functions were implemented but never used. While in vmd.h, fix the order to match current vmd(8) reality.
|
#
1.64 |
|
16-Jul-2021 |
dv |
vmd(8): simplify vcpu logic, removing uart & vionet reads
Remove legacy state handling on the ns8250 and virtio network devices originally put in place before using libevent for async device events. The vcpu thread doesn't need to process device data as it is handled by the libevent thread.
This has the benefit of simplifying some of the message passing between threads introduced to the ns8250 uart since both the vcpu and libevent threads were processing read events.
No functional change intended. Tested by many, including abieber@, weerd@, Mischa Peters, and Matthias Schmidt. (Thanks.)
OK mlarkin@
|
#
1.63 |
|
16-Jun-2021 |
dv |
cleanup vmd(8) includes and header files
Lots of organic growth other the years lead to unnecessary includes (proc.h everywhere) and odd dependencies between header files. This cleans things up a bit to help with upcoming cleanup around dhcp code.
No functional change.
"go for it" mlarkin@
|
Revision tags: OPENBSD_6_9_BASE
|
#
1.62 |
|
05-Apr-2021 |
dv |
Support booting from compressed kernel images.
The bsd.rd ramdisk now ships gzip'd on amd64. Use libz in base to transparently handle decompression of any compressed kernel images.
Patch from Josh Rickmar.
ok kn@
|
#
1.61 |
|
29-Mar-2021 |
dv |
Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior ignored did not intercept these packets and instead transmitted them.
This should make vmd(8)'s dhcp behave more as a true dhcp server should and allows it to work properly with the new dhcpleased(8) attempting a renewal.
OK mlarkin@
|
#
1.60 |
|
19-Mar-2021 |
kn |
Remove booting from kernels in raw/qcow2 images
Diff and (slightly tweaked) text below from Dave Voutila < dave at sisu dot io >, thanks!
-- Since 6.7 switched to FFS2 as the default filesystem for new installs, the ability for vmd(8) to load a kernel and boot.conf from a disk image directly (without SeaBIOS) has been broken.
A diff from tb to add FFS2 support never mdae it into the tree.
On 5th Jan 2021, new ramdisks for amd64 have started shipping gzipped, breaking the ability to load the bsd.rd directly as a kernel image for a vmd guest without first uncompressing the image.
Using BIOS works, the FFS2 change happend ten months ago and few if any have complained about the breakage. vmctl(8) is still vague about supporting it per its man page and one still has to pass the disk image twice as a "-b" and "-d" argument to boot an OpenBSD guest *without* BIOS.
Josh Rickmar reported the gzip issue on bugs@ and provided patches to add support for compressed ramdisks and kernel images. The easiest way to do so is to drop support for FFS images since they require a call to fmemopen(3) while all the other logic uses fopen(3)/fdopen(3) calls and a file descriptor. It is much easier to get thsoe patches merged if they don't have to account for extracting files from disk images. --
No objections anyone "Removing it makes sense" reyk (who wrote the FFS module) OK mlarkin
|
#
1.59 |
|
13-Feb-2021 |
mlarkin |
Fix some wrong comments and KNF/long line wraps
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.58 |
|
28-Jun-2020 |
pd |
vmd(8): Eliminate libevent state corruption
libevent functions for com, pic and rtc are now only called on event_thread. vcpu exit handlers send messages on a dev pipe and callbacks on these events do the event management (event_add, evtimer_add, etc). Previously, libevent state was mutated by two threads, event_thread, that runs all the callbacks and the vcpu thread when running exit handlers. This could have lead to libevent state corruption.
Patch from Dave Voutila <dave@sisu.io>
ok claudio@ tested by abieber@ and brynet@
|
Revision tags: OPENBSD_6_7_BASE
|
#
1.57 |
|
30-Apr-2020 |
pd |
vmd(8): correctly terminate vm processes after sending vm
Instead of a round about way of sending a message to vmm that 'send is successful' and terminating by vm_remove from vmm, we can send the imsg and exit in the vm process. The sigchld handler in vmm will vm_remove it from its structures. This is how a normal vm is terminated as well.
Previously, vm_remove was called in vmm_dispatch_vm (ie. the event handler to receive messages from vm process) when hanlding the IMSG_VMDOP_SEND_VM_RESPONSE (ie. the vm process has written the vm state to the fd passed on by vmctl send). This is not how vm_remove was intented to be used as it does a free(vm). The vm struct holds the buffers for imsg and so after handling this IMSG_VMDOP_SEND_VM_RESPONSE message, vmm_dispatch_vm loops again to do imsg_get(ibuf, &imsg) to read the next message (and we had just freed this *ibuf when we freed the vm struct) causing it to segfault.
reported by kn@ ok kn@
|
#
1.56 |
|
21-Apr-2020 |
pd |
vmd: improve concurrency control in pause
Previous implementation hit a deadlock sometimes as the pthread_cond_broadcast for the pause mutex could happen before pthread_cond_wait. This implementation uses a barrier which is hit when all vpcus are paused.
ok mpi@
|
#
1.55 |
|
08-Apr-2020 |
pd |
vmm(4): add IOCTL handler to sets the access protections of the ept
This exposes VMM_IOC_MPROTECT_EPT which can be used by vmd to lock in physical pages. Currently, vmd just terminates the vm in case it gets a protection fault in the future.
This feature is used by solo5 which uses vmm(4) as a backend hypervisor.
ok mpi@
Patch from Adam Steen <adam@adamsteen.com.au>
|
#
1.54 |
|
11-Dec-2019 |
pd |
vmd: proper concurrency control when pausing a vm
Removes an XXX which slept for 1s waiting for the vcpu thread to reach HLT and pause. We now define a paused and unpaused condition so that a call to pause_vm() / vmctl pause blocks till the vm really reaches a paused state.
Also, detach events for devices from event loop when pausing and add them back when unpausing. This is because some callbacks call pthread_mutex_lock and if the vm is paused, it would block also causing the libevent thread to block. This would mean that we would not be able to process any IMSGs received from vmm (parent process) including a message to unpause.
ok mlarkin@
|
#
1.53 |
|
30-Nov-2019 |
mlarkin |
Revert previous - the stability was not as improved as we had thought and we ended up accidentally breaking vmctl. This will need more thought.
ok ori@
|
#
1.52 |
|
29-Nov-2019 |
mlarkin |
Fix at least one cause of VMs spinning at 100% host CPU
After debugging with ori@, it looks like an event ends up on the wrong libevent queue, and we end continually de-queueing and re-queueing the event continually. While it's unclear exactly why this happened, a clue on libevent's github issues page for the same problem pointed us to using a different event base for the device events. This seems to have unstuck ori@'s problematic VM, and I have also seen no more hangs after this.
We have not completely separated the queues; ori@ will work on setting new libevent bases for those later. But those events are pretty frequency.
with help from and ok ori@
|
Revision tags: OPENBSD_6_6_BASE
|
#
1.51 |
|
17-Jul-2019 |
pd |
vmm/vmd: Fix migration with pvclock
Implement VMM_IOC_READVMPARAMS and VMM_IOC_WRITEVMPARAMS ioctls to read and write pvclock state.
reads ok mlarkin@
|
#
1.50 |
|
28-Jun-2019 |
deraadt |
When system calls indicate an error they return -1, not some arbitrary value < 0. errno is only updated in this case. Change all (most?) callers of syscalls to follow this better, and let's see if this strictness helps us in the future.
|
#
1.49 |
|
28-May-2019 |
pd |
vmd: unset CR0_CD and CR0_NW in default flat64 register values
These never got unset on AMD/SVM guests when booted via vmctl start -b causing them to run very slow
ok mlarkin@
|
#
1.48 |
|
12-May-2019 |
pd |
vmm: add a x86 page table walker
Add a first cut of x86 page table walker to vmd(8) and vmm(4). This function is not used right now but is a building block for future features like HPET, OUTSB and INSB emulation, nested virtualisation support, etc.
With help from Mike Larkin
ok mlarkin@
|
#
1.47 |
|
11-May-2019 |
jasper |
vm_dump_header allocated space for a signature but it was never set; set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image
ok mlarkin@ pd@
|
#
1.46 |
|
11-May-2019 |
jasper |
track the state of the vm (running, paused, etc) using a single bitfield instead of a handful of separate variables. this will makes it easier for vmd to report and check on the individual vm states
no functional change intended
ok ccardenas@ mlarkin@
|
Revision tags: OPENBSD_6_5_BASE
|
#
1.45 |
|
01-Mar-2019 |
mlarkin |
vmd(8): remove some i386 remnants that missed the original cleanup
ok pd, kn, deraadt
|
#
1.44 |
|
20-Feb-2019 |
mlarkin |
vmd(8): initialize guest %drX registers to power-on defaults on launch
Initializes the %drX registers to power on defaults, and bump the VM send/recieve header to reflect same
discussed with deraadt@
|
#
1.43 |
|
10-Dec-2018 |
claudio |
Implement the fw_cfg interface basics and use it to set the bootorder if a bootdevice was forced. This implements both the pure IO port interface and also the new DMA interface, a few direct commands are implemented which are needed but in general the "file" interface should be used. There is no write support for the guest. Tested against the latest vmm-firmware port. This requires also a -current kernel to pass the IO ports to vmd(8). OK mlarkin@ ccardenas@
|
#
1.42 |
|
06-Dec-2018 |
claudio |
Make it possible to define the bootdevice in vmd. This information is used currently only when booting a OpenBSD kernel. If VMBOOTDEV_NET is used the internal dhcp server will pass "auto_install" as boot file to the client and the boot loader passes the MAC of the first interface to the kernel to indicate PXE booting. Adding boot order support to SeaBIOS is not yet implemented. Ok ccardenas@
|
Revision tags: OPENBSD_6_4_BASE
|
#
1.41 |
|
08-Oct-2018 |
reyk |
Add support for qcow2 base images (external snapshots).
This works is from Ori Bernstein, committing on his behalf:
Add support to vmd for external snapshots. That is, snapshots that are derived from a base image. Data lookups start in the derived image, and if the derived image does not contain some data, the search proceeds ot the base image. Multiple derived images may exist off of a single base image.
A limitation of this format is that modifying the base image will corrupt the derived image.
This change also adds support for creating disk derived disk images to vmctl. To use it:
vmctl create derived.qcow2 -s 16G -b base.qcow2
From Ori Bernstein OK mlarkin@ reyk@
|
#
1.40 |
|
28-Sep-2018 |
reyk |
Support vmd-internal's vmboot with qcow2 disk images.
OK mlarkin@
|
#
1.39 |
|
19-Sep-2018 |
ccardenas |
Various clean up items for disks.
- qcow2: general cleanup - vioraw: check malloc - virtio: add function to sync disks - vm: call virtio_shutdown to sync disks when vm is finished executing
Thanks to Ori Bernstein.
Ok miko@
|
#
1.38 |
|
17-Jul-2018 |
mlarkin |
vmd(8): fix vmctl -b option for i386 kernels.
ok pd@
|
#
1.37 |
|
12-Jul-2018 |
mlarkin |
vmm(8)/vmm(4): send a copy of the guest register state to vmd on exit, avoiding multiple readregs ioctls back to vmm in case register content is needed subsequently.
ok phessler
|
#
1.36 |
|
10-Jul-2018 |
mlarkin |
vmd(8): route ELCR handler to the right function
|
#
1.35 |
|
09-Jul-2018 |
mlarkin |
vmd(8): better debug message in a failure case
|
#
1.34 |
|
19-Jun-2018 |
reyk |
knf
|
#
1.33 |
|
27-Apr-2018 |
mlarkin |
vmd(8): implement vmd side of ELCR registers
ok guenther
|
#
1.32 |
|
26-Apr-2018 |
mlarkin |
vmd(8): handle PIT channel 2 status readback via port 0x61
Allow PIT channel 2 status (fired/counting) readback via port 0x61 bit 5.
ok guenther@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.31 |
|
03-Jan-2018 |
ccardenas |
Add initial CD-ROM support to VMD via vioscsi.
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
#
1.30 |
|
29-Nov-2017 |
mlarkin |
make vmm(4) less responsible for initial register state, preferring to let usermode daemons handle that.
ok pd@
|
#
1.29 |
|
28-Nov-2017 |
mlarkin |
fix some spelling errors in a few comments
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.28 |
|
19-Sep-2017 |
mlarkin |
Clarify a wrong conditional, found by jsg.
ok jsg
|
#
1.27 |
|
17-Sep-2017 |
pd |
vmd: send/recv pci config space instead of recreating pci devices on receive
ok mlarkin@
|
#
1.26 |
|
17-Sep-2017 |
pd |
vmd: re add rtc.per and rtc.sec evtimers on receive
This was missed in receive. mc146818_start is already defined. This fixes rtc time resync on receive.
ok mlarkin@
|
#
1.25 |
|
11-Sep-2017 |
dlg |
add functions to provide direct access to guest memory as vmd addresses
iovec_mem() populates an iovec array based on guest physical addresses. this allows the use of things like readv and writev for moving data between the guest and a disk image file without having to bounce the memory.
vaddr_mem() provides a vmd usable pointer based on a guests physical address. this makes it possible to directly reference things like virtio rings without having to bounce that memory either. however, it assumes that a contiguous range of guest physical memory will sit in a single vm memory range. mlarkin@ says this is right.
ok mlarkin@
|
#
1.24 |
|
20-Aug-2017 |
pd |
vmd: Allow only upward migration
This restricts receiving vms from hosts with more cpu features.
Tested on broadwell -> skylake (works) skylake -> broadwell (don't work)
ok mlarkin@
|
#
1.23 |
|
14-Aug-2017 |
mlarkin |
vmd: set MSR_MISC_ENABLE=0 on vm creation, this will be re-set in vmm based on proper values from the host in use.
|
#
1.22 |
|
15-Jul-2017 |
pd |
Add vmctl send and vmctl receive
ok reyk@ and mlarkin@
|
#
1.21 |
|
09-Jul-2017 |
pd |
vmd/vmctl: Add ability to pause / unpause vms
With help from Ashwin Agrawal
ok reyk@ mlarkin@
|
#
1.20 |
|
07-Jun-2017 |
mlarkin |
vmd: Implement simulated baudrate support in the ns8250 module. The previous version was allowing an output rate that is "too fast", and linux guests would give up after 512 characters TXed ("too much work for irq4").
This diff calculates the approximate rate we can sustain at the current programmed baud rate and limits the output to that rate by inserting a HZ delay after a specified number of characters have been transmitted. This fixes the linux guest console issue.
Note that the console now outputs at more or less the selected baud rate, instead of nearly instantaneously as before - if you selected 9600 in your guest VMs before, you might want to change that to 115200 now for a better console experience.
krw@ "seems like a good idea to me"
|
#
1.19 |
|
30-May-2017 |
tedu |
split vioblk read/write functions into start and finish as prep for async io operations. ok mlarkin
|
#
1.18 |
|
28-May-2017 |
mlarkin |
SVM: add some exit types
Also, fix a comment that wasn't applicable anymore, and change a format from decimal to hex
|
#
1.17 |
|
05-May-2017 |
reyk |
VMs cannot use proc_compose() to PROC_VMM, they have to use imsg_compose() on the "vmm_pipe" directly. This fixes the communication channel from VMs back to vmm.
|
#
1.16 |
|
05-May-2017 |
mlarkin |
Allow vmd(8) to set guest %xcr0
Usermode part of previous vmm(4) diff.
Posted to tech by Pratik Vyas
|
#
1.15 |
|
02-May-2017 |
mlarkin |
fix an error in i386 vmd build
|
#
1.14 |
|
02-May-2017 |
mlarkin |
Matching vmd(8) part of previous diff (first part of vmctl send/receive).
ok kettenis
|
#
1.13 |
|
25-Apr-2017 |
reyk |
spacing
|
#
1.12 |
|
19-Apr-2017 |
reyk |
Add support for dynamic "NAT" interfaces (-L/local interface).
When a local interface is configured, vmd configures a /31 address on the tap(4) interface of the host and provides another IP in the same subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server that replies with IP, gateway, and DNS addresses to the VM. The built-in server only ever responds to the VM on the inside and cannot leak its DHCP responses to the outside.
Thanks to Uwe Werler, Josh Grosse, and some others for testing!
OK deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.11 |
|
27-Mar-2017 |
deraadt |
die whitespace die die die
|
#
1.10 |
|
25-Mar-2017 |
mlarkin |
Last bits needed to get seabios + alpine linux working. This is enough to get started and let more people help finding and fixing bugs.
ok kettenis, deraadt
|
#
1.9 |
|
25-Mar-2017 |
reyk |
Boot using BIOS from /etc/firmware/vmm-bios by default.
Instead of using the internal "vmboot", VMs will now be booted using the external BIOS firmware in /etc/firmware/vmm-bios (which is subject to a LGPLv3 license). Direct booting of OpenBSD kernels or non-default BIOS images is still supported for now using the -b/boot option that is replacing the -k/kernel option.
As requested by Theo, vmd(8) fails if neither the default BIOS is found nor a kernel has been specified in the VM configuration. The "vmm" BIOS has to be installed using fw_update(1), which will be done automatically in most cases where the OpenBSD can fetch it after install/upgrade.
OK mlarkin@
|
#
1.8 |
|
25-Mar-2017 |
mlarkin |
Implement some missing functionality and clean up some code in vmd pci emulation.
ok kettenis
|
#
1.7 |
|
25-Mar-2017 |
mlarkin |
Introduce a new function to obtain properly sized input data, and convert i8253/i8259/mc146818 emulation to use this.
|
#
1.6 |
|
24-Mar-2017 |
mlarkin |
Allow vmd to proceed after an interrupt occurred after retiring a cpuid instruction. Matches previous commit to kernel vmm.c
|
#
1.5 |
|
23-Mar-2017 |
mlarkin |
Implement memory size and SMP CPU count NVRAM registers in the emulated mc146818. This is needed for seabios to boot properly (and construct a sensible e820 map to send to the guest OS).
|
#
1.4 |
|
21-Mar-2017 |
mlarkin |
Fix two errors in NS8250 (UART) emulation. The first error zeroed out the high bits of %eax on reading register data from the emulated UART ports. The second error didn't properly assert the TXRDY bit during init - this bit was only set after the first character was sent. Both these bugs caused seabios to not be able to output any data. Found during the recent effort to get Linux guests booting.
|
#
1.3 |
|
15-Mar-2017 |
reyk |
Improve vmmci(4) shutdown and reboot.
This change handles various cases to power off the VM, even if it is unresponsive, stuck in ddb, or when the shutdown was initiated from the VM guest side. Usage of timeout and VM ACKs make sure that the VM is really turned off at some point.
OK mlarkin@
|
#
1.2 |
|
02-Mar-2017 |
reyk |
Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.
This is especially useful when multiple VMs share a switch, the implementation is independent from the underlying switch or bridge.
no objections mlarkin@
|
#
1.1 |
|
01-Mar-2017 |
reyk |
Split vmm.c into two files: vm.c for the VM child, vmm.c for the parent
As discussed with mlarkin@, it makes it easier to maintain the file.
OK mlarkin@
|
#
1.86 |
|
25-Apr-2023 |
dv |
vmm(4)/vmd(8): pull struct members out of vmm ioctl create struct.
The object sent to vmm(4) contained file paths and details the kernel does not need for cpu virtualization as device emulation is in userland. Effectively, "pull up" the struct members from the vm_create_params struct to the parent vmop_create_params struct.
This allows us to clean up some of vmd(8) and simplify things for switching to having vmctl(8) open the "kernel" file (SeaBIOS, bsd.rd, etc.) to allow users to boot recovery ramdisk kernels.
ok mlarkin@
|
#
1.85 |
|
23-Apr-2023 |
dv |
vmd(8): teach vmm process how to exec.
Use execvp(2) to launch vm children with new address spaces. Consequently, introduces use of unveil(2) into the vmm and vm processes.
This imposes the requirement of launching vmd with absolute paths, similar to sshd(8).
ok mlarkin@
|
#
1.84 |
|
23-Apr-2023 |
anton |
unbreak tree by coping with recent s/XCR0/XFEATURE rename
|
Revision tags: OPENBSD_7_3_BASE
|
#
1.83 |
|
06-Feb-2023 |
dv |
vmd(8): scan pci bus to determine bootorder strings.
vmd's SeaBIOS bootorder strings had hardcoded pci device ids, so if a user added a network interface the bootorder strings didn't line up with reality. Using vmctl(8) to boot from a cdrom (-B cdrom) would fail, for instance, if attaching both a nic and a disk as well.
This change scans the pci devices and finds the first of each type to construct viable bootorder strings.
ok jan@
|
#
1.82 |
|
28-Jan-2023 |
dv |
Move some header definitions from vmm(4) to vmd(8).
Part of an ongoing effort to move userland-specific information out of a kernel header and directly into vmd(8). No functional change.
ok mlarkin@
|
#
1.81 |
|
08-Jan-2023 |
dv |
vmd(8): add thread names to vm process.
ok guenther@.
|
#
1.80 |
|
04-Jan-2023 |
dv |
Typos in vmd error message. No functional change.
|
#
1.79 |
|
28-Dec-2022 |
jmc |
spelling fixes; from paul tagliamonte any parts of his diff not taken are noted on tech
|
#
1.78 |
|
26-Dec-2022 |
dv |
vmd(8): provide a detailed e820 memory map.
When booting guests with SeaBIOS, vmd(8) supplied details about the available guest memory via CMOS registers. Consequently, we've been carrying some patches in the ports tree to SeaBIOS to fetch this information like it's the 1990s.
When a vm initializes memory ranges, we now track what each range represents. This information can be used to supply the e820 memory map to SeaBIOS via the fw_cfg interface allowing it to properly communicate memory ranges to a guest operating system. (This will also allow us to drop some patches from the port.)
Given the ranges can now be marked with a purpose, this also allows vmm(4) to switch from hard-coded mmio ranges and instead let the information on the memory range dictate if vmm should be handling a page fault or sending to vmd for a memory assist.
Tested by Mischa Peters and others. OK mlarkin@.
|
#
1.77 |
|
23-Dec-2022 |
dv |
vmd(8): implement zero-copy operations on virtqueues.
The original virtio device implementation relied on allocating a buffer on heap, copying the virtqueue from the guest, mutating the copy, and then overwriting the virtqueue in the guest.
While the approach worked, it was both complex and added extra overhead. On older hardware, switching to the zero-copy approach can show a noticeable performance improvement for vionet devices. An added benefit is this diff also reduces the amount of code in vmd, which is always a welcome change.
In addition, change to talking about the queue pfn and not "address" as the virtio-pci spec has drivers provide a 32-bit value representing the physical page number of the location in guest memory, not the linear address.
Original idea from dlg@ while working on re-adding async task queues.
ok dlg@, tested by many
|
#
1.76 |
|
11-Nov-2022 |
dv |
Revert removal of toggling interrupt line in vmd vcpu run loop.
phessler reports a performance regression. Needs more testing.
|
#
1.75 |
|
10-Nov-2022 |
dv |
vmd(8): remove toggling interrupt line on vcpu in vcpu run loop
We toggle the interrupt "line" on the vcpu when we assert or deassert irq on the pic in either the vcpu thread (emulating some devices) or on the device event thread (mostly handling reading available data). Having it in the vcpu run loop here just results in another ioctl(2) call before the one for re-entering the guest cpu.
Removing it shows no noticeable behavioral change in existing guests.
ok mlarkin@
|
#
1.74 |
|
10-Nov-2022 |
dv |
vmd(8): import mmio decode and emulation, disabled for now.
The initial mmio support for vmd adds support for only specific MOV and MOVZX instructions. Plan is to begin iterating in-tree on other missing pieces. All functionality is gated behind an #if for now.
Only change to vmm(4) is reordering register #define's in vmmvar.h.
ok mlarkin@
|
Revision tags: OPENBSD_7_2_BASE
|
#
1.73 |
|
01-Sep-2022 |
dv |
vmm(4): send all port io emulation to userland
Simplify things by sending any io exits from IN/OUT instructions to userland instead of trying to emulate anything in the kernel. vmm was sending most pertinent exits to vmd anyways, so this functionally changes little.
An added benefit is this solves an issue reported by tb@ where i386 OpenBSD guests would probe for a pc keyboard repeatedly and cause excessive vm exits. (The emulation in vmm was not properly handling these port reads.)
While here, make the assignment of the VEI_DIR_{IN,OUT} enum values not assume the underlying integer the compiler may assign.
ok mlarkin@
|
#
1.72 |
|
30-Aug-2022 |
dv |
Initial support for mmio assist for vmm(4)
Provide the basic information required for a userland assist in emulating instructions touching mmio regions, sending as much information as is provided by the host hardware.
No decode or assist provided at the moment by vmd(8).
ok mlarkin@
|
#
1.71 |
|
29-Jun-2022 |
dv |
vmd(8): fix off by one in vm memory range check
When inspecting if a gpa falls into a known memory range, vmd was considering it valid 1 byte past the end resulting in selecting the wrong starting range for the search.
ok mlarkin@
|
#
1.70 |
|
26-Jun-2022 |
dv |
vmd: create a copy of bios at 4g boundary
Newer Linux kernels call into the bios to perform a reboot and our version of SeaBIOS assumes there's a "copy" of the bios ending at 4g. When SeaBIOS reads from this area, since vmd doesn't perform mmio yet, guests terminate with an unhandled fault.
Carve out some space ending at 4g and copy the bios there. Technically we could load garbage there, but give SeaBIOS what it wants for now.
ok mlarkin@
|
#
1.69 |
|
03-May-2022 |
dv |
vmm/vmd/vmctl: standardize memory units to bytes
At different points in the vm lifecycle vmm(4), vmctl(8), and vmd(8) refer to a vm's memory range sizes in either bytes or megabytes. This is needlessly complex.
Switch to using bytes everywhere and adjust types and constants accordingly. While this makes it possible to specify vm's with memory in fractions of megabytes, the logic requiring whole megabyte values remains.
Feedback from deraadt@, mlarkin@, and Matthew Martin.
ok mlarkin@
|
Revision tags: OPENBSD_7_1_BASE
|
#
1.68 |
|
01-Mar-2022 |
dv |
vmd(8): gracefully handle hitting data limits when starting a vm
With recent changes to login.conf(5) to restrict daemon datasize to a finite value, users can now hit resource limits when attempting to start a vm.
This change fixes the error path when hitting the limit. vmd(8) will no longer abort and memory error messages are relayed to the user.
While here, address potential under-reads/writes using atomicio when relaying data between the child vm process and vmd's vmm process.
Original diff from tedu@. OK mlarkin@.
|
#
1.67 |
|
30-Dec-2021 |
claudio |
Add back support for -B net -b bsd.rd which emulates a PXE install and results in an autoinstall. This can be used to quickly create new OpenBSD installs. OK dv@
|
#
1.66 |
|
29-Nov-2021 |
deraadt |
mostly avoid sys/param.h with a local nitems() ok mlarkin
|
Revision tags: OPENBSD_7_0_BASE
|
#
1.65 |
|
01-Sep-2021 |
dv |
remove unused functions and cleanup vmd.h
Discussed with mlarkin@. These functions were implemented but never used. While in vmd.h, fix the order to match current vmd(8) reality.
|
#
1.64 |
|
16-Jul-2021 |
dv |
vmd(8): simplify vcpu logic, removing uart & vionet reads
Remove legacy state handling on the ns8250 and virtio network devices originally put in place before using libevent for async device events. The vcpu thread doesn't need to process device data as it is handled by the libevent thread.
This has the benefit of simplifying some of the message passing between threads introduced to the ns8250 uart since both the vcpu and libevent threads were processing read events.
No functional change intended. Tested by many, including abieber@, weerd@, Mischa Peters, and Matthias Schmidt. (Thanks.)
OK mlarkin@
|
#
1.63 |
|
16-Jun-2021 |
dv |
cleanup vmd(8) includes and header files
Lots of organic growth other the years lead to unnecessary includes (proc.h everywhere) and odd dependencies between header files. This cleans things up a bit to help with upcoming cleanup around dhcp code.
No functional change.
"go for it" mlarkin@
|
Revision tags: OPENBSD_6_9_BASE
|
#
1.62 |
|
05-Apr-2021 |
dv |
Support booting from compressed kernel images.
The bsd.rd ramdisk now ships gzip'd on amd64. Use libz in base to transparently handle decompression of any compressed kernel images.
Patch from Josh Rickmar.
ok kn@
|
#
1.61 |
|
29-Mar-2021 |
dv |
Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior ignored did not intercept these packets and instead transmitted them.
This should make vmd(8)'s dhcp behave more as a true dhcp server should and allows it to work properly with the new dhcpleased(8) attempting a renewal.
OK mlarkin@
|
#
1.60 |
|
19-Mar-2021 |
kn |
Remove booting from kernels in raw/qcow2 images
Diff and (slightly tweaked) text below from Dave Voutila < dave at sisu dot io >, thanks!
-- Since 6.7 switched to FFS2 as the default filesystem for new installs, the ability for vmd(8) to load a kernel and boot.conf from a disk image directly (without SeaBIOS) has been broken.
A diff from tb to add FFS2 support never mdae it into the tree.
On 5th Jan 2021, new ramdisks for amd64 have started shipping gzipped, breaking the ability to load the bsd.rd directly as a kernel image for a vmd guest without first uncompressing the image.
Using BIOS works, the FFS2 change happend ten months ago and few if any have complained about the breakage. vmctl(8) is still vague about supporting it per its man page and one still has to pass the disk image twice as a "-b" and "-d" argument to boot an OpenBSD guest *without* BIOS.
Josh Rickmar reported the gzip issue on bugs@ and provided patches to add support for compressed ramdisks and kernel images. The easiest way to do so is to drop support for FFS images since they require a call to fmemopen(3) while all the other logic uses fopen(3)/fdopen(3) calls and a file descriptor. It is much easier to get thsoe patches merged if they don't have to account for extracting files from disk images. --
No objections anyone "Removing it makes sense" reyk (who wrote the FFS module) OK mlarkin
|
#
1.59 |
|
13-Feb-2021 |
mlarkin |
Fix some wrong comments and KNF/long line wraps
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.58 |
|
28-Jun-2020 |
pd |
vmd(8): Eliminate libevent state corruption
libevent functions for com, pic and rtc are now only called on event_thread. vcpu exit handlers send messages on a dev pipe and callbacks on these events do the event management (event_add, evtimer_add, etc). Previously, libevent state was mutated by two threads, event_thread, that runs all the callbacks and the vcpu thread when running exit handlers. This could have lead to libevent state corruption.
Patch from Dave Voutila <dave@sisu.io>
ok claudio@ tested by abieber@ and brynet@
|
Revision tags: OPENBSD_6_7_BASE
|
#
1.57 |
|
30-Apr-2020 |
pd |
vmd(8): correctly terminate vm processes after sending vm
Instead of a round about way of sending a message to vmm that 'send is successful' and terminating by vm_remove from vmm, we can send the imsg and exit in the vm process. The sigchld handler in vmm will vm_remove it from its structures. This is how a normal vm is terminated as well.
Previously, vm_remove was called in vmm_dispatch_vm (ie. the event handler to receive messages from vm process) when hanlding the IMSG_VMDOP_SEND_VM_RESPONSE (ie. the vm process has written the vm state to the fd passed on by vmctl send). This is not how vm_remove was intented to be used as it does a free(vm). The vm struct holds the buffers for imsg and so after handling this IMSG_VMDOP_SEND_VM_RESPONSE message, vmm_dispatch_vm loops again to do imsg_get(ibuf, &imsg) to read the next message (and we had just freed this *ibuf when we freed the vm struct) causing it to segfault.
reported by kn@ ok kn@
|
#
1.56 |
|
21-Apr-2020 |
pd |
vmd: improve concurrency control in pause
Previous implementation hit a deadlock sometimes as the pthread_cond_broadcast for the pause mutex could happen before pthread_cond_wait. This implementation uses a barrier which is hit when all vpcus are paused.
ok mpi@
|
#
1.55 |
|
08-Apr-2020 |
pd |
vmm(4): add IOCTL handler to sets the access protections of the ept
This exposes VMM_IOC_MPROTECT_EPT which can be used by vmd to lock in physical pages. Currently, vmd just terminates the vm in case it gets a protection fault in the future.
This feature is used by solo5 which uses vmm(4) as a backend hypervisor.
ok mpi@
Patch from Adam Steen <adam@adamsteen.com.au>
|
#
1.54 |
|
11-Dec-2019 |
pd |
vmd: proper concurrency control when pausing a vm
Removes an XXX which slept for 1s waiting for the vcpu thread to reach HLT and pause. We now define a paused and unpaused condition so that a call to pause_vm() / vmctl pause blocks till the vm really reaches a paused state.
Also, detach events for devices from event loop when pausing and add them back when unpausing. This is because some callbacks call pthread_mutex_lock and if the vm is paused, it would block also causing the libevent thread to block. This would mean that we would not be able to process any IMSGs received from vmm (parent process) including a message to unpause.
ok mlarkin@
|
#
1.53 |
|
30-Nov-2019 |
mlarkin |
Revert previous - the stability was not as improved as we had thought and we ended up accidentally breaking vmctl. This will need more thought.
ok ori@
|
#
1.52 |
|
29-Nov-2019 |
mlarkin |
Fix at least one cause of VMs spinning at 100% host CPU
After debugging with ori@, it looks like an event ends up on the wrong libevent queue, and we end continually de-queueing and re-queueing the event continually. While it's unclear exactly why this happened, a clue on libevent's github issues page for the same problem pointed us to using a different event base for the device events. This seems to have unstuck ori@'s problematic VM, and I have also seen no more hangs after this.
We have not completely separated the queues; ori@ will work on setting new libevent bases for those later. But those events are pretty frequency.
with help from and ok ori@
|
Revision tags: OPENBSD_6_6_BASE
|
#
1.51 |
|
17-Jul-2019 |
pd |
vmm/vmd: Fix migration with pvclock
Implement VMM_IOC_READVMPARAMS and VMM_IOC_WRITEVMPARAMS ioctls to read and write pvclock state.
reads ok mlarkin@
|
#
1.50 |
|
28-Jun-2019 |
deraadt |
When system calls indicate an error they return -1, not some arbitrary value < 0. errno is only updated in this case. Change all (most?) callers of syscalls to follow this better, and let's see if this strictness helps us in the future.
|
#
1.49 |
|
28-May-2019 |
pd |
vmd: unset CR0_CD and CR0_NW in default flat64 register values
These never got unset on AMD/SVM guests when booted via vmctl start -b causing them to run very slow
ok mlarkin@
|
#
1.48 |
|
12-May-2019 |
pd |
vmm: add a x86 page table walker
Add a first cut of x86 page table walker to vmd(8) and vmm(4). This function is not used right now but is a building block for future features like HPET, OUTSB and INSB emulation, nested virtualisation support, etc.
With help from Mike Larkin
ok mlarkin@
|
#
1.47 |
|
11-May-2019 |
jasper |
vm_dump_header allocated space for a signature but it was never set; set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image
ok mlarkin@ pd@
|
#
1.46 |
|
11-May-2019 |
jasper |
track the state of the vm (running, paused, etc) using a single bitfield instead of a handful of separate variables. this will makes it easier for vmd to report and check on the individual vm states
no functional change intended
ok ccardenas@ mlarkin@
|
Revision tags: OPENBSD_6_5_BASE
|
#
1.45 |
|
01-Mar-2019 |
mlarkin |
vmd(8): remove some i386 remnants that missed the original cleanup
ok pd, kn, deraadt
|
#
1.44 |
|
20-Feb-2019 |
mlarkin |
vmd(8): initialize guest %drX registers to power-on defaults on launch
Initializes the %drX registers to power on defaults, and bump the VM send/recieve header to reflect same
discussed with deraadt@
|
#
1.43 |
|
10-Dec-2018 |
claudio |
Implement the fw_cfg interface basics and use it to set the bootorder if a bootdevice was forced. This implements both the pure IO port interface and also the new DMA interface, a few direct commands are implemented which are needed but in general the "file" interface should be used. There is no write support for the guest. Tested against the latest vmm-firmware port. This requires also a -current kernel to pass the IO ports to vmd(8). OK mlarkin@ ccardenas@
|
#
1.42 |
|
06-Dec-2018 |
claudio |
Make it possible to define the bootdevice in vmd. This information is used currently only when booting a OpenBSD kernel. If VMBOOTDEV_NET is used the internal dhcp server will pass "auto_install" as boot file to the client and the boot loader passes the MAC of the first interface to the kernel to indicate PXE booting. Adding boot order support to SeaBIOS is not yet implemented. Ok ccardenas@
|
Revision tags: OPENBSD_6_4_BASE
|
#
1.41 |
|
08-Oct-2018 |
reyk |
Add support for qcow2 base images (external snapshots).
This works is from Ori Bernstein, committing on his behalf:
Add support to vmd for external snapshots. That is, snapshots that are derived from a base image. Data lookups start in the derived image, and if the derived image does not contain some data, the search proceeds ot the base image. Multiple derived images may exist off of a single base image.
A limitation of this format is that modifying the base image will corrupt the derived image.
This change also adds support for creating disk derived disk images to vmctl. To use it:
vmctl create derived.qcow2 -s 16G -b base.qcow2
From Ori Bernstein OK mlarkin@ reyk@
|
#
1.40 |
|
28-Sep-2018 |
reyk |
Support vmd-internal's vmboot with qcow2 disk images.
OK mlarkin@
|
#
1.39 |
|
19-Sep-2018 |
ccardenas |
Various clean up items for disks.
- qcow2: general cleanup - vioraw: check malloc - virtio: add function to sync disks - vm: call virtio_shutdown to sync disks when vm is finished executing
Thanks to Ori Bernstein.
Ok miko@
|
#
1.38 |
|
17-Jul-2018 |
mlarkin |
vmd(8): fix vmctl -b option for i386 kernels.
ok pd@
|
#
1.37 |
|
12-Jul-2018 |
mlarkin |
vmm(8)/vmm(4): send a copy of the guest register state to vmd on exit, avoiding multiple readregs ioctls back to vmm in case register content is needed subsequently.
ok phessler
|
#
1.36 |
|
10-Jul-2018 |
mlarkin |
vmd(8): route ELCR handler to the right function
|
#
1.35 |
|
09-Jul-2018 |
mlarkin |
vmd(8): better debug message in a failure case
|
#
1.34 |
|
19-Jun-2018 |
reyk |
knf
|
#
1.33 |
|
27-Apr-2018 |
mlarkin |
vmd(8): implement vmd side of ELCR registers
ok guenther
|
#
1.32 |
|
26-Apr-2018 |
mlarkin |
vmd(8): handle PIT channel 2 status readback via port 0x61
Allow PIT channel 2 status (fired/counting) readback via port 0x61 bit 5.
ok guenther@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.31 |
|
03-Jan-2018 |
ccardenas |
Add initial CD-ROM support to VMD via vioscsi.
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
#
1.30 |
|
29-Nov-2017 |
mlarkin |
make vmm(4) less responsible for initial register state, preferring to let usermode daemons handle that.
ok pd@
|
#
1.29 |
|
28-Nov-2017 |
mlarkin |
fix some spelling errors in a few comments
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.28 |
|
19-Sep-2017 |
mlarkin |
Clarify a wrong conditional, found by jsg.
ok jsg
|
#
1.27 |
|
17-Sep-2017 |
pd |
vmd: send/recv pci config space instead of recreating pci devices on receive
ok mlarkin@
|
#
1.26 |
|
17-Sep-2017 |
pd |
vmd: re add rtc.per and rtc.sec evtimers on receive
This was missed in receive. mc146818_start is already defined. This fixes rtc time resync on receive.
ok mlarkin@
|
#
1.25 |
|
11-Sep-2017 |
dlg |
add functions to provide direct access to guest memory as vmd addresses
iovec_mem() populates an iovec array based on guest physical addresses. this allows the use of things like readv and writev for moving data between the guest and a disk image file without having to bounce the memory.
vaddr_mem() provides a vmd usable pointer based on a guests physical address. this makes it possible to directly reference things like virtio rings without having to bounce that memory either. however, it assumes that a contiguous range of guest physical memory will sit in a single vm memory range. mlarkin@ says this is right.
ok mlarkin@
|
#
1.24 |
|
20-Aug-2017 |
pd |
vmd: Allow only upward migration
This restricts receiving vms from hosts with more cpu features.
Tested on broadwell -> skylake (works) skylake -> broadwell (don't work)
ok mlarkin@
|
#
1.23 |
|
14-Aug-2017 |
mlarkin |
vmd: set MSR_MISC_ENABLE=0 on vm creation, this will be re-set in vmm based on proper values from the host in use.
|
#
1.22 |
|
15-Jul-2017 |
pd |
Add vmctl send and vmctl receive
ok reyk@ and mlarkin@
|
#
1.21 |
|
09-Jul-2017 |
pd |
vmd/vmctl: Add ability to pause / unpause vms
With help from Ashwin Agrawal
ok reyk@ mlarkin@
|
#
1.20 |
|
07-Jun-2017 |
mlarkin |
vmd: Implement simulated baudrate support in the ns8250 module. The previous version was allowing an output rate that is "too fast", and linux guests would give up after 512 characters TXed ("too much work for irq4").
This diff calculates the approximate rate we can sustain at the current programmed baud rate and limits the output to that rate by inserting a HZ delay after a specified number of characters have been transmitted. This fixes the linux guest console issue.
Note that the console now outputs at more or less the selected baud rate, instead of nearly instantaneously as before - if you selected 9600 in your guest VMs before, you might want to change that to 115200 now for a better console experience.
krw@ "seems like a good idea to me"
|
#
1.19 |
|
30-May-2017 |
tedu |
split vioblk read/write functions into start and finish as prep for async io operations. ok mlarkin
|
#
1.18 |
|
28-May-2017 |
mlarkin |
SVM: add some exit types
Also, fix a comment that wasn't applicable anymore, and change a format from decimal to hex
|
#
1.17 |
|
05-May-2017 |
reyk |
VMs cannot use proc_compose() to PROC_VMM, they have to use imsg_compose() on the "vmm_pipe" directly. This fixes the communication channel from VMs back to vmm.
|
#
1.16 |
|
05-May-2017 |
mlarkin |
Allow vmd(8) to set guest %xcr0
Usermode part of previous vmm(4) diff.
Posted to tech by Pratik Vyas
|
#
1.15 |
|
02-May-2017 |
mlarkin |
fix an error in i386 vmd build
|
#
1.14 |
|
02-May-2017 |
mlarkin |
Matching vmd(8) part of previous diff (first part of vmctl send/receive).
ok kettenis
|
#
1.13 |
|
25-Apr-2017 |
reyk |
spacing
|
#
1.12 |
|
19-Apr-2017 |
reyk |
Add support for dynamic "NAT" interfaces (-L/local interface).
When a local interface is configured, vmd configures a /31 address on the tap(4) interface of the host and provides another IP in the same subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server that replies with IP, gateway, and DNS addresses to the VM. The built-in server only ever responds to the VM on the inside and cannot leak its DHCP responses to the outside.
Thanks to Uwe Werler, Josh Grosse, and some others for testing!
OK deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.11 |
|
27-Mar-2017 |
deraadt |
die whitespace die die die
|
#
1.10 |
|
25-Mar-2017 |
mlarkin |
Last bits needed to get seabios + alpine linux working. This is enough to get started and let more people help finding and fixing bugs.
ok kettenis, deraadt
|
#
1.9 |
|
25-Mar-2017 |
reyk |
Boot using BIOS from /etc/firmware/vmm-bios by default.
Instead of using the internal "vmboot", VMs will now be booted using the external BIOS firmware in /etc/firmware/vmm-bios (which is subject to a LGPLv3 license). Direct booting of OpenBSD kernels or non-default BIOS images is still supported for now using the -b/boot option that is replacing the -k/kernel option.
As requested by Theo, vmd(8) fails if neither the default BIOS is found nor a kernel has been specified in the VM configuration. The "vmm" BIOS has to be installed using fw_update(1), which will be done automatically in most cases where the OpenBSD can fetch it after install/upgrade.
OK mlarkin@
|
#
1.8 |
|
25-Mar-2017 |
mlarkin |
Implement some missing functionality and clean up some code in vmd pci emulation.
ok kettenis
|
#
1.7 |
|
25-Mar-2017 |
mlarkin |
Introduce a new function to obtain properly sized input data, and convert i8253/i8259/mc146818 emulation to use this.
|
#
1.6 |
|
24-Mar-2017 |
mlarkin |
Allow vmd to proceed after an interrupt occurred after retiring a cpuid instruction. Matches previous commit to kernel vmm.c
|
#
1.5 |
|
23-Mar-2017 |
mlarkin |
Implement memory size and SMP CPU count NVRAM registers in the emulated mc146818. This is needed for seabios to boot properly (and construct a sensible e820 map to send to the guest OS).
|
#
1.4 |
|
21-Mar-2017 |
mlarkin |
Fix two errors in NS8250 (UART) emulation. The first error zeroed out the high bits of %eax on reading register data from the emulated UART ports. The second error didn't properly assert the TXRDY bit during init - this bit was only set after the first character was sent. Both these bugs caused seabios to not be able to output any data. Found during the recent effort to get Linux guests booting.
|
#
1.3 |
|
15-Mar-2017 |
reyk |
Improve vmmci(4) shutdown and reboot.
This change handles various cases to power off the VM, even if it is unresponsive, stuck in ddb, or when the shutdown was initiated from the VM guest side. Usage of timeout and VM ACKs make sure that the VM is really turned off at some point.
OK mlarkin@
|
#
1.2 |
|
02-Mar-2017 |
reyk |
Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.
This is especially useful when multiple VMs share a switch, the implementation is independent from the underlying switch or bridge.
no objections mlarkin@
|
#
1.1 |
|
01-Mar-2017 |
reyk |
Split vmm.c into two files: vm.c for the VM child, vmm.c for the parent
As discussed with mlarkin@, it makes it easier to maintain the file.
OK mlarkin@
|
#
1.85 |
|
23-Apr-2023 |
dv |
vmd(8): teach vmm process how to exec.
Use execvp(2) to launch vm children with new address spaces. Consequently, introduces use of unveil(2) into the vmm and vm processes.
This imposes the requirement of launching vmd with absolute paths, similar to sshd(8).
ok mlarkin@
|
#
1.84 |
|
23-Apr-2023 |
anton |
unbreak tree by coping with recent s/XCR0/XFEATURE rename
|
Revision tags: OPENBSD_7_3_BASE
|
#
1.83 |
|
06-Feb-2023 |
dv |
vmd(8): scan pci bus to determine bootorder strings.
vmd's SeaBIOS bootorder strings had hardcoded pci device ids, so if a user added a network interface the bootorder strings didn't line up with reality. Using vmctl(8) to boot from a cdrom (-B cdrom) would fail, for instance, if attaching both a nic and a disk as well.
This change scans the pci devices and finds the first of each type to construct viable bootorder strings.
ok jan@
|
#
1.82 |
|
28-Jan-2023 |
dv |
Move some header definitions from vmm(4) to vmd(8).
Part of an ongoing effort to move userland-specific information out of a kernel header and directly into vmd(8). No functional change.
ok mlarkin@
|
#
1.81 |
|
08-Jan-2023 |
dv |
vmd(8): add thread names to vm process.
ok guenther@.
|
#
1.80 |
|
04-Jan-2023 |
dv |
Typos in vmd error message. No functional change.
|
#
1.79 |
|
28-Dec-2022 |
jmc |
spelling fixes; from paul tagliamonte any parts of his diff not taken are noted on tech
|
#
1.78 |
|
26-Dec-2022 |
dv |
vmd(8): provide a detailed e820 memory map.
When booting guests with SeaBIOS, vmd(8) supplied details about the available guest memory via CMOS registers. Consequently, we've been carrying some patches in the ports tree to SeaBIOS to fetch this information like it's the 1990s.
When a vm initializes memory ranges, we now track what each range represents. This information can be used to supply the e820 memory map to SeaBIOS via the fw_cfg interface allowing it to properly communicate memory ranges to a guest operating system. (This will also allow us to drop some patches from the port.)
Given the ranges can now be marked with a purpose, this also allows vmm(4) to switch from hard-coded mmio ranges and instead let the information on the memory range dictate if vmm should be handling a page fault or sending to vmd for a memory assist.
Tested by Mischa Peters and others. OK mlarkin@.
|
#
1.77 |
|
23-Dec-2022 |
dv |
vmd(8): implement zero-copy operations on virtqueues.
The original virtio device implementation relied on allocating a buffer on heap, copying the virtqueue from the guest, mutating the copy, and then overwriting the virtqueue in the guest.
While the approach worked, it was both complex and added extra overhead. On older hardware, switching to the zero-copy approach can show a noticeable performance improvement for vionet devices. An added benefit is this diff also reduces the amount of code in vmd, which is always a welcome change.
In addition, change to talking about the queue pfn and not "address" as the virtio-pci spec has drivers provide a 32-bit value representing the physical page number of the location in guest memory, not the linear address.
Original idea from dlg@ while working on re-adding async task queues.
ok dlg@, tested by many
|
#
1.76 |
|
11-Nov-2022 |
dv |
Revert removal of toggling interrupt line in vmd vcpu run loop.
phessler reports a performance regression. Needs more testing.
|
#
1.75 |
|
10-Nov-2022 |
dv |
vmd(8): remove toggling interrupt line on vcpu in vcpu run loop
We toggle the interrupt "line" on the vcpu when we assert or deassert irq on the pic in either the vcpu thread (emulating some devices) or on the device event thread (mostly handling reading available data). Having it in the vcpu run loop here just results in another ioctl(2) call before the one for re-entering the guest cpu.
Removing it shows no noticeable behavioral change in existing guests.
ok mlarkin@
|
#
1.74 |
|
10-Nov-2022 |
dv |
vmd(8): import mmio decode and emulation, disabled for now.
The initial mmio support for vmd adds support for only specific MOV and MOVZX instructions. Plan is to begin iterating in-tree on other missing pieces. All functionality is gated behind an #if for now.
Only change to vmm(4) is reordering register #define's in vmmvar.h.
ok mlarkin@
|
Revision tags: OPENBSD_7_2_BASE
|
#
1.73 |
|
01-Sep-2022 |
dv |
vmm(4): send all port io emulation to userland
Simplify things by sending any io exits from IN/OUT instructions to userland instead of trying to emulate anything in the kernel. vmm was sending most pertinent exits to vmd anyways, so this functionally changes little.
An added benefit is this solves an issue reported by tb@ where i386 OpenBSD guests would probe for a pc keyboard repeatedly and cause excessive vm exits. (The emulation in vmm was not properly handling these port reads.)
While here, make the assignment of the VEI_DIR_{IN,OUT} enum values not assume the underlying integer the compiler may assign.
ok mlarkin@
|
#
1.72 |
|
30-Aug-2022 |
dv |
Initial support for mmio assist for vmm(4)
Provide the basic information required for a userland assist in emulating instructions touching mmio regions, sending as much information as is provided by the host hardware.
No decode or assist provided at the moment by vmd(8).
ok mlarkin@
|
#
1.71 |
|
29-Jun-2022 |
dv |
vmd(8): fix off by one in vm memory range check
When inspecting if a gpa falls into a known memory range, vmd was considering it valid 1 byte past the end resulting in selecting the wrong starting range for the search.
ok mlarkin@
|
#
1.70 |
|
26-Jun-2022 |
dv |
vmd: create a copy of bios at 4g boundary
Newer Linux kernels call into the bios to perform a reboot and our version of SeaBIOS assumes there's a "copy" of the bios ending at 4g. When SeaBIOS reads from this area, since vmd doesn't perform mmio yet, guests terminate with an unhandled fault.
Carve out some space ending at 4g and copy the bios there. Technically we could load garbage there, but give SeaBIOS what it wants for now.
ok mlarkin@
|
#
1.69 |
|
03-May-2022 |
dv |
vmm/vmd/vmctl: standardize memory units to bytes
At different points in the vm lifecycle vmm(4), vmctl(8), and vmd(8) refer to a vm's memory range sizes in either bytes or megabytes. This is needlessly complex.
Switch to using bytes everywhere and adjust types and constants accordingly. While this makes it possible to specify vm's with memory in fractions of megabytes, the logic requiring whole megabyte values remains.
Feedback from deraadt@, mlarkin@, and Matthew Martin.
ok mlarkin@
|
Revision tags: OPENBSD_7_1_BASE
|
#
1.68 |
|
01-Mar-2022 |
dv |
vmd(8): gracefully handle hitting data limits when starting a vm
With recent changes to login.conf(5) to restrict daemon datasize to a finite value, users can now hit resource limits when attempting to start a vm.
This change fixes the error path when hitting the limit. vmd(8) will no longer abort and memory error messages are relayed to the user.
While here, address potential under-reads/writes using atomicio when relaying data between the child vm process and vmd's vmm process.
Original diff from tedu@. OK mlarkin@.
|
#
1.67 |
|
30-Dec-2021 |
claudio |
Add back support for -B net -b bsd.rd which emulates a PXE install and results in an autoinstall. This can be used to quickly create new OpenBSD installs. OK dv@
|
#
1.66 |
|
29-Nov-2021 |
deraadt |
mostly avoid sys/param.h with a local nitems() ok mlarkin
|
Revision tags: OPENBSD_7_0_BASE
|
#
1.65 |
|
01-Sep-2021 |
dv |
remove unused functions and cleanup vmd.h
Discussed with mlarkin@. These functions were implemented but never used. While in vmd.h, fix the order to match current vmd(8) reality.
|
#
1.64 |
|
16-Jul-2021 |
dv |
vmd(8): simplify vcpu logic, removing uart & vionet reads
Remove legacy state handling on the ns8250 and virtio network devices originally put in place before using libevent for async device events. The vcpu thread doesn't need to process device data as it is handled by the libevent thread.
This has the benefit of simplifying some of the message passing between threads introduced to the ns8250 uart since both the vcpu and libevent threads were processing read events.
No functional change intended. Tested by many, including abieber@, weerd@, Mischa Peters, and Matthias Schmidt. (Thanks.)
OK mlarkin@
|
#
1.63 |
|
16-Jun-2021 |
dv |
cleanup vmd(8) includes and header files
Lots of organic growth other the years lead to unnecessary includes (proc.h everywhere) and odd dependencies between header files. This cleans things up a bit to help with upcoming cleanup around dhcp code.
No functional change.
"go for it" mlarkin@
|
Revision tags: OPENBSD_6_9_BASE
|
#
1.62 |
|
05-Apr-2021 |
dv |
Support booting from compressed kernel images.
The bsd.rd ramdisk now ships gzip'd on amd64. Use libz in base to transparently handle decompression of any compressed kernel images.
Patch from Josh Rickmar.
ok kn@
|
#
1.61 |
|
29-Mar-2021 |
dv |
Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior ignored did not intercept these packets and instead transmitted them.
This should make vmd(8)'s dhcp behave more as a true dhcp server should and allows it to work properly with the new dhcpleased(8) attempting a renewal.
OK mlarkin@
|
#
1.60 |
|
19-Mar-2021 |
kn |
Remove booting from kernels in raw/qcow2 images
Diff and (slightly tweaked) text below from Dave Voutila < dave at sisu dot io >, thanks!
-- Since 6.7 switched to FFS2 as the default filesystem for new installs, the ability for vmd(8) to load a kernel and boot.conf from a disk image directly (without SeaBIOS) has been broken.
A diff from tb to add FFS2 support never mdae it into the tree.
On 5th Jan 2021, new ramdisks for amd64 have started shipping gzipped, breaking the ability to load the bsd.rd directly as a kernel image for a vmd guest without first uncompressing the image.
Using BIOS works, the FFS2 change happend ten months ago and few if any have complained about the breakage. vmctl(8) is still vague about supporting it per its man page and one still has to pass the disk image twice as a "-b" and "-d" argument to boot an OpenBSD guest *without* BIOS.
Josh Rickmar reported the gzip issue on bugs@ and provided patches to add support for compressed ramdisks and kernel images. The easiest way to do so is to drop support for FFS images since they require a call to fmemopen(3) while all the other logic uses fopen(3)/fdopen(3) calls and a file descriptor. It is much easier to get thsoe patches merged if they don't have to account for extracting files from disk images. --
No objections anyone "Removing it makes sense" reyk (who wrote the FFS module) OK mlarkin
|
#
1.59 |
|
13-Feb-2021 |
mlarkin |
Fix some wrong comments and KNF/long line wraps
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.58 |
|
28-Jun-2020 |
pd |
vmd(8): Eliminate libevent state corruption
libevent functions for com, pic and rtc are now only called on event_thread. vcpu exit handlers send messages on a dev pipe and callbacks on these events do the event management (event_add, evtimer_add, etc). Previously, libevent state was mutated by two threads, event_thread, that runs all the callbacks and the vcpu thread when running exit handlers. This could have lead to libevent state corruption.
Patch from Dave Voutila <dave@sisu.io>
ok claudio@ tested by abieber@ and brynet@
|
Revision tags: OPENBSD_6_7_BASE
|
#
1.57 |
|
30-Apr-2020 |
pd |
vmd(8): correctly terminate vm processes after sending vm
Instead of a round about way of sending a message to vmm that 'send is successful' and terminating by vm_remove from vmm, we can send the imsg and exit in the vm process. The sigchld handler in vmm will vm_remove it from its structures. This is how a normal vm is terminated as well.
Previously, vm_remove was called in vmm_dispatch_vm (ie. the event handler to receive messages from vm process) when hanlding the IMSG_VMDOP_SEND_VM_RESPONSE (ie. the vm process has written the vm state to the fd passed on by vmctl send). This is not how vm_remove was intented to be used as it does a free(vm). The vm struct holds the buffers for imsg and so after handling this IMSG_VMDOP_SEND_VM_RESPONSE message, vmm_dispatch_vm loops again to do imsg_get(ibuf, &imsg) to read the next message (and we had just freed this *ibuf when we freed the vm struct) causing it to segfault.
reported by kn@ ok kn@
|
#
1.56 |
|
21-Apr-2020 |
pd |
vmd: improve concurrency control in pause
Previous implementation hit a deadlock sometimes as the pthread_cond_broadcast for the pause mutex could happen before pthread_cond_wait. This implementation uses a barrier which is hit when all vpcus are paused.
ok mpi@
|
#
1.55 |
|
08-Apr-2020 |
pd |
vmm(4): add IOCTL handler to sets the access protections of the ept
This exposes VMM_IOC_MPROTECT_EPT which can be used by vmd to lock in physical pages. Currently, vmd just terminates the vm in case it gets a protection fault in the future.
This feature is used by solo5 which uses vmm(4) as a backend hypervisor.
ok mpi@
Patch from Adam Steen <adam@adamsteen.com.au>
|
#
1.54 |
|
11-Dec-2019 |
pd |
vmd: proper concurrency control when pausing a vm
Removes an XXX which slept for 1s waiting for the vcpu thread to reach HLT and pause. We now define a paused and unpaused condition so that a call to pause_vm() / vmctl pause blocks till the vm really reaches a paused state.
Also, detach events for devices from event loop when pausing and add them back when unpausing. This is because some callbacks call pthread_mutex_lock and if the vm is paused, it would block also causing the libevent thread to block. This would mean that we would not be able to process any IMSGs received from vmm (parent process) including a message to unpause.
ok mlarkin@
|
#
1.53 |
|
30-Nov-2019 |
mlarkin |
Revert previous - the stability was not as improved as we had thought and we ended up accidentally breaking vmctl. This will need more thought.
ok ori@
|
#
1.52 |
|
29-Nov-2019 |
mlarkin |
Fix at least one cause of VMs spinning at 100% host CPU
After debugging with ori@, it looks like an event ends up on the wrong libevent queue, and we end continually de-queueing and re-queueing the event continually. While it's unclear exactly why this happened, a clue on libevent's github issues page for the same problem pointed us to using a different event base for the device events. This seems to have unstuck ori@'s problematic VM, and I have also seen no more hangs after this.
We have not completely separated the queues; ori@ will work on setting new libevent bases for those later. But those events are pretty frequency.
with help from and ok ori@
|
Revision tags: OPENBSD_6_6_BASE
|
#
1.51 |
|
17-Jul-2019 |
pd |
vmm/vmd: Fix migration with pvclock
Implement VMM_IOC_READVMPARAMS and VMM_IOC_WRITEVMPARAMS ioctls to read and write pvclock state.
reads ok mlarkin@
|
#
1.50 |
|
28-Jun-2019 |
deraadt |
When system calls indicate an error they return -1, not some arbitrary value < 0. errno is only updated in this case. Change all (most?) callers of syscalls to follow this better, and let's see if this strictness helps us in the future.
|
#
1.49 |
|
28-May-2019 |
pd |
vmd: unset CR0_CD and CR0_NW in default flat64 register values
These never got unset on AMD/SVM guests when booted via vmctl start -b causing them to run very slow
ok mlarkin@
|
#
1.48 |
|
12-May-2019 |
pd |
vmm: add a x86 page table walker
Add a first cut of x86 page table walker to vmd(8) and vmm(4). This function is not used right now but is a building block for future features like HPET, OUTSB and INSB emulation, nested virtualisation support, etc.
With help from Mike Larkin
ok mlarkin@
|
#
1.47 |
|
11-May-2019 |
jasper |
vm_dump_header allocated space for a signature but it was never set; set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image
ok mlarkin@ pd@
|
#
1.46 |
|
11-May-2019 |
jasper |
track the state of the vm (running, paused, etc) using a single bitfield instead of a handful of separate variables. this will makes it easier for vmd to report and check on the individual vm states
no functional change intended
ok ccardenas@ mlarkin@
|
Revision tags: OPENBSD_6_5_BASE
|
#
1.45 |
|
01-Mar-2019 |
mlarkin |
vmd(8): remove some i386 remnants that missed the original cleanup
ok pd, kn, deraadt
|
#
1.44 |
|
20-Feb-2019 |
mlarkin |
vmd(8): initialize guest %drX registers to power-on defaults on launch
Initializes the %drX registers to power on defaults, and bump the VM send/recieve header to reflect same
discussed with deraadt@
|
#
1.43 |
|
10-Dec-2018 |
claudio |
Implement the fw_cfg interface basics and use it to set the bootorder if a bootdevice was forced. This implements both the pure IO port interface and also the new DMA interface, a few direct commands are implemented which are needed but in general the "file" interface should be used. There is no write support for the guest. Tested against the latest vmm-firmware port. This requires also a -current kernel to pass the IO ports to vmd(8). OK mlarkin@ ccardenas@
|
#
1.42 |
|
06-Dec-2018 |
claudio |
Make it possible to define the bootdevice in vmd. This information is used currently only when booting a OpenBSD kernel. If VMBOOTDEV_NET is used the internal dhcp server will pass "auto_install" as boot file to the client and the boot loader passes the MAC of the first interface to the kernel to indicate PXE booting. Adding boot order support to SeaBIOS is not yet implemented. Ok ccardenas@
|
Revision tags: OPENBSD_6_4_BASE
|
#
1.41 |
|
08-Oct-2018 |
reyk |
Add support for qcow2 base images (external snapshots).
This works is from Ori Bernstein, committing on his behalf:
Add support to vmd for external snapshots. That is, snapshots that are derived from a base image. Data lookups start in the derived image, and if the derived image does not contain some data, the search proceeds ot the base image. Multiple derived images may exist off of a single base image.
A limitation of this format is that modifying the base image will corrupt the derived image.
This change also adds support for creating disk derived disk images to vmctl. To use it:
vmctl create derived.qcow2 -s 16G -b base.qcow2
From Ori Bernstein OK mlarkin@ reyk@
|
#
1.40 |
|
28-Sep-2018 |
reyk |
Support vmd-internal's vmboot with qcow2 disk images.
OK mlarkin@
|
#
1.39 |
|
19-Sep-2018 |
ccardenas |
Various clean up items for disks.
- qcow2: general cleanup - vioraw: check malloc - virtio: add function to sync disks - vm: call virtio_shutdown to sync disks when vm is finished executing
Thanks to Ori Bernstein.
Ok miko@
|
#
1.38 |
|
17-Jul-2018 |
mlarkin |
vmd(8): fix vmctl -b option for i386 kernels.
ok pd@
|
#
1.37 |
|
12-Jul-2018 |
mlarkin |
vmm(8)/vmm(4): send a copy of the guest register state to vmd on exit, avoiding multiple readregs ioctls back to vmm in case register content is needed subsequently.
ok phessler
|
#
1.36 |
|
10-Jul-2018 |
mlarkin |
vmd(8): route ELCR handler to the right function
|
#
1.35 |
|
09-Jul-2018 |
mlarkin |
vmd(8): better debug message in a failure case
|
#
1.34 |
|
19-Jun-2018 |
reyk |
knf
|
#
1.33 |
|
27-Apr-2018 |
mlarkin |
vmd(8): implement vmd side of ELCR registers
ok guenther
|
#
1.32 |
|
26-Apr-2018 |
mlarkin |
vmd(8): handle PIT channel 2 status readback via port 0x61
Allow PIT channel 2 status (fired/counting) readback via port 0x61 bit 5.
ok guenther@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.31 |
|
03-Jan-2018 |
ccardenas |
Add initial CD-ROM support to VMD via vioscsi.
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
#
1.30 |
|
29-Nov-2017 |
mlarkin |
make vmm(4) less responsible for initial register state, preferring to let usermode daemons handle that.
ok pd@
|
#
1.29 |
|
28-Nov-2017 |
mlarkin |
fix some spelling errors in a few comments
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.28 |
|
19-Sep-2017 |
mlarkin |
Clarify a wrong conditional, found by jsg.
ok jsg
|
#
1.27 |
|
17-Sep-2017 |
pd |
vmd: send/recv pci config space instead of recreating pci devices on receive
ok mlarkin@
|
#
1.26 |
|
17-Sep-2017 |
pd |
vmd: re add rtc.per and rtc.sec evtimers on receive
This was missed in receive. mc146818_start is already defined. This fixes rtc time resync on receive.
ok mlarkin@
|
#
1.25 |
|
11-Sep-2017 |
dlg |
add functions to provide direct access to guest memory as vmd addresses
iovec_mem() populates an iovec array based on guest physical addresses. this allows the use of things like readv and writev for moving data between the guest and a disk image file without having to bounce the memory.
vaddr_mem() provides a vmd usable pointer based on a guests physical address. this makes it possible to directly reference things like virtio rings without having to bounce that memory either. however, it assumes that a contiguous range of guest physical memory will sit in a single vm memory range. mlarkin@ says this is right.
ok mlarkin@
|
#
1.24 |
|
20-Aug-2017 |
pd |
vmd: Allow only upward migration
This restricts receiving vms from hosts with more cpu features.
Tested on broadwell -> skylake (works) skylake -> broadwell (don't work)
ok mlarkin@
|
#
1.23 |
|
14-Aug-2017 |
mlarkin |
vmd: set MSR_MISC_ENABLE=0 on vm creation, this will be re-set in vmm based on proper values from the host in use.
|
#
1.22 |
|
15-Jul-2017 |
pd |
Add vmctl send and vmctl receive
ok reyk@ and mlarkin@
|
#
1.21 |
|
09-Jul-2017 |
pd |
vmd/vmctl: Add ability to pause / unpause vms
With help from Ashwin Agrawal
ok reyk@ mlarkin@
|
#
1.20 |
|
07-Jun-2017 |
mlarkin |
vmd: Implement simulated baudrate support in the ns8250 module. The previous version was allowing an output rate that is "too fast", and linux guests would give up after 512 characters TXed ("too much work for irq4").
This diff calculates the approximate rate we can sustain at the current programmed baud rate and limits the output to that rate by inserting a HZ delay after a specified number of characters have been transmitted. This fixes the linux guest console issue.
Note that the console now outputs at more or less the selected baud rate, instead of nearly instantaneously as before - if you selected 9600 in your guest VMs before, you might want to change that to 115200 now for a better console experience.
krw@ "seems like a good idea to me"
|
#
1.19 |
|
30-May-2017 |
tedu |
split vioblk read/write functions into start and finish as prep for async io operations. ok mlarkin
|
#
1.18 |
|
28-May-2017 |
mlarkin |
SVM: add some exit types
Also, fix a comment that wasn't applicable anymore, and change a format from decimal to hex
|
#
1.17 |
|
05-May-2017 |
reyk |
VMs cannot use proc_compose() to PROC_VMM, they have to use imsg_compose() on the "vmm_pipe" directly. This fixes the communication channel from VMs back to vmm.
|
#
1.16 |
|
05-May-2017 |
mlarkin |
Allow vmd(8) to set guest %xcr0
Usermode part of previous vmm(4) diff.
Posted to tech by Pratik Vyas
|
#
1.15 |
|
02-May-2017 |
mlarkin |
fix an error in i386 vmd build
|
#
1.14 |
|
02-May-2017 |
mlarkin |
Matching vmd(8) part of previous diff (first part of vmctl send/receive).
ok kettenis
|
#
1.13 |
|
25-Apr-2017 |
reyk |
spacing
|
#
1.12 |
|
19-Apr-2017 |
reyk |
Add support for dynamic "NAT" interfaces (-L/local interface).
When a local interface is configured, vmd configures a /31 address on the tap(4) interface of the host and provides another IP in the same subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server that replies with IP, gateway, and DNS addresses to the VM. The built-in server only ever responds to the VM on the inside and cannot leak its DHCP responses to the outside.
Thanks to Uwe Werler, Josh Grosse, and some others for testing!
OK deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.11 |
|
27-Mar-2017 |
deraadt |
die whitespace die die die
|
#
1.10 |
|
25-Mar-2017 |
mlarkin |
Last bits needed to get seabios + alpine linux working. This is enough to get started and let more people help finding and fixing bugs.
ok kettenis, deraadt
|
#
1.9 |
|
25-Mar-2017 |
reyk |
Boot using BIOS from /etc/firmware/vmm-bios by default.
Instead of using the internal "vmboot", VMs will now be booted using the external BIOS firmware in /etc/firmware/vmm-bios (which is subject to a LGPLv3 license). Direct booting of OpenBSD kernels or non-default BIOS images is still supported for now using the -b/boot option that is replacing the -k/kernel option.
As requested by Theo, vmd(8) fails if neither the default BIOS is found nor a kernel has been specified in the VM configuration. The "vmm" BIOS has to be installed using fw_update(1), which will be done automatically in most cases where the OpenBSD can fetch it after install/upgrade.
OK mlarkin@
|
#
1.8 |
|
25-Mar-2017 |
mlarkin |
Implement some missing functionality and clean up some code in vmd pci emulation.
ok kettenis
|
#
1.7 |
|
25-Mar-2017 |
mlarkin |
Introduce a new function to obtain properly sized input data, and convert i8253/i8259/mc146818 emulation to use this.
|
#
1.6 |
|
24-Mar-2017 |
mlarkin |
Allow vmd to proceed after an interrupt occurred after retiring a cpuid instruction. Matches previous commit to kernel vmm.c
|
#
1.5 |
|
23-Mar-2017 |
mlarkin |
Implement memory size and SMP CPU count NVRAM registers in the emulated mc146818. This is needed for seabios to boot properly (and construct a sensible e820 map to send to the guest OS).
|
#
1.4 |
|
21-Mar-2017 |
mlarkin |
Fix two errors in NS8250 (UART) emulation. The first error zeroed out the high bits of %eax on reading register data from the emulated UART ports. The second error didn't properly assert the TXRDY bit during init - this bit was only set after the first character was sent. Both these bugs caused seabios to not be able to output any data. Found during the recent effort to get Linux guests booting.
|
#
1.3 |
|
15-Mar-2017 |
reyk |
Improve vmmci(4) shutdown and reboot.
This change handles various cases to power off the VM, even if it is unresponsive, stuck in ddb, or when the shutdown was initiated from the VM guest side. Usage of timeout and VM ACKs make sure that the VM is really turned off at some point.
OK mlarkin@
|
#
1.2 |
|
02-Mar-2017 |
reyk |
Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.
This is especially useful when multiple VMs share a switch, the implementation is independent from the underlying switch or bridge.
no objections mlarkin@
|
#
1.1 |
|
01-Mar-2017 |
reyk |
Split vmm.c into two files: vm.c for the VM child, vmm.c for the parent
As discussed with mlarkin@, it makes it easier to maintain the file.
OK mlarkin@
|
#
1.85 |
|
23-Apr-2023 |
dv |
vmd(8): teach vmm process how to exec.
Use execvp(2) to launch vm children with new address spaces. Consequently, introduces use of unveil(2) into the vmm and vm processes.
This imposes the requirement of launching vmd with absolute paths, similar to sshd(8).
ok mlarkin@
|
#
1.84 |
|
23-Apr-2023 |
anton |
unbreak tree by coping with recent s/XCR0/XFEATURE rename
|
Revision tags: OPENBSD_7_3_BASE
|
#
1.83 |
|
06-Feb-2023 |
dv |
vmd(8): scan pci bus to determine bootorder strings.
vmd's SeaBIOS bootorder strings had hardcoded pci device ids, so if a user added a network interface the bootorder strings didn't line up with reality. Using vmctl(8) to boot from a cdrom (-B cdrom) would fail, for instance, if attaching both a nic and a disk as well.
This change scans the pci devices and finds the first of each type to construct viable bootorder strings.
ok jan@
|
#
1.82 |
|
28-Jan-2023 |
dv |
Move some header definitions from vmm(4) to vmd(8).
Part of an ongoing effort to move userland-specific information out of a kernel header and directly into vmd(8). No functional change.
ok mlarkin@
|
#
1.81 |
|
08-Jan-2023 |
dv |
vmd(8): add thread names to vm process.
ok guenther@.
|
#
1.80 |
|
04-Jan-2023 |
dv |
Typos in vmd error message. No functional change.
|
#
1.79 |
|
28-Dec-2022 |
jmc |
spelling fixes; from paul tagliamonte any parts of his diff not taken are noted on tech
|
#
1.78 |
|
26-Dec-2022 |
dv |
vmd(8): provide a detailed e820 memory map.
When booting guests with SeaBIOS, vmd(8) supplied details about the available guest memory via CMOS registers. Consequently, we've been carrying some patches in the ports tree to SeaBIOS to fetch this information like it's the 1990s.
When a vm initializes memory ranges, we now track what each range represents. This information can be used to supply the e820 memory map to SeaBIOS via the fw_cfg interface allowing it to properly communicate memory ranges to a guest operating system. (This will also allow us to drop some patches from the port.)
Given the ranges can now be marked with a purpose, this also allows vmm(4) to switch from hard-coded mmio ranges and instead let the information on the memory range dictate if vmm should be handling a page fault or sending to vmd for a memory assist.
Tested by Mischa Peters and others. OK mlarkin@.
|
#
1.77 |
|
23-Dec-2022 |
dv |
vmd(8): implement zero-copy operations on virtqueues.
The original virtio device implementation relied on allocating a buffer on heap, copying the virtqueue from the guest, mutating the copy, and then overwriting the virtqueue in the guest.
While the approach worked, it was both complex and added extra overhead. On older hardware, switching to the zero-copy approach can show a noticeable performance improvement for vionet devices. An added benefit is this diff also reduces the amount of code in vmd, which is always a welcome change.
In addition, change to talking about the queue pfn and not "address" as the virtio-pci spec has drivers provide a 32-bit value representing the physical page number of the location in guest memory, not the linear address.
Original idea from dlg@ while working on re-adding async task queues.
ok dlg@, tested by many
|
#
1.76 |
|
11-Nov-2022 |
dv |
Revert removal of toggling interrupt line in vmd vcpu run loop.
phessler reports a performance regression. Needs more testing.
|
#
1.75 |
|
10-Nov-2022 |
dv |
vmd(8): remove toggling interrupt line on vcpu in vcpu run loop
We toggle the interrupt "line" on the vcpu when we assert or deassert irq on the pic in either the vcpu thread (emulating some devices) or on the device event thread (mostly handling reading available data). Having it in the vcpu run loop here just results in another ioctl(2) call before the one for re-entering the guest cpu.
Removing it shows no noticeable behavioral change in existing guests.
ok mlarkin@
|
#
1.74 |
|
10-Nov-2022 |
dv |
vmd(8): import mmio decode and emulation, disabled for now.
The initial mmio support for vmd adds support for only specific MOV and MOVZX instructions. Plan is to begin iterating in-tree on other missing pieces. All functionality is gated behind an #if for now.
Only change to vmm(4) is reordering register #define's in vmmvar.h.
ok mlarkin@
|
Revision tags: OPENBSD_7_2_BASE
|
#
1.73 |
|
01-Sep-2022 |
dv |
vmm(4): send all port io emulation to userland
Simplify things by sending any io exits from IN/OUT instructions to userland instead of trying to emulate anything in the kernel. vmm was sending most pertinent exits to vmd anyways, so this functionally changes little.
An added benefit is this solves an issue reported by tb@ where i386 OpenBSD guests would probe for a pc keyboard repeatedly and cause excessive vm exits. (The emulation in vmm was not properly handling these port reads.)
While here, make the assignment of the VEI_DIR_{IN,OUT} enum values not assume the underlying integer the compiler may assign.
ok mlarkin@
|
#
1.72 |
|
30-Aug-2022 |
dv |
Initial support for mmio assist for vmm(4)
Provide the basic information required for a userland assist in emulating instructions touching mmio regions, sending as much information as is provided by the host hardware.
No decode or assist provided at the moment by vmd(8).
ok mlarkin@
|
#
1.71 |
|
29-Jun-2022 |
dv |
vmd(8): fix off by one in vm memory range check
When inspecting if a gpa falls into a known memory range, vmd was considering it valid 1 byte past the end resulting in selecting the wrong starting range for the search.
ok mlarkin@
|
#
1.70 |
|
26-Jun-2022 |
dv |
vmd: create a copy of bios at 4g boundary
Newer Linux kernels call into the bios to perform a reboot and our version of SeaBIOS assumes there's a "copy" of the bios ending at 4g. When SeaBIOS reads from this area, since vmd doesn't perform mmio yet, guests terminate with an unhandled fault.
Carve out some space ending at 4g and copy the bios there. Technically we could load garbage there, but give SeaBIOS what it wants for now.
ok mlarkin@
|
#
1.69 |
|
03-May-2022 |
dv |
vmm/vmd/vmctl: standardize memory units to bytes
At different points in the vm lifecycle vmm(4), vmctl(8), and vmd(8) refer to a vm's memory range sizes in either bytes or megabytes. This is needlessly complex.
Switch to using bytes everywhere and adjust types and constants accordingly. While this makes it possible to specify vm's with memory in fractions of megabytes, the logic requiring whole megabyte values remains.
Feedback from deraadt@, mlarkin@, and Matthew Martin.
ok mlarkin@
|
Revision tags: OPENBSD_7_1_BASE
|
#
1.68 |
|
01-Mar-2022 |
dv |
vmd(8): gracefully handle hitting data limits when starting a vm
With recent changes to login.conf(5) to restrict daemon datasize to a finite value, users can now hit resource limits when attempting to start a vm.
This change fixes the error path when hitting the limit. vmd(8) will no longer abort and memory error messages are relayed to the user.
While here, address potential under-reads/writes using atomicio when relaying data between the child vm process and vmd's vmm process.
Original diff from tedu@. OK mlarkin@.
|
#
1.67 |
|
30-Dec-2021 |
claudio |
Add back support for -B net -b bsd.rd which emulates a PXE install and results in an autoinstall. This can be used to quickly create new OpenBSD installs. OK dv@
|
#
1.66 |
|
29-Nov-2021 |
deraadt |
mostly avoid sys/param.h with a local nitems() ok mlarkin
|
Revision tags: OPENBSD_7_0_BASE
|
#
1.65 |
|
01-Sep-2021 |
dv |
remove unused functions and cleanup vmd.h
Discussed with mlarkin@. These functions were implemented but never used. While in vmd.h, fix the order to match current vmd(8) reality.
|
#
1.64 |
|
16-Jul-2021 |
dv |
vmd(8): simplify vcpu logic, removing uart & vionet reads
Remove legacy state handling on the ns8250 and virtio network devices originally put in place before using libevent for async device events. The vcpu thread doesn't need to process device data as it is handled by the libevent thread.
This has the benefit of simplifying some of the message passing between threads introduced to the ns8250 uart since both the vcpu and libevent threads were processing read events.
No functional change intended. Tested by many, including abieber@, weerd@, Mischa Peters, and Matthias Schmidt. (Thanks.)
OK mlarkin@
|
#
1.63 |
|
16-Jun-2021 |
dv |
cleanup vmd(8) includes and header files
Lots of organic growth other the years lead to unnecessary includes (proc.h everywhere) and odd dependencies between header files. This cleans things up a bit to help with upcoming cleanup around dhcp code.
No functional change.
"go for it" mlarkin@
|
Revision tags: OPENBSD_6_9_BASE
|
#
1.62 |
|
05-Apr-2021 |
dv |
Support booting from compressed kernel images.
The bsd.rd ramdisk now ships gzip'd on amd64. Use libz in base to transparently handle decompression of any compressed kernel images.
Patch from Josh Rickmar.
ok kn@
|
#
1.61 |
|
29-Mar-2021 |
dv |
Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior ignored did not intercept these packets and instead transmitted them.
This should make vmd(8)'s dhcp behave more as a true dhcp server should and allows it to work properly with the new dhcpleased(8) attempting a renewal.
OK mlarkin@
|
#
1.60 |
|
19-Mar-2021 |
kn |
Remove booting from kernels in raw/qcow2 images
Diff and (slightly tweaked) text below from Dave Voutila < dave at sisu dot io >, thanks!
-- Since 6.7 switched to FFS2 as the default filesystem for new installs, the ability for vmd(8) to load a kernel and boot.conf from a disk image directly (without SeaBIOS) has been broken.
A diff from tb to add FFS2 support never mdae it into the tree.
On 5th Jan 2021, new ramdisks for amd64 have started shipping gzipped, breaking the ability to load the bsd.rd directly as a kernel image for a vmd guest without first uncompressing the image.
Using BIOS works, the FFS2 change happend ten months ago and few if any have complained about the breakage. vmctl(8) is still vague about supporting it per its man page and one still has to pass the disk image twice as a "-b" and "-d" argument to boot an OpenBSD guest *without* BIOS.
Josh Rickmar reported the gzip issue on bugs@ and provided patches to add support for compressed ramdisks and kernel images. The easiest way to do so is to drop support for FFS images since they require a call to fmemopen(3) while all the other logic uses fopen(3)/fdopen(3) calls and a file descriptor. It is much easier to get thsoe patches merged if they don't have to account for extracting files from disk images. --
No objections anyone "Removing it makes sense" reyk (who wrote the FFS module) OK mlarkin
|
#
1.59 |
|
13-Feb-2021 |
mlarkin |
Fix some wrong comments and KNF/long line wraps
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.58 |
|
28-Jun-2020 |
pd |
vmd(8): Eliminate libevent state corruption
libevent functions for com, pic and rtc are now only called on event_thread. vcpu exit handlers send messages on a dev pipe and callbacks on these events do the event management (event_add, evtimer_add, etc). Previously, libevent state was mutated by two threads, event_thread, that runs all the callbacks and the vcpu thread when running exit handlers. This could have lead to libevent state corruption.
Patch from Dave Voutila <dave@sisu.io>
ok claudio@ tested by abieber@ and brynet@
|
Revision tags: OPENBSD_6_7_BASE
|
#
1.57 |
|
30-Apr-2020 |
pd |
vmd(8): correctly terminate vm processes after sending vm
Instead of a round about way of sending a message to vmm that 'send is successful' and terminating by vm_remove from vmm, we can send the imsg and exit in the vm process. The sigchld handler in vmm will vm_remove it from its structures. This is how a normal vm is terminated as well.
Previously, vm_remove was called in vmm_dispatch_vm (ie. the event handler to receive messages from vm process) when hanlding the IMSG_VMDOP_SEND_VM_RESPONSE (ie. the vm process has written the vm state to the fd passed on by vmctl send). This is not how vm_remove was intented to be used as it does a free(vm). The vm struct holds the buffers for imsg and so after handling this IMSG_VMDOP_SEND_VM_RESPONSE message, vmm_dispatch_vm loops again to do imsg_get(ibuf, &imsg) to read the next message (and we had just freed this *ibuf when we freed the vm struct) causing it to segfault.
reported by kn@ ok kn@
|
#
1.56 |
|
21-Apr-2020 |
pd |
vmd: improve concurrency control in pause
Previous implementation hit a deadlock sometimes as the pthread_cond_broadcast for the pause mutex could happen before pthread_cond_wait. This implementation uses a barrier which is hit when all vpcus are paused.
ok mpi@
|
#
1.55 |
|
08-Apr-2020 |
pd |
vmm(4): add IOCTL handler to sets the access protections of the ept
This exposes VMM_IOC_MPROTECT_EPT which can be used by vmd to lock in physical pages. Currently, vmd just terminates the vm in case it gets a protection fault in the future.
This feature is used by solo5 which uses vmm(4) as a backend hypervisor.
ok mpi@
Patch from Adam Steen <adam@adamsteen.com.au>
|
#
1.54 |
|
11-Dec-2019 |
pd |
vmd: proper concurrency control when pausing a vm
Removes an XXX which slept for 1s waiting for the vcpu thread to reach HLT and pause. We now define a paused and unpaused condition so that a call to pause_vm() / vmctl pause blocks till the vm really reaches a paused state.
Also, detach events for devices from event loop when pausing and add them back when unpausing. This is because some callbacks call pthread_mutex_lock and if the vm is paused, it would block also causing the libevent thread to block. This would mean that we would not be able to process any IMSGs received from vmm (parent process) including a message to unpause.
ok mlarkin@
|
#
1.53 |
|
30-Nov-2019 |
mlarkin |
Revert previous - the stability was not as improved as we had thought and we ended up accidentally breaking vmctl. This will need more thought.
ok ori@
|
#
1.52 |
|
29-Nov-2019 |
mlarkin |
Fix at least one cause of VMs spinning at 100% host CPU
After debugging with ori@, it looks like an event ends up on the wrong libevent queue, and we end continually de-queueing and re-queueing the event continually. While it's unclear exactly why this happened, a clue on libevent's github issues page for the same problem pointed us to using a different event base for the device events. This seems to have unstuck ori@'s problematic VM, and I have also seen no more hangs after this.
We have not completely separated the queues; ori@ will work on setting new libevent bases for those later. But those events are pretty frequency.
with help from and ok ori@
|
Revision tags: OPENBSD_6_6_BASE
|
#
1.51 |
|
17-Jul-2019 |
pd |
vmm/vmd: Fix migration with pvclock
Implement VMM_IOC_READVMPARAMS and VMM_IOC_WRITEVMPARAMS ioctls to read and write pvclock state.
reads ok mlarkin@
|
#
1.50 |
|
28-Jun-2019 |
deraadt |
When system calls indicate an error they return -1, not some arbitrary value < 0. errno is only updated in this case. Change all (most?) callers of syscalls to follow this better, and let's see if this strictness helps us in the future.
|
#
1.49 |
|
28-May-2019 |
pd |
vmd: unset CR0_CD and CR0_NW in default flat64 register values
These never got unset on AMD/SVM guests when booted via vmctl start -b causing them to run very slow
ok mlarkin@
|
#
1.48 |
|
12-May-2019 |
pd |
vmm: add a x86 page table walker
Add a first cut of x86 page table walker to vmd(8) and vmm(4). This function is not used right now but is a building block for future features like HPET, OUTSB and INSB emulation, nested virtualisation support, etc.
With help from Mike Larkin
ok mlarkin@
|
#
1.47 |
|
11-May-2019 |
jasper |
vm_dump_header allocated space for a signature but it was never set; set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image
ok mlarkin@ pd@
|
#
1.46 |
|
11-May-2019 |
jasper |
track the state of the vm (running, paused, etc) using a single bitfield instead of a handful of separate variables. this will makes it easier for vmd to report and check on the individual vm states
no functional change intended
ok ccardenas@ mlarkin@
|
Revision tags: OPENBSD_6_5_BASE
|
#
1.45 |
|
01-Mar-2019 |
mlarkin |
vmd(8): remove some i386 remnants that missed the original cleanup
ok pd, kn, deraadt
|
#
1.44 |
|
20-Feb-2019 |
mlarkin |
vmd(8): initialize guest %drX registers to power-on defaults on launch
Initializes the %drX registers to power on defaults, and bump the VM send/recieve header to reflect same
discussed with deraadt@
|
#
1.43 |
|
10-Dec-2018 |
claudio |
Implement the fw_cfg interface basics and use it to set the bootorder if a bootdevice was forced. This implements both the pure IO port interface and also the new DMA interface, a few direct commands are implemented which are needed but in general the "file" interface should be used. There is no write support for the guest. Tested against the latest vmm-firmware port. This requires also a -current kernel to pass the IO ports to vmd(8). OK mlarkin@ ccardenas@
|
#
1.42 |
|
06-Dec-2018 |
claudio |
Make it possible to define the bootdevice in vmd. This information is used currently only when booting a OpenBSD kernel. If VMBOOTDEV_NET is used the internal dhcp server will pass "auto_install" as boot file to the client and the boot loader passes the MAC of the first interface to the kernel to indicate PXE booting. Adding boot order support to SeaBIOS is not yet implemented. Ok ccardenas@
|
Revision tags: OPENBSD_6_4_BASE
|
#
1.41 |
|
08-Oct-2018 |
reyk |
Add support for qcow2 base images (external snapshots).
This works is from Ori Bernstein, committing on his behalf:
Add support to vmd for external snapshots. That is, snapshots that are derived from a base image. Data lookups start in the derived image, and if the derived image does not contain some data, the search proceeds ot the base image. Multiple derived images may exist off of a single base image.
A limitation of this format is that modifying the base image will corrupt the derived image.
This change also adds support for creating disk derived disk images to vmctl. To use it:
vmctl create derived.qcow2 -s 16G -b base.qcow2
From Ori Bernstein OK mlarkin@ reyk@
|
#
1.40 |
|
28-Sep-2018 |
reyk |
Support vmd-internal's vmboot with qcow2 disk images.
OK mlarkin@
|
#
1.39 |
|
19-Sep-2018 |
ccardenas |
Various clean up items for disks.
- qcow2: general cleanup - vioraw: check malloc - virtio: add function to sync disks - vm: call virtio_shutdown to sync disks when vm is finished executing
Thanks to Ori Bernstein.
Ok miko@
|
#
1.38 |
|
17-Jul-2018 |
mlarkin |
vmd(8): fix vmctl -b option for i386 kernels.
ok pd@
|
#
1.37 |
|
12-Jul-2018 |
mlarkin |
vmm(8)/vmm(4): send a copy of the guest register state to vmd on exit, avoiding multiple readregs ioctls back to vmm in case register content is needed subsequently.
ok phessler
|
#
1.36 |
|
10-Jul-2018 |
mlarkin |
vmd(8): route ELCR handler to the right function
|
#
1.35 |
|
09-Jul-2018 |
mlarkin |
vmd(8): better debug message in a failure case
|
#
1.34 |
|
19-Jun-2018 |
reyk |
knf
|
#
1.33 |
|
27-Apr-2018 |
mlarkin |
vmd(8): implement vmd side of ELCR registers
ok guenther
|
#
1.32 |
|
26-Apr-2018 |
mlarkin |
vmd(8): handle PIT channel 2 status readback via port 0x61
Allow PIT channel 2 status (fired/counting) readback via port 0x61 bit 5.
ok guenther@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.31 |
|
03-Jan-2018 |
ccardenas |
Add initial CD-ROM support to VMD via vioscsi.
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
#
1.30 |
|
29-Nov-2017 |
mlarkin |
make vmm(4) less responsible for initial register state, preferring to let usermode daemons handle that.
ok pd@
|
#
1.29 |
|
28-Nov-2017 |
mlarkin |
fix some spelling errors in a few comments
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.28 |
|
19-Sep-2017 |
mlarkin |
Clarify a wrong conditional, found by jsg.
ok jsg
|
#
1.27 |
|
17-Sep-2017 |
pd |
vmd: send/recv pci config space instead of recreating pci devices on receive
ok mlarkin@
|
#
1.26 |
|
17-Sep-2017 |
pd |
vmd: re add rtc.per and rtc.sec evtimers on receive
This was missed in receive. mc146818_start is already defined. This fixes rtc time resync on receive.
ok mlarkin@
|
#
1.25 |
|
11-Sep-2017 |
dlg |
add functions to provide direct access to guest memory as vmd addresses
iovec_mem() populates an iovec array based on guest physical addresses. this allows the use of things like readv and writev for moving data between the guest and a disk image file without having to bounce the memory.
vaddr_mem() provides a vmd usable pointer based on a guests physical address. this makes it possible to directly reference things like virtio rings without having to bounce that memory either. however, it assumes that a contiguous range of guest physical memory will sit in a single vm memory range. mlarkin@ says this is right.
ok mlarkin@
|
#
1.24 |
|
20-Aug-2017 |
pd |
vmd: Allow only upward migration
This restricts receiving vms from hosts with more cpu features.
Tested on broadwell -> skylake (works) skylake -> broadwell (don't work)
ok mlarkin@
|
#
1.23 |
|
14-Aug-2017 |
mlarkin |
vmd: set MSR_MISC_ENABLE=0 on vm creation, this will be re-set in vmm based on proper values from the host in use.
|
#
1.22 |
|
15-Jul-2017 |
pd |
Add vmctl send and vmctl receive
ok reyk@ and mlarkin@
|
#
1.21 |
|
09-Jul-2017 |
pd |
vmd/vmctl: Add ability to pause / unpause vms
With help from Ashwin Agrawal
ok reyk@ mlarkin@
|
#
1.20 |
|
07-Jun-2017 |
mlarkin |
vmd: Implement simulated baudrate support in the ns8250 module. The previous version was allowing an output rate that is "too fast", and linux guests would give up after 512 characters TXed ("too much work for irq4").
This diff calculates the approximate rate we can sustain at the current programmed baud rate and limits the output to that rate by inserting a HZ delay after a specified number of characters have been transmitted. This fixes the linux guest console issue.
Note that the console now outputs at more or less the selected baud rate, instead of nearly instantaneously as before - if you selected 9600 in your guest VMs before, you might want to change that to 115200 now for a better console experience.
krw@ "seems like a good idea to me"
|
#
1.19 |
|
30-May-2017 |
tedu |
split vioblk read/write functions into start and finish as prep for async io operations. ok mlarkin
|
#
1.18 |
|
28-May-2017 |
mlarkin |
SVM: add some exit types
Also, fix a comment that wasn't applicable anymore, and change a format from decimal to hex
|
#
1.17 |
|
05-May-2017 |
reyk |
VMs cannot use proc_compose() to PROC_VMM, they have to use imsg_compose() on the "vmm_pipe" directly. This fixes the communication channel from VMs back to vmm.
|
#
1.16 |
|
05-May-2017 |
mlarkin |
Allow vmd(8) to set guest %xcr0
Usermode part of previous vmm(4) diff.
Posted to tech by Pratik Vyas
|
#
1.15 |
|
02-May-2017 |
mlarkin |
fix an error in i386 vmd build
|
#
1.14 |
|
02-May-2017 |
mlarkin |
Matching vmd(8) part of previous diff (first part of vmctl send/receive).
ok kettenis
|
#
1.13 |
|
25-Apr-2017 |
reyk |
spacing
|
#
1.12 |
|
19-Apr-2017 |
reyk |
Add support for dynamic "NAT" interfaces (-L/local interface).
When a local interface is configured, vmd configures a /31 address on the tap(4) interface of the host and provides another IP in the same subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server that replies with IP, gateway, and DNS addresses to the VM. The built-in server only ever responds to the VM on the inside and cannot leak its DHCP responses to the outside.
Thanks to Uwe Werler, Josh Grosse, and some others for testing!
OK deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.11 |
|
27-Mar-2017 |
deraadt |
die whitespace die die die
|
#
1.10 |
|
25-Mar-2017 |
mlarkin |
Last bits needed to get seabios + alpine linux working. This is enough to get started and let more people help finding and fixing bugs.
ok kettenis, deraadt
|
#
1.9 |
|
25-Mar-2017 |
reyk |
Boot using BIOS from /etc/firmware/vmm-bios by default.
Instead of using the internal "vmboot", VMs will now be booted using the external BIOS firmware in /etc/firmware/vmm-bios (which is subject to a LGPLv3 license). Direct booting of OpenBSD kernels or non-default BIOS images is still supported for now using the -b/boot option that is replacing the -k/kernel option.
As requested by Theo, vmd(8) fails if neither the default BIOS is found nor a kernel has been specified in the VM configuration. The "vmm" BIOS has to be installed using fw_update(1), which will be done automatically in most cases where the OpenBSD can fetch it after install/upgrade.
OK mlarkin@
|
#
1.8 |
|
25-Mar-2017 |
mlarkin |
Implement some missing functionality and clean up some code in vmd pci emulation.
ok kettenis
|
#
1.7 |
|
25-Mar-2017 |
mlarkin |
Introduce a new function to obtain properly sized input data, and convert i8253/i8259/mc146818 emulation to use this.
|
#
1.6 |
|
24-Mar-2017 |
mlarkin |
Allow vmd to proceed after an interrupt occurred after retiring a cpuid instruction. Matches previous commit to kernel vmm.c
|
#
1.5 |
|
23-Mar-2017 |
mlarkin |
Implement memory size and SMP CPU count NVRAM registers in the emulated mc146818. This is needed for seabios to boot properly (and construct a sensible e820 map to send to the guest OS).
|
#
1.4 |
|
21-Mar-2017 |
mlarkin |
Fix two errors in NS8250 (UART) emulation. The first error zeroed out the high bits of %eax on reading register data from the emulated UART ports. The second error didn't properly assert the TXRDY bit during init - this bit was only set after the first character was sent. Both these bugs caused seabios to not be able to output any data. Found during the recent effort to get Linux guests booting.
|
#
1.3 |
|
15-Mar-2017 |
reyk |
Improve vmmci(4) shutdown and reboot.
This change handles various cases to power off the VM, even if it is unresponsive, stuck in ddb, or when the shutdown was initiated from the VM guest side. Usage of timeout and VM ACKs make sure that the VM is really turned off at some point.
OK mlarkin@
|
#
1.2 |
|
02-Mar-2017 |
reyk |
Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.
This is especially useful when multiple VMs share a switch, the implementation is independent from the underlying switch or bridge.
no objections mlarkin@
|
#
1.1 |
|
01-Mar-2017 |
reyk |
Split vmm.c into two files: vm.c for the VM child, vmm.c for the parent
As discussed with mlarkin@, it makes it easier to maintain the file.
OK mlarkin@
|
#
1.83 |
|
06-Feb-2023 |
dv |
vmd(8): scan pci bus to determine bootorder strings.
vmd's SeaBIOS bootorder strings had hardcoded pci device ids, so if a user added a network interface the bootorder strings didn't line up with reality. Using vmctl(8) to boot from a cdrom (-B cdrom) would fail, for instance, if attaching both a nic and a disk as well.
This change scans the pci devices and finds the first of each type to construct viable bootorder strings.
ok jan@
|
#
1.82 |
|
28-Jan-2023 |
dv |
Move some header definitions from vmm(4) to vmd(8).
Part of an ongoing effort to move userland-specific information out of a kernel header and directly into vmd(8). No functional change.
ok mlarkin@
|
#
1.81 |
|
08-Jan-2023 |
dv |
vmd(8): add thread names to vm process.
ok guenther@.
|
#
1.80 |
|
04-Jan-2023 |
dv |
Typos in vmd error message. No functional change.
|
#
1.79 |
|
28-Dec-2022 |
jmc |
spelling fixes; from paul tagliamonte any parts of his diff not taken are noted on tech
|
#
1.78 |
|
26-Dec-2022 |
dv |
vmd(8): provide a detailed e820 memory map.
When booting guests with SeaBIOS, vmd(8) supplied details about the available guest memory via CMOS registers. Consequently, we've been carrying some patches in the ports tree to SeaBIOS to fetch this information like it's the 1990s.
When a vm initializes memory ranges, we now track what each range represents. This information can be used to supply the e820 memory map to SeaBIOS via the fw_cfg interface allowing it to properly communicate memory ranges to a guest operating system. (This will also allow us to drop some patches from the port.)
Given the ranges can now be marked with a purpose, this also allows vmm(4) to switch from hard-coded mmio ranges and instead let the information on the memory range dictate if vmm should be handling a page fault or sending to vmd for a memory assist.
Tested by Mischa Peters and others. OK mlarkin@.
|
#
1.77 |
|
23-Dec-2022 |
dv |
vmd(8): implement zero-copy operations on virtqueues.
The original virtio device implementation relied on allocating a buffer on heap, copying the virtqueue from the guest, mutating the copy, and then overwriting the virtqueue in the guest.
While the approach worked, it was both complex and added extra overhead. On older hardware, switching to the zero-copy approach can show a noticeable performance improvement for vionet devices. An added benefit is this diff also reduces the amount of code in vmd, which is always a welcome change.
In addition, change to talking about the queue pfn and not "address" as the virtio-pci spec has drivers provide a 32-bit value representing the physical page number of the location in guest memory, not the linear address.
Original idea from dlg@ while working on re-adding async task queues.
ok dlg@, tested by many
|
#
1.76 |
|
11-Nov-2022 |
dv |
Revert removal of toggling interrupt line in vmd vcpu run loop.
phessler reports a performance regression. Needs more testing.
|
#
1.75 |
|
10-Nov-2022 |
dv |
vmd(8): remove toggling interrupt line on vcpu in vcpu run loop
We toggle the interrupt "line" on the vcpu when we assert or deassert irq on the pic in either the vcpu thread (emulating some devices) or on the device event thread (mostly handling reading available data). Having it in the vcpu run loop here just results in another ioctl(2) call before the one for re-entering the guest cpu.
Removing it shows no noticeable behavioral change in existing guests.
ok mlarkin@
|
#
1.74 |
|
10-Nov-2022 |
dv |
vmd(8): import mmio decode and emulation, disabled for now.
The initial mmio support for vmd adds support for only specific MOV and MOVZX instructions. Plan is to begin iterating in-tree on other missing pieces. All functionality is gated behind an #if for now.
Only change to vmm(4) is reordering register #define's in vmmvar.h.
ok mlarkin@
|
Revision tags: OPENBSD_7_2_BASE
|
#
1.73 |
|
01-Sep-2022 |
dv |
vmm(4): send all port io emulation to userland
Simplify things by sending any io exits from IN/OUT instructions to userland instead of trying to emulate anything in the kernel. vmm was sending most pertinent exits to vmd anyways, so this functionally changes little.
An added benefit is this solves an issue reported by tb@ where i386 OpenBSD guests would probe for a pc keyboard repeatedly and cause excessive vm exits. (The emulation in vmm was not properly handling these port reads.)
While here, make the assignment of the VEI_DIR_{IN,OUT} enum values not assume the underlying integer the compiler may assign.
ok mlarkin@
|
#
1.72 |
|
30-Aug-2022 |
dv |
Initial support for mmio assist for vmm(4)
Provide the basic information required for a userland assist in emulating instructions touching mmio regions, sending as much information as is provided by the host hardware.
No decode or assist provided at the moment by vmd(8).
ok mlarkin@
|
#
1.71 |
|
29-Jun-2022 |
dv |
vmd(8): fix off by one in vm memory range check
When inspecting if a gpa falls into a known memory range, vmd was considering it valid 1 byte past the end resulting in selecting the wrong starting range for the search.
ok mlarkin@
|
#
1.70 |
|
26-Jun-2022 |
dv |
vmd: create a copy of bios at 4g boundary
Newer Linux kernels call into the bios to perform a reboot and our version of SeaBIOS assumes there's a "copy" of the bios ending at 4g. When SeaBIOS reads from this area, since vmd doesn't perform mmio yet, guests terminate with an unhandled fault.
Carve out some space ending at 4g and copy the bios there. Technically we could load garbage there, but give SeaBIOS what it wants for now.
ok mlarkin@
|
#
1.69 |
|
03-May-2022 |
dv |
vmm/vmd/vmctl: standardize memory units to bytes
At different points in the vm lifecycle vmm(4), vmctl(8), and vmd(8) refer to a vm's memory range sizes in either bytes or megabytes. This is needlessly complex.
Switch to using bytes everywhere and adjust types and constants accordingly. While this makes it possible to specify vm's with memory in fractions of megabytes, the logic requiring whole megabyte values remains.
Feedback from deraadt@, mlarkin@, and Matthew Martin.
ok mlarkin@
|
Revision tags: OPENBSD_7_1_BASE
|
#
1.68 |
|
01-Mar-2022 |
dv |
vmd(8): gracefully handle hitting data limits when starting a vm
With recent changes to login.conf(5) to restrict daemon datasize to a finite value, users can now hit resource limits when attempting to start a vm.
This change fixes the error path when hitting the limit. vmd(8) will no longer abort and memory error messages are relayed to the user.
While here, address potential under-reads/writes using atomicio when relaying data between the child vm process and vmd's vmm process.
Original diff from tedu@. OK mlarkin@.
|
#
1.67 |
|
30-Dec-2021 |
claudio |
Add back support for -B net -b bsd.rd which emulates a PXE install and results in an autoinstall. This can be used to quickly create new OpenBSD installs. OK dv@
|
#
1.66 |
|
29-Nov-2021 |
deraadt |
mostly avoid sys/param.h with a local nitems() ok mlarkin
|
Revision tags: OPENBSD_7_0_BASE
|
#
1.65 |
|
01-Sep-2021 |
dv |
remove unused functions and cleanup vmd.h
Discussed with mlarkin@. These functions were implemented but never used. While in vmd.h, fix the order to match current vmd(8) reality.
|
#
1.64 |
|
16-Jul-2021 |
dv |
vmd(8): simplify vcpu logic, removing uart & vionet reads
Remove legacy state handling on the ns8250 and virtio network devices originally put in place before using libevent for async device events. The vcpu thread doesn't need to process device data as it is handled by the libevent thread.
This has the benefit of simplifying some of the message passing between threads introduced to the ns8250 uart since both the vcpu and libevent threads were processing read events.
No functional change intended. Tested by many, including abieber@, weerd@, Mischa Peters, and Matthias Schmidt. (Thanks.)
OK mlarkin@
|
#
1.63 |
|
16-Jun-2021 |
dv |
cleanup vmd(8) includes and header files
Lots of organic growth other the years lead to unnecessary includes (proc.h everywhere) and odd dependencies between header files. This cleans things up a bit to help with upcoming cleanup around dhcp code.
No functional change.
"go for it" mlarkin@
|
Revision tags: OPENBSD_6_9_BASE
|
#
1.62 |
|
05-Apr-2021 |
dv |
Support booting from compressed kernel images.
The bsd.rd ramdisk now ships gzip'd on amd64. Use libz in base to transparently handle decompression of any compressed kernel images.
Patch from Josh Rickmar.
ok kn@
|
#
1.61 |
|
29-Mar-2021 |
dv |
Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior ignored did not intercept these packets and instead transmitted them.
This should make vmd(8)'s dhcp behave more as a true dhcp server should and allows it to work properly with the new dhcpleased(8) attempting a renewal.
OK mlarkin@
|
#
1.60 |
|
19-Mar-2021 |
kn |
Remove booting from kernels in raw/qcow2 images
Diff and (slightly tweaked) text below from Dave Voutila < dave at sisu dot io >, thanks!
-- Since 6.7 switched to FFS2 as the default filesystem for new installs, the ability for vmd(8) to load a kernel and boot.conf from a disk image directly (without SeaBIOS) has been broken.
A diff from tb to add FFS2 support never mdae it into the tree.
On 5th Jan 2021, new ramdisks for amd64 have started shipping gzipped, breaking the ability to load the bsd.rd directly as a kernel image for a vmd guest without first uncompressing the image.
Using BIOS works, the FFS2 change happend ten months ago and few if any have complained about the breakage. vmctl(8) is still vague about supporting it per its man page and one still has to pass the disk image twice as a "-b" and "-d" argument to boot an OpenBSD guest *without* BIOS.
Josh Rickmar reported the gzip issue on bugs@ and provided patches to add support for compressed ramdisks and kernel images. The easiest way to do so is to drop support for FFS images since they require a call to fmemopen(3) while all the other logic uses fopen(3)/fdopen(3) calls and a file descriptor. It is much easier to get thsoe patches merged if they don't have to account for extracting files from disk images. --
No objections anyone "Removing it makes sense" reyk (who wrote the FFS module) OK mlarkin
|
#
1.59 |
|
13-Feb-2021 |
mlarkin |
Fix some wrong comments and KNF/long line wraps
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.58 |
|
28-Jun-2020 |
pd |
vmd(8): Eliminate libevent state corruption
libevent functions for com, pic and rtc are now only called on event_thread. vcpu exit handlers send messages on a dev pipe and callbacks on these events do the event management (event_add, evtimer_add, etc). Previously, libevent state was mutated by two threads, event_thread, that runs all the callbacks and the vcpu thread when running exit handlers. This could have lead to libevent state corruption.
Patch from Dave Voutila <dave@sisu.io>
ok claudio@ tested by abieber@ and brynet@
|
Revision tags: OPENBSD_6_7_BASE
|
#
1.57 |
|
30-Apr-2020 |
pd |
vmd(8): correctly terminate vm processes after sending vm
Instead of a round about way of sending a message to vmm that 'send is successful' and terminating by vm_remove from vmm, we can send the imsg and exit in the vm process. The sigchld handler in vmm will vm_remove it from its structures. This is how a normal vm is terminated as well.
Previously, vm_remove was called in vmm_dispatch_vm (ie. the event handler to receive messages from vm process) when hanlding the IMSG_VMDOP_SEND_VM_RESPONSE (ie. the vm process has written the vm state to the fd passed on by vmctl send). This is not how vm_remove was intented to be used as it does a free(vm). The vm struct holds the buffers for imsg and so after handling this IMSG_VMDOP_SEND_VM_RESPONSE message, vmm_dispatch_vm loops again to do imsg_get(ibuf, &imsg) to read the next message (and we had just freed this *ibuf when we freed the vm struct) causing it to segfault.
reported by kn@ ok kn@
|
#
1.56 |
|
21-Apr-2020 |
pd |
vmd: improve concurrency control in pause
Previous implementation hit a deadlock sometimes as the pthread_cond_broadcast for the pause mutex could happen before pthread_cond_wait. This implementation uses a barrier which is hit when all vpcus are paused.
ok mpi@
|
#
1.55 |
|
08-Apr-2020 |
pd |
vmm(4): add IOCTL handler to sets the access protections of the ept
This exposes VMM_IOC_MPROTECT_EPT which can be used by vmd to lock in physical pages. Currently, vmd just terminates the vm in case it gets a protection fault in the future.
This feature is used by solo5 which uses vmm(4) as a backend hypervisor.
ok mpi@
Patch from Adam Steen <adam@adamsteen.com.au>
|
#
1.54 |
|
11-Dec-2019 |
pd |
vmd: proper concurrency control when pausing a vm
Removes an XXX which slept for 1s waiting for the vcpu thread to reach HLT and pause. We now define a paused and unpaused condition so that a call to pause_vm() / vmctl pause blocks till the vm really reaches a paused state.
Also, detach events for devices from event loop when pausing and add them back when unpausing. This is because some callbacks call pthread_mutex_lock and if the vm is paused, it would block also causing the libevent thread to block. This would mean that we would not be able to process any IMSGs received from vmm (parent process) including a message to unpause.
ok mlarkin@
|
#
1.53 |
|
30-Nov-2019 |
mlarkin |
Revert previous - the stability was not as improved as we had thought and we ended up accidentally breaking vmctl. This will need more thought.
ok ori@
|
#
1.52 |
|
29-Nov-2019 |
mlarkin |
Fix at least one cause of VMs spinning at 100% host CPU
After debugging with ori@, it looks like an event ends up on the wrong libevent queue, and we end continually de-queueing and re-queueing the event continually. While it's unclear exactly why this happened, a clue on libevent's github issues page for the same problem pointed us to using a different event base for the device events. This seems to have unstuck ori@'s problematic VM, and I have also seen no more hangs after this.
We have not completely separated the queues; ori@ will work on setting new libevent bases for those later. But those events are pretty frequency.
with help from and ok ori@
|
Revision tags: OPENBSD_6_6_BASE
|
#
1.51 |
|
17-Jul-2019 |
pd |
vmm/vmd: Fix migration with pvclock
Implement VMM_IOC_READVMPARAMS and VMM_IOC_WRITEVMPARAMS ioctls to read and write pvclock state.
reads ok mlarkin@
|
#
1.50 |
|
28-Jun-2019 |
deraadt |
When system calls indicate an error they return -1, not some arbitrary value < 0. errno is only updated in this case. Change all (most?) callers of syscalls to follow this better, and let's see if this strictness helps us in the future.
|
#
1.49 |
|
28-May-2019 |
pd |
vmd: unset CR0_CD and CR0_NW in default flat64 register values
These never got unset on AMD/SVM guests when booted via vmctl start -b causing them to run very slow
ok mlarkin@
|
#
1.48 |
|
12-May-2019 |
pd |
vmm: add a x86 page table walker
Add a first cut of x86 page table walker to vmd(8) and vmm(4). This function is not used right now but is a building block for future features like HPET, OUTSB and INSB emulation, nested virtualisation support, etc.
With help from Mike Larkin
ok mlarkin@
|
#
1.47 |
|
11-May-2019 |
jasper |
vm_dump_header allocated space for a signature but it was never set; set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image
ok mlarkin@ pd@
|
#
1.46 |
|
11-May-2019 |
jasper |
track the state of the vm (running, paused, etc) using a single bitfield instead of a handful of separate variables. this will makes it easier for vmd to report and check on the individual vm states
no functional change intended
ok ccardenas@ mlarkin@
|
Revision tags: OPENBSD_6_5_BASE
|
#
1.45 |
|
01-Mar-2019 |
mlarkin |
vmd(8): remove some i386 remnants that missed the original cleanup
ok pd, kn, deraadt
|
#
1.44 |
|
20-Feb-2019 |
mlarkin |
vmd(8): initialize guest %drX registers to power-on defaults on launch
Initializes the %drX registers to power on defaults, and bump the VM send/recieve header to reflect same
discussed with deraadt@
|
#
1.43 |
|
10-Dec-2018 |
claudio |
Implement the fw_cfg interface basics and use it to set the bootorder if a bootdevice was forced. This implements both the pure IO port interface and also the new DMA interface, a few direct commands are implemented which are needed but in general the "file" interface should be used. There is no write support for the guest. Tested against the latest vmm-firmware port. This requires also a -current kernel to pass the IO ports to vmd(8). OK mlarkin@ ccardenas@
|
#
1.42 |
|
06-Dec-2018 |
claudio |
Make it possible to define the bootdevice in vmd. This information is used currently only when booting a OpenBSD kernel. If VMBOOTDEV_NET is used the internal dhcp server will pass "auto_install" as boot file to the client and the boot loader passes the MAC of the first interface to the kernel to indicate PXE booting. Adding boot order support to SeaBIOS is not yet implemented. Ok ccardenas@
|
Revision tags: OPENBSD_6_4_BASE
|
#
1.41 |
|
08-Oct-2018 |
reyk |
Add support for qcow2 base images (external snapshots).
This works is from Ori Bernstein, committing on his behalf:
Add support to vmd for external snapshots. That is, snapshots that are derived from a base image. Data lookups start in the derived image, and if the derived image does not contain some data, the search proceeds ot the base image. Multiple derived images may exist off of a single base image.
A limitation of this format is that modifying the base image will corrupt the derived image.
This change also adds support for creating disk derived disk images to vmctl. To use it:
vmctl create derived.qcow2 -s 16G -b base.qcow2
From Ori Bernstein OK mlarkin@ reyk@
|
#
1.40 |
|
28-Sep-2018 |
reyk |
Support vmd-internal's vmboot with qcow2 disk images.
OK mlarkin@
|
#
1.39 |
|
19-Sep-2018 |
ccardenas |
Various clean up items for disks.
- qcow2: general cleanup - vioraw: check malloc - virtio: add function to sync disks - vm: call virtio_shutdown to sync disks when vm is finished executing
Thanks to Ori Bernstein.
Ok miko@
|
#
1.38 |
|
17-Jul-2018 |
mlarkin |
vmd(8): fix vmctl -b option for i386 kernels.
ok pd@
|
#
1.37 |
|
12-Jul-2018 |
mlarkin |
vmm(8)/vmm(4): send a copy of the guest register state to vmd on exit, avoiding multiple readregs ioctls back to vmm in case register content is needed subsequently.
ok phessler
|
#
1.36 |
|
10-Jul-2018 |
mlarkin |
vmd(8): route ELCR handler to the right function
|
#
1.35 |
|
09-Jul-2018 |
mlarkin |
vmd(8): better debug message in a failure case
|
#
1.34 |
|
19-Jun-2018 |
reyk |
knf
|
#
1.33 |
|
27-Apr-2018 |
mlarkin |
vmd(8): implement vmd side of ELCR registers
ok guenther
|
#
1.32 |
|
26-Apr-2018 |
mlarkin |
vmd(8): handle PIT channel 2 status readback via port 0x61
Allow PIT channel 2 status (fired/counting) readback via port 0x61 bit 5.
ok guenther@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.31 |
|
03-Jan-2018 |
ccardenas |
Add initial CD-ROM support to VMD via vioscsi.
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
#
1.30 |
|
29-Nov-2017 |
mlarkin |
make vmm(4) less responsible for initial register state, preferring to let usermode daemons handle that.
ok pd@
|
#
1.29 |
|
28-Nov-2017 |
mlarkin |
fix some spelling errors in a few comments
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.28 |
|
19-Sep-2017 |
mlarkin |
Clarify a wrong conditional, found by jsg.
ok jsg
|
#
1.27 |
|
17-Sep-2017 |
pd |
vmd: send/recv pci config space instead of recreating pci devices on receive
ok mlarkin@
|
#
1.26 |
|
17-Sep-2017 |
pd |
vmd: re add rtc.per and rtc.sec evtimers on receive
This was missed in receive. mc146818_start is already defined. This fixes rtc time resync on receive.
ok mlarkin@
|
#
1.25 |
|
11-Sep-2017 |
dlg |
add functions to provide direct access to guest memory as vmd addresses
iovec_mem() populates an iovec array based on guest physical addresses. this allows the use of things like readv and writev for moving data between the guest and a disk image file without having to bounce the memory.
vaddr_mem() provides a vmd usable pointer based on a guests physical address. this makes it possible to directly reference things like virtio rings without having to bounce that memory either. however, it assumes that a contiguous range of guest physical memory will sit in a single vm memory range. mlarkin@ says this is right.
ok mlarkin@
|
#
1.24 |
|
20-Aug-2017 |
pd |
vmd: Allow only upward migration
This restricts receiving vms from hosts with more cpu features.
Tested on broadwell -> skylake (works) skylake -> broadwell (don't work)
ok mlarkin@
|
#
1.23 |
|
14-Aug-2017 |
mlarkin |
vmd: set MSR_MISC_ENABLE=0 on vm creation, this will be re-set in vmm based on proper values from the host in use.
|
#
1.22 |
|
15-Jul-2017 |
pd |
Add vmctl send and vmctl receive
ok reyk@ and mlarkin@
|
#
1.21 |
|
09-Jul-2017 |
pd |
vmd/vmctl: Add ability to pause / unpause vms
With help from Ashwin Agrawal
ok reyk@ mlarkin@
|
#
1.20 |
|
07-Jun-2017 |
mlarkin |
vmd: Implement simulated baudrate support in the ns8250 module. The previous version was allowing an output rate that is "too fast", and linux guests would give up after 512 characters TXed ("too much work for irq4").
This diff calculates the approximate rate we can sustain at the current programmed baud rate and limits the output to that rate by inserting a HZ delay after a specified number of characters have been transmitted. This fixes the linux guest console issue.
Note that the console now outputs at more or less the selected baud rate, instead of nearly instantaneously as before - if you selected 9600 in your guest VMs before, you might want to change that to 115200 now for a better console experience.
krw@ "seems like a good idea to me"
|
#
1.19 |
|
30-May-2017 |
tedu |
split vioblk read/write functions into start and finish as prep for async io operations. ok mlarkin
|
#
1.18 |
|
28-May-2017 |
mlarkin |
SVM: add some exit types
Also, fix a comment that wasn't applicable anymore, and change a format from decimal to hex
|
#
1.17 |
|
05-May-2017 |
reyk |
VMs cannot use proc_compose() to PROC_VMM, they have to use imsg_compose() on the "vmm_pipe" directly. This fixes the communication channel from VMs back to vmm.
|
#
1.16 |
|
05-May-2017 |
mlarkin |
Allow vmd(8) to set guest %xcr0
Usermode part of previous vmm(4) diff.
Posted to tech by Pratik Vyas
|
#
1.15 |
|
02-May-2017 |
mlarkin |
fix an error in i386 vmd build
|
#
1.14 |
|
02-May-2017 |
mlarkin |
Matching vmd(8) part of previous diff (first part of vmctl send/receive).
ok kettenis
|
#
1.13 |
|
25-Apr-2017 |
reyk |
spacing
|
#
1.12 |
|
19-Apr-2017 |
reyk |
Add support for dynamic "NAT" interfaces (-L/local interface).
When a local interface is configured, vmd configures a /31 address on the tap(4) interface of the host and provides another IP in the same subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server that replies with IP, gateway, and DNS addresses to the VM. The built-in server only ever responds to the VM on the inside and cannot leak its DHCP responses to the outside.
Thanks to Uwe Werler, Josh Grosse, and some others for testing!
OK deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.11 |
|
27-Mar-2017 |
deraadt |
die whitespace die die die
|
#
1.10 |
|
25-Mar-2017 |
mlarkin |
Last bits needed to get seabios + alpine linux working. This is enough to get started and let more people help finding and fixing bugs.
ok kettenis, deraadt
|
#
1.9 |
|
25-Mar-2017 |
reyk |
Boot using BIOS from /etc/firmware/vmm-bios by default.
Instead of using the internal "vmboot", VMs will now be booted using the external BIOS firmware in /etc/firmware/vmm-bios (which is subject to a LGPLv3 license). Direct booting of OpenBSD kernels or non-default BIOS images is still supported for now using the -b/boot option that is replacing the -k/kernel option.
As requested by Theo, vmd(8) fails if neither the default BIOS is found nor a kernel has been specified in the VM configuration. The "vmm" BIOS has to be installed using fw_update(1), which will be done automatically in most cases where the OpenBSD can fetch it after install/upgrade.
OK mlarkin@
|
#
1.8 |
|
25-Mar-2017 |
mlarkin |
Implement some missing functionality and clean up some code in vmd pci emulation.
ok kettenis
|
#
1.7 |
|
25-Mar-2017 |
mlarkin |
Introduce a new function to obtain properly sized input data, and convert i8253/i8259/mc146818 emulation to use this.
|
#
1.6 |
|
24-Mar-2017 |
mlarkin |
Allow vmd to proceed after an interrupt occurred after retiring a cpuid instruction. Matches previous commit to kernel vmm.c
|
#
1.5 |
|
23-Mar-2017 |
mlarkin |
Implement memory size and SMP CPU count NVRAM registers in the emulated mc146818. This is needed for seabios to boot properly (and construct a sensible e820 map to send to the guest OS).
|
#
1.4 |
|
21-Mar-2017 |
mlarkin |
Fix two errors in NS8250 (UART) emulation. The first error zeroed out the high bits of %eax on reading register data from the emulated UART ports. The second error didn't properly assert the TXRDY bit during init - this bit was only set after the first character was sent. Both these bugs caused seabios to not be able to output any data. Found during the recent effort to get Linux guests booting.
|
#
1.3 |
|
15-Mar-2017 |
reyk |
Improve vmmci(4) shutdown and reboot.
This change handles various cases to power off the VM, even if it is unresponsive, stuck in ddb, or when the shutdown was initiated from the VM guest side. Usage of timeout and VM ACKs make sure that the VM is really turned off at some point.
OK mlarkin@
|
#
1.2 |
|
02-Mar-2017 |
reyk |
Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.
This is especially useful when multiple VMs share a switch, the implementation is independent from the underlying switch or bridge.
no objections mlarkin@
|
#
1.1 |
|
01-Mar-2017 |
reyk |
Split vmm.c into two files: vm.c for the VM child, vmm.c for the parent
As discussed with mlarkin@, it makes it easier to maintain the file.
OK mlarkin@
|
#
1.82 |
|
28-Jan-2023 |
dv |
Move some header definitions from vmm(4) to vmd(8).
Part of an ongoing effort to move userland-specific information out of a kernel header and directly into vmd(8). No functional change.
ok mlarkin@
|
#
1.81 |
|
08-Jan-2023 |
dv |
vmd(8): add thread names to vm process.
ok guenther@.
|
#
1.80 |
|
04-Jan-2023 |
dv |
Typos in vmd error message. No functional change.
|
#
1.79 |
|
28-Dec-2022 |
jmc |
spelling fixes; from paul tagliamonte any parts of his diff not taken are noted on tech
|
#
1.78 |
|
26-Dec-2022 |
dv |
vmd(8): provide a detailed e820 memory map.
When booting guests with SeaBIOS, vmd(8) supplied details about the available guest memory via CMOS registers. Consequently, we've been carrying some patches in the ports tree to SeaBIOS to fetch this information like it's the 1990s.
When a vm initializes memory ranges, we now track what each range represents. This information can be used to supply the e820 memory map to SeaBIOS via the fw_cfg interface allowing it to properly communicate memory ranges to a guest operating system. (This will also allow us to drop some patches from the port.)
Given the ranges can now be marked with a purpose, this also allows vmm(4) to switch from hard-coded mmio ranges and instead let the information on the memory range dictate if vmm should be handling a page fault or sending to vmd for a memory assist.
Tested by Mischa Peters and others. OK mlarkin@.
|
#
1.77 |
|
23-Dec-2022 |
dv |
vmd(8): implement zero-copy operations on virtqueues.
The original virtio device implementation relied on allocating a buffer on heap, copying the virtqueue from the guest, mutating the copy, and then overwriting the virtqueue in the guest.
While the approach worked, it was both complex and added extra overhead. On older hardware, switching to the zero-copy approach can show a noticeable performance improvement for vionet devices. An added benefit is this diff also reduces the amount of code in vmd, which is always a welcome change.
In addition, change to talking about the queue pfn and not "address" as the virtio-pci spec has drivers provide a 32-bit value representing the physical page number of the location in guest memory, not the linear address.
Original idea from dlg@ while working on re-adding async task queues.
ok dlg@, tested by many
|
#
1.76 |
|
11-Nov-2022 |
dv |
Revert removal of toggling interrupt line in vmd vcpu run loop.
phessler reports a performance regression. Needs more testing.
|
#
1.75 |
|
10-Nov-2022 |
dv |
vmd(8): remove toggling interrupt line on vcpu in vcpu run loop
We toggle the interrupt "line" on the vcpu when we assert or deassert irq on the pic in either the vcpu thread (emulating some devices) or on the device event thread (mostly handling reading available data). Having it in the vcpu run loop here just results in another ioctl(2) call before the one for re-entering the guest cpu.
Removing it shows no noticeable behavioral change in existing guests.
ok mlarkin@
|
#
1.74 |
|
10-Nov-2022 |
dv |
vmd(8): import mmio decode and emulation, disabled for now.
The initial mmio support for vmd adds support for only specific MOV and MOVZX instructions. Plan is to begin iterating in-tree on other missing pieces. All functionality is gated behind an #if for now.
Only change to vmm(4) is reordering register #define's in vmmvar.h.
ok mlarkin@
|
Revision tags: OPENBSD_7_2_BASE
|
#
1.73 |
|
01-Sep-2022 |
dv |
vmm(4): send all port io emulation to userland
Simplify things by sending any io exits from IN/OUT instructions to userland instead of trying to emulate anything in the kernel. vmm was sending most pertinent exits to vmd anyways, so this functionally changes little.
An added benefit is this solves an issue reported by tb@ where i386 OpenBSD guests would probe for a pc keyboard repeatedly and cause excessive vm exits. (The emulation in vmm was not properly handling these port reads.)
While here, make the assignment of the VEI_DIR_{IN,OUT} enum values not assume the underlying integer the compiler may assign.
ok mlarkin@
|
#
1.72 |
|
30-Aug-2022 |
dv |
Initial support for mmio assist for vmm(4)
Provide the basic information required for a userland assist in emulating instructions touching mmio regions, sending as much information as is provided by the host hardware.
No decode or assist provided at the moment by vmd(8).
ok mlarkin@
|
#
1.71 |
|
29-Jun-2022 |
dv |
vmd(8): fix off by one in vm memory range check
When inspecting if a gpa falls into a known memory range, vmd was considering it valid 1 byte past the end resulting in selecting the wrong starting range for the search.
ok mlarkin@
|
#
1.70 |
|
26-Jun-2022 |
dv |
vmd: create a copy of bios at 4g boundary
Newer Linux kernels call into the bios to perform a reboot and our version of SeaBIOS assumes there's a "copy" of the bios ending at 4g. When SeaBIOS reads from this area, since vmd doesn't perform mmio yet, guests terminate with an unhandled fault.
Carve out some space ending at 4g and copy the bios there. Technically we could load garbage there, but give SeaBIOS what it wants for now.
ok mlarkin@
|
#
1.69 |
|
03-May-2022 |
dv |
vmm/vmd/vmctl: standardize memory units to bytes
At different points in the vm lifecycle vmm(4), vmctl(8), and vmd(8) refer to a vm's memory range sizes in either bytes or megabytes. This is needlessly complex.
Switch to using bytes everywhere and adjust types and constants accordingly. While this makes it possible to specify vm's with memory in fractions of megabytes, the logic requiring whole megabyte values remains.
Feedback from deraadt@, mlarkin@, and Matthew Martin.
ok mlarkin@
|
Revision tags: OPENBSD_7_1_BASE
|
#
1.68 |
|
01-Mar-2022 |
dv |
vmd(8): gracefully handle hitting data limits when starting a vm
With recent changes to login.conf(5) to restrict daemon datasize to a finite value, users can now hit resource limits when attempting to start a vm.
This change fixes the error path when hitting the limit. vmd(8) will no longer abort and memory error messages are relayed to the user.
While here, address potential under-reads/writes using atomicio when relaying data between the child vm process and vmd's vmm process.
Original diff from tedu@. OK mlarkin@.
|
#
1.67 |
|
30-Dec-2021 |
claudio |
Add back support for -B net -b bsd.rd which emulates a PXE install and results in an autoinstall. This can be used to quickly create new OpenBSD installs. OK dv@
|
#
1.66 |
|
29-Nov-2021 |
deraadt |
mostly avoid sys/param.h with a local nitems() ok mlarkin
|
Revision tags: OPENBSD_7_0_BASE
|
#
1.65 |
|
01-Sep-2021 |
dv |
remove unused functions and cleanup vmd.h
Discussed with mlarkin@. These functions were implemented but never used. While in vmd.h, fix the order to match current vmd(8) reality.
|
#
1.64 |
|
16-Jul-2021 |
dv |
vmd(8): simplify vcpu logic, removing uart & vionet reads
Remove legacy state handling on the ns8250 and virtio network devices originally put in place before using libevent for async device events. The vcpu thread doesn't need to process device data as it is handled by the libevent thread.
This has the benefit of simplifying some of the message passing between threads introduced to the ns8250 uart since both the vcpu and libevent threads were processing read events.
No functional change intended. Tested by many, including abieber@, weerd@, Mischa Peters, and Matthias Schmidt. (Thanks.)
OK mlarkin@
|
#
1.63 |
|
16-Jun-2021 |
dv |
cleanup vmd(8) includes and header files
Lots of organic growth other the years lead to unnecessary includes (proc.h everywhere) and odd dependencies between header files. This cleans things up a bit to help with upcoming cleanup around dhcp code.
No functional change.
"go for it" mlarkin@
|
Revision tags: OPENBSD_6_9_BASE
|
#
1.62 |
|
05-Apr-2021 |
dv |
Support booting from compressed kernel images.
The bsd.rd ramdisk now ships gzip'd on amd64. Use libz in base to transparently handle decompression of any compressed kernel images.
Patch from Josh Rickmar.
ok kn@
|
#
1.61 |
|
29-Mar-2021 |
dv |
Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior ignored did not intercept these packets and instead transmitted them.
This should make vmd(8)'s dhcp behave more as a true dhcp server should and allows it to work properly with the new dhcpleased(8) attempting a renewal.
OK mlarkin@
|
#
1.60 |
|
19-Mar-2021 |
kn |
Remove booting from kernels in raw/qcow2 images
Diff and (slightly tweaked) text below from Dave Voutila < dave at sisu dot io >, thanks!
-- Since 6.7 switched to FFS2 as the default filesystem for new installs, the ability for vmd(8) to load a kernel and boot.conf from a disk image directly (without SeaBIOS) has been broken.
A diff from tb to add FFS2 support never mdae it into the tree.
On 5th Jan 2021, new ramdisks for amd64 have started shipping gzipped, breaking the ability to load the bsd.rd directly as a kernel image for a vmd guest without first uncompressing the image.
Using BIOS works, the FFS2 change happend ten months ago and few if any have complained about the breakage. vmctl(8) is still vague about supporting it per its man page and one still has to pass the disk image twice as a "-b" and "-d" argument to boot an OpenBSD guest *without* BIOS.
Josh Rickmar reported the gzip issue on bugs@ and provided patches to add support for compressed ramdisks and kernel images. The easiest way to do so is to drop support for FFS images since they require a call to fmemopen(3) while all the other logic uses fopen(3)/fdopen(3) calls and a file descriptor. It is much easier to get thsoe patches merged if they don't have to account for extracting files from disk images. --
No objections anyone "Removing it makes sense" reyk (who wrote the FFS module) OK mlarkin
|
#
1.59 |
|
13-Feb-2021 |
mlarkin |
Fix some wrong comments and KNF/long line wraps
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.58 |
|
28-Jun-2020 |
pd |
vmd(8): Eliminate libevent state corruption
libevent functions for com, pic and rtc are now only called on event_thread. vcpu exit handlers send messages on a dev pipe and callbacks on these events do the event management (event_add, evtimer_add, etc). Previously, libevent state was mutated by two threads, event_thread, that runs all the callbacks and the vcpu thread when running exit handlers. This could have lead to libevent state corruption.
Patch from Dave Voutila <dave@sisu.io>
ok claudio@ tested by abieber@ and brynet@
|
Revision tags: OPENBSD_6_7_BASE
|
#
1.57 |
|
30-Apr-2020 |
pd |
vmd(8): correctly terminate vm processes after sending vm
Instead of a round about way of sending a message to vmm that 'send is successful' and terminating by vm_remove from vmm, we can send the imsg and exit in the vm process. The sigchld handler in vmm will vm_remove it from its structures. This is how a normal vm is terminated as well.
Previously, vm_remove was called in vmm_dispatch_vm (ie. the event handler to receive messages from vm process) when hanlding the IMSG_VMDOP_SEND_VM_RESPONSE (ie. the vm process has written the vm state to the fd passed on by vmctl send). This is not how vm_remove was intented to be used as it does a free(vm). The vm struct holds the buffers for imsg and so after handling this IMSG_VMDOP_SEND_VM_RESPONSE message, vmm_dispatch_vm loops again to do imsg_get(ibuf, &imsg) to read the next message (and we had just freed this *ibuf when we freed the vm struct) causing it to segfault.
reported by kn@ ok kn@
|
#
1.56 |
|
21-Apr-2020 |
pd |
vmd: improve concurrency control in pause
Previous implementation hit a deadlock sometimes as the pthread_cond_broadcast for the pause mutex could happen before pthread_cond_wait. This implementation uses a barrier which is hit when all vpcus are paused.
ok mpi@
|
#
1.55 |
|
08-Apr-2020 |
pd |
vmm(4): add IOCTL handler to sets the access protections of the ept
This exposes VMM_IOC_MPROTECT_EPT which can be used by vmd to lock in physical pages. Currently, vmd just terminates the vm in case it gets a protection fault in the future.
This feature is used by solo5 which uses vmm(4) as a backend hypervisor.
ok mpi@
Patch from Adam Steen <adam@adamsteen.com.au>
|
#
1.54 |
|
11-Dec-2019 |
pd |
vmd: proper concurrency control when pausing a vm
Removes an XXX which slept for 1s waiting for the vcpu thread to reach HLT and pause. We now define a paused and unpaused condition so that a call to pause_vm() / vmctl pause blocks till the vm really reaches a paused state.
Also, detach events for devices from event loop when pausing and add them back when unpausing. This is because some callbacks call pthread_mutex_lock and if the vm is paused, it would block also causing the libevent thread to block. This would mean that we would not be able to process any IMSGs received from vmm (parent process) including a message to unpause.
ok mlarkin@
|
#
1.53 |
|
30-Nov-2019 |
mlarkin |
Revert previous - the stability was not as improved as we had thought and we ended up accidentally breaking vmctl. This will need more thought.
ok ori@
|
#
1.52 |
|
29-Nov-2019 |
mlarkin |
Fix at least one cause of VMs spinning at 100% host CPU
After debugging with ori@, it looks like an event ends up on the wrong libevent queue, and we end continually de-queueing and re-queueing the event continually. While it's unclear exactly why this happened, a clue on libevent's github issues page for the same problem pointed us to using a different event base for the device events. This seems to have unstuck ori@'s problematic VM, and I have also seen no more hangs after this.
We have not completely separated the queues; ori@ will work on setting new libevent bases for those later. But those events are pretty frequency.
with help from and ok ori@
|
Revision tags: OPENBSD_6_6_BASE
|
#
1.51 |
|
17-Jul-2019 |
pd |
vmm/vmd: Fix migration with pvclock
Implement VMM_IOC_READVMPARAMS and VMM_IOC_WRITEVMPARAMS ioctls to read and write pvclock state.
reads ok mlarkin@
|
#
1.50 |
|
28-Jun-2019 |
deraadt |
When system calls indicate an error they return -1, not some arbitrary value < 0. errno is only updated in this case. Change all (most?) callers of syscalls to follow this better, and let's see if this strictness helps us in the future.
|
#
1.49 |
|
28-May-2019 |
pd |
vmd: unset CR0_CD and CR0_NW in default flat64 register values
These never got unset on AMD/SVM guests when booted via vmctl start -b causing them to run very slow
ok mlarkin@
|
#
1.48 |
|
12-May-2019 |
pd |
vmm: add a x86 page table walker
Add a first cut of x86 page table walker to vmd(8) and vmm(4). This function is not used right now but is a building block for future features like HPET, OUTSB and INSB emulation, nested virtualisation support, etc.
With help from Mike Larkin
ok mlarkin@
|
#
1.47 |
|
11-May-2019 |
jasper |
vm_dump_header allocated space for a signature but it was never set; set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image
ok mlarkin@ pd@
|
#
1.46 |
|
11-May-2019 |
jasper |
track the state of the vm (running, paused, etc) using a single bitfield instead of a handful of separate variables. this will makes it easier for vmd to report and check on the individual vm states
no functional change intended
ok ccardenas@ mlarkin@
|
Revision tags: OPENBSD_6_5_BASE
|
#
1.45 |
|
01-Mar-2019 |
mlarkin |
vmd(8): remove some i386 remnants that missed the original cleanup
ok pd, kn, deraadt
|
#
1.44 |
|
20-Feb-2019 |
mlarkin |
vmd(8): initialize guest %drX registers to power-on defaults on launch
Initializes the %drX registers to power on defaults, and bump the VM send/recieve header to reflect same
discussed with deraadt@
|
#
1.43 |
|
10-Dec-2018 |
claudio |
Implement the fw_cfg interface basics and use it to set the bootorder if a bootdevice was forced. This implements both the pure IO port interface and also the new DMA interface, a few direct commands are implemented which are needed but in general the "file" interface should be used. There is no write support for the guest. Tested against the latest vmm-firmware port. This requires also a -current kernel to pass the IO ports to vmd(8). OK mlarkin@ ccardenas@
|
#
1.42 |
|
06-Dec-2018 |
claudio |
Make it possible to define the bootdevice in vmd. This information is used currently only when booting a OpenBSD kernel. If VMBOOTDEV_NET is used the internal dhcp server will pass "auto_install" as boot file to the client and the boot loader passes the MAC of the first interface to the kernel to indicate PXE booting. Adding boot order support to SeaBIOS is not yet implemented. Ok ccardenas@
|
Revision tags: OPENBSD_6_4_BASE
|
#
1.41 |
|
08-Oct-2018 |
reyk |
Add support for qcow2 base images (external snapshots).
This works is from Ori Bernstein, committing on his behalf:
Add support to vmd for external snapshots. That is, snapshots that are derived from a base image. Data lookups start in the derived image, and if the derived image does not contain some data, the search proceeds ot the base image. Multiple derived images may exist off of a single base image.
A limitation of this format is that modifying the base image will corrupt the derived image.
This change also adds support for creating disk derived disk images to vmctl. To use it:
vmctl create derived.qcow2 -s 16G -b base.qcow2
From Ori Bernstein OK mlarkin@ reyk@
|
#
1.40 |
|
28-Sep-2018 |
reyk |
Support vmd-internal's vmboot with qcow2 disk images.
OK mlarkin@
|
#
1.39 |
|
19-Sep-2018 |
ccardenas |
Various clean up items for disks.
- qcow2: general cleanup - vioraw: check malloc - virtio: add function to sync disks - vm: call virtio_shutdown to sync disks when vm is finished executing
Thanks to Ori Bernstein.
Ok miko@
|
#
1.38 |
|
17-Jul-2018 |
mlarkin |
vmd(8): fix vmctl -b option for i386 kernels.
ok pd@
|
#
1.37 |
|
12-Jul-2018 |
mlarkin |
vmm(8)/vmm(4): send a copy of the guest register state to vmd on exit, avoiding multiple readregs ioctls back to vmm in case register content is needed subsequently.
ok phessler
|
#
1.36 |
|
10-Jul-2018 |
mlarkin |
vmd(8): route ELCR handler to the right function
|
#
1.35 |
|
09-Jul-2018 |
mlarkin |
vmd(8): better debug message in a failure case
|
#
1.34 |
|
19-Jun-2018 |
reyk |
knf
|
#
1.33 |
|
27-Apr-2018 |
mlarkin |
vmd(8): implement vmd side of ELCR registers
ok guenther
|
#
1.32 |
|
26-Apr-2018 |
mlarkin |
vmd(8): handle PIT channel 2 status readback via port 0x61
Allow PIT channel 2 status (fired/counting) readback via port 0x61 bit 5.
ok guenther@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.31 |
|
03-Jan-2018 |
ccardenas |
Add initial CD-ROM support to VMD via vioscsi.
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
#
1.30 |
|
29-Nov-2017 |
mlarkin |
make vmm(4) less responsible for initial register state, preferring to let usermode daemons handle that.
ok pd@
|
#
1.29 |
|
28-Nov-2017 |
mlarkin |
fix some spelling errors in a few comments
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.28 |
|
19-Sep-2017 |
mlarkin |
Clarify a wrong conditional, found by jsg.
ok jsg
|
#
1.27 |
|
17-Sep-2017 |
pd |
vmd: send/recv pci config space instead of recreating pci devices on receive
ok mlarkin@
|
#
1.26 |
|
17-Sep-2017 |
pd |
vmd: re add rtc.per and rtc.sec evtimers on receive
This was missed in receive. mc146818_start is already defined. This fixes rtc time resync on receive.
ok mlarkin@
|
#
1.25 |
|
11-Sep-2017 |
dlg |
add functions to provide direct access to guest memory as vmd addresses
iovec_mem() populates an iovec array based on guest physical addresses. this allows the use of things like readv and writev for moving data between the guest and a disk image file without having to bounce the memory.
vaddr_mem() provides a vmd usable pointer based on a guests physical address. this makes it possible to directly reference things like virtio rings without having to bounce that memory either. however, it assumes that a contiguous range of guest physical memory will sit in a single vm memory range. mlarkin@ says this is right.
ok mlarkin@
|
#
1.24 |
|
20-Aug-2017 |
pd |
vmd: Allow only upward migration
This restricts receiving vms from hosts with more cpu features.
Tested on broadwell -> skylake (works) skylake -> broadwell (don't work)
ok mlarkin@
|
#
1.23 |
|
14-Aug-2017 |
mlarkin |
vmd: set MSR_MISC_ENABLE=0 on vm creation, this will be re-set in vmm based on proper values from the host in use.
|
#
1.22 |
|
15-Jul-2017 |
pd |
Add vmctl send and vmctl receive
ok reyk@ and mlarkin@
|
#
1.21 |
|
09-Jul-2017 |
pd |
vmd/vmctl: Add ability to pause / unpause vms
With help from Ashwin Agrawal
ok reyk@ mlarkin@
|
#
1.20 |
|
07-Jun-2017 |
mlarkin |
vmd: Implement simulated baudrate support in the ns8250 module. The previous version was allowing an output rate that is "too fast", and linux guests would give up after 512 characters TXed ("too much work for irq4").
This diff calculates the approximate rate we can sustain at the current programmed baud rate and limits the output to that rate by inserting a HZ delay after a specified number of characters have been transmitted. This fixes the linux guest console issue.
Note that the console now outputs at more or less the selected baud rate, instead of nearly instantaneously as before - if you selected 9600 in your guest VMs before, you might want to change that to 115200 now for a better console experience.
krw@ "seems like a good idea to me"
|
#
1.19 |
|
30-May-2017 |
tedu |
split vioblk read/write functions into start and finish as prep for async io operations. ok mlarkin
|
#
1.18 |
|
28-May-2017 |
mlarkin |
SVM: add some exit types
Also, fix a comment that wasn't applicable anymore, and change a format from decimal to hex
|
#
1.17 |
|
05-May-2017 |
reyk |
VMs cannot use proc_compose() to PROC_VMM, they have to use imsg_compose() on the "vmm_pipe" directly. This fixes the communication channel from VMs back to vmm.
|
#
1.16 |
|
05-May-2017 |
mlarkin |
Allow vmd(8) to set guest %xcr0
Usermode part of previous vmm(4) diff.
Posted to tech by Pratik Vyas
|
#
1.15 |
|
02-May-2017 |
mlarkin |
fix an error in i386 vmd build
|
#
1.14 |
|
02-May-2017 |
mlarkin |
Matching vmd(8) part of previous diff (first part of vmctl send/receive).
ok kettenis
|
#
1.13 |
|
25-Apr-2017 |
reyk |
spacing
|
#
1.12 |
|
19-Apr-2017 |
reyk |
Add support for dynamic "NAT" interfaces (-L/local interface).
When a local interface is configured, vmd configures a /31 address on the tap(4) interface of the host and provides another IP in the same subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server that replies with IP, gateway, and DNS addresses to the VM. The built-in server only ever responds to the VM on the inside and cannot leak its DHCP responses to the outside.
Thanks to Uwe Werler, Josh Grosse, and some others for testing!
OK deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.11 |
|
27-Mar-2017 |
deraadt |
die whitespace die die die
|
#
1.10 |
|
25-Mar-2017 |
mlarkin |
Last bits needed to get seabios + alpine linux working. This is enough to get started and let more people help finding and fixing bugs.
ok kettenis, deraadt
|
#
1.9 |
|
25-Mar-2017 |
reyk |
Boot using BIOS from /etc/firmware/vmm-bios by default.
Instead of using the internal "vmboot", VMs will now be booted using the external BIOS firmware in /etc/firmware/vmm-bios (which is subject to a LGPLv3 license). Direct booting of OpenBSD kernels or non-default BIOS images is still supported for now using the -b/boot option that is replacing the -k/kernel option.
As requested by Theo, vmd(8) fails if neither the default BIOS is found nor a kernel has been specified in the VM configuration. The "vmm" BIOS has to be installed using fw_update(1), which will be done automatically in most cases where the OpenBSD can fetch it after install/upgrade.
OK mlarkin@
|
#
1.8 |
|
25-Mar-2017 |
mlarkin |
Implement some missing functionality and clean up some code in vmd pci emulation.
ok kettenis
|
#
1.7 |
|
25-Mar-2017 |
mlarkin |
Introduce a new function to obtain properly sized input data, and convert i8253/i8259/mc146818 emulation to use this.
|
#
1.6 |
|
24-Mar-2017 |
mlarkin |
Allow vmd to proceed after an interrupt occurred after retiring a cpuid instruction. Matches previous commit to kernel vmm.c
|
#
1.5 |
|
23-Mar-2017 |
mlarkin |
Implement memory size and SMP CPU count NVRAM registers in the emulated mc146818. This is needed for seabios to boot properly (and construct a sensible e820 map to send to the guest OS).
|
#
1.4 |
|
21-Mar-2017 |
mlarkin |
Fix two errors in NS8250 (UART) emulation. The first error zeroed out the high bits of %eax on reading register data from the emulated UART ports. The second error didn't properly assert the TXRDY bit during init - this bit was only set after the first character was sent. Both these bugs caused seabios to not be able to output any data. Found during the recent effort to get Linux guests booting.
|
#
1.3 |
|
15-Mar-2017 |
reyk |
Improve vmmci(4) shutdown and reboot.
This change handles various cases to power off the VM, even if it is unresponsive, stuck in ddb, or when the shutdown was initiated from the VM guest side. Usage of timeout and VM ACKs make sure that the VM is really turned off at some point.
OK mlarkin@
|
#
1.2 |
|
02-Mar-2017 |
reyk |
Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.
This is especially useful when multiple VMs share a switch, the implementation is independent from the underlying switch or bridge.
no objections mlarkin@
|
#
1.1 |
|
01-Mar-2017 |
reyk |
Split vmm.c into two files: vm.c for the VM child, vmm.c for the parent
As discussed with mlarkin@, it makes it easier to maintain the file.
OK mlarkin@
|
#
1.81 |
|
08-Jan-2023 |
dv |
vmd(8): add thread names to vm process.
ok guenther@.
|
#
1.80 |
|
04-Jan-2023 |
dv |
Typos in vmd error message. No functional change.
|
#
1.79 |
|
28-Dec-2022 |
jmc |
spelling fixes; from paul tagliamonte any parts of his diff not taken are noted on tech
|
#
1.78 |
|
26-Dec-2022 |
dv |
vmd(8): provide a detailed e820 memory map.
When booting guests with SeaBIOS, vmd(8) supplied details about the available guest memory via CMOS registers. Consequently, we've been carrying some patches in the ports tree to SeaBIOS to fetch this information like it's the 1990s.
When a vm initializes memory ranges, we now track what each range represents. This information can be used to supply the e820 memory map to SeaBIOS via the fw_cfg interface allowing it to properly communicate memory ranges to a guest operating system. (This will also allow us to drop some patches from the port.)
Given the ranges can now be marked with a purpose, this also allows vmm(4) to switch from hard-coded mmio ranges and instead let the information on the memory range dictate if vmm should be handling a page fault or sending to vmd for a memory assist.
Tested by Mischa Peters and others. OK mlarkin@.
|
#
1.77 |
|
23-Dec-2022 |
dv |
vmd(8): implement zero-copy operations on virtqueues.
The original virtio device implementation relied on allocating a buffer on heap, copying the virtqueue from the guest, mutating the copy, and then overwriting the virtqueue in the guest.
While the approach worked, it was both complex and added extra overhead. On older hardware, switching to the zero-copy approach can show a noticeable performance improvement for vionet devices. An added benefit is this diff also reduces the amount of code in vmd, which is always a welcome change.
In addition, change to talking about the queue pfn and not "address" as the virtio-pci spec has drivers provide a 32-bit value representing the physical page number of the location in guest memory, not the linear address.
Original idea from dlg@ while working on re-adding async task queues.
ok dlg@, tested by many
|
#
1.76 |
|
11-Nov-2022 |
dv |
Revert removal of toggling interrupt line in vmd vcpu run loop.
phessler reports a performance regression. Needs more testing.
|
#
1.75 |
|
10-Nov-2022 |
dv |
vmd(8): remove toggling interrupt line on vcpu in vcpu run loop
We toggle the interrupt "line" on the vcpu when we assert or deassert irq on the pic in either the vcpu thread (emulating some devices) or on the device event thread (mostly handling reading available data). Having it in the vcpu run loop here just results in another ioctl(2) call before the one for re-entering the guest cpu.
Removing it shows no noticeable behavioral change in existing guests.
ok mlarkin@
|
#
1.74 |
|
10-Nov-2022 |
dv |
vmd(8): import mmio decode and emulation, disabled for now.
The initial mmio support for vmd adds support for only specific MOV and MOVZX instructions. Plan is to begin iterating in-tree on other missing pieces. All functionality is gated behind an #if for now.
Only change to vmm(4) is reordering register #define's in vmmvar.h.
ok mlarkin@
|
Revision tags: OPENBSD_7_2_BASE
|
#
1.73 |
|
01-Sep-2022 |
dv |
vmm(4): send all port io emulation to userland
Simplify things by sending any io exits from IN/OUT instructions to userland instead of trying to emulate anything in the kernel. vmm was sending most pertinent exits to vmd anyways, so this functionally changes little.
An added benefit is this solves an issue reported by tb@ where i386 OpenBSD guests would probe for a pc keyboard repeatedly and cause excessive vm exits. (The emulation in vmm was not properly handling these port reads.)
While here, make the assignment of the VEI_DIR_{IN,OUT} enum values not assume the underlying integer the compiler may assign.
ok mlarkin@
|
#
1.72 |
|
30-Aug-2022 |
dv |
Initial support for mmio assist for vmm(4)
Provide the basic information required for a userland assist in emulating instructions touching mmio regions, sending as much information as is provided by the host hardware.
No decode or assist provided at the moment by vmd(8).
ok mlarkin@
|
#
1.71 |
|
29-Jun-2022 |
dv |
vmd(8): fix off by one in vm memory range check
When inspecting if a gpa falls into a known memory range, vmd was considering it valid 1 byte past the end resulting in selecting the wrong starting range for the search.
ok mlarkin@
|
#
1.70 |
|
26-Jun-2022 |
dv |
vmd: create a copy of bios at 4g boundary
Newer Linux kernels call into the bios to perform a reboot and our version of SeaBIOS assumes there's a "copy" of the bios ending at 4g. When SeaBIOS reads from this area, since vmd doesn't perform mmio yet, guests terminate with an unhandled fault.
Carve out some space ending at 4g and copy the bios there. Technically we could load garbage there, but give SeaBIOS what it wants for now.
ok mlarkin@
|
#
1.69 |
|
03-May-2022 |
dv |
vmm/vmd/vmctl: standardize memory units to bytes
At different points in the vm lifecycle vmm(4), vmctl(8), and vmd(8) refer to a vm's memory range sizes in either bytes or megabytes. This is needlessly complex.
Switch to using bytes everywhere and adjust types and constants accordingly. While this makes it possible to specify vm's with memory in fractions of megabytes, the logic requiring whole megabyte values remains.
Feedback from deraadt@, mlarkin@, and Matthew Martin.
ok mlarkin@
|
Revision tags: OPENBSD_7_1_BASE
|
#
1.68 |
|
01-Mar-2022 |
dv |
vmd(8): gracefully handle hitting data limits when starting a vm
With recent changes to login.conf(5) to restrict daemon datasize to a finite value, users can now hit resource limits when attempting to start a vm.
This change fixes the error path when hitting the limit. vmd(8) will no longer abort and memory error messages are relayed to the user.
While here, address potential under-reads/writes using atomicio when relaying data between the child vm process and vmd's vmm process.
Original diff from tedu@. OK mlarkin@.
|
#
1.67 |
|
30-Dec-2021 |
claudio |
Add back support for -B net -b bsd.rd which emulates a PXE install and results in an autoinstall. This can be used to quickly create new OpenBSD installs. OK dv@
|
#
1.66 |
|
29-Nov-2021 |
deraadt |
mostly avoid sys/param.h with a local nitems() ok mlarkin
|
Revision tags: OPENBSD_7_0_BASE
|
#
1.65 |
|
01-Sep-2021 |
dv |
remove unused functions and cleanup vmd.h
Discussed with mlarkin@. These functions were implemented but never used. While in vmd.h, fix the order to match current vmd(8) reality.
|
#
1.64 |
|
16-Jul-2021 |
dv |
vmd(8): simplify vcpu logic, removing uart & vionet reads
Remove legacy state handling on the ns8250 and virtio network devices originally put in place before using libevent for async device events. The vcpu thread doesn't need to process device data as it is handled by the libevent thread.
This has the benefit of simplifying some of the message passing between threads introduced to the ns8250 uart since both the vcpu and libevent threads were processing read events.
No functional change intended. Tested by many, including abieber@, weerd@, Mischa Peters, and Matthias Schmidt. (Thanks.)
OK mlarkin@
|
#
1.63 |
|
16-Jun-2021 |
dv |
cleanup vmd(8) includes and header files
Lots of organic growth other the years lead to unnecessary includes (proc.h everywhere) and odd dependencies between header files. This cleans things up a bit to help with upcoming cleanup around dhcp code.
No functional change.
"go for it" mlarkin@
|
Revision tags: OPENBSD_6_9_BASE
|
#
1.62 |
|
05-Apr-2021 |
dv |
Support booting from compressed kernel images.
The bsd.rd ramdisk now ships gzip'd on amd64. Use libz in base to transparently handle decompression of any compressed kernel images.
Patch from Josh Rickmar.
ok kn@
|
#
1.61 |
|
29-Mar-2021 |
dv |
Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior ignored did not intercept these packets and instead transmitted them.
This should make vmd(8)'s dhcp behave more as a true dhcp server should and allows it to work properly with the new dhcpleased(8) attempting a renewal.
OK mlarkin@
|
#
1.60 |
|
19-Mar-2021 |
kn |
Remove booting from kernels in raw/qcow2 images
Diff and (slightly tweaked) text below from Dave Voutila < dave at sisu dot io >, thanks!
-- Since 6.7 switched to FFS2 as the default filesystem for new installs, the ability for vmd(8) to load a kernel and boot.conf from a disk image directly (without SeaBIOS) has been broken.
A diff from tb to add FFS2 support never mdae it into the tree.
On 5th Jan 2021, new ramdisks for amd64 have started shipping gzipped, breaking the ability to load the bsd.rd directly as a kernel image for a vmd guest without first uncompressing the image.
Using BIOS works, the FFS2 change happend ten months ago and few if any have complained about the breakage. vmctl(8) is still vague about supporting it per its man page and one still has to pass the disk image twice as a "-b" and "-d" argument to boot an OpenBSD guest *without* BIOS.
Josh Rickmar reported the gzip issue on bugs@ and provided patches to add support for compressed ramdisks and kernel images. The easiest way to do so is to drop support for FFS images since they require a call to fmemopen(3) while all the other logic uses fopen(3)/fdopen(3) calls and a file descriptor. It is much easier to get thsoe patches merged if they don't have to account for extracting files from disk images. --
No objections anyone "Removing it makes sense" reyk (who wrote the FFS module) OK mlarkin
|
#
1.59 |
|
13-Feb-2021 |
mlarkin |
Fix some wrong comments and KNF/long line wraps
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.58 |
|
28-Jun-2020 |
pd |
vmd(8): Eliminate libevent state corruption
libevent functions for com, pic and rtc are now only called on event_thread. vcpu exit handlers send messages on a dev pipe and callbacks on these events do the event management (event_add, evtimer_add, etc). Previously, libevent state was mutated by two threads, event_thread, that runs all the callbacks and the vcpu thread when running exit handlers. This could have lead to libevent state corruption.
Patch from Dave Voutila <dave@sisu.io>
ok claudio@ tested by abieber@ and brynet@
|
Revision tags: OPENBSD_6_7_BASE
|
#
1.57 |
|
30-Apr-2020 |
pd |
vmd(8): correctly terminate vm processes after sending vm
Instead of a round about way of sending a message to vmm that 'send is successful' and terminating by vm_remove from vmm, we can send the imsg and exit in the vm process. The sigchld handler in vmm will vm_remove it from its structures. This is how a normal vm is terminated as well.
Previously, vm_remove was called in vmm_dispatch_vm (ie. the event handler to receive messages from vm process) when hanlding the IMSG_VMDOP_SEND_VM_RESPONSE (ie. the vm process has written the vm state to the fd passed on by vmctl send). This is not how vm_remove was intented to be used as it does a free(vm). The vm struct holds the buffers for imsg and so after handling this IMSG_VMDOP_SEND_VM_RESPONSE message, vmm_dispatch_vm loops again to do imsg_get(ibuf, &imsg) to read the next message (and we had just freed this *ibuf when we freed the vm struct) causing it to segfault.
reported by kn@ ok kn@
|
#
1.56 |
|
21-Apr-2020 |
pd |
vmd: improve concurrency control in pause
Previous implementation hit a deadlock sometimes as the pthread_cond_broadcast for the pause mutex could happen before pthread_cond_wait. This implementation uses a barrier which is hit when all vpcus are paused.
ok mpi@
|
#
1.55 |
|
08-Apr-2020 |
pd |
vmm(4): add IOCTL handler to sets the access protections of the ept
This exposes VMM_IOC_MPROTECT_EPT which can be used by vmd to lock in physical pages. Currently, vmd just terminates the vm in case it gets a protection fault in the future.
This feature is used by solo5 which uses vmm(4) as a backend hypervisor.
ok mpi@
Patch from Adam Steen <adam@adamsteen.com.au>
|
#
1.54 |
|
11-Dec-2019 |
pd |
vmd: proper concurrency control when pausing a vm
Removes an XXX which slept for 1s waiting for the vcpu thread to reach HLT and pause. We now define a paused and unpaused condition so that a call to pause_vm() / vmctl pause blocks till the vm really reaches a paused state.
Also, detach events for devices from event loop when pausing and add them back when unpausing. This is because some callbacks call pthread_mutex_lock and if the vm is paused, it would block also causing the libevent thread to block. This would mean that we would not be able to process any IMSGs received from vmm (parent process) including a message to unpause.
ok mlarkin@
|
#
1.53 |
|
30-Nov-2019 |
mlarkin |
Revert previous - the stability was not as improved as we had thought and we ended up accidentally breaking vmctl. This will need more thought.
ok ori@
|
#
1.52 |
|
29-Nov-2019 |
mlarkin |
Fix at least one cause of VMs spinning at 100% host CPU
After debugging with ori@, it looks like an event ends up on the wrong libevent queue, and we end continually de-queueing and re-queueing the event continually. While it's unclear exactly why this happened, a clue on libevent's github issues page for the same problem pointed us to using a different event base for the device events. This seems to have unstuck ori@'s problematic VM, and I have also seen no more hangs after this.
We have not completely separated the queues; ori@ will work on setting new libevent bases for those later. But those events are pretty frequency.
with help from and ok ori@
|
Revision tags: OPENBSD_6_6_BASE
|
#
1.51 |
|
17-Jul-2019 |
pd |
vmm/vmd: Fix migration with pvclock
Implement VMM_IOC_READVMPARAMS and VMM_IOC_WRITEVMPARAMS ioctls to read and write pvclock state.
reads ok mlarkin@
|
#
1.50 |
|
28-Jun-2019 |
deraadt |
When system calls indicate an error they return -1, not some arbitrary value < 0. errno is only updated in this case. Change all (most?) callers of syscalls to follow this better, and let's see if this strictness helps us in the future.
|
#
1.49 |
|
28-May-2019 |
pd |
vmd: unset CR0_CD and CR0_NW in default flat64 register values
These never got unset on AMD/SVM guests when booted via vmctl start -b causing them to run very slow
ok mlarkin@
|
#
1.48 |
|
12-May-2019 |
pd |
vmm: add a x86 page table walker
Add a first cut of x86 page table walker to vmd(8) and vmm(4). This function is not used right now but is a building block for future features like HPET, OUTSB and INSB emulation, nested virtualisation support, etc.
With help from Mike Larkin
ok mlarkin@
|
#
1.47 |
|
11-May-2019 |
jasper |
vm_dump_header allocated space for a signature but it was never set; set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image
ok mlarkin@ pd@
|
#
1.46 |
|
11-May-2019 |
jasper |
track the state of the vm (running, paused, etc) using a single bitfield instead of a handful of separate variables. this will makes it easier for vmd to report and check on the individual vm states
no functional change intended
ok ccardenas@ mlarkin@
|
Revision tags: OPENBSD_6_5_BASE
|
#
1.45 |
|
01-Mar-2019 |
mlarkin |
vmd(8): remove some i386 remnants that missed the original cleanup
ok pd, kn, deraadt
|
#
1.44 |
|
20-Feb-2019 |
mlarkin |
vmd(8): initialize guest %drX registers to power-on defaults on launch
Initializes the %drX registers to power on defaults, and bump the VM send/recieve header to reflect same
discussed with deraadt@
|
#
1.43 |
|
10-Dec-2018 |
claudio |
Implement the fw_cfg interface basics and use it to set the bootorder if a bootdevice was forced. This implements both the pure IO port interface and also the new DMA interface, a few direct commands are implemented which are needed but in general the "file" interface should be used. There is no write support for the guest. Tested against the latest vmm-firmware port. This requires also a -current kernel to pass the IO ports to vmd(8). OK mlarkin@ ccardenas@
|
#
1.42 |
|
06-Dec-2018 |
claudio |
Make it possible to define the bootdevice in vmd. This information is used currently only when booting a OpenBSD kernel. If VMBOOTDEV_NET is used the internal dhcp server will pass "auto_install" as boot file to the client and the boot loader passes the MAC of the first interface to the kernel to indicate PXE booting. Adding boot order support to SeaBIOS is not yet implemented. Ok ccardenas@
|
Revision tags: OPENBSD_6_4_BASE
|
#
1.41 |
|
08-Oct-2018 |
reyk |
Add support for qcow2 base images (external snapshots).
This works is from Ori Bernstein, committing on his behalf:
Add support to vmd for external snapshots. That is, snapshots that are derived from a base image. Data lookups start in the derived image, and if the derived image does not contain some data, the search proceeds ot the base image. Multiple derived images may exist off of a single base image.
A limitation of this format is that modifying the base image will corrupt the derived image.
This change also adds support for creating disk derived disk images to vmctl. To use it:
vmctl create derived.qcow2 -s 16G -b base.qcow2
From Ori Bernstein OK mlarkin@ reyk@
|
#
1.40 |
|
28-Sep-2018 |
reyk |
Support vmd-internal's vmboot with qcow2 disk images.
OK mlarkin@
|
#
1.39 |
|
19-Sep-2018 |
ccardenas |
Various clean up items for disks.
- qcow2: general cleanup - vioraw: check malloc - virtio: add function to sync disks - vm: call virtio_shutdown to sync disks when vm is finished executing
Thanks to Ori Bernstein.
Ok miko@
|
#
1.38 |
|
17-Jul-2018 |
mlarkin |
vmd(8): fix vmctl -b option for i386 kernels.
ok pd@
|
#
1.37 |
|
12-Jul-2018 |
mlarkin |
vmm(8)/vmm(4): send a copy of the guest register state to vmd on exit, avoiding multiple readregs ioctls back to vmm in case register content is needed subsequently.
ok phessler
|
#
1.36 |
|
10-Jul-2018 |
mlarkin |
vmd(8): route ELCR handler to the right function
|
#
1.35 |
|
09-Jul-2018 |
mlarkin |
vmd(8): better debug message in a failure case
|
#
1.34 |
|
19-Jun-2018 |
reyk |
knf
|
#
1.33 |
|
27-Apr-2018 |
mlarkin |
vmd(8): implement vmd side of ELCR registers
ok guenther
|
#
1.32 |
|
26-Apr-2018 |
mlarkin |
vmd(8): handle PIT channel 2 status readback via port 0x61
Allow PIT channel 2 status (fired/counting) readback via port 0x61 bit 5.
ok guenther@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.31 |
|
03-Jan-2018 |
ccardenas |
Add initial CD-ROM support to VMD via vioscsi.
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
#
1.30 |
|
29-Nov-2017 |
mlarkin |
make vmm(4) less responsible for initial register state, preferring to let usermode daemons handle that.
ok pd@
|
#
1.29 |
|
28-Nov-2017 |
mlarkin |
fix some spelling errors in a few comments
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.28 |
|
19-Sep-2017 |
mlarkin |
Clarify a wrong conditional, found by jsg.
ok jsg
|
#
1.27 |
|
17-Sep-2017 |
pd |
vmd: send/recv pci config space instead of recreating pci devices on receive
ok mlarkin@
|
#
1.26 |
|
17-Sep-2017 |
pd |
vmd: re add rtc.per and rtc.sec evtimers on receive
This was missed in receive. mc146818_start is already defined. This fixes rtc time resync on receive.
ok mlarkin@
|
#
1.25 |
|
11-Sep-2017 |
dlg |
add functions to provide direct access to guest memory as vmd addresses
iovec_mem() populates an iovec array based on guest physical addresses. this allows the use of things like readv and writev for moving data between the guest and a disk image file without having to bounce the memory.
vaddr_mem() provides a vmd usable pointer based on a guests physical address. this makes it possible to directly reference things like virtio rings without having to bounce that memory either. however, it assumes that a contiguous range of guest physical memory will sit in a single vm memory range. mlarkin@ says this is right.
ok mlarkin@
|
#
1.24 |
|
20-Aug-2017 |
pd |
vmd: Allow only upward migration
This restricts receiving vms from hosts with more cpu features.
Tested on broadwell -> skylake (works) skylake -> broadwell (don't work)
ok mlarkin@
|
#
1.23 |
|
14-Aug-2017 |
mlarkin |
vmd: set MSR_MISC_ENABLE=0 on vm creation, this will be re-set in vmm based on proper values from the host in use.
|
#
1.22 |
|
15-Jul-2017 |
pd |
Add vmctl send and vmctl receive
ok reyk@ and mlarkin@
|
#
1.21 |
|
09-Jul-2017 |
pd |
vmd/vmctl: Add ability to pause / unpause vms
With help from Ashwin Agrawal
ok reyk@ mlarkin@
|
#
1.20 |
|
07-Jun-2017 |
mlarkin |
vmd: Implement simulated baudrate support in the ns8250 module. The previous version was allowing an output rate that is "too fast", and linux guests would give up after 512 characters TXed ("too much work for irq4").
This diff calculates the approximate rate we can sustain at the current programmed baud rate and limits the output to that rate by inserting a HZ delay after a specified number of characters have been transmitted. This fixes the linux guest console issue.
Note that the console now outputs at more or less the selected baud rate, instead of nearly instantaneously as before - if you selected 9600 in your guest VMs before, you might want to change that to 115200 now for a better console experience.
krw@ "seems like a good idea to me"
|
#
1.19 |
|
30-May-2017 |
tedu |
split vioblk read/write functions into start and finish as prep for async io operations. ok mlarkin
|
#
1.18 |
|
28-May-2017 |
mlarkin |
SVM: add some exit types
Also, fix a comment that wasn't applicable anymore, and change a format from decimal to hex
|
#
1.17 |
|
05-May-2017 |
reyk |
VMs cannot use proc_compose() to PROC_VMM, they have to use imsg_compose() on the "vmm_pipe" directly. This fixes the communication channel from VMs back to vmm.
|
#
1.16 |
|
05-May-2017 |
mlarkin |
Allow vmd(8) to set guest %xcr0
Usermode part of previous vmm(4) diff.
Posted to tech by Pratik Vyas
|
#
1.15 |
|
02-May-2017 |
mlarkin |
fix an error in i386 vmd build
|
#
1.14 |
|
02-May-2017 |
mlarkin |
Matching vmd(8) part of previous diff (first part of vmctl send/receive).
ok kettenis
|
#
1.13 |
|
25-Apr-2017 |
reyk |
spacing
|
#
1.12 |
|
19-Apr-2017 |
reyk |
Add support for dynamic "NAT" interfaces (-L/local interface).
When a local interface is configured, vmd configures a /31 address on the tap(4) interface of the host and provides another IP in the same subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server that replies with IP, gateway, and DNS addresses to the VM. The built-in server only ever responds to the VM on the inside and cannot leak its DHCP responses to the outside.
Thanks to Uwe Werler, Josh Grosse, and some others for testing!
OK deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.11 |
|
27-Mar-2017 |
deraadt |
die whitespace die die die
|
#
1.10 |
|
25-Mar-2017 |
mlarkin |
Last bits needed to get seabios + alpine linux working. This is enough to get started and let more people help finding and fixing bugs.
ok kettenis, deraadt
|
#
1.9 |
|
25-Mar-2017 |
reyk |
Boot using BIOS from /etc/firmware/vmm-bios by default.
Instead of using the internal "vmboot", VMs will now be booted using the external BIOS firmware in /etc/firmware/vmm-bios (which is subject to a LGPLv3 license). Direct booting of OpenBSD kernels or non-default BIOS images is still supported for now using the -b/boot option that is replacing the -k/kernel option.
As requested by Theo, vmd(8) fails if neither the default BIOS is found nor a kernel has been specified in the VM configuration. The "vmm" BIOS has to be installed using fw_update(1), which will be done automatically in most cases where the OpenBSD can fetch it after install/upgrade.
OK mlarkin@
|
#
1.8 |
|
25-Mar-2017 |
mlarkin |
Implement some missing functionality and clean up some code in vmd pci emulation.
ok kettenis
|
#
1.7 |
|
25-Mar-2017 |
mlarkin |
Introduce a new function to obtain properly sized input data, and convert i8253/i8259/mc146818 emulation to use this.
|
#
1.6 |
|
24-Mar-2017 |
mlarkin |
Allow vmd to proceed after an interrupt occurred after retiring a cpuid instruction. Matches previous commit to kernel vmm.c
|
#
1.5 |
|
23-Mar-2017 |
mlarkin |
Implement memory size and SMP CPU count NVRAM registers in the emulated mc146818. This is needed for seabios to boot properly (and construct a sensible e820 map to send to the guest OS).
|
#
1.4 |
|
21-Mar-2017 |
mlarkin |
Fix two errors in NS8250 (UART) emulation. The first error zeroed out the high bits of %eax on reading register data from the emulated UART ports. The second error didn't properly assert the TXRDY bit during init - this bit was only set after the first character was sent. Both these bugs caused seabios to not be able to output any data. Found during the recent effort to get Linux guests booting.
|
#
1.3 |
|
15-Mar-2017 |
reyk |
Improve vmmci(4) shutdown and reboot.
This change handles various cases to power off the VM, even if it is unresponsive, stuck in ddb, or when the shutdown was initiated from the VM guest side. Usage of timeout and VM ACKs make sure that the VM is really turned off at some point.
OK mlarkin@
|
#
1.2 |
|
02-Mar-2017 |
reyk |
Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.
This is especially useful when multiple VMs share a switch, the implementation is independent from the underlying switch or bridge.
no objections mlarkin@
|
#
1.1 |
|
01-Mar-2017 |
reyk |
Split vmm.c into two files: vm.c for the VM child, vmm.c for the parent
As discussed with mlarkin@, it makes it easier to maintain the file.
OK mlarkin@
|
#
1.80 |
|
04-Jan-2023 |
dv |
Typos in vmd error message. No functional change.
|
#
1.79 |
|
28-Dec-2022 |
jmc |
spelling fixes; from paul tagliamonte any parts of his diff not taken are noted on tech
|
#
1.78 |
|
26-Dec-2022 |
dv |
vmd(8): provide a detailed e820 memory map.
When booting guests with SeaBIOS, vmd(8) supplied details about the available guest memory via CMOS registers. Consequently, we've been carrying some patches in the ports tree to SeaBIOS to fetch this information like it's the 1990s.
When a vm initializes memory ranges, we now track what each range represents. This information can be used to supply the e820 memory map to SeaBIOS via the fw_cfg interface allowing it to properly communicate memory ranges to a guest operating system. (This will also allow us to drop some patches from the port.)
Given the ranges can now be marked with a purpose, this also allows vmm(4) to switch from hard-coded mmio ranges and instead let the information on the memory range dictate if vmm should be handling a page fault or sending to vmd for a memory assist.
Tested by Mischa Peters and others. OK mlarkin@.
|
#
1.77 |
|
23-Dec-2022 |
dv |
vmd(8): implement zero-copy operations on virtqueues.
The original virtio device implementation relied on allocating a buffer on heap, copying the virtqueue from the guest, mutating the copy, and then overwriting the virtqueue in the guest.
While the approach worked, it was both complex and added extra overhead. On older hardware, switching to the zero-copy approach can show a noticeable performance improvement for vionet devices. An added benefit is this diff also reduces the amount of code in vmd, which is always a welcome change.
In addition, change to talking about the queue pfn and not "address" as the virtio-pci spec has drivers provide a 32-bit value representing the physical page number of the location in guest memory, not the linear address.
Original idea from dlg@ while working on re-adding async task queues.
ok dlg@, tested by many
|
#
1.76 |
|
11-Nov-2022 |
dv |
Revert removal of toggling interrupt line in vmd vcpu run loop.
phessler reports a performance regression. Needs more testing.
|
#
1.75 |
|
10-Nov-2022 |
dv |
vmd(8): remove toggling interrupt line on vcpu in vcpu run loop
We toggle the interrupt "line" on the vcpu when we assert or deassert irq on the pic in either the vcpu thread (emulating some devices) or on the device event thread (mostly handling reading available data). Having it in the vcpu run loop here just results in another ioctl(2) call before the one for re-entering the guest cpu.
Removing it shows no noticeable behavioral change in existing guests.
ok mlarkin@
|
#
1.74 |
|
10-Nov-2022 |
dv |
vmd(8): import mmio decode and emulation, disabled for now.
The initial mmio support for vmd adds support for only specific MOV and MOVZX instructions. Plan is to begin iterating in-tree on other missing pieces. All functionality is gated behind an #if for now.
Only change to vmm(4) is reordering register #define's in vmmvar.h.
ok mlarkin@
|
Revision tags: OPENBSD_7_2_BASE
|
#
1.73 |
|
01-Sep-2022 |
dv |
vmm(4): send all port io emulation to userland
Simplify things by sending any io exits from IN/OUT instructions to userland instead of trying to emulate anything in the kernel. vmm was sending most pertinent exits to vmd anyways, so this functionally changes little.
An added benefit is this solves an issue reported by tb@ where i386 OpenBSD guests would probe for a pc keyboard repeatedly and cause excessive vm exits. (The emulation in vmm was not properly handling these port reads.)
While here, make the assignment of the VEI_DIR_{IN,OUT} enum values not assume the underlying integer the compiler may assign.
ok mlarkin@
|
#
1.72 |
|
30-Aug-2022 |
dv |
Initial support for mmio assist for vmm(4)
Provide the basic information required for a userland assist in emulating instructions touching mmio regions, sending as much information as is provided by the host hardware.
No decode or assist provided at the moment by vmd(8).
ok mlarkin@
|
#
1.71 |
|
29-Jun-2022 |
dv |
vmd(8): fix off by one in vm memory range check
When inspecting if a gpa falls into a known memory range, vmd was considering it valid 1 byte past the end resulting in selecting the wrong starting range for the search.
ok mlarkin@
|
#
1.70 |
|
26-Jun-2022 |
dv |
vmd: create a copy of bios at 4g boundary
Newer Linux kernels call into the bios to perform a reboot and our version of SeaBIOS assumes there's a "copy" of the bios ending at 4g. When SeaBIOS reads from this area, since vmd doesn't perform mmio yet, guests terminate with an unhandled fault.
Carve out some space ending at 4g and copy the bios there. Technically we could load garbage there, but give SeaBIOS what it wants for now.
ok mlarkin@
|
#
1.69 |
|
03-May-2022 |
dv |
vmm/vmd/vmctl: standardize memory units to bytes
At different points in the vm lifecycle vmm(4), vmctl(8), and vmd(8) refer to a vm's memory range sizes in either bytes or megabytes. This is needlessly complex.
Switch to using bytes everywhere and adjust types and constants accordingly. While this makes it possible to specify vm's with memory in fractions of megabytes, the logic requiring whole megabyte values remains.
Feedback from deraadt@, mlarkin@, and Matthew Martin.
ok mlarkin@
|
Revision tags: OPENBSD_7_1_BASE
|
#
1.68 |
|
01-Mar-2022 |
dv |
vmd(8): gracefully handle hitting data limits when starting a vm
With recent changes to login.conf(5) to restrict daemon datasize to a finite value, users can now hit resource limits when attempting to start a vm.
This change fixes the error path when hitting the limit. vmd(8) will no longer abort and memory error messages are relayed to the user.
While here, address potential under-reads/writes using atomicio when relaying data between the child vm process and vmd's vmm process.
Original diff from tedu@. OK mlarkin@.
|
#
1.67 |
|
30-Dec-2021 |
claudio |
Add back support for -B net -b bsd.rd which emulates a PXE install and results in an autoinstall. This can be used to quickly create new OpenBSD installs. OK dv@
|
#
1.66 |
|
29-Nov-2021 |
deraadt |
mostly avoid sys/param.h with a local nitems() ok mlarkin
|
Revision tags: OPENBSD_7_0_BASE
|
#
1.65 |
|
01-Sep-2021 |
dv |
remove unused functions and cleanup vmd.h
Discussed with mlarkin@. These functions were implemented but never used. While in vmd.h, fix the order to match current vmd(8) reality.
|
#
1.64 |
|
16-Jul-2021 |
dv |
vmd(8): simplify vcpu logic, removing uart & vionet reads
Remove legacy state handling on the ns8250 and virtio network devices originally put in place before using libevent for async device events. The vcpu thread doesn't need to process device data as it is handled by the libevent thread.
This has the benefit of simplifying some of the message passing between threads introduced to the ns8250 uart since both the vcpu and libevent threads were processing read events.
No functional change intended. Tested by many, including abieber@, weerd@, Mischa Peters, and Matthias Schmidt. (Thanks.)
OK mlarkin@
|
#
1.63 |
|
16-Jun-2021 |
dv |
cleanup vmd(8) includes and header files
Lots of organic growth other the years lead to unnecessary includes (proc.h everywhere) and odd dependencies between header files. This cleans things up a bit to help with upcoming cleanup around dhcp code.
No functional change.
"go for it" mlarkin@
|
Revision tags: OPENBSD_6_9_BASE
|
#
1.62 |
|
05-Apr-2021 |
dv |
Support booting from compressed kernel images.
The bsd.rd ramdisk now ships gzip'd on amd64. Use libz in base to transparently handle decompression of any compressed kernel images.
Patch from Josh Rickmar.
ok kn@
|
#
1.61 |
|
29-Mar-2021 |
dv |
Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior ignored did not intercept these packets and instead transmitted them.
This should make vmd(8)'s dhcp behave more as a true dhcp server should and allows it to work properly with the new dhcpleased(8) attempting a renewal.
OK mlarkin@
|
#
1.60 |
|
19-Mar-2021 |
kn |
Remove booting from kernels in raw/qcow2 images
Diff and (slightly tweaked) text below from Dave Voutila < dave at sisu dot io >, thanks!
-- Since 6.7 switched to FFS2 as the default filesystem for new installs, the ability for vmd(8) to load a kernel and boot.conf from a disk image directly (without SeaBIOS) has been broken.
A diff from tb to add FFS2 support never mdae it into the tree.
On 5th Jan 2021, new ramdisks for amd64 have started shipping gzipped, breaking the ability to load the bsd.rd directly as a kernel image for a vmd guest without first uncompressing the image.
Using BIOS works, the FFS2 change happend ten months ago and few if any have complained about the breakage. vmctl(8) is still vague about supporting it per its man page and one still has to pass the disk image twice as a "-b" and "-d" argument to boot an OpenBSD guest *without* BIOS.
Josh Rickmar reported the gzip issue on bugs@ and provided patches to add support for compressed ramdisks and kernel images. The easiest way to do so is to drop support for FFS images since they require a call to fmemopen(3) while all the other logic uses fopen(3)/fdopen(3) calls and a file descriptor. It is much easier to get thsoe patches merged if they don't have to account for extracting files from disk images. --
No objections anyone "Removing it makes sense" reyk (who wrote the FFS module) OK mlarkin
|
#
1.59 |
|
13-Feb-2021 |
mlarkin |
Fix some wrong comments and KNF/long line wraps
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.58 |
|
28-Jun-2020 |
pd |
vmd(8): Eliminate libevent state corruption
libevent functions for com, pic and rtc are now only called on event_thread. vcpu exit handlers send messages on a dev pipe and callbacks on these events do the event management (event_add, evtimer_add, etc). Previously, libevent state was mutated by two threads, event_thread, that runs all the callbacks and the vcpu thread when running exit handlers. This could have lead to libevent state corruption.
Patch from Dave Voutila <dave@sisu.io>
ok claudio@ tested by abieber@ and brynet@
|
Revision tags: OPENBSD_6_7_BASE
|
#
1.57 |
|
30-Apr-2020 |
pd |
vmd(8): correctly terminate vm processes after sending vm
Instead of a round about way of sending a message to vmm that 'send is successful' and terminating by vm_remove from vmm, we can send the imsg and exit in the vm process. The sigchld handler in vmm will vm_remove it from its structures. This is how a normal vm is terminated as well.
Previously, vm_remove was called in vmm_dispatch_vm (ie. the event handler to receive messages from vm process) when hanlding the IMSG_VMDOP_SEND_VM_RESPONSE (ie. the vm process has written the vm state to the fd passed on by vmctl send). This is not how vm_remove was intented to be used as it does a free(vm). The vm struct holds the buffers for imsg and so after handling this IMSG_VMDOP_SEND_VM_RESPONSE message, vmm_dispatch_vm loops again to do imsg_get(ibuf, &imsg) to read the next message (and we had just freed this *ibuf when we freed the vm struct) causing it to segfault.
reported by kn@ ok kn@
|
#
1.56 |
|
21-Apr-2020 |
pd |
vmd: improve concurrency control in pause
Previous implementation hit a deadlock sometimes as the pthread_cond_broadcast for the pause mutex could happen before pthread_cond_wait. This implementation uses a barrier which is hit when all vpcus are paused.
ok mpi@
|
#
1.55 |
|
08-Apr-2020 |
pd |
vmm(4): add IOCTL handler to sets the access protections of the ept
This exposes VMM_IOC_MPROTECT_EPT which can be used by vmd to lock in physical pages. Currently, vmd just terminates the vm in case it gets a protection fault in the future.
This feature is used by solo5 which uses vmm(4) as a backend hypervisor.
ok mpi@
Patch from Adam Steen <adam@adamsteen.com.au>
|
#
1.54 |
|
11-Dec-2019 |
pd |
vmd: proper concurrency control when pausing a vm
Removes an XXX which slept for 1s waiting for the vcpu thread to reach HLT and pause. We now define a paused and unpaused condition so that a call to pause_vm() / vmctl pause blocks till the vm really reaches a paused state.
Also, detach events for devices from event loop when pausing and add them back when unpausing. This is because some callbacks call pthread_mutex_lock and if the vm is paused, it would block also causing the libevent thread to block. This would mean that we would not be able to process any IMSGs received from vmm (parent process) including a message to unpause.
ok mlarkin@
|
#
1.53 |
|
30-Nov-2019 |
mlarkin |
Revert previous - the stability was not as improved as we had thought and we ended up accidentally breaking vmctl. This will need more thought.
ok ori@
|
#
1.52 |
|
29-Nov-2019 |
mlarkin |
Fix at least one cause of VMs spinning at 100% host CPU
After debugging with ori@, it looks like an event ends up on the wrong libevent queue, and we end continually de-queueing and re-queueing the event continually. While it's unclear exactly why this happened, a clue on libevent's github issues page for the same problem pointed us to using a different event base for the device events. This seems to have unstuck ori@'s problematic VM, and I have also seen no more hangs after this.
We have not completely separated the queues; ori@ will work on setting new libevent bases for those later. But those events are pretty frequency.
with help from and ok ori@
|
Revision tags: OPENBSD_6_6_BASE
|
#
1.51 |
|
17-Jul-2019 |
pd |
vmm/vmd: Fix migration with pvclock
Implement VMM_IOC_READVMPARAMS and VMM_IOC_WRITEVMPARAMS ioctls to read and write pvclock state.
reads ok mlarkin@
|
#
1.50 |
|
28-Jun-2019 |
deraadt |
When system calls indicate an error they return -1, not some arbitrary value < 0. errno is only updated in this case. Change all (most?) callers of syscalls to follow this better, and let's see if this strictness helps us in the future.
|
#
1.49 |
|
28-May-2019 |
pd |
vmd: unset CR0_CD and CR0_NW in default flat64 register values
These never got unset on AMD/SVM guests when booted via vmctl start -b causing them to run very slow
ok mlarkin@
|
#
1.48 |
|
12-May-2019 |
pd |
vmm: add a x86 page table walker
Add a first cut of x86 page table walker to vmd(8) and vmm(4). This function is not used right now but is a building block for future features like HPET, OUTSB and INSB emulation, nested virtualisation support, etc.
With help from Mike Larkin
ok mlarkin@
|
#
1.47 |
|
11-May-2019 |
jasper |
vm_dump_header allocated space for a signature but it was never set; set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image
ok mlarkin@ pd@
|
#
1.46 |
|
11-May-2019 |
jasper |
track the state of the vm (running, paused, etc) using a single bitfield instead of a handful of separate variables. this will makes it easier for vmd to report and check on the individual vm states
no functional change intended
ok ccardenas@ mlarkin@
|
Revision tags: OPENBSD_6_5_BASE
|
#
1.45 |
|
01-Mar-2019 |
mlarkin |
vmd(8): remove some i386 remnants that missed the original cleanup
ok pd, kn, deraadt
|
#
1.44 |
|
20-Feb-2019 |
mlarkin |
vmd(8): initialize guest %drX registers to power-on defaults on launch
Initializes the %drX registers to power on defaults, and bump the VM send/recieve header to reflect same
discussed with deraadt@
|
#
1.43 |
|
10-Dec-2018 |
claudio |
Implement the fw_cfg interface basics and use it to set the bootorder if a bootdevice was forced. This implements both the pure IO port interface and also the new DMA interface, a few direct commands are implemented which are needed but in general the "file" interface should be used. There is no write support for the guest. Tested against the latest vmm-firmware port. This requires also a -current kernel to pass the IO ports to vmd(8). OK mlarkin@ ccardenas@
|
#
1.42 |
|
06-Dec-2018 |
claudio |
Make it possible to define the bootdevice in vmd. This information is used currently only when booting a OpenBSD kernel. If VMBOOTDEV_NET is used the internal dhcp server will pass "auto_install" as boot file to the client and the boot loader passes the MAC of the first interface to the kernel to indicate PXE booting. Adding boot order support to SeaBIOS is not yet implemented. Ok ccardenas@
|
Revision tags: OPENBSD_6_4_BASE
|
#
1.41 |
|
08-Oct-2018 |
reyk |
Add support for qcow2 base images (external snapshots).
This works is from Ori Bernstein, committing on his behalf:
Add support to vmd for external snapshots. That is, snapshots that are derived from a base image. Data lookups start in the derived image, and if the derived image does not contain some data, the search proceeds ot the base image. Multiple derived images may exist off of a single base image.
A limitation of this format is that modifying the base image will corrupt the derived image.
This change also adds support for creating disk derived disk images to vmctl. To use it:
vmctl create derived.qcow2 -s 16G -b base.qcow2
From Ori Bernstein OK mlarkin@ reyk@
|
#
1.40 |
|
28-Sep-2018 |
reyk |
Support vmd-internal's vmboot with qcow2 disk images.
OK mlarkin@
|
#
1.39 |
|
19-Sep-2018 |
ccardenas |
Various clean up items for disks.
- qcow2: general cleanup - vioraw: check malloc - virtio: add function to sync disks - vm: call virtio_shutdown to sync disks when vm is finished executing
Thanks to Ori Bernstein.
Ok miko@
|
#
1.38 |
|
17-Jul-2018 |
mlarkin |
vmd(8): fix vmctl -b option for i386 kernels.
ok pd@
|
#
1.37 |
|
12-Jul-2018 |
mlarkin |
vmm(8)/vmm(4): send a copy of the guest register state to vmd on exit, avoiding multiple readregs ioctls back to vmm in case register content is needed subsequently.
ok phessler
|
#
1.36 |
|
10-Jul-2018 |
mlarkin |
vmd(8): route ELCR handler to the right function
|
#
1.35 |
|
09-Jul-2018 |
mlarkin |
vmd(8): better debug message in a failure case
|
#
1.34 |
|
19-Jun-2018 |
reyk |
knf
|
#
1.33 |
|
27-Apr-2018 |
mlarkin |
vmd(8): implement vmd side of ELCR registers
ok guenther
|
#
1.32 |
|
26-Apr-2018 |
mlarkin |
vmd(8): handle PIT channel 2 status readback via port 0x61
Allow PIT channel 2 status (fired/counting) readback via port 0x61 bit 5.
ok guenther@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.31 |
|
03-Jan-2018 |
ccardenas |
Add initial CD-ROM support to VMD via vioscsi.
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
#
1.30 |
|
29-Nov-2017 |
mlarkin |
make vmm(4) less responsible for initial register state, preferring to let usermode daemons handle that.
ok pd@
|
#
1.29 |
|
28-Nov-2017 |
mlarkin |
fix some spelling errors in a few comments
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.28 |
|
19-Sep-2017 |
mlarkin |
Clarify a wrong conditional, found by jsg.
ok jsg
|
#
1.27 |
|
17-Sep-2017 |
pd |
vmd: send/recv pci config space instead of recreating pci devices on receive
ok mlarkin@
|
#
1.26 |
|
17-Sep-2017 |
pd |
vmd: re add rtc.per and rtc.sec evtimers on receive
This was missed in receive. mc146818_start is already defined. This fixes rtc time resync on receive.
ok mlarkin@
|
#
1.25 |
|
11-Sep-2017 |
dlg |
add functions to provide direct access to guest memory as vmd addresses
iovec_mem() populates an iovec array based on guest physical addresses. this allows the use of things like readv and writev for moving data between the guest and a disk image file without having to bounce the memory.
vaddr_mem() provides a vmd usable pointer based on a guests physical address. this makes it possible to directly reference things like virtio rings without having to bounce that memory either. however, it assumes that a contiguous range of guest physical memory will sit in a single vm memory range. mlarkin@ says this is right.
ok mlarkin@
|
#
1.24 |
|
20-Aug-2017 |
pd |
vmd: Allow only upward migration
This restricts receiving vms from hosts with more cpu features.
Tested on broadwell -> skylake (works) skylake -> broadwell (don't work)
ok mlarkin@
|
#
1.23 |
|
14-Aug-2017 |
mlarkin |
vmd: set MSR_MISC_ENABLE=0 on vm creation, this will be re-set in vmm based on proper values from the host in use.
|
#
1.22 |
|
15-Jul-2017 |
pd |
Add vmctl send and vmctl receive
ok reyk@ and mlarkin@
|
#
1.21 |
|
09-Jul-2017 |
pd |
vmd/vmctl: Add ability to pause / unpause vms
With help from Ashwin Agrawal
ok reyk@ mlarkin@
|
#
1.20 |
|
07-Jun-2017 |
mlarkin |
vmd: Implement simulated baudrate support in the ns8250 module. The previous version was allowing an output rate that is "too fast", and linux guests would give up after 512 characters TXed ("too much work for irq4").
This diff calculates the approximate rate we can sustain at the current programmed baud rate and limits the output to that rate by inserting a HZ delay after a specified number of characters have been transmitted. This fixes the linux guest console issue.
Note that the console now outputs at more or less the selected baud rate, instead of nearly instantaneously as before - if you selected 9600 in your guest VMs before, you might want to change that to 115200 now for a better console experience.
krw@ "seems like a good idea to me"
|
#
1.19 |
|
30-May-2017 |
tedu |
split vioblk read/write functions into start and finish as prep for async io operations. ok mlarkin
|
#
1.18 |
|
28-May-2017 |
mlarkin |
SVM: add some exit types
Also, fix a comment that wasn't applicable anymore, and change a format from decimal to hex
|
#
1.17 |
|
05-May-2017 |
reyk |
VMs cannot use proc_compose() to PROC_VMM, they have to use imsg_compose() on the "vmm_pipe" directly. This fixes the communication channel from VMs back to vmm.
|
#
1.16 |
|
05-May-2017 |
mlarkin |
Allow vmd(8) to set guest %xcr0
Usermode part of previous vmm(4) diff.
Posted to tech by Pratik Vyas
|
#
1.15 |
|
02-May-2017 |
mlarkin |
fix an error in i386 vmd build
|
#
1.14 |
|
02-May-2017 |
mlarkin |
Matching vmd(8) part of previous diff (first part of vmctl send/receive).
ok kettenis
|
#
1.13 |
|
25-Apr-2017 |
reyk |
spacing
|
#
1.12 |
|
19-Apr-2017 |
reyk |
Add support for dynamic "NAT" interfaces (-L/local interface).
When a local interface is configured, vmd configures a /31 address on the tap(4) interface of the host and provides another IP in the same subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server that replies with IP, gateway, and DNS addresses to the VM. The built-in server only ever responds to the VM on the inside and cannot leak its DHCP responses to the outside.
Thanks to Uwe Werler, Josh Grosse, and some others for testing!
OK deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.11 |
|
27-Mar-2017 |
deraadt |
die whitespace die die die
|
#
1.10 |
|
25-Mar-2017 |
mlarkin |
Last bits needed to get seabios + alpine linux working. This is enough to get started and let more people help finding and fixing bugs.
ok kettenis, deraadt
|
#
1.9 |
|
25-Mar-2017 |
reyk |
Boot using BIOS from /etc/firmware/vmm-bios by default.
Instead of using the internal "vmboot", VMs will now be booted using the external BIOS firmware in /etc/firmware/vmm-bios (which is subject to a LGPLv3 license). Direct booting of OpenBSD kernels or non-default BIOS images is still supported for now using the -b/boot option that is replacing the -k/kernel option.
As requested by Theo, vmd(8) fails if neither the default BIOS is found nor a kernel has been specified in the VM configuration. The "vmm" BIOS has to be installed using fw_update(1), which will be done automatically in most cases where the OpenBSD can fetch it after install/upgrade.
OK mlarkin@
|
#
1.8 |
|
25-Mar-2017 |
mlarkin |
Implement some missing functionality and clean up some code in vmd pci emulation.
ok kettenis
|
#
1.7 |
|
25-Mar-2017 |
mlarkin |
Introduce a new function to obtain properly sized input data, and convert i8253/i8259/mc146818 emulation to use this.
|
#
1.6 |
|
24-Mar-2017 |
mlarkin |
Allow vmd to proceed after an interrupt occurred after retiring a cpuid instruction. Matches previous commit to kernel vmm.c
|
#
1.5 |
|
23-Mar-2017 |
mlarkin |
Implement memory size and SMP CPU count NVRAM registers in the emulated mc146818. This is needed for seabios to boot properly (and construct a sensible e820 map to send to the guest OS).
|
#
1.4 |
|
21-Mar-2017 |
mlarkin |
Fix two errors in NS8250 (UART) emulation. The first error zeroed out the high bits of %eax on reading register data from the emulated UART ports. The second error didn't properly assert the TXRDY bit during init - this bit was only set after the first character was sent. Both these bugs caused seabios to not be able to output any data. Found during the recent effort to get Linux guests booting.
|
#
1.3 |
|
15-Mar-2017 |
reyk |
Improve vmmci(4) shutdown and reboot.
This change handles various cases to power off the VM, even if it is unresponsive, stuck in ddb, or when the shutdown was initiated from the VM guest side. Usage of timeout and VM ACKs make sure that the VM is really turned off at some point.
OK mlarkin@
|
#
1.2 |
|
02-Mar-2017 |
reyk |
Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.
This is especially useful when multiple VMs share a switch, the implementation is independent from the underlying switch or bridge.
no objections mlarkin@
|
#
1.1 |
|
01-Mar-2017 |
reyk |
Split vmm.c into two files: vm.c for the VM child, vmm.c for the parent
As discussed with mlarkin@, it makes it easier to maintain the file.
OK mlarkin@
|
#
1.79 |
|
28-Dec-2022 |
jmc |
spelling fixes; from paul tagliamonte any parts of his diff not taken are noted on tech
|
#
1.78 |
|
26-Dec-2022 |
dv |
vmd(8): provide a detailed e820 memory map.
When booting guests with SeaBIOS, vmd(8) supplied details about the available guest memory via CMOS registers. Consequently, we've been carrying some patches in the ports tree to SeaBIOS to fetch this information like it's the 1990s.
When a vm initializes memory ranges, we now track what each range represents. This information can be used to supply the e820 memory map to SeaBIOS via the fw_cfg interface allowing it to properly communicate memory ranges to a guest operating system. (This will also allow us to drop some patches from the port.)
Given the ranges can now be marked with a purpose, this also allows vmm(4) to switch from hard-coded mmio ranges and instead let the information on the memory range dictate if vmm should be handling a page fault or sending to vmd for a memory assist.
Tested by Mischa Peters and others. OK mlarkin@.
|
#
1.77 |
|
23-Dec-2022 |
dv |
vmd(8): implement zero-copy operations on virtqueues.
The original virtio device implementation relied on allocating a buffer on heap, copying the virtqueue from the guest, mutating the copy, and then overwriting the virtqueue in the guest.
While the approach worked, it was both complex and added extra overhead. On older hardware, switching to the zero-copy approach can show a noticeable performance improvement for vionet devices. An added benefit is this diff also reduces the amount of code in vmd, which is always a welcome change.
In addition, change to talking about the queue pfn and not "address" as the virtio-pci spec has drivers provide a 32-bit value representing the physical page number of the location in guest memory, not the linear address.
Original idea from dlg@ while working on re-adding async task queues.
ok dlg@, tested by many
|
#
1.76 |
|
11-Nov-2022 |
dv |
Revert removal of toggling interrupt line in vmd vcpu run loop.
phessler reports a performance regression. Needs more testing.
|
#
1.75 |
|
10-Nov-2022 |
dv |
vmd(8): remove toggling interrupt line on vcpu in vcpu run loop
We toggle the interrupt "line" on the vcpu when we assert or deassert irq on the pic in either the vcpu thread (emulating some devices) or on the device event thread (mostly handling reading available data). Having it in the vcpu run loop here just results in another ioctl(2) call before the one for re-entering the guest cpu.
Removing it shows no noticeable behavioral change in existing guests.
ok mlarkin@
|
#
1.74 |
|
10-Nov-2022 |
dv |
vmd(8): import mmio decode and emulation, disabled for now.
The initial mmio support for vmd adds support for only specific MOV and MOVZX instructions. Plan is to begin iterating in-tree on other missing pieces. All functionality is gated behind an #if for now.
Only change to vmm(4) is reordering register #define's in vmmvar.h.
ok mlarkin@
|
Revision tags: OPENBSD_7_2_BASE
|
#
1.73 |
|
01-Sep-2022 |
dv |
vmm(4): send all port io emulation to userland
Simplify things by sending any io exits from IN/OUT instructions to userland instead of trying to emulate anything in the kernel. vmm was sending most pertinent exits to vmd anyways, so this functionally changes little.
An added benefit is this solves an issue reported by tb@ where i386 OpenBSD guests would probe for a pc keyboard repeatedly and cause excessive vm exits. (The emulation in vmm was not properly handling these port reads.)
While here, make the assignment of the VEI_DIR_{IN,OUT} enum values not assume the underlying integer the compiler may assign.
ok mlarkin@
|
#
1.72 |
|
30-Aug-2022 |
dv |
Initial support for mmio assist for vmm(4)
Provide the basic information required for a userland assist in emulating instructions touching mmio regions, sending as much information as is provided by the host hardware.
No decode or assist provided at the moment by vmd(8).
ok mlarkin@
|
#
1.71 |
|
29-Jun-2022 |
dv |
vmd(8): fix off by one in vm memory range check
When inspecting if a gpa falls into a known memory range, vmd was considering it valid 1 byte past the end resulting in selecting the wrong starting range for the search.
ok mlarkin@
|
#
1.70 |
|
26-Jun-2022 |
dv |
vmd: create a copy of bios at 4g boundary
Newer Linux kernels call into the bios to perform a reboot and our version of SeaBIOS assumes there's a "copy" of the bios ending at 4g. When SeaBIOS reads from this area, since vmd doesn't perform mmio yet, guests terminate with an unhandled fault.
Carve out some space ending at 4g and copy the bios there. Technically we could load garbage there, but give SeaBIOS what it wants for now.
ok mlarkin@
|
#
1.69 |
|
03-May-2022 |
dv |
vmm/vmd/vmctl: standardize memory units to bytes
At different points in the vm lifecycle vmm(4), vmctl(8), and vmd(8) refer to a vm's memory range sizes in either bytes or megabytes. This is needlessly complex.
Switch to using bytes everywhere and adjust types and constants accordingly. While this makes it possible to specify vm's with memory in fractions of megabytes, the logic requiring whole megabyte values remains.
Feedback from deraadt@, mlarkin@, and Matthew Martin.
ok mlarkin@
|
Revision tags: OPENBSD_7_1_BASE
|
#
1.68 |
|
01-Mar-2022 |
dv |
vmd(8): gracefully handle hitting data limits when starting a vm
With recent changes to login.conf(5) to restrict daemon datasize to a finite value, users can now hit resource limits when attempting to start a vm.
This change fixes the error path when hitting the limit. vmd(8) will no longer abort and memory error messages are relayed to the user.
While here, address potential under-reads/writes using atomicio when relaying data between the child vm process and vmd's vmm process.
Original diff from tedu@. OK mlarkin@.
|
#
1.67 |
|
30-Dec-2021 |
claudio |
Add back support for -B net -b bsd.rd which emulates a PXE install and results in an autoinstall. This can be used to quickly create new OpenBSD installs. OK dv@
|
#
1.66 |
|
29-Nov-2021 |
deraadt |
mostly avoid sys/param.h with a local nitems() ok mlarkin
|
Revision tags: OPENBSD_7_0_BASE
|
#
1.65 |
|
01-Sep-2021 |
dv |
remove unused functions and cleanup vmd.h
Discussed with mlarkin@. These functions were implemented but never used. While in vmd.h, fix the order to match current vmd(8) reality.
|
#
1.64 |
|
16-Jul-2021 |
dv |
vmd(8): simplify vcpu logic, removing uart & vionet reads
Remove legacy state handling on the ns8250 and virtio network devices originally put in place before using libevent for async device events. The vcpu thread doesn't need to process device data as it is handled by the libevent thread.
This has the benefit of simplifying some of the message passing between threads introduced to the ns8250 uart since both the vcpu and libevent threads were processing read events.
No functional change intended. Tested by many, including abieber@, weerd@, Mischa Peters, and Matthias Schmidt. (Thanks.)
OK mlarkin@
|
#
1.63 |
|
16-Jun-2021 |
dv |
cleanup vmd(8) includes and header files
Lots of organic growth other the years lead to unnecessary includes (proc.h everywhere) and odd dependencies between header files. This cleans things up a bit to help with upcoming cleanup around dhcp code.
No functional change.
"go for it" mlarkin@
|
Revision tags: OPENBSD_6_9_BASE
|
#
1.62 |
|
05-Apr-2021 |
dv |
Support booting from compressed kernel images.
The bsd.rd ramdisk now ships gzip'd on amd64. Use libz in base to transparently handle decompression of any compressed kernel images.
Patch from Josh Rickmar.
ok kn@
|
#
1.61 |
|
29-Mar-2021 |
dv |
Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior ignored did not intercept these packets and instead transmitted them.
This should make vmd(8)'s dhcp behave more as a true dhcp server should and allows it to work properly with the new dhcpleased(8) attempting a renewal.
OK mlarkin@
|
#
1.60 |
|
19-Mar-2021 |
kn |
Remove booting from kernels in raw/qcow2 images
Diff and (slightly tweaked) text below from Dave Voutila < dave at sisu dot io >, thanks!
-- Since 6.7 switched to FFS2 as the default filesystem for new installs, the ability for vmd(8) to load a kernel and boot.conf from a disk image directly (without SeaBIOS) has been broken.
A diff from tb to add FFS2 support never mdae it into the tree.
On 5th Jan 2021, new ramdisks for amd64 have started shipping gzipped, breaking the ability to load the bsd.rd directly as a kernel image for a vmd guest without first uncompressing the image.
Using BIOS works, the FFS2 change happend ten months ago and few if any have complained about the breakage. vmctl(8) is still vague about supporting it per its man page and one still has to pass the disk image twice as a "-b" and "-d" argument to boot an OpenBSD guest *without* BIOS.
Josh Rickmar reported the gzip issue on bugs@ and provided patches to add support for compressed ramdisks and kernel images. The easiest way to do so is to drop support for FFS images since they require a call to fmemopen(3) while all the other logic uses fopen(3)/fdopen(3) calls and a file descriptor. It is much easier to get thsoe patches merged if they don't have to account for extracting files from disk images. --
No objections anyone "Removing it makes sense" reyk (who wrote the FFS module) OK mlarkin
|
#
1.59 |
|
13-Feb-2021 |
mlarkin |
Fix some wrong comments and KNF/long line wraps
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.58 |
|
28-Jun-2020 |
pd |
vmd(8): Eliminate libevent state corruption
libevent functions for com, pic and rtc are now only called on event_thread. vcpu exit handlers send messages on a dev pipe and callbacks on these events do the event management (event_add, evtimer_add, etc). Previously, libevent state was mutated by two threads, event_thread, that runs all the callbacks and the vcpu thread when running exit handlers. This could have lead to libevent state corruption.
Patch from Dave Voutila <dave@sisu.io>
ok claudio@ tested by abieber@ and brynet@
|
Revision tags: OPENBSD_6_7_BASE
|
#
1.57 |
|
30-Apr-2020 |
pd |
vmd(8): correctly terminate vm processes after sending vm
Instead of a round about way of sending a message to vmm that 'send is successful' and terminating by vm_remove from vmm, we can send the imsg and exit in the vm process. The sigchld handler in vmm will vm_remove it from its structures. This is how a normal vm is terminated as well.
Previously, vm_remove was called in vmm_dispatch_vm (ie. the event handler to receive messages from vm process) when hanlding the IMSG_VMDOP_SEND_VM_RESPONSE (ie. the vm process has written the vm state to the fd passed on by vmctl send). This is not how vm_remove was intented to be used as it does a free(vm). The vm struct holds the buffers for imsg and so after handling this IMSG_VMDOP_SEND_VM_RESPONSE message, vmm_dispatch_vm loops again to do imsg_get(ibuf, &imsg) to read the next message (and we had just freed this *ibuf when we freed the vm struct) causing it to segfault.
reported by kn@ ok kn@
|
#
1.56 |
|
21-Apr-2020 |
pd |
vmd: improve concurrency control in pause
Previous implementation hit a deadlock sometimes as the pthread_cond_broadcast for the pause mutex could happen before pthread_cond_wait. This implementation uses a barrier which is hit when all vpcus are paused.
ok mpi@
|
#
1.55 |
|
08-Apr-2020 |
pd |
vmm(4): add IOCTL handler to sets the access protections of the ept
This exposes VMM_IOC_MPROTECT_EPT which can be used by vmd to lock in physical pages. Currently, vmd just terminates the vm in case it gets a protection fault in the future.
This feature is used by solo5 which uses vmm(4) as a backend hypervisor.
ok mpi@
Patch from Adam Steen <adam@adamsteen.com.au>
|
#
1.54 |
|
11-Dec-2019 |
pd |
vmd: proper concurrency control when pausing a vm
Removes an XXX which slept for 1s waiting for the vcpu thread to reach HLT and pause. We now define a paused and unpaused condition so that a call to pause_vm() / vmctl pause blocks till the vm really reaches a paused state.
Also, detach events for devices from event loop when pausing and add them back when unpausing. This is because some callbacks call pthread_mutex_lock and if the vm is paused, it would block also causing the libevent thread to block. This would mean that we would not be able to process any IMSGs received from vmm (parent process) including a message to unpause.
ok mlarkin@
|
#
1.53 |
|
30-Nov-2019 |
mlarkin |
Revert previous - the stability was not as improved as we had thought and we ended up accidentally breaking vmctl. This will need more thought.
ok ori@
|
#
1.52 |
|
29-Nov-2019 |
mlarkin |
Fix at least one cause of VMs spinning at 100% host CPU
After debugging with ori@, it looks like an event ends up on the wrong libevent queue, and we end continually de-queueing and re-queueing the event continually. While it's unclear exactly why this happened, a clue on libevent's github issues page for the same problem pointed us to using a different event base for the device events. This seems to have unstuck ori@'s problematic VM, and I have also seen no more hangs after this.
We have not completely separated the queues; ori@ will work on setting new libevent bases for those later. But those events are pretty frequency.
with help from and ok ori@
|
Revision tags: OPENBSD_6_6_BASE
|
#
1.51 |
|
17-Jul-2019 |
pd |
vmm/vmd: Fix migration with pvclock
Implement VMM_IOC_READVMPARAMS and VMM_IOC_WRITEVMPARAMS ioctls to read and write pvclock state.
reads ok mlarkin@
|
#
1.50 |
|
28-Jun-2019 |
deraadt |
When system calls indicate an error they return -1, not some arbitrary value < 0. errno is only updated in this case. Change all (most?) callers of syscalls to follow this better, and let's see if this strictness helps us in the future.
|
#
1.49 |
|
28-May-2019 |
pd |
vmd: unset CR0_CD and CR0_NW in default flat64 register values
These never got unset on AMD/SVM guests when booted via vmctl start -b causing them to run very slow
ok mlarkin@
|
#
1.48 |
|
12-May-2019 |
pd |
vmm: add a x86 page table walker
Add a first cut of x86 page table walker to vmd(8) and vmm(4). This function is not used right now but is a building block for future features like HPET, OUTSB and INSB emulation, nested virtualisation support, etc.
With help from Mike Larkin
ok mlarkin@
|
#
1.47 |
|
11-May-2019 |
jasper |
vm_dump_header allocated space for a signature but it was never set; set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image
ok mlarkin@ pd@
|
#
1.46 |
|
11-May-2019 |
jasper |
track the state of the vm (running, paused, etc) using a single bitfield instead of a handful of separate variables. this will makes it easier for vmd to report and check on the individual vm states
no functional change intended
ok ccardenas@ mlarkin@
|
Revision tags: OPENBSD_6_5_BASE
|
#
1.45 |
|
01-Mar-2019 |
mlarkin |
vmd(8): remove some i386 remnants that missed the original cleanup
ok pd, kn, deraadt
|
#
1.44 |
|
20-Feb-2019 |
mlarkin |
vmd(8): initialize guest %drX registers to power-on defaults on launch
Initializes the %drX registers to power on defaults, and bump the VM send/recieve header to reflect same
discussed with deraadt@
|
#
1.43 |
|
10-Dec-2018 |
claudio |
Implement the fw_cfg interface basics and use it to set the bootorder if a bootdevice was forced. This implements both the pure IO port interface and also the new DMA interface, a few direct commands are implemented which are needed but in general the "file" interface should be used. There is no write support for the guest. Tested against the latest vmm-firmware port. This requires also a -current kernel to pass the IO ports to vmd(8). OK mlarkin@ ccardenas@
|
#
1.42 |
|
06-Dec-2018 |
claudio |
Make it possible to define the bootdevice in vmd. This information is used currently only when booting a OpenBSD kernel. If VMBOOTDEV_NET is used the internal dhcp server will pass "auto_install" as boot file to the client and the boot loader passes the MAC of the first interface to the kernel to indicate PXE booting. Adding boot order support to SeaBIOS is not yet implemented. Ok ccardenas@
|
Revision tags: OPENBSD_6_4_BASE
|
#
1.41 |
|
08-Oct-2018 |
reyk |
Add support for qcow2 base images (external snapshots).
This works is from Ori Bernstein, committing on his behalf:
Add support to vmd for external snapshots. That is, snapshots that are derived from a base image. Data lookups start in the derived image, and if the derived image does not contain some data, the search proceeds ot the base image. Multiple derived images may exist off of a single base image.
A limitation of this format is that modifying the base image will corrupt the derived image.
This change also adds support for creating disk derived disk images to vmctl. To use it:
vmctl create derived.qcow2 -s 16G -b base.qcow2
From Ori Bernstein OK mlarkin@ reyk@
|
#
1.40 |
|
28-Sep-2018 |
reyk |
Support vmd-internal's vmboot with qcow2 disk images.
OK mlarkin@
|
#
1.39 |
|
19-Sep-2018 |
ccardenas |
Various clean up items for disks.
- qcow2: general cleanup - vioraw: check malloc - virtio: add function to sync disks - vm: call virtio_shutdown to sync disks when vm is finished executing
Thanks to Ori Bernstein.
Ok miko@
|
#
1.38 |
|
17-Jul-2018 |
mlarkin |
vmd(8): fix vmctl -b option for i386 kernels.
ok pd@
|
#
1.37 |
|
12-Jul-2018 |
mlarkin |
vmm(8)/vmm(4): send a copy of the guest register state to vmd on exit, avoiding multiple readregs ioctls back to vmm in case register content is needed subsequently.
ok phessler
|
#
1.36 |
|
10-Jul-2018 |
mlarkin |
vmd(8): route ELCR handler to the right function
|
#
1.35 |
|
09-Jul-2018 |
mlarkin |
vmd(8): better debug message in a failure case
|
#
1.34 |
|
19-Jun-2018 |
reyk |
knf
|
#
1.33 |
|
27-Apr-2018 |
mlarkin |
vmd(8): implement vmd side of ELCR registers
ok guenther
|
#
1.32 |
|
26-Apr-2018 |
mlarkin |
vmd(8): handle PIT channel 2 status readback via port 0x61
Allow PIT channel 2 status (fired/counting) readback via port 0x61 bit 5.
ok guenther@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.31 |
|
03-Jan-2018 |
ccardenas |
Add initial CD-ROM support to VMD via vioscsi.
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
#
1.30 |
|
29-Nov-2017 |
mlarkin |
make vmm(4) less responsible for initial register state, preferring to let usermode daemons handle that.
ok pd@
|
#
1.29 |
|
28-Nov-2017 |
mlarkin |
fix some spelling errors in a few comments
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.28 |
|
19-Sep-2017 |
mlarkin |
Clarify a wrong conditional, found by jsg.
ok jsg
|
#
1.27 |
|
17-Sep-2017 |
pd |
vmd: send/recv pci config space instead of recreating pci devices on receive
ok mlarkin@
|
#
1.26 |
|
17-Sep-2017 |
pd |
vmd: re add rtc.per and rtc.sec evtimers on receive
This was missed in receive. mc146818_start is already defined. This fixes rtc time resync on receive.
ok mlarkin@
|
#
1.25 |
|
11-Sep-2017 |
dlg |
add functions to provide direct access to guest memory as vmd addresses
iovec_mem() populates an iovec array based on guest physical addresses. this allows the use of things like readv and writev for moving data between the guest and a disk image file without having to bounce the memory.
vaddr_mem() provides a vmd usable pointer based on a guests physical address. this makes it possible to directly reference things like virtio rings without having to bounce that memory either. however, it assumes that a contiguous range of guest physical memory will sit in a single vm memory range. mlarkin@ says this is right.
ok mlarkin@
|
#
1.24 |
|
20-Aug-2017 |
pd |
vmd: Allow only upward migration
This restricts receiving vms from hosts with more cpu features.
Tested on broadwell -> skylake (works) skylake -> broadwell (don't work)
ok mlarkin@
|
#
1.23 |
|
14-Aug-2017 |
mlarkin |
vmd: set MSR_MISC_ENABLE=0 on vm creation, this will be re-set in vmm based on proper values from the host in use.
|
#
1.22 |
|
15-Jul-2017 |
pd |
Add vmctl send and vmctl receive
ok reyk@ and mlarkin@
|
#
1.21 |
|
09-Jul-2017 |
pd |
vmd/vmctl: Add ability to pause / unpause vms
With help from Ashwin Agrawal
ok reyk@ mlarkin@
|
#
1.20 |
|
07-Jun-2017 |
mlarkin |
vmd: Implement simulated baudrate support in the ns8250 module. The previous version was allowing an output rate that is "too fast", and linux guests would give up after 512 characters TXed ("too much work for irq4").
This diff calculates the approximate rate we can sustain at the current programmed baud rate and limits the output to that rate by inserting a HZ delay after a specified number of characters have been transmitted. This fixes the linux guest console issue.
Note that the console now outputs at more or less the selected baud rate, instead of nearly instantaneously as before - if you selected 9600 in your guest VMs before, you might want to change that to 115200 now for a better console experience.
krw@ "seems like a good idea to me"
|
#
1.19 |
|
30-May-2017 |
tedu |
split vioblk read/write functions into start and finish as prep for async io operations. ok mlarkin
|
#
1.18 |
|
28-May-2017 |
mlarkin |
SVM: add some exit types
Also, fix a comment that wasn't applicable anymore, and change a format from decimal to hex
|
#
1.17 |
|
05-May-2017 |
reyk |
VMs cannot use proc_compose() to PROC_VMM, they have to use imsg_compose() on the "vmm_pipe" directly. This fixes the communication channel from VMs back to vmm.
|
#
1.16 |
|
05-May-2017 |
mlarkin |
Allow vmd(8) to set guest %xcr0
Usermode part of previous vmm(4) diff.
Posted to tech by Pratik Vyas
|
#
1.15 |
|
02-May-2017 |
mlarkin |
fix an error in i386 vmd build
|
#
1.14 |
|
02-May-2017 |
mlarkin |
Matching vmd(8) part of previous diff (first part of vmctl send/receive).
ok kettenis
|
#
1.13 |
|
25-Apr-2017 |
reyk |
spacing
|
#
1.12 |
|
19-Apr-2017 |
reyk |
Add support for dynamic "NAT" interfaces (-L/local interface).
When a local interface is configured, vmd configures a /31 address on the tap(4) interface of the host and provides another IP in the same subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server that replies with IP, gateway, and DNS addresses to the VM. The built-in server only ever responds to the VM on the inside and cannot leak its DHCP responses to the outside.
Thanks to Uwe Werler, Josh Grosse, and some others for testing!
OK deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.11 |
|
27-Mar-2017 |
deraadt |
die whitespace die die die
|
#
1.10 |
|
25-Mar-2017 |
mlarkin |
Last bits needed to get seabios + alpine linux working. This is enough to get started and let more people help finding and fixing bugs.
ok kettenis, deraadt
|
#
1.9 |
|
25-Mar-2017 |
reyk |
Boot using BIOS from /etc/firmware/vmm-bios by default.
Instead of using the internal "vmboot", VMs will now be booted using the external BIOS firmware in /etc/firmware/vmm-bios (which is subject to a LGPLv3 license). Direct booting of OpenBSD kernels or non-default BIOS images is still supported for now using the -b/boot option that is replacing the -k/kernel option.
As requested by Theo, vmd(8) fails if neither the default BIOS is found nor a kernel has been specified in the VM configuration. The "vmm" BIOS has to be installed using fw_update(1), which will be done automatically in most cases where the OpenBSD can fetch it after install/upgrade.
OK mlarkin@
|
#
1.8 |
|
25-Mar-2017 |
mlarkin |
Implement some missing functionality and clean up some code in vmd pci emulation.
ok kettenis
|
#
1.7 |
|
25-Mar-2017 |
mlarkin |
Introduce a new function to obtain properly sized input data, and convert i8253/i8259/mc146818 emulation to use this.
|
#
1.6 |
|
24-Mar-2017 |
mlarkin |
Allow vmd to proceed after an interrupt occurred after retiring a cpuid instruction. Matches previous commit to kernel vmm.c
|
#
1.5 |
|
23-Mar-2017 |
mlarkin |
Implement memory size and SMP CPU count NVRAM registers in the emulated mc146818. This is needed for seabios to boot properly (and construct a sensible e820 map to send to the guest OS).
|
#
1.4 |
|
21-Mar-2017 |
mlarkin |
Fix two errors in NS8250 (UART) emulation. The first error zeroed out the high bits of %eax on reading register data from the emulated UART ports. The second error didn't properly assert the TXRDY bit during init - this bit was only set after the first character was sent. Both these bugs caused seabios to not be able to output any data. Found during the recent effort to get Linux guests booting.
|
#
1.3 |
|
15-Mar-2017 |
reyk |
Improve vmmci(4) shutdown and reboot.
This change handles various cases to power off the VM, even if it is unresponsive, stuck in ddb, or when the shutdown was initiated from the VM guest side. Usage of timeout and VM ACKs make sure that the VM is really turned off at some point.
OK mlarkin@
|
#
1.2 |
|
02-Mar-2017 |
reyk |
Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.
This is especially useful when multiple VMs share a switch, the implementation is independent from the underlying switch or bridge.
no objections mlarkin@
|
#
1.1 |
|
01-Mar-2017 |
reyk |
Split vmm.c into two files: vm.c for the VM child, vmm.c for the parent
As discussed with mlarkin@, it makes it easier to maintain the file.
OK mlarkin@
|
#
1.79 |
|
28-Dec-2022 |
jmc |
spelling fixes; from paul tagliamonte any parts of his diff not taken are noted on tech
|
#
1.78 |
|
26-Dec-2022 |
dv |
vmd(8): provide a detailed e820 memory map.
When booting guests with SeaBIOS, vmd(8) supplied details about the available guest memory via CMOS registers. Consequently, we've been carrying some patches in the ports tree to SeaBIOS to fetch this information like it's the 1990s.
When a vm initializes memory ranges, we now track what each range represents. This information can be used to supply the e820 memory map to SeaBIOS via the fw_cfg interface allowing it to properly communicate memory ranges to a guest operating system. (This will also allow us to drop some patches from the port.)
Given the ranges can now be marked with a purpose, this also allows vmm(4) to switch from hard-coded mmio ranges and instead let the information on the memory range dictate if vmm should be handling a page fault or sending to vmd for a memory assist.
Tested by Mischa Peters and others. OK mlarkin@.
|
#
1.77 |
|
23-Dec-2022 |
dv |
vmd(8): implement zero-copy operations on virtqueues.
The original virtio device implementation relied on allocating a buffer on heap, copying the virtqueue from the guest, mutating the copy, and then overwriting the virtqueue in the guest.
While the approach worked, it was both complex and added extra overhead. On older hardware, switching to the zero-copy approach can show a noticeable performance improvement for vionet devices. An added benefit is this diff also reduces the amount of code in vmd, which is always a welcome change.
In addition, change to talking about the queue pfn and not "address" as the virtio-pci spec has drivers provide a 32-bit value representing the physical page number of the location in guest memory, not the linear address.
Original idea from dlg@ while working on re-adding async task queues.
ok dlg@, tested by many
|
#
1.76 |
|
11-Nov-2022 |
dv |
Revert removal of toggling interrupt line in vmd vcpu run loop.
phessler reports a performance regression. Needs more testing.
|
#
1.75 |
|
10-Nov-2022 |
dv |
vmd(8): remove toggling interrupt line on vcpu in vcpu run loop
We toggle the interrupt "line" on the vcpu when we assert or deassert irq on the pic in either the vcpu thread (emulating some devices) or on the device event thread (mostly handling reading available data). Having it in the vcpu run loop here just results in another ioctl(2) call before the one for re-entering the guest cpu.
Removing it shows no noticeable behavioral change in existing guests.
ok mlarkin@
|
#
1.74 |
|
10-Nov-2022 |
dv |
vmd(8): import mmio decode and emulation, disabled for now.
The initial mmio support for vmd adds support for only specific MOV and MOVZX instructions. Plan is to begin iterating in-tree on other missing pieces. All functionality is gated behind an #if for now.
Only change to vmm(4) is reordering register #define's in vmmvar.h.
ok mlarkin@
|
Revision tags: OPENBSD_7_2_BASE
|
#
1.73 |
|
01-Sep-2022 |
dv |
vmm(4): send all port io emulation to userland
Simplify things by sending any io exits from IN/OUT instructions to userland instead of trying to emulate anything in the kernel. vmm was sending most pertinent exits to vmd anyways, so this functionally changes little.
An added benefit is this solves an issue reported by tb@ where i386 OpenBSD guests would probe for a pc keyboard repeatedly and cause excessive vm exits. (The emulation in vmm was not properly handling these port reads.)
While here, make the assignment of the VEI_DIR_{IN,OUT} enum values not assume the underlying integer the compiler may assign.
ok mlarkin@
|
#
1.72 |
|
30-Aug-2022 |
dv |
Initial support for mmio assist for vmm(4)
Provide the basic information required for a userland assist in emulating instructions touching mmio regions, sending as much information as is provided by the host hardware.
No decode or assist provided at the moment by vmd(8).
ok mlarkin@
|
#
1.71 |
|
29-Jun-2022 |
dv |
vmd(8): fix off by one in vm memory range check
When inspecting if a gpa falls into a known memory range, vmd was considering it valid 1 byte past the end resulting in selecting the wrong starting range for the search.
ok mlarkin@
|
#
1.70 |
|
26-Jun-2022 |
dv |
vmd: create a copy of bios at 4g boundary
Newer Linux kernels call into the bios to perform a reboot and our version of SeaBIOS assumes there's a "copy" of the bios ending at 4g. When SeaBIOS reads from this area, since vmd doesn't perform mmio yet, guests terminate with an unhandled fault.
Carve out some space ending at 4g and copy the bios there. Technically we could load garbage there, but give SeaBIOS what it wants for now.
ok mlarkin@
|
#
1.69 |
|
03-May-2022 |
dv |
vmm/vmd/vmctl: standardize memory units to bytes
At different points in the vm lifecycle vmm(4), vmctl(8), and vmd(8) refer to a vm's memory range sizes in either bytes or megabytes. This is needlessly complex.
Switch to using bytes everywhere and adjust types and constants accordingly. While this makes it possible to specify vm's with memory in fractions of megabytes, the logic requiring whole megabyte values remains.
Feedback from deraadt@, mlarkin@, and Matthew Martin.
ok mlarkin@
|
Revision tags: OPENBSD_7_1_BASE
|
#
1.68 |
|
01-Mar-2022 |
dv |
vmd(8): gracefully handle hitting data limits when starting a vm
With recent changes to login.conf(5) to restrict daemon datasize to a finite value, users can now hit resource limits when attempting to start a vm.
This change fixes the error path when hitting the limit. vmd(8) will no longer abort and memory error messages are relayed to the user.
While here, address potential under-reads/writes using atomicio when relaying data between the child vm process and vmd's vmm process.
Original diff from tedu@. OK mlarkin@.
|
#
1.67 |
|
30-Dec-2021 |
claudio |
Add back support for -B net -b bsd.rd which emulates a PXE install and results in an autoinstall. This can be used to quickly create new OpenBSD installs. OK dv@
|
#
1.66 |
|
29-Nov-2021 |
deraadt |
mostly avoid sys/param.h with a local nitems() ok mlarkin
|
Revision tags: OPENBSD_7_0_BASE
|
#
1.65 |
|
01-Sep-2021 |
dv |
remove unused functions and cleanup vmd.h
Discussed with mlarkin@. These functions were implemented but never used. While in vmd.h, fix the order to match current vmd(8) reality.
|
#
1.64 |
|
16-Jul-2021 |
dv |
vmd(8): simplify vcpu logic, removing uart & vionet reads
Remove legacy state handling on the ns8250 and virtio network devices originally put in place before using libevent for async device events. The vcpu thread doesn't need to process device data as it is handled by the libevent thread.
This has the benefit of simplifying some of the message passing between threads introduced to the ns8250 uart since both the vcpu and libevent threads were processing read events.
No functional change intended. Tested by many, including abieber@, weerd@, Mischa Peters, and Matthias Schmidt. (Thanks.)
OK mlarkin@
|
#
1.63 |
|
16-Jun-2021 |
dv |
cleanup vmd(8) includes and header files
Lots of organic growth other the years lead to unnecessary includes (proc.h everywhere) and odd dependencies between header files. This cleans things up a bit to help with upcoming cleanup around dhcp code.
No functional change.
"go for it" mlarkin@
|
Revision tags: OPENBSD_6_9_BASE
|
#
1.62 |
|
05-Apr-2021 |
dv |
Support booting from compressed kernel images.
The bsd.rd ramdisk now ships gzip'd on amd64. Use libz in base to transparently handle decompression of any compressed kernel images.
Patch from Josh Rickmar.
ok kn@
|
#
1.61 |
|
29-Mar-2021 |
dv |
Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior ignored did not intercept these packets and instead transmitted them.
This should make vmd(8)'s dhcp behave more as a true dhcp server should and allows it to work properly with the new dhcpleased(8) attempting a renewal.
OK mlarkin@
|
#
1.60 |
|
19-Mar-2021 |
kn |
Remove booting from kernels in raw/qcow2 images
Diff and (slightly tweaked) text below from Dave Voutila < dave at sisu dot io >, thanks!
-- Since 6.7 switched to FFS2 as the default filesystem for new installs, the ability for vmd(8) to load a kernel and boot.conf from a disk image directly (without SeaBIOS) has been broken.
A diff from tb to add FFS2 support never mdae it into the tree.
On 5th Jan 2021, new ramdisks for amd64 have started shipping gzipped, breaking the ability to load the bsd.rd directly as a kernel image for a vmd guest without first uncompressing the image.
Using BIOS works, the FFS2 change happend ten months ago and few if any have complained about the breakage. vmctl(8) is still vague about supporting it per its man page and one still has to pass the disk image twice as a "-b" and "-d" argument to boot an OpenBSD guest *without* BIOS.
Josh Rickmar reported the gzip issue on bugs@ and provided patches to add support for compressed ramdisks and kernel images. The easiest way to do so is to drop support for FFS images since they require a call to fmemopen(3) while all the other logic uses fopen(3)/fdopen(3) calls and a file descriptor. It is much easier to get thsoe patches merged if they don't have to account for extracting files from disk images. --
No objections anyone "Removing it makes sense" reyk (who wrote the FFS module) OK mlarkin
|
#
1.59 |
|
13-Feb-2021 |
mlarkin |
Fix some wrong comments and KNF/long line wraps
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.58 |
|
28-Jun-2020 |
pd |
vmd(8): Eliminate libevent state corruption
libevent functions for com, pic and rtc are now only called on event_thread. vcpu exit handlers send messages on a dev pipe and callbacks on these events do the event management (event_add, evtimer_add, etc). Previously, libevent state was mutated by two threads, event_thread, that runs all the callbacks and the vcpu thread when running exit handlers. This could have lead to libevent state corruption.
Patch from Dave Voutila <dave@sisu.io>
ok claudio@ tested by abieber@ and brynet@
|
Revision tags: OPENBSD_6_7_BASE
|
#
1.57 |
|
30-Apr-2020 |
pd |
vmd(8): correctly terminate vm processes after sending vm
Instead of a round about way of sending a message to vmm that 'send is successful' and terminating by vm_remove from vmm, we can send the imsg and exit in the vm process. The sigchld handler in vmm will vm_remove it from its structures. This is how a normal vm is terminated as well.
Previously, vm_remove was called in vmm_dispatch_vm (ie. the event handler to receive messages from vm process) when hanlding the IMSG_VMDOP_SEND_VM_RESPONSE (ie. the vm process has written the vm state to the fd passed on by vmctl send). This is not how vm_remove was intented to be used as it does a free(vm). The vm struct holds the buffers for imsg and so after handling this IMSG_VMDOP_SEND_VM_RESPONSE message, vmm_dispatch_vm loops again to do imsg_get(ibuf, &imsg) to read the next message (and we had just freed this *ibuf when we freed the vm struct) causing it to segfault.
reported by kn@ ok kn@
|
#
1.56 |
|
21-Apr-2020 |
pd |
vmd: improve concurrency control in pause
Previous implementation hit a deadlock sometimes as the pthread_cond_broadcast for the pause mutex could happen before pthread_cond_wait. This implementation uses a barrier which is hit when all vpcus are paused.
ok mpi@
|
#
1.55 |
|
08-Apr-2020 |
pd |
vmm(4): add IOCTL handler to sets the access protections of the ept
This exposes VMM_IOC_MPROTECT_EPT which can be used by vmd to lock in physical pages. Currently, vmd just terminates the vm in case it gets a protection fault in the future.
This feature is used by solo5 which uses vmm(4) as a backend hypervisor.
ok mpi@
Patch from Adam Steen <adam@adamsteen.com.au>
|
#
1.54 |
|
11-Dec-2019 |
pd |
vmd: proper concurrency control when pausing a vm
Removes an XXX which slept for 1s waiting for the vcpu thread to reach HLT and pause. We now define a paused and unpaused condition so that a call to pause_vm() / vmctl pause blocks till the vm really reaches a paused state.
Also, detach events for devices from event loop when pausing and add them back when unpausing. This is because some callbacks call pthread_mutex_lock and if the vm is paused, it would block also causing the libevent thread to block. This would mean that we would not be able to process any IMSGs received from vmm (parent process) including a message to unpause.
ok mlarkin@
|
#
1.53 |
|
30-Nov-2019 |
mlarkin |
Revert previous - the stability was not as improved as we had thought and we ended up accidentally breaking vmctl. This will need more thought.
ok ori@
|
#
1.52 |
|
29-Nov-2019 |
mlarkin |
Fix at least one cause of VMs spinning at 100% host CPU
After debugging with ori@, it looks like an event ends up on the wrong libevent queue, and we end continually de-queueing and re-queueing the event continually. While it's unclear exactly why this happened, a clue on libevent's github issues page for the same problem pointed us to using a different event base for the device events. This seems to have unstuck ori@'s problematic VM, and I have also seen no more hangs after this.
We have not completely separated the queues; ori@ will work on setting new libevent bases for those later. But those events are pretty frequency.
with help from and ok ori@
|
Revision tags: OPENBSD_6_6_BASE
|
#
1.51 |
|
17-Jul-2019 |
pd |
vmm/vmd: Fix migration with pvclock
Implement VMM_IOC_READVMPARAMS and VMM_IOC_WRITEVMPARAMS ioctls to read and write pvclock state.
reads ok mlarkin@
|
#
1.50 |
|
28-Jun-2019 |
deraadt |
When system calls indicate an error they return -1, not some arbitrary value < 0. errno is only updated in this case. Change all (most?) callers of syscalls to follow this better, and let's see if this strictness helps us in the future.
|
#
1.49 |
|
28-May-2019 |
pd |
vmd: unset CR0_CD and CR0_NW in default flat64 register values
These never got unset on AMD/SVM guests when booted via vmctl start -b causing them to run very slow
ok mlarkin@
|
#
1.48 |
|
12-May-2019 |
pd |
vmm: add a x86 page table walker
Add a first cut of x86 page table walker to vmd(8) and vmm(4). This function is not used right now but is a building block for future features like HPET, OUTSB and INSB emulation, nested virtualisation support, etc.
With help from Mike Larkin
ok mlarkin@
|
#
1.47 |
|
11-May-2019 |
jasper |
vm_dump_header allocated space for a signature but it was never set; set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image
ok mlarkin@ pd@
|
#
1.46 |
|
11-May-2019 |
jasper |
track the state of the vm (running, paused, etc) using a single bitfield instead of a handful of separate variables. this will makes it easier for vmd to report and check on the individual vm states
no functional change intended
ok ccardenas@ mlarkin@
|
Revision tags: OPENBSD_6_5_BASE
|
#
1.45 |
|
01-Mar-2019 |
mlarkin |
vmd(8): remove some i386 remnants that missed the original cleanup
ok pd, kn, deraadt
|
#
1.44 |
|
20-Feb-2019 |
mlarkin |
vmd(8): initialize guest %drX registers to power-on defaults on launch
Initializes the %drX registers to power on defaults, and bump the VM send/recieve header to reflect same
discussed with deraadt@
|
#
1.43 |
|
10-Dec-2018 |
claudio |
Implement the fw_cfg interface basics and use it to set the bootorder if a bootdevice was forced. This implements both the pure IO port interface and also the new DMA interface, a few direct commands are implemented which are needed but in general the "file" interface should be used. There is no write support for the guest. Tested against the latest vmm-firmware port. This requires also a -current kernel to pass the IO ports to vmd(8). OK mlarkin@ ccardenas@
|
#
1.42 |
|
06-Dec-2018 |
claudio |
Make it possible to define the bootdevice in vmd. This information is used currently only when booting a OpenBSD kernel. If VMBOOTDEV_NET is used the internal dhcp server will pass "auto_install" as boot file to the client and the boot loader passes the MAC of the first interface to the kernel to indicate PXE booting. Adding boot order support to SeaBIOS is not yet implemented. Ok ccardenas@
|
Revision tags: OPENBSD_6_4_BASE
|
#
1.41 |
|
08-Oct-2018 |
reyk |
Add support for qcow2 base images (external snapshots).
This works is from Ori Bernstein, committing on his behalf:
Add support to vmd for external snapshots. That is, snapshots that are derived from a base image. Data lookups start in the derived image, and if the derived image does not contain some data, the search proceeds ot the base image. Multiple derived images may exist off of a single base image.
A limitation of this format is that modifying the base image will corrupt the derived image.
This change also adds support for creating disk derived disk images to vmctl. To use it:
vmctl create derived.qcow2 -s 16G -b base.qcow2
From Ori Bernstein OK mlarkin@ reyk@
|
#
1.40 |
|
28-Sep-2018 |
reyk |
Support vmd-internal's vmboot with qcow2 disk images.
OK mlarkin@
|
#
1.39 |
|
19-Sep-2018 |
ccardenas |
Various clean up items for disks.
- qcow2: general cleanup - vioraw: check malloc - virtio: add function to sync disks - vm: call virtio_shutdown to sync disks when vm is finished executing
Thanks to Ori Bernstein.
Ok miko@
|
#
1.38 |
|
17-Jul-2018 |
mlarkin |
vmd(8): fix vmctl -b option for i386 kernels.
ok pd@
|
#
1.37 |
|
12-Jul-2018 |
mlarkin |
vmm(8)/vmm(4): send a copy of the guest register state to vmd on exit, avoiding multiple readregs ioctls back to vmm in case register content is needed subsequently.
ok phessler
|
#
1.36 |
|
10-Jul-2018 |
mlarkin |
vmd(8): route ELCR handler to the right function
|
#
1.35 |
|
09-Jul-2018 |
mlarkin |
vmd(8): better debug message in a failure case
|
#
1.34 |
|
19-Jun-2018 |
reyk |
knf
|
#
1.33 |
|
27-Apr-2018 |
mlarkin |
vmd(8): implement vmd side of ELCR registers
ok guenther
|
#
1.32 |
|
26-Apr-2018 |
mlarkin |
vmd(8): handle PIT channel 2 status readback via port 0x61
Allow PIT channel 2 status (fired/counting) readback via port 0x61 bit 5.
ok guenther@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.31 |
|
03-Jan-2018 |
ccardenas |
Add initial CD-ROM support to VMD via vioscsi.
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
#
1.30 |
|
29-Nov-2017 |
mlarkin |
make vmm(4) less responsible for initial register state, preferring to let usermode daemons handle that.
ok pd@
|
#
1.29 |
|
28-Nov-2017 |
mlarkin |
fix some spelling errors in a few comments
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.28 |
|
19-Sep-2017 |
mlarkin |
Clarify a wrong conditional, found by jsg.
ok jsg
|
#
1.27 |
|
17-Sep-2017 |
pd |
vmd: send/recv pci config space instead of recreating pci devices on receive
ok mlarkin@
|
#
1.26 |
|
17-Sep-2017 |
pd |
vmd: re add rtc.per and rtc.sec evtimers on receive
This was missed in receive. mc146818_start is already defined. This fixes rtc time resync on receive.
ok mlarkin@
|
#
1.25 |
|
11-Sep-2017 |
dlg |
add functions to provide direct access to guest memory as vmd addresses
iovec_mem() populates an iovec array based on guest physical addresses. this allows the use of things like readv and writev for moving data between the guest and a disk image file without having to bounce the memory.
vaddr_mem() provides a vmd usable pointer based on a guests physical address. this makes it possible to directly reference things like virtio rings without having to bounce that memory either. however, it assumes that a contiguous range of guest physical memory will sit in a single vm memory range. mlarkin@ says this is right.
ok mlarkin@
|
#
1.24 |
|
20-Aug-2017 |
pd |
vmd: Allow only upward migration
This restricts receiving vms from hosts with more cpu features.
Tested on broadwell -> skylake (works) skylake -> broadwell (don't work)
ok mlarkin@
|
#
1.23 |
|
14-Aug-2017 |
mlarkin |
vmd: set MSR_MISC_ENABLE=0 on vm creation, this will be re-set in vmm based on proper values from the host in use.
|
#
1.22 |
|
15-Jul-2017 |
pd |
Add vmctl send and vmctl receive
ok reyk@ and mlarkin@
|
#
1.21 |
|
09-Jul-2017 |
pd |
vmd/vmctl: Add ability to pause / unpause vms
With help from Ashwin Agrawal
ok reyk@ mlarkin@
|
#
1.20 |
|
07-Jun-2017 |
mlarkin |
vmd: Implement simulated baudrate support in the ns8250 module. The previous version was allowing an output rate that is "too fast", and linux guests would give up after 512 characters TXed ("too much work for irq4").
This diff calculates the approximate rate we can sustain at the current programmed baud rate and limits the output to that rate by inserting a HZ delay after a specified number of characters have been transmitted. This fixes the linux guest console issue.
Note that the console now outputs at more or less the selected baud rate, instead of nearly instantaneously as before - if you selected 9600 in your guest VMs before, you might want to change that to 115200 now for a better console experience.
krw@ "seems like a good idea to me"
|
#
1.19 |
|
30-May-2017 |
tedu |
split vioblk read/write functions into start and finish as prep for async io operations. ok mlarkin
|
#
1.18 |
|
28-May-2017 |
mlarkin |
SVM: add some exit types
Also, fix a comment that wasn't applicable anymore, and change a format from decimal to hex
|
#
1.17 |
|
05-May-2017 |
reyk |
VMs cannot use proc_compose() to PROC_VMM, they have to use imsg_compose() on the "vmm_pipe" directly. This fixes the communication channel from VMs back to vmm.
|
#
1.16 |
|
05-May-2017 |
mlarkin |
Allow vmd(8) to set guest %xcr0
Usermode part of previous vmm(4) diff.
Posted to tech by Pratik Vyas
|
#
1.15 |
|
02-May-2017 |
mlarkin |
fix an error in i386 vmd build
|
#
1.14 |
|
02-May-2017 |
mlarkin |
Matching vmd(8) part of previous diff (first part of vmctl send/receive).
ok kettenis
|
#
1.13 |
|
25-Apr-2017 |
reyk |
spacing
|
#
1.12 |
|
19-Apr-2017 |
reyk |
Add support for dynamic "NAT" interfaces (-L/local interface).
When a local interface is configured, vmd configures a /31 address on the tap(4) interface of the host and provides another IP in the same subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server that replies with IP, gateway, and DNS addresses to the VM. The built-in server only ever responds to the VM on the inside and cannot leak its DHCP responses to the outside.
Thanks to Uwe Werler, Josh Grosse, and some others for testing!
OK deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.11 |
|
27-Mar-2017 |
deraadt |
die whitespace die die die
|
#
1.10 |
|
25-Mar-2017 |
mlarkin |
Last bits needed to get seabios + alpine linux working. This is enough to get started and let more people help finding and fixing bugs.
ok kettenis, deraadt
|
#
1.9 |
|
25-Mar-2017 |
reyk |
Boot using BIOS from /etc/firmware/vmm-bios by default.
Instead of using the internal "vmboot", VMs will now be booted using the external BIOS firmware in /etc/firmware/vmm-bios (which is subject to a LGPLv3 license). Direct booting of OpenBSD kernels or non-default BIOS images is still supported for now using the -b/boot option that is replacing the -k/kernel option.
As requested by Theo, vmd(8) fails if neither the default BIOS is found nor a kernel has been specified in the VM configuration. The "vmm" BIOS has to be installed using fw_update(1), which will be done automatically in most cases where the OpenBSD can fetch it after install/upgrade.
OK mlarkin@
|
#
1.8 |
|
25-Mar-2017 |
mlarkin |
Implement some missing functionality and clean up some code in vmd pci emulation.
ok kettenis
|
#
1.7 |
|
25-Mar-2017 |
mlarkin |
Introduce a new function to obtain properly sized input data, and convert i8253/i8259/mc146818 emulation to use this.
|
#
1.6 |
|
24-Mar-2017 |
mlarkin |
Allow vmd to proceed after an interrupt occurred after retiring a cpuid instruction. Matches previous commit to kernel vmm.c
|
#
1.5 |
|
23-Mar-2017 |
mlarkin |
Implement memory size and SMP CPU count NVRAM registers in the emulated mc146818. This is needed for seabios to boot properly (and construct a sensible e820 map to send to the guest OS).
|
#
1.4 |
|
21-Mar-2017 |
mlarkin |
Fix two errors in NS8250 (UART) emulation. The first error zeroed out the high bits of %eax on reading register data from the emulated UART ports. The second error didn't properly assert the TXRDY bit during init - this bit was only set after the first character was sent. Both these bugs caused seabios to not be able to output any data. Found during the recent effort to get Linux guests booting.
|
#
1.3 |
|
15-Mar-2017 |
reyk |
Improve vmmci(4) shutdown and reboot.
This change handles various cases to power off the VM, even if it is unresponsive, stuck in ddb, or when the shutdown was initiated from the VM guest side. Usage of timeout and VM ACKs make sure that the VM is really turned off at some point.
OK mlarkin@
|
#
1.2 |
|
02-Mar-2017 |
reyk |
Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.
This is especially useful when multiple VMs share a switch, the implementation is independent from the underlying switch or bridge.
no objections mlarkin@
|
#
1.1 |
|
01-Mar-2017 |
reyk |
Split vmm.c into two files: vm.c for the VM child, vmm.c for the parent
As discussed with mlarkin@, it makes it easier to maintain the file.
OK mlarkin@
|
#
1.77 |
|
23-Dec-2022 |
dv |
vmd(8): implement zero-copy operations on virtqueues.
The original virtio device implementation relied on allocating a buffer on heap, copying the virtqueue from the guest, mutating the copy, and then overwriting the virtqueue in the guest.
While the approach worked, it was both complex and added extra overhead. On older hardware, switching to the zero-copy approach can show a noticeable performance improvement for vionet devices. An added benefit is this diff also reduces the amount of code in vmd, which is always a welcome change.
In addition, change to talking about the queue pfn and not "address" as the virtio-pci spec has drivers provide a 32-bit value representing the physical page number of the location in guest memory, not the linear address.
Original idea from dlg@ while working on re-adding async task queues.
ok dlg@, tested by many
|
#
1.76 |
|
11-Nov-2022 |
dv |
Revert removal of toggling interrupt line in vmd vcpu run loop.
phessler reports a performance regression. Needs more testing.
|
#
1.75 |
|
10-Nov-2022 |
dv |
vmd(8): remove toggling interrupt line on vcpu in vcpu run loop
We toggle the interrupt "line" on the vcpu when we assert or deassert irq on the pic in either the vcpu thread (emulating some devices) or on the device event thread (mostly handling reading available data). Having it in the vcpu run loop here just results in another ioctl(2) call before the one for re-entering the guest cpu.
Removing it shows no noticeable behavioral change in existing guests.
ok mlarkin@
|
#
1.74 |
|
10-Nov-2022 |
dv |
vmd(8): import mmio decode and emulation, disabled for now.
The initial mmio support for vmd adds support for only specific MOV and MOVZX instructions. Plan is to begin iterating in-tree on other missing pieces. All functionality is gated behind an #if for now.
Only change to vmm(4) is reordering register #define's in vmmvar.h.
ok mlarkin@
|
Revision tags: OPENBSD_7_2_BASE
|
#
1.73 |
|
01-Sep-2022 |
dv |
vmm(4): send all port io emulation to userland
Simplify things by sending any io exits from IN/OUT instructions to userland instead of trying to emulate anything in the kernel. vmm was sending most pertinent exits to vmd anyways, so this functionally changes little.
An added benefit is this solves an issue reported by tb@ where i386 OpenBSD guests would probe for a pc keyboard repeatedly and cause excessive vm exits. (The emulation in vmm was not properly handling these port reads.)
While here, make the assignment of the VEI_DIR_{IN,OUT} enum values not assume the underlying integer the compiler may assign.
ok mlarkin@
|
#
1.72 |
|
30-Aug-2022 |
dv |
Initial support for mmio assist for vmm(4)
Provide the basic information required for a userland assist in emulating instructions touching mmio regions, sending as much information as is provided by the host hardware.
No decode or assist provided at the moment by vmd(8).
ok mlarkin@
|
#
1.71 |
|
29-Jun-2022 |
dv |
vmd(8): fix off by one in vm memory range check
When inspecting if a gpa falls into a known memory range, vmd was considering it valid 1 byte past the end resulting in selecting the wrong starting range for the search.
ok mlarkin@
|
#
1.70 |
|
26-Jun-2022 |
dv |
vmd: create a copy of bios at 4g boundary
Newer Linux kernels call into the bios to perform a reboot and our version of SeaBIOS assumes there's a "copy" of the bios ending at 4g. When SeaBIOS reads from this area, since vmd doesn't perform mmio yet, guests terminate with an unhandled fault.
Carve out some space ending at 4g and copy the bios there. Technically we could load garbage there, but give SeaBIOS what it wants for now.
ok mlarkin@
|
#
1.69 |
|
03-May-2022 |
dv |
vmm/vmd/vmctl: standardize memory units to bytes
At different points in the vm lifecycle vmm(4), vmctl(8), and vmd(8) refer to a vm's memory range sizes in either bytes or megabytes. This is needlessly complex.
Switch to using bytes everywhere and adjust types and constants accordingly. While this makes it possible to specify vm's with memory in fractions of megabytes, the logic requiring whole megabyte values remains.
Feedback from deraadt@, mlarkin@, and Matthew Martin.
ok mlarkin@
|
Revision tags: OPENBSD_7_1_BASE
|
#
1.68 |
|
01-Mar-2022 |
dv |
vmd(8): gracefully handle hitting data limits when starting a vm
With recent changes to login.conf(5) to restrict daemon datasize to a finite value, users can now hit resource limits when attempting to start a vm.
This change fixes the error path when hitting the limit. vmd(8) will no longer abort and memory error messages are relayed to the user.
While here, address potential under-reads/writes using atomicio when relaying data between the child vm process and vmd's vmm process.
Original diff from tedu@. OK mlarkin@.
|
#
1.67 |
|
30-Dec-2021 |
claudio |
Add back support for -B net -b bsd.rd which emulates a PXE install and results in an autoinstall. This can be used to quickly create new OpenBSD installs. OK dv@
|
#
1.66 |
|
29-Nov-2021 |
deraadt |
mostly avoid sys/param.h with a local nitems() ok mlarkin
|
Revision tags: OPENBSD_7_0_BASE
|
#
1.65 |
|
01-Sep-2021 |
dv |
remove unused functions and cleanup vmd.h
Discussed with mlarkin@. These functions were implemented but never used. While in vmd.h, fix the order to match current vmd(8) reality.
|
#
1.64 |
|
16-Jul-2021 |
dv |
vmd(8): simplify vcpu logic, removing uart & vionet reads
Remove legacy state handling on the ns8250 and virtio network devices originally put in place before using libevent for async device events. The vcpu thread doesn't need to process device data as it is handled by the libevent thread.
This has the benefit of simplifying some of the message passing between threads introduced to the ns8250 uart since both the vcpu and libevent threads were processing read events.
No functional change intended. Tested by many, including abieber@, weerd@, Mischa Peters, and Matthias Schmidt. (Thanks.)
OK mlarkin@
|
#
1.63 |
|
16-Jun-2021 |
dv |
cleanup vmd(8) includes and header files
Lots of organic growth other the years lead to unnecessary includes (proc.h everywhere) and odd dependencies between header files. This cleans things up a bit to help with upcoming cleanup around dhcp code.
No functional change.
"go for it" mlarkin@
|
Revision tags: OPENBSD_6_9_BASE
|
#
1.62 |
|
05-Apr-2021 |
dv |
Support booting from compressed kernel images.
The bsd.rd ramdisk now ships gzip'd on amd64. Use libz in base to transparently handle decompression of any compressed kernel images.
Patch from Josh Rickmar.
ok kn@
|
#
1.61 |
|
29-Mar-2021 |
dv |
Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior ignored did not intercept these packets and instead transmitted them.
This should make vmd(8)'s dhcp behave more as a true dhcp server should and allows it to work properly with the new dhcpleased(8) attempting a renewal.
OK mlarkin@
|
#
1.60 |
|
19-Mar-2021 |
kn |
Remove booting from kernels in raw/qcow2 images
Diff and (slightly tweaked) text below from Dave Voutila < dave at sisu dot io >, thanks!
-- Since 6.7 switched to FFS2 as the default filesystem for new installs, the ability for vmd(8) to load a kernel and boot.conf from a disk image directly (without SeaBIOS) has been broken.
A diff from tb to add FFS2 support never mdae it into the tree.
On 5th Jan 2021, new ramdisks for amd64 have started shipping gzipped, breaking the ability to load the bsd.rd directly as a kernel image for a vmd guest without first uncompressing the image.
Using BIOS works, the FFS2 change happend ten months ago and few if any have complained about the breakage. vmctl(8) is still vague about supporting it per its man page and one still has to pass the disk image twice as a "-b" and "-d" argument to boot an OpenBSD guest *without* BIOS.
Josh Rickmar reported the gzip issue on bugs@ and provided patches to add support for compressed ramdisks and kernel images. The easiest way to do so is to drop support for FFS images since they require a call to fmemopen(3) while all the other logic uses fopen(3)/fdopen(3) calls and a file descriptor. It is much easier to get thsoe patches merged if they don't have to account for extracting files from disk images. --
No objections anyone "Removing it makes sense" reyk (who wrote the FFS module) OK mlarkin
|
#
1.59 |
|
13-Feb-2021 |
mlarkin |
Fix some wrong comments and KNF/long line wraps
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.58 |
|
28-Jun-2020 |
pd |
vmd(8): Eliminate libevent state corruption
libevent functions for com, pic and rtc are now only called on event_thread. vcpu exit handlers send messages on a dev pipe and callbacks on these events do the event management (event_add, evtimer_add, etc). Previously, libevent state was mutated by two threads, event_thread, that runs all the callbacks and the vcpu thread when running exit handlers. This could have lead to libevent state corruption.
Patch from Dave Voutila <dave@sisu.io>
ok claudio@ tested by abieber@ and brynet@
|
Revision tags: OPENBSD_6_7_BASE
|
#
1.57 |
|
30-Apr-2020 |
pd |
vmd(8): correctly terminate vm processes after sending vm
Instead of a round about way of sending a message to vmm that 'send is successful' and terminating by vm_remove from vmm, we can send the imsg and exit in the vm process. The sigchld handler in vmm will vm_remove it from its structures. This is how a normal vm is terminated as well.
Previously, vm_remove was called in vmm_dispatch_vm (ie. the event handler to receive messages from vm process) when hanlding the IMSG_VMDOP_SEND_VM_RESPONSE (ie. the vm process has written the vm state to the fd passed on by vmctl send). This is not how vm_remove was intented to be used as it does a free(vm). The vm struct holds the buffers for imsg and so after handling this IMSG_VMDOP_SEND_VM_RESPONSE message, vmm_dispatch_vm loops again to do imsg_get(ibuf, &imsg) to read the next message (and we had just freed this *ibuf when we freed the vm struct) causing it to segfault.
reported by kn@ ok kn@
|
#
1.56 |
|
21-Apr-2020 |
pd |
vmd: improve concurrency control in pause
Previous implementation hit a deadlock sometimes as the pthread_cond_broadcast for the pause mutex could happen before pthread_cond_wait. This implementation uses a barrier which is hit when all vpcus are paused.
ok mpi@
|
#
1.55 |
|
08-Apr-2020 |
pd |
vmm(4): add IOCTL handler to sets the access protections of the ept
This exposes VMM_IOC_MPROTECT_EPT which can be used by vmd to lock in physical pages. Currently, vmd just terminates the vm in case it gets a protection fault in the future.
This feature is used by solo5 which uses vmm(4) as a backend hypervisor.
ok mpi@
Patch from Adam Steen <adam@adamsteen.com.au>
|
#
1.54 |
|
11-Dec-2019 |
pd |
vmd: proper concurrency control when pausing a vm
Removes an XXX which slept for 1s waiting for the vcpu thread to reach HLT and pause. We now define a paused and unpaused condition so that a call to pause_vm() / vmctl pause blocks till the vm really reaches a paused state.
Also, detach events for devices from event loop when pausing and add them back when unpausing. This is because some callbacks call pthread_mutex_lock and if the vm is paused, it would block also causing the libevent thread to block. This would mean that we would not be able to process any IMSGs received from vmm (parent process) including a message to unpause.
ok mlarkin@
|
#
1.53 |
|
30-Nov-2019 |
mlarkin |
Revert previous - the stability was not as improved as we had thought and we ended up accidentally breaking vmctl. This will need more thought.
ok ori@
|
#
1.52 |
|
29-Nov-2019 |
mlarkin |
Fix at least one cause of VMs spinning at 100% host CPU
After debugging with ori@, it looks like an event ends up on the wrong libevent queue, and we end continually de-queueing and re-queueing the event continually. While it's unclear exactly why this happened, a clue on libevent's github issues page for the same problem pointed us to using a different event base for the device events. This seems to have unstuck ori@'s problematic VM, and I have also seen no more hangs after this.
We have not completely separated the queues; ori@ will work on setting new libevent bases for those later. But those events are pretty frequency.
with help from and ok ori@
|
Revision tags: OPENBSD_6_6_BASE
|
#
1.51 |
|
17-Jul-2019 |
pd |
vmm/vmd: Fix migration with pvclock
Implement VMM_IOC_READVMPARAMS and VMM_IOC_WRITEVMPARAMS ioctls to read and write pvclock state.
reads ok mlarkin@
|
#
1.50 |
|
28-Jun-2019 |
deraadt |
When system calls indicate an error they return -1, not some arbitrary value < 0. errno is only updated in this case. Change all (most?) callers of syscalls to follow this better, and let's see if this strictness helps us in the future.
|
#
1.49 |
|
28-May-2019 |
pd |
vmd: unset CR0_CD and CR0_NW in default flat64 register values
These never got unset on AMD/SVM guests when booted via vmctl start -b causing them to run very slow
ok mlarkin@
|
#
1.48 |
|
12-May-2019 |
pd |
vmm: add a x86 page table walker
Add a first cut of x86 page table walker to vmd(8) and vmm(4). This function is not used right now but is a building block for future features like HPET, OUTSB and INSB emulation, nested virtualisation support, etc.
With help from Mike Larkin
ok mlarkin@
|
#
1.47 |
|
11-May-2019 |
jasper |
vm_dump_header allocated space for a signature but it was never set; set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image
ok mlarkin@ pd@
|
#
1.46 |
|
11-May-2019 |
jasper |
track the state of the vm (running, paused, etc) using a single bitfield instead of a handful of separate variables. this will makes it easier for vmd to report and check on the individual vm states
no functional change intended
ok ccardenas@ mlarkin@
|
Revision tags: OPENBSD_6_5_BASE
|
#
1.45 |
|
01-Mar-2019 |
mlarkin |
vmd(8): remove some i386 remnants that missed the original cleanup
ok pd, kn, deraadt
|
#
1.44 |
|
20-Feb-2019 |
mlarkin |
vmd(8): initialize guest %drX registers to power-on defaults on launch
Initializes the %drX registers to power on defaults, and bump the VM send/recieve header to reflect same
discussed with deraadt@
|
#
1.43 |
|
10-Dec-2018 |
claudio |
Implement the fw_cfg interface basics and use it to set the bootorder if a bootdevice was forced. This implements both the pure IO port interface and also the new DMA interface, a few direct commands are implemented which are needed but in general the "file" interface should be used. There is no write support for the guest. Tested against the latest vmm-firmware port. This requires also a -current kernel to pass the IO ports to vmd(8). OK mlarkin@ ccardenas@
|
#
1.42 |
|
06-Dec-2018 |
claudio |
Make it possible to define the bootdevice in vmd. This information is used currently only when booting a OpenBSD kernel. If VMBOOTDEV_NET is used the internal dhcp server will pass "auto_install" as boot file to the client and the boot loader passes the MAC of the first interface to the kernel to indicate PXE booting. Adding boot order support to SeaBIOS is not yet implemented. Ok ccardenas@
|
Revision tags: OPENBSD_6_4_BASE
|
#
1.41 |
|
08-Oct-2018 |
reyk |
Add support for qcow2 base images (external snapshots).
This works is from Ori Bernstein, committing on his behalf:
Add support to vmd for external snapshots. That is, snapshots that are derived from a base image. Data lookups start in the derived image, and if the derived image does not contain some data, the search proceeds ot the base image. Multiple derived images may exist off of a single base image.
A limitation of this format is that modifying the base image will corrupt the derived image.
This change also adds support for creating disk derived disk images to vmctl. To use it:
vmctl create derived.qcow2 -s 16G -b base.qcow2
From Ori Bernstein OK mlarkin@ reyk@
|
#
1.40 |
|
28-Sep-2018 |
reyk |
Support vmd-internal's vmboot with qcow2 disk images.
OK mlarkin@
|
#
1.39 |
|
19-Sep-2018 |
ccardenas |
Various clean up items for disks.
- qcow2: general cleanup - vioraw: check malloc - virtio: add function to sync disks - vm: call virtio_shutdown to sync disks when vm is finished executing
Thanks to Ori Bernstein.
Ok miko@
|
#
1.38 |
|
17-Jul-2018 |
mlarkin |
vmd(8): fix vmctl -b option for i386 kernels.
ok pd@
|
#
1.37 |
|
12-Jul-2018 |
mlarkin |
vmm(8)/vmm(4): send a copy of the guest register state to vmd on exit, avoiding multiple readregs ioctls back to vmm in case register content is needed subsequently.
ok phessler
|
#
1.36 |
|
10-Jul-2018 |
mlarkin |
vmd(8): route ELCR handler to the right function
|
#
1.35 |
|
09-Jul-2018 |
mlarkin |
vmd(8): better debug message in a failure case
|
#
1.34 |
|
19-Jun-2018 |
reyk |
knf
|
#
1.33 |
|
27-Apr-2018 |
mlarkin |
vmd(8): implement vmd side of ELCR registers
ok guenther
|
#
1.32 |
|
26-Apr-2018 |
mlarkin |
vmd(8): handle PIT channel 2 status readback via port 0x61
Allow PIT channel 2 status (fired/counting) readback via port 0x61 bit 5.
ok guenther@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.31 |
|
03-Jan-2018 |
ccardenas |
Add initial CD-ROM support to VMD via vioscsi.
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
#
1.30 |
|
29-Nov-2017 |
mlarkin |
make vmm(4) less responsible for initial register state, preferring to let usermode daemons handle that.
ok pd@
|
#
1.29 |
|
28-Nov-2017 |
mlarkin |
fix some spelling errors in a few comments
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.28 |
|
19-Sep-2017 |
mlarkin |
Clarify a wrong conditional, found by jsg.
ok jsg
|
#
1.27 |
|
17-Sep-2017 |
pd |
vmd: send/recv pci config space instead of recreating pci devices on receive
ok mlarkin@
|
#
1.26 |
|
17-Sep-2017 |
pd |
vmd: re add rtc.per and rtc.sec evtimers on receive
This was missed in receive. mc146818_start is already defined. This fixes rtc time resync on receive.
ok mlarkin@
|
#
1.25 |
|
11-Sep-2017 |
dlg |
add functions to provide direct access to guest memory as vmd addresses
iovec_mem() populates an iovec array based on guest physical addresses. this allows the use of things like readv and writev for moving data between the guest and a disk image file without having to bounce the memory.
vaddr_mem() provides a vmd usable pointer based on a guests physical address. this makes it possible to directly reference things like virtio rings without having to bounce that memory either. however, it assumes that a contiguous range of guest physical memory will sit in a single vm memory range. mlarkin@ says this is right.
ok mlarkin@
|
#
1.24 |
|
20-Aug-2017 |
pd |
vmd: Allow only upward migration
This restricts receiving vms from hosts with more cpu features.
Tested on broadwell -> skylake (works) skylake -> broadwell (don't work)
ok mlarkin@
|
#
1.23 |
|
14-Aug-2017 |
mlarkin |
vmd: set MSR_MISC_ENABLE=0 on vm creation, this will be re-set in vmm based on proper values from the host in use.
|
#
1.22 |
|
15-Jul-2017 |
pd |
Add vmctl send and vmctl receive
ok reyk@ and mlarkin@
|
#
1.21 |
|
09-Jul-2017 |
pd |
vmd/vmctl: Add ability to pause / unpause vms
With help from Ashwin Agrawal
ok reyk@ mlarkin@
|
#
1.20 |
|
07-Jun-2017 |
mlarkin |
vmd: Implement simulated baudrate support in the ns8250 module. The previous version was allowing an output rate that is "too fast", and linux guests would give up after 512 characters TXed ("too much work for irq4").
This diff calculates the approximate rate we can sustain at the current programmed baud rate and limits the output to that rate by inserting a HZ delay after a specified number of characters have been transmitted. This fixes the linux guest console issue.
Note that the console now outputs at more or less the selected baud rate, instead of nearly instantaneously as before - if you selected 9600 in your guest VMs before, you might want to change that to 115200 now for a better console experience.
krw@ "seems like a good idea to me"
|
#
1.19 |
|
30-May-2017 |
tedu |
split vioblk read/write functions into start and finish as prep for async io operations. ok mlarkin
|
#
1.18 |
|
28-May-2017 |
mlarkin |
SVM: add some exit types
Also, fix a comment that wasn't applicable anymore, and change a format from decimal to hex
|
#
1.17 |
|
05-May-2017 |
reyk |
VMs cannot use proc_compose() to PROC_VMM, they have to use imsg_compose() on the "vmm_pipe" directly. This fixes the communication channel from VMs back to vmm.
|
#
1.16 |
|
05-May-2017 |
mlarkin |
Allow vmd(8) to set guest %xcr0
Usermode part of previous vmm(4) diff.
Posted to tech by Pratik Vyas
|
#
1.15 |
|
02-May-2017 |
mlarkin |
fix an error in i386 vmd build
|
#
1.14 |
|
02-May-2017 |
mlarkin |
Matching vmd(8) part of previous diff (first part of vmctl send/receive).
ok kettenis
|
#
1.13 |
|
25-Apr-2017 |
reyk |
spacing
|
#
1.12 |
|
19-Apr-2017 |
reyk |
Add support for dynamic "NAT" interfaces (-L/local interface).
When a local interface is configured, vmd configures a /31 address on the tap(4) interface of the host and provides another IP in the same subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server that replies with IP, gateway, and DNS addresses to the VM. The built-in server only ever responds to the VM on the inside and cannot leak its DHCP responses to the outside.
Thanks to Uwe Werler, Josh Grosse, and some others for testing!
OK deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.11 |
|
27-Mar-2017 |
deraadt |
die whitespace die die die
|
#
1.10 |
|
25-Mar-2017 |
mlarkin |
Last bits needed to get seabios + alpine linux working. This is enough to get started and let more people help finding and fixing bugs.
ok kettenis, deraadt
|
#
1.9 |
|
25-Mar-2017 |
reyk |
Boot using BIOS from /etc/firmware/vmm-bios by default.
Instead of using the internal "vmboot", VMs will now be booted using the external BIOS firmware in /etc/firmware/vmm-bios (which is subject to a LGPLv3 license). Direct booting of OpenBSD kernels or non-default BIOS images is still supported for now using the -b/boot option that is replacing the -k/kernel option.
As requested by Theo, vmd(8) fails if neither the default BIOS is found nor a kernel has been specified in the VM configuration. The "vmm" BIOS has to be installed using fw_update(1), which will be done automatically in most cases where the OpenBSD can fetch it after install/upgrade.
OK mlarkin@
|
#
1.8 |
|
25-Mar-2017 |
mlarkin |
Implement some missing functionality and clean up some code in vmd pci emulation.
ok kettenis
|
#
1.7 |
|
25-Mar-2017 |
mlarkin |
Introduce a new function to obtain properly sized input data, and convert i8253/i8259/mc146818 emulation to use this.
|
#
1.6 |
|
24-Mar-2017 |
mlarkin |
Allow vmd to proceed after an interrupt occurred after retiring a cpuid instruction. Matches previous commit to kernel vmm.c
|
#
1.5 |
|
23-Mar-2017 |
mlarkin |
Implement memory size and SMP CPU count NVRAM registers in the emulated mc146818. This is needed for seabios to boot properly (and construct a sensible e820 map to send to the guest OS).
|
#
1.4 |
|
21-Mar-2017 |
mlarkin |
Fix two errors in NS8250 (UART) emulation. The first error zeroed out the high bits of %eax on reading register data from the emulated UART ports. The second error didn't properly assert the TXRDY bit during init - this bit was only set after the first character was sent. Both these bugs caused seabios to not be able to output any data. Found during the recent effort to get Linux guests booting.
|
#
1.3 |
|
15-Mar-2017 |
reyk |
Improve vmmci(4) shutdown and reboot.
This change handles various cases to power off the VM, even if it is unresponsive, stuck in ddb, or when the shutdown was initiated from the VM guest side. Usage of timeout and VM ACKs make sure that the VM is really turned off at some point.
OK mlarkin@
|
#
1.2 |
|
02-Mar-2017 |
reyk |
Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.
This is especially useful when multiple VMs share a switch, the implementation is independent from the underlying switch or bridge.
no objections mlarkin@
|
#
1.1 |
|
01-Mar-2017 |
reyk |
Split vmm.c into two files: vm.c for the VM child, vmm.c for the parent
As discussed with mlarkin@, it makes it easier to maintain the file.
OK mlarkin@
|
#
1.76 |
|
11-Nov-2022 |
dv |
Revert removal of toggling interrupt line in vmd vcpu run loop.
phessler reports a performance regression. Needs more testing.
|
#
1.75 |
|
10-Nov-2022 |
dv |
vmd(8): remove toggling interrupt line on vcpu in vcpu run loop
We toggle the interrupt "line" on the vcpu when we assert or deassert irq on the pic in either the vcpu thread (emulating some devices) or on the device event thread (mostly handling reading available data). Having it in the vcpu run loop here just results in another ioctl(2) call before the one for re-entering the guest cpu.
Removing it shows no noticeable behavioral change in existing guests.
ok mlarkin@
|
#
1.74 |
|
10-Nov-2022 |
dv |
vmd(8): import mmio decode and emulation, disabled for now.
The initial mmio support for vmd adds support for only specific MOV and MOVZX instructions. Plan is to begin iterating in-tree on other missing pieces. All functionality is gated behind an #if for now.
Only change to vmm(4) is reordering register #define's in vmmvar.h.
ok mlarkin@
|
Revision tags: OPENBSD_7_2_BASE
|
#
1.73 |
|
01-Sep-2022 |
dv |
vmm(4): send all port io emulation to userland
Simplify things by sending any io exits from IN/OUT instructions to userland instead of trying to emulate anything in the kernel. vmm was sending most pertinent exits to vmd anyways, so this functionally changes little.
An added benefit is this solves an issue reported by tb@ where i386 OpenBSD guests would probe for a pc keyboard repeatedly and cause excessive vm exits. (The emulation in vmm was not properly handling these port reads.)
While here, make the assignment of the VEI_DIR_{IN,OUT} enum values not assume the underlying integer the compiler may assign.
ok mlarkin@
|
#
1.72 |
|
30-Aug-2022 |
dv |
Initial support for mmio assist for vmm(4)
Provide the basic information required for a userland assist in emulating instructions touching mmio regions, sending as much information as is provided by the host hardware.
No decode or assist provided at the moment by vmd(8).
ok mlarkin@
|
#
1.71 |
|
29-Jun-2022 |
dv |
vmd(8): fix off by one in vm memory range check
When inspecting if a gpa falls into a known memory range, vmd was considering it valid 1 byte past the end resulting in selecting the wrong starting range for the search.
ok mlarkin@
|
#
1.70 |
|
26-Jun-2022 |
dv |
vmd: create a copy of bios at 4g boundary
Newer Linux kernels call into the bios to perform a reboot and our version of SeaBIOS assumes there's a "copy" of the bios ending at 4g. When SeaBIOS reads from this area, since vmd doesn't perform mmio yet, guests terminate with an unhandled fault.
Carve out some space ending at 4g and copy the bios there. Technically we could load garbage there, but give SeaBIOS what it wants for now.
ok mlarkin@
|
#
1.69 |
|
03-May-2022 |
dv |
vmm/vmd/vmctl: standardize memory units to bytes
At different points in the vm lifecycle vmm(4), vmctl(8), and vmd(8) refer to a vm's memory range sizes in either bytes or megabytes. This is needlessly complex.
Switch to using bytes everywhere and adjust types and constants accordingly. While this makes it possible to specify vm's with memory in fractions of megabytes, the logic requiring whole megabyte values remains.
Feedback from deraadt@, mlarkin@, and Matthew Martin.
ok mlarkin@
|
Revision tags: OPENBSD_7_1_BASE
|
#
1.68 |
|
01-Mar-2022 |
dv |
vmd(8): gracefully handle hitting data limits when starting a vm
With recent changes to login.conf(5) to restrict daemon datasize to a finite value, users can now hit resource limits when attempting to start a vm.
This change fixes the error path when hitting the limit. vmd(8) will no longer abort and memory error messages are relayed to the user.
While here, address potential under-reads/writes using atomicio when relaying data between the child vm process and vmd's vmm process.
Original diff from tedu@. OK mlarkin@.
|
#
1.67 |
|
30-Dec-2021 |
claudio |
Add back support for -B net -b bsd.rd which emulates a PXE install and results in an autoinstall. This can be used to quickly create new OpenBSD installs. OK dv@
|
#
1.66 |
|
29-Nov-2021 |
deraadt |
mostly avoid sys/param.h with a local nitems() ok mlarkin
|
Revision tags: OPENBSD_7_0_BASE
|
#
1.65 |
|
01-Sep-2021 |
dv |
remove unused functions and cleanup vmd.h
Discussed with mlarkin@. These functions were implemented but never used. While in vmd.h, fix the order to match current vmd(8) reality.
|
#
1.64 |
|
16-Jul-2021 |
dv |
vmd(8): simplify vcpu logic, removing uart & vionet reads
Remove legacy state handling on the ns8250 and virtio network devices originally put in place before using libevent for async device events. The vcpu thread doesn't need to process device data as it is handled by the libevent thread.
This has the benefit of simplifying some of the message passing between threads introduced to the ns8250 uart since both the vcpu and libevent threads were processing read events.
No functional change intended. Tested by many, including abieber@, weerd@, Mischa Peters, and Matthias Schmidt. (Thanks.)
OK mlarkin@
|
#
1.63 |
|
16-Jun-2021 |
dv |
cleanup vmd(8) includes and header files
Lots of organic growth other the years lead to unnecessary includes (proc.h everywhere) and odd dependencies between header files. This cleans things up a bit to help with upcoming cleanup around dhcp code.
No functional change.
"go for it" mlarkin@
|
Revision tags: OPENBSD_6_9_BASE
|
#
1.62 |
|
05-Apr-2021 |
dv |
Support booting from compressed kernel images.
The bsd.rd ramdisk now ships gzip'd on amd64. Use libz in base to transparently handle decompression of any compressed kernel images.
Patch from Josh Rickmar.
ok kn@
|
#
1.61 |
|
29-Mar-2021 |
dv |
Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior ignored did not intercept these packets and instead transmitted them.
This should make vmd(8)'s dhcp behave more as a true dhcp server should and allows it to work properly with the new dhcpleased(8) attempting a renewal.
OK mlarkin@
|
#
1.60 |
|
19-Mar-2021 |
kn |
Remove booting from kernels in raw/qcow2 images
Diff and (slightly tweaked) text below from Dave Voutila < dave at sisu dot io >, thanks!
-- Since 6.7 switched to FFS2 as the default filesystem for new installs, the ability for vmd(8) to load a kernel and boot.conf from a disk image directly (without SeaBIOS) has been broken.
A diff from tb to add FFS2 support never mdae it into the tree.
On 5th Jan 2021, new ramdisks for amd64 have started shipping gzipped, breaking the ability to load the bsd.rd directly as a kernel image for a vmd guest without first uncompressing the image.
Using BIOS works, the FFS2 change happend ten months ago and few if any have complained about the breakage. vmctl(8) is still vague about supporting it per its man page and one still has to pass the disk image twice as a "-b" and "-d" argument to boot an OpenBSD guest *without* BIOS.
Josh Rickmar reported the gzip issue on bugs@ and provided patches to add support for compressed ramdisks and kernel images. The easiest way to do so is to drop support for FFS images since they require a call to fmemopen(3) while all the other logic uses fopen(3)/fdopen(3) calls and a file descriptor. It is much easier to get thsoe patches merged if they don't have to account for extracting files from disk images. --
No objections anyone "Removing it makes sense" reyk (who wrote the FFS module) OK mlarkin
|
#
1.59 |
|
13-Feb-2021 |
mlarkin |
Fix some wrong comments and KNF/long line wraps
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.58 |
|
28-Jun-2020 |
pd |
vmd(8): Eliminate libevent state corruption
libevent functions for com, pic and rtc are now only called on event_thread. vcpu exit handlers send messages on a dev pipe and callbacks on these events do the event management (event_add, evtimer_add, etc). Previously, libevent state was mutated by two threads, event_thread, that runs all the callbacks and the vcpu thread when running exit handlers. This could have lead to libevent state corruption.
Patch from Dave Voutila <dave@sisu.io>
ok claudio@ tested by abieber@ and brynet@
|
Revision tags: OPENBSD_6_7_BASE
|
#
1.57 |
|
30-Apr-2020 |
pd |
vmd(8): correctly terminate vm processes after sending vm
Instead of a round about way of sending a message to vmm that 'send is successful' and terminating by vm_remove from vmm, we can send the imsg and exit in the vm process. The sigchld handler in vmm will vm_remove it from its structures. This is how a normal vm is terminated as well.
Previously, vm_remove was called in vmm_dispatch_vm (ie. the event handler to receive messages from vm process) when hanlding the IMSG_VMDOP_SEND_VM_RESPONSE (ie. the vm process has written the vm state to the fd passed on by vmctl send). This is not how vm_remove was intented to be used as it does a free(vm). The vm struct holds the buffers for imsg and so after handling this IMSG_VMDOP_SEND_VM_RESPONSE message, vmm_dispatch_vm loops again to do imsg_get(ibuf, &imsg) to read the next message (and we had just freed this *ibuf when we freed the vm struct) causing it to segfault.
reported by kn@ ok kn@
|
#
1.56 |
|
21-Apr-2020 |
pd |
vmd: improve concurrency control in pause
Previous implementation hit a deadlock sometimes as the pthread_cond_broadcast for the pause mutex could happen before pthread_cond_wait. This implementation uses a barrier which is hit when all vpcus are paused.
ok mpi@
|
#
1.55 |
|
08-Apr-2020 |
pd |
vmm(4): add IOCTL handler to sets the access protections of the ept
This exposes VMM_IOC_MPROTECT_EPT which can be used by vmd to lock in physical pages. Currently, vmd just terminates the vm in case it gets a protection fault in the future.
This feature is used by solo5 which uses vmm(4) as a backend hypervisor.
ok mpi@
Patch from Adam Steen <adam@adamsteen.com.au>
|
#
1.54 |
|
11-Dec-2019 |
pd |
vmd: proper concurrency control when pausing a vm
Removes an XXX which slept for 1s waiting for the vcpu thread to reach HLT and pause. We now define a paused and unpaused condition so that a call to pause_vm() / vmctl pause blocks till the vm really reaches a paused state.
Also, detach events for devices from event loop when pausing and add them back when unpausing. This is because some callbacks call pthread_mutex_lock and if the vm is paused, it would block also causing the libevent thread to block. This would mean that we would not be able to process any IMSGs received from vmm (parent process) including a message to unpause.
ok mlarkin@
|
#
1.53 |
|
30-Nov-2019 |
mlarkin |
Revert previous - the stability was not as improved as we had thought and we ended up accidentally breaking vmctl. This will need more thought.
ok ori@
|
#
1.52 |
|
29-Nov-2019 |
mlarkin |
Fix at least one cause of VMs spinning at 100% host CPU
After debugging with ori@, it looks like an event ends up on the wrong libevent queue, and we end continually de-queueing and re-queueing the event continually. While it's unclear exactly why this happened, a clue on libevent's github issues page for the same problem pointed us to using a different event base for the device events. This seems to have unstuck ori@'s problematic VM, and I have also seen no more hangs after this.
We have not completely separated the queues; ori@ will work on setting new libevent bases for those later. But those events are pretty frequency.
with help from and ok ori@
|
Revision tags: OPENBSD_6_6_BASE
|
#
1.51 |
|
17-Jul-2019 |
pd |
vmm/vmd: Fix migration with pvclock
Implement VMM_IOC_READVMPARAMS and VMM_IOC_WRITEVMPARAMS ioctls to read and write pvclock state.
reads ok mlarkin@
|
#
1.50 |
|
28-Jun-2019 |
deraadt |
When system calls indicate an error they return -1, not some arbitrary value < 0. errno is only updated in this case. Change all (most?) callers of syscalls to follow this better, and let's see if this strictness helps us in the future.
|
#
1.49 |
|
28-May-2019 |
pd |
vmd: unset CR0_CD and CR0_NW in default flat64 register values
These never got unset on AMD/SVM guests when booted via vmctl start -b causing them to run very slow
ok mlarkin@
|
#
1.48 |
|
12-May-2019 |
pd |
vmm: add a x86 page table walker
Add a first cut of x86 page table walker to vmd(8) and vmm(4). This function is not used right now but is a building block for future features like HPET, OUTSB and INSB emulation, nested virtualisation support, etc.
With help from Mike Larkin
ok mlarkin@
|
#
1.47 |
|
11-May-2019 |
jasper |
vm_dump_header allocated space for a signature but it was never set; set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image
ok mlarkin@ pd@
|
#
1.46 |
|
11-May-2019 |
jasper |
track the state of the vm (running, paused, etc) using a single bitfield instead of a handful of separate variables. this will makes it easier for vmd to report and check on the individual vm states
no functional change intended
ok ccardenas@ mlarkin@
|
Revision tags: OPENBSD_6_5_BASE
|
#
1.45 |
|
01-Mar-2019 |
mlarkin |
vmd(8): remove some i386 remnants that missed the original cleanup
ok pd, kn, deraadt
|
#
1.44 |
|
20-Feb-2019 |
mlarkin |
vmd(8): initialize guest %drX registers to power-on defaults on launch
Initializes the %drX registers to power on defaults, and bump the VM send/recieve header to reflect same
discussed with deraadt@
|
#
1.43 |
|
10-Dec-2018 |
claudio |
Implement the fw_cfg interface basics and use it to set the bootorder if a bootdevice was forced. This implements both the pure IO port interface and also the new DMA interface, a few direct commands are implemented which are needed but in general the "file" interface should be used. There is no write support for the guest. Tested against the latest vmm-firmware port. This requires also a -current kernel to pass the IO ports to vmd(8). OK mlarkin@ ccardenas@
|
#
1.42 |
|
06-Dec-2018 |
claudio |
Make it possible to define the bootdevice in vmd. This information is used currently only when booting a OpenBSD kernel. If VMBOOTDEV_NET is used the internal dhcp server will pass "auto_install" as boot file to the client and the boot loader passes the MAC of the first interface to the kernel to indicate PXE booting. Adding boot order support to SeaBIOS is not yet implemented. Ok ccardenas@
|
Revision tags: OPENBSD_6_4_BASE
|
#
1.41 |
|
08-Oct-2018 |
reyk |
Add support for qcow2 base images (external snapshots).
This works is from Ori Bernstein, committing on his behalf:
Add support to vmd for external snapshots. That is, snapshots that are derived from a base image. Data lookups start in the derived image, and if the derived image does not contain some data, the search proceeds ot the base image. Multiple derived images may exist off of a single base image.
A limitation of this format is that modifying the base image will corrupt the derived image.
This change also adds support for creating disk derived disk images to vmctl. To use it:
vmctl create derived.qcow2 -s 16G -b base.qcow2
From Ori Bernstein OK mlarkin@ reyk@
|
#
1.40 |
|
28-Sep-2018 |
reyk |
Support vmd-internal's vmboot with qcow2 disk images.
OK mlarkin@
|
#
1.39 |
|
19-Sep-2018 |
ccardenas |
Various clean up items for disks.
- qcow2: general cleanup - vioraw: check malloc - virtio: add function to sync disks - vm: call virtio_shutdown to sync disks when vm is finished executing
Thanks to Ori Bernstein.
Ok miko@
|
#
1.38 |
|
17-Jul-2018 |
mlarkin |
vmd(8): fix vmctl -b option for i386 kernels.
ok pd@
|
#
1.37 |
|
12-Jul-2018 |
mlarkin |
vmm(8)/vmm(4): send a copy of the guest register state to vmd on exit, avoiding multiple readregs ioctls back to vmm in case register content is needed subsequently.
ok phessler
|
#
1.36 |
|
10-Jul-2018 |
mlarkin |
vmd(8): route ELCR handler to the right function
|
#
1.35 |
|
09-Jul-2018 |
mlarkin |
vmd(8): better debug message in a failure case
|
#
1.34 |
|
19-Jun-2018 |
reyk |
knf
|
#
1.33 |
|
27-Apr-2018 |
mlarkin |
vmd(8): implement vmd side of ELCR registers
ok guenther
|
#
1.32 |
|
26-Apr-2018 |
mlarkin |
vmd(8): handle PIT channel 2 status readback via port 0x61
Allow PIT channel 2 status (fired/counting) readback via port 0x61 bit 5.
ok guenther@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.31 |
|
03-Jan-2018 |
ccardenas |
Add initial CD-ROM support to VMD via vioscsi.
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
#
1.30 |
|
29-Nov-2017 |
mlarkin |
make vmm(4) less responsible for initial register state, preferring to let usermode daemons handle that.
ok pd@
|
#
1.29 |
|
28-Nov-2017 |
mlarkin |
fix some spelling errors in a few comments
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.28 |
|
19-Sep-2017 |
mlarkin |
Clarify a wrong conditional, found by jsg.
ok jsg
|
#
1.27 |
|
17-Sep-2017 |
pd |
vmd: send/recv pci config space instead of recreating pci devices on receive
ok mlarkin@
|
#
1.26 |
|
17-Sep-2017 |
pd |
vmd: re add rtc.per and rtc.sec evtimers on receive
This was missed in receive. mc146818_start is already defined. This fixes rtc time resync on receive.
ok mlarkin@
|
#
1.25 |
|
11-Sep-2017 |
dlg |
add functions to provide direct access to guest memory as vmd addresses
iovec_mem() populates an iovec array based on guest physical addresses. this allows the use of things like readv and writev for moving data between the guest and a disk image file without having to bounce the memory.
vaddr_mem() provides a vmd usable pointer based on a guests physical address. this makes it possible to directly reference things like virtio rings without having to bounce that memory either. however, it assumes that a contiguous range of guest physical memory will sit in a single vm memory range. mlarkin@ says this is right.
ok mlarkin@
|
#
1.24 |
|
20-Aug-2017 |
pd |
vmd: Allow only upward migration
This restricts receiving vms from hosts with more cpu features.
Tested on broadwell -> skylake (works) skylake -> broadwell (don't work)
ok mlarkin@
|
#
1.23 |
|
14-Aug-2017 |
mlarkin |
vmd: set MSR_MISC_ENABLE=0 on vm creation, this will be re-set in vmm based on proper values from the host in use.
|
#
1.22 |
|
15-Jul-2017 |
pd |
Add vmctl send and vmctl receive
ok reyk@ and mlarkin@
|
#
1.21 |
|
09-Jul-2017 |
pd |
vmd/vmctl: Add ability to pause / unpause vms
With help from Ashwin Agrawal
ok reyk@ mlarkin@
|
#
1.20 |
|
07-Jun-2017 |
mlarkin |
vmd: Implement simulated baudrate support in the ns8250 module. The previous version was allowing an output rate that is "too fast", and linux guests would give up after 512 characters TXed ("too much work for irq4").
This diff calculates the approximate rate we can sustain at the current programmed baud rate and limits the output to that rate by inserting a HZ delay after a specified number of characters have been transmitted. This fixes the linux guest console issue.
Note that the console now outputs at more or less the selected baud rate, instead of nearly instantaneously as before - if you selected 9600 in your guest VMs before, you might want to change that to 115200 now for a better console experience.
krw@ "seems like a good idea to me"
|
#
1.19 |
|
30-May-2017 |
tedu |
split vioblk read/write functions into start and finish as prep for async io operations. ok mlarkin
|
#
1.18 |
|
28-May-2017 |
mlarkin |
SVM: add some exit types
Also, fix a comment that wasn't applicable anymore, and change a format from decimal to hex
|
#
1.17 |
|
05-May-2017 |
reyk |
VMs cannot use proc_compose() to PROC_VMM, they have to use imsg_compose() on the "vmm_pipe" directly. This fixes the communication channel from VMs back to vmm.
|
#
1.16 |
|
05-May-2017 |
mlarkin |
Allow vmd(8) to set guest %xcr0
Usermode part of previous vmm(4) diff.
Posted to tech by Pratik Vyas
|
#
1.15 |
|
02-May-2017 |
mlarkin |
fix an error in i386 vmd build
|
#
1.14 |
|
02-May-2017 |
mlarkin |
Matching vmd(8) part of previous diff (first part of vmctl send/receive).
ok kettenis
|
#
1.13 |
|
25-Apr-2017 |
reyk |
spacing
|
#
1.12 |
|
19-Apr-2017 |
reyk |
Add support for dynamic "NAT" interfaces (-L/local interface).
When a local interface is configured, vmd configures a /31 address on the tap(4) interface of the host and provides another IP in the same subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server that replies with IP, gateway, and DNS addresses to the VM. The built-in server only ever responds to the VM on the inside and cannot leak its DHCP responses to the outside.
Thanks to Uwe Werler, Josh Grosse, and some others for testing!
OK deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.11 |
|
27-Mar-2017 |
deraadt |
die whitespace die die die
|
#
1.10 |
|
25-Mar-2017 |
mlarkin |
Last bits needed to get seabios + alpine linux working. This is enough to get started and let more people help finding and fixing bugs.
ok kettenis, deraadt
|
#
1.9 |
|
25-Mar-2017 |
reyk |
Boot using BIOS from /etc/firmware/vmm-bios by default.
Instead of using the internal "vmboot", VMs will now be booted using the external BIOS firmware in /etc/firmware/vmm-bios (which is subject to a LGPLv3 license). Direct booting of OpenBSD kernels or non-default BIOS images is still supported for now using the -b/boot option that is replacing the -k/kernel option.
As requested by Theo, vmd(8) fails if neither the default BIOS is found nor a kernel has been specified in the VM configuration. The "vmm" BIOS has to be installed using fw_update(1), which will be done automatically in most cases where the OpenBSD can fetch it after install/upgrade.
OK mlarkin@
|
#
1.8 |
|
25-Mar-2017 |
mlarkin |
Implement some missing functionality and clean up some code in vmd pci emulation.
ok kettenis
|
#
1.7 |
|
25-Mar-2017 |
mlarkin |
Introduce a new function to obtain properly sized input data, and convert i8253/i8259/mc146818 emulation to use this.
|
#
1.6 |
|
24-Mar-2017 |
mlarkin |
Allow vmd to proceed after an interrupt occurred after retiring a cpuid instruction. Matches previous commit to kernel vmm.c
|
#
1.5 |
|
23-Mar-2017 |
mlarkin |
Implement memory size and SMP CPU count NVRAM registers in the emulated mc146818. This is needed for seabios to boot properly (and construct a sensible e820 map to send to the guest OS).
|
#
1.4 |
|
21-Mar-2017 |
mlarkin |
Fix two errors in NS8250 (UART) emulation. The first error zeroed out the high bits of %eax on reading register data from the emulated UART ports. The second error didn't properly assert the TXRDY bit during init - this bit was only set after the first character was sent. Both these bugs caused seabios to not be able to output any data. Found during the recent effort to get Linux guests booting.
|
#
1.3 |
|
15-Mar-2017 |
reyk |
Improve vmmci(4) shutdown and reboot.
This change handles various cases to power off the VM, even if it is unresponsive, stuck in ddb, or when the shutdown was initiated from the VM guest side. Usage of timeout and VM ACKs make sure that the VM is really turned off at some point.
OK mlarkin@
|
#
1.2 |
|
02-Mar-2017 |
reyk |
Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.
This is especially useful when multiple VMs share a switch, the implementation is independent from the underlying switch or bridge.
no objections mlarkin@
|
#
1.1 |
|
01-Mar-2017 |
reyk |
Split vmm.c into two files: vm.c for the VM child, vmm.c for the parent
As discussed with mlarkin@, it makes it easier to maintain the file.
OK mlarkin@
|
#
1.74 |
|
10-Nov-2022 |
dv |
vmd(8): import mmio decode and emulation, disabled for now.
The initial mmio support for vmd adds support for only specific MOV and MOVZX instructions. Plan is to begin iterating in-tree on other missing pieces. All functionality is gated behind an #if for now.
Only change to vmm(4) is reordering register #define's in vmmvar.h.
ok mlarkin@
|
Revision tags: OPENBSD_7_2_BASE
|
#
1.73 |
|
01-Sep-2022 |
dv |
vmm(4): send all port io emulation to userland
Simplify things by sending any io exits from IN/OUT instructions to userland instead of trying to emulate anything in the kernel. vmm was sending most pertinent exits to vmd anyways, so this functionally changes little.
An added benefit is this solves an issue reported by tb@ where i386 OpenBSD guests would probe for a pc keyboard repeatedly and cause excessive vm exits. (The emulation in vmm was not properly handling these port reads.)
While here, make the assignment of the VEI_DIR_{IN,OUT} enum values not assume the underlying integer the compiler may assign.
ok mlarkin@
|
#
1.72 |
|
30-Aug-2022 |
dv |
Initial support for mmio assist for vmm(4)
Provide the basic information required for a userland assist in emulating instructions touching mmio regions, sending as much information as is provided by the host hardware.
No decode or assist provided at the moment by vmd(8).
ok mlarkin@
|
#
1.71 |
|
29-Jun-2022 |
dv |
vmd(8): fix off by one in vm memory range check
When inspecting if a gpa falls into a known memory range, vmd was considering it valid 1 byte past the end resulting in selecting the wrong starting range for the search.
ok mlarkin@
|
#
1.70 |
|
26-Jun-2022 |
dv |
vmd: create a copy of bios at 4g boundary
Newer Linux kernels call into the bios to perform a reboot and our version of SeaBIOS assumes there's a "copy" of the bios ending at 4g. When SeaBIOS reads from this area, since vmd doesn't perform mmio yet, guests terminate with an unhandled fault.
Carve out some space ending at 4g and copy the bios there. Technically we could load garbage there, but give SeaBIOS what it wants for now.
ok mlarkin@
|
#
1.69 |
|
03-May-2022 |
dv |
vmm/vmd/vmctl: standardize memory units to bytes
At different points in the vm lifecycle vmm(4), vmctl(8), and vmd(8) refer to a vm's memory range sizes in either bytes or megabytes. This is needlessly complex.
Switch to using bytes everywhere and adjust types and constants accordingly. While this makes it possible to specify vm's with memory in fractions of megabytes, the logic requiring whole megabyte values remains.
Feedback from deraadt@, mlarkin@, and Matthew Martin.
ok mlarkin@
|
Revision tags: OPENBSD_7_1_BASE
|
#
1.68 |
|
01-Mar-2022 |
dv |
vmd(8): gracefully handle hitting data limits when starting a vm
With recent changes to login.conf(5) to restrict daemon datasize to a finite value, users can now hit resource limits when attempting to start a vm.
This change fixes the error path when hitting the limit. vmd(8) will no longer abort and memory error messages are relayed to the user.
While here, address potential under-reads/writes using atomicio when relaying data between the child vm process and vmd's vmm process.
Original diff from tedu@. OK mlarkin@.
|
#
1.67 |
|
30-Dec-2021 |
claudio |
Add back support for -B net -b bsd.rd which emulates a PXE install and results in an autoinstall. This can be used to quickly create new OpenBSD installs. OK dv@
|
#
1.66 |
|
29-Nov-2021 |
deraadt |
mostly avoid sys/param.h with a local nitems() ok mlarkin
|
Revision tags: OPENBSD_7_0_BASE
|
#
1.65 |
|
01-Sep-2021 |
dv |
remove unused functions and cleanup vmd.h
Discussed with mlarkin@. These functions were implemented but never used. While in vmd.h, fix the order to match current vmd(8) reality.
|
#
1.64 |
|
16-Jul-2021 |
dv |
vmd(8): simplify vcpu logic, removing uart & vionet reads
Remove legacy state handling on the ns8250 and virtio network devices originally put in place before using libevent for async device events. The vcpu thread doesn't need to process device data as it is handled by the libevent thread.
This has the benefit of simplifying some of the message passing between threads introduced to the ns8250 uart since both the vcpu and libevent threads were processing read events.
No functional change intended. Tested by many, including abieber@, weerd@, Mischa Peters, and Matthias Schmidt. (Thanks.)
OK mlarkin@
|
#
1.63 |
|
16-Jun-2021 |
dv |
cleanup vmd(8) includes and header files
Lots of organic growth other the years lead to unnecessary includes (proc.h everywhere) and odd dependencies between header files. This cleans things up a bit to help with upcoming cleanup around dhcp code.
No functional change.
"go for it" mlarkin@
|
Revision tags: OPENBSD_6_9_BASE
|
#
1.62 |
|
05-Apr-2021 |
dv |
Support booting from compressed kernel images.
The bsd.rd ramdisk now ships gzip'd on amd64. Use libz in base to transparently handle decompression of any compressed kernel images.
Patch from Josh Rickmar.
ok kn@
|
#
1.61 |
|
29-Mar-2021 |
dv |
Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior ignored did not intercept these packets and instead transmitted them.
This should make vmd(8)'s dhcp behave more as a true dhcp server should and allows it to work properly with the new dhcpleased(8) attempting a renewal.
OK mlarkin@
|
#
1.60 |
|
19-Mar-2021 |
kn |
Remove booting from kernels in raw/qcow2 images
Diff and (slightly tweaked) text below from Dave Voutila < dave at sisu dot io >, thanks!
-- Since 6.7 switched to FFS2 as the default filesystem for new installs, the ability for vmd(8) to load a kernel and boot.conf from a disk image directly (without SeaBIOS) has been broken.
A diff from tb to add FFS2 support never mdae it into the tree.
On 5th Jan 2021, new ramdisks for amd64 have started shipping gzipped, breaking the ability to load the bsd.rd directly as a kernel image for a vmd guest without first uncompressing the image.
Using BIOS works, the FFS2 change happend ten months ago and few if any have complained about the breakage. vmctl(8) is still vague about supporting it per its man page and one still has to pass the disk image twice as a "-b" and "-d" argument to boot an OpenBSD guest *without* BIOS.
Josh Rickmar reported the gzip issue on bugs@ and provided patches to add support for compressed ramdisks and kernel images. The easiest way to do so is to drop support for FFS images since they require a call to fmemopen(3) while all the other logic uses fopen(3)/fdopen(3) calls and a file descriptor. It is much easier to get thsoe patches merged if they don't have to account for extracting files from disk images. --
No objections anyone "Removing it makes sense" reyk (who wrote the FFS module) OK mlarkin
|
#
1.59 |
|
13-Feb-2021 |
mlarkin |
Fix some wrong comments and KNF/long line wraps
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.58 |
|
28-Jun-2020 |
pd |
vmd(8): Eliminate libevent state corruption
libevent functions for com, pic and rtc are now only called on event_thread. vcpu exit handlers send messages on a dev pipe and callbacks on these events do the event management (event_add, evtimer_add, etc). Previously, libevent state was mutated by two threads, event_thread, that runs all the callbacks and the vcpu thread when running exit handlers. This could have lead to libevent state corruption.
Patch from Dave Voutila <dave@sisu.io>
ok claudio@ tested by abieber@ and brynet@
|
Revision tags: OPENBSD_6_7_BASE
|
#
1.57 |
|
30-Apr-2020 |
pd |
vmd(8): correctly terminate vm processes after sending vm
Instead of a round about way of sending a message to vmm that 'send is successful' and terminating by vm_remove from vmm, we can send the imsg and exit in the vm process. The sigchld handler in vmm will vm_remove it from its structures. This is how a normal vm is terminated as well.
Previously, vm_remove was called in vmm_dispatch_vm (ie. the event handler to receive messages from vm process) when hanlding the IMSG_VMDOP_SEND_VM_RESPONSE (ie. the vm process has written the vm state to the fd passed on by vmctl send). This is not how vm_remove was intented to be used as it does a free(vm). The vm struct holds the buffers for imsg and so after handling this IMSG_VMDOP_SEND_VM_RESPONSE message, vmm_dispatch_vm loops again to do imsg_get(ibuf, &imsg) to read the next message (and we had just freed this *ibuf when we freed the vm struct) causing it to segfault.
reported by kn@ ok kn@
|
#
1.56 |
|
21-Apr-2020 |
pd |
vmd: improve concurrency control in pause
Previous implementation hit a deadlock sometimes as the pthread_cond_broadcast for the pause mutex could happen before pthread_cond_wait. This implementation uses a barrier which is hit when all vpcus are paused.
ok mpi@
|
#
1.55 |
|
08-Apr-2020 |
pd |
vmm(4): add IOCTL handler to sets the access protections of the ept
This exposes VMM_IOC_MPROTECT_EPT which can be used by vmd to lock in physical pages. Currently, vmd just terminates the vm in case it gets a protection fault in the future.
This feature is used by solo5 which uses vmm(4) as a backend hypervisor.
ok mpi@
Patch from Adam Steen <adam@adamsteen.com.au>
|
#
1.54 |
|
11-Dec-2019 |
pd |
vmd: proper concurrency control when pausing a vm
Removes an XXX which slept for 1s waiting for the vcpu thread to reach HLT and pause. We now define a paused and unpaused condition so that a call to pause_vm() / vmctl pause blocks till the vm really reaches a paused state.
Also, detach events for devices from event loop when pausing and add them back when unpausing. This is because some callbacks call pthread_mutex_lock and if the vm is paused, it would block also causing the libevent thread to block. This would mean that we would not be able to process any IMSGs received from vmm (parent process) including a message to unpause.
ok mlarkin@
|
#
1.53 |
|
30-Nov-2019 |
mlarkin |
Revert previous - the stability was not as improved as we had thought and we ended up accidentally breaking vmctl. This will need more thought.
ok ori@
|
#
1.52 |
|
29-Nov-2019 |
mlarkin |
Fix at least one cause of VMs spinning at 100% host CPU
After debugging with ori@, it looks like an event ends up on the wrong libevent queue, and we end continually de-queueing and re-queueing the event continually. While it's unclear exactly why this happened, a clue on libevent's github issues page for the same problem pointed us to using a different event base for the device events. This seems to have unstuck ori@'s problematic VM, and I have also seen no more hangs after this.
We have not completely separated the queues; ori@ will work on setting new libevent bases for those later. But those events are pretty frequency.
with help from and ok ori@
|
Revision tags: OPENBSD_6_6_BASE
|
#
1.51 |
|
17-Jul-2019 |
pd |
vmm/vmd: Fix migration with pvclock
Implement VMM_IOC_READVMPARAMS and VMM_IOC_WRITEVMPARAMS ioctls to read and write pvclock state.
reads ok mlarkin@
|
#
1.50 |
|
28-Jun-2019 |
deraadt |
When system calls indicate an error they return -1, not some arbitrary value < 0. errno is only updated in this case. Change all (most?) callers of syscalls to follow this better, and let's see if this strictness helps us in the future.
|
#
1.49 |
|
28-May-2019 |
pd |
vmd: unset CR0_CD and CR0_NW in default flat64 register values
These never got unset on AMD/SVM guests when booted via vmctl start -b causing them to run very slow
ok mlarkin@
|
#
1.48 |
|
12-May-2019 |
pd |
vmm: add a x86 page table walker
Add a first cut of x86 page table walker to vmd(8) and vmm(4). This function is not used right now but is a building block for future features like HPET, OUTSB and INSB emulation, nested virtualisation support, etc.
With help from Mike Larkin
ok mlarkin@
|
#
1.47 |
|
11-May-2019 |
jasper |
vm_dump_header allocated space for a signature but it was never set; set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image
ok mlarkin@ pd@
|
#
1.46 |
|
11-May-2019 |
jasper |
track the state of the vm (running, paused, etc) using a single bitfield instead of a handful of separate variables. this will makes it easier for vmd to report and check on the individual vm states
no functional change intended
ok ccardenas@ mlarkin@
|
Revision tags: OPENBSD_6_5_BASE
|
#
1.45 |
|
01-Mar-2019 |
mlarkin |
vmd(8): remove some i386 remnants that missed the original cleanup
ok pd, kn, deraadt
|
#
1.44 |
|
20-Feb-2019 |
mlarkin |
vmd(8): initialize guest %drX registers to power-on defaults on launch
Initializes the %drX registers to power on defaults, and bump the VM send/recieve header to reflect same
discussed with deraadt@
|
#
1.43 |
|
10-Dec-2018 |
claudio |
Implement the fw_cfg interface basics and use it to set the bootorder if a bootdevice was forced. This implements both the pure IO port interface and also the new DMA interface, a few direct commands are implemented which are needed but in general the "file" interface should be used. There is no write support for the guest. Tested against the latest vmm-firmware port. This requires also a -current kernel to pass the IO ports to vmd(8). OK mlarkin@ ccardenas@
|
#
1.42 |
|
06-Dec-2018 |
claudio |
Make it possible to define the bootdevice in vmd. This information is used currently only when booting a OpenBSD kernel. If VMBOOTDEV_NET is used the internal dhcp server will pass "auto_install" as boot file to the client and the boot loader passes the MAC of the first interface to the kernel to indicate PXE booting. Adding boot order support to SeaBIOS is not yet implemented. Ok ccardenas@
|
Revision tags: OPENBSD_6_4_BASE
|
#
1.41 |
|
08-Oct-2018 |
reyk |
Add support for qcow2 base images (external snapshots).
This works is from Ori Bernstein, committing on his behalf:
Add support to vmd for external snapshots. That is, snapshots that are derived from a base image. Data lookups start in the derived image, and if the derived image does not contain some data, the search proceeds ot the base image. Multiple derived images may exist off of a single base image.
A limitation of this format is that modifying the base image will corrupt the derived image.
This change also adds support for creating disk derived disk images to vmctl. To use it:
vmctl create derived.qcow2 -s 16G -b base.qcow2
From Ori Bernstein OK mlarkin@ reyk@
|
#
1.40 |
|
28-Sep-2018 |
reyk |
Support vmd-internal's vmboot with qcow2 disk images.
OK mlarkin@
|
#
1.39 |
|
19-Sep-2018 |
ccardenas |
Various clean up items for disks.
- qcow2: general cleanup - vioraw: check malloc - virtio: add function to sync disks - vm: call virtio_shutdown to sync disks when vm is finished executing
Thanks to Ori Bernstein.
Ok miko@
|
#
1.38 |
|
17-Jul-2018 |
mlarkin |
vmd(8): fix vmctl -b option for i386 kernels.
ok pd@
|
#
1.37 |
|
12-Jul-2018 |
mlarkin |
vmm(8)/vmm(4): send a copy of the guest register state to vmd on exit, avoiding multiple readregs ioctls back to vmm in case register content is needed subsequently.
ok phessler
|
#
1.36 |
|
10-Jul-2018 |
mlarkin |
vmd(8): route ELCR handler to the right function
|
#
1.35 |
|
09-Jul-2018 |
mlarkin |
vmd(8): better debug message in a failure case
|
#
1.34 |
|
19-Jun-2018 |
reyk |
knf
|
#
1.33 |
|
27-Apr-2018 |
mlarkin |
vmd(8): implement vmd side of ELCR registers
ok guenther
|
#
1.32 |
|
26-Apr-2018 |
mlarkin |
vmd(8): handle PIT channel 2 status readback via port 0x61
Allow PIT channel 2 status (fired/counting) readback via port 0x61 bit 5.
ok guenther@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.31 |
|
03-Jan-2018 |
ccardenas |
Add initial CD-ROM support to VMD via vioscsi.
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
#
1.30 |
|
29-Nov-2017 |
mlarkin |
make vmm(4) less responsible for initial register state, preferring to let usermode daemons handle that.
ok pd@
|
#
1.29 |
|
28-Nov-2017 |
mlarkin |
fix some spelling errors in a few comments
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.28 |
|
19-Sep-2017 |
mlarkin |
Clarify a wrong conditional, found by jsg.
ok jsg
|
#
1.27 |
|
17-Sep-2017 |
pd |
vmd: send/recv pci config space instead of recreating pci devices on receive
ok mlarkin@
|
#
1.26 |
|
17-Sep-2017 |
pd |
vmd: re add rtc.per and rtc.sec evtimers on receive
This was missed in receive. mc146818_start is already defined. This fixes rtc time resync on receive.
ok mlarkin@
|
#
1.25 |
|
11-Sep-2017 |
dlg |
add functions to provide direct access to guest memory as vmd addresses
iovec_mem() populates an iovec array based on guest physical addresses. this allows the use of things like readv and writev for moving data between the guest and a disk image file without having to bounce the memory.
vaddr_mem() provides a vmd usable pointer based on a guests physical address. this makes it possible to directly reference things like virtio rings without having to bounce that memory either. however, it assumes that a contiguous range of guest physical memory will sit in a single vm memory range. mlarkin@ says this is right.
ok mlarkin@
|
#
1.24 |
|
20-Aug-2017 |
pd |
vmd: Allow only upward migration
This restricts receiving vms from hosts with more cpu features.
Tested on broadwell -> skylake (works) skylake -> broadwell (don't work)
ok mlarkin@
|
#
1.23 |
|
14-Aug-2017 |
mlarkin |
vmd: set MSR_MISC_ENABLE=0 on vm creation, this will be re-set in vmm based on proper values from the host in use.
|
#
1.22 |
|
15-Jul-2017 |
pd |
Add vmctl send and vmctl receive
ok reyk@ and mlarkin@
|
#
1.21 |
|
09-Jul-2017 |
pd |
vmd/vmctl: Add ability to pause / unpause vms
With help from Ashwin Agrawal
ok reyk@ mlarkin@
|
#
1.20 |
|
07-Jun-2017 |
mlarkin |
vmd: Implement simulated baudrate support in the ns8250 module. The previous version was allowing an output rate that is "too fast", and linux guests would give up after 512 characters TXed ("too much work for irq4").
This diff calculates the approximate rate we can sustain at the current programmed baud rate and limits the output to that rate by inserting a HZ delay after a specified number of characters have been transmitted. This fixes the linux guest console issue.
Note that the console now outputs at more or less the selected baud rate, instead of nearly instantaneously as before - if you selected 9600 in your guest VMs before, you might want to change that to 115200 now for a better console experience.
krw@ "seems like a good idea to me"
|
#
1.19 |
|
30-May-2017 |
tedu |
split vioblk read/write functions into start and finish as prep for async io operations. ok mlarkin
|
#
1.18 |
|
28-May-2017 |
mlarkin |
SVM: add some exit types
Also, fix a comment that wasn't applicable anymore, and change a format from decimal to hex
|
#
1.17 |
|
05-May-2017 |
reyk |
VMs cannot use proc_compose() to PROC_VMM, they have to use imsg_compose() on the "vmm_pipe" directly. This fixes the communication channel from VMs back to vmm.
|
#
1.16 |
|
05-May-2017 |
mlarkin |
Allow vmd(8) to set guest %xcr0
Usermode part of previous vmm(4) diff.
Posted to tech by Pratik Vyas
|
#
1.15 |
|
02-May-2017 |
mlarkin |
fix an error in i386 vmd build
|
#
1.14 |
|
02-May-2017 |
mlarkin |
Matching vmd(8) part of previous diff (first part of vmctl send/receive).
ok kettenis
|
#
1.13 |
|
25-Apr-2017 |
reyk |
spacing
|
#
1.12 |
|
19-Apr-2017 |
reyk |
Add support for dynamic "NAT" interfaces (-L/local interface).
When a local interface is configured, vmd configures a /31 address on the tap(4) interface of the host and provides another IP in the same subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server that replies with IP, gateway, and DNS addresses to the VM. The built-in server only ever responds to the VM on the inside and cannot leak its DHCP responses to the outside.
Thanks to Uwe Werler, Josh Grosse, and some others for testing!
OK deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.11 |
|
27-Mar-2017 |
deraadt |
die whitespace die die die
|
#
1.10 |
|
25-Mar-2017 |
mlarkin |
Last bits needed to get seabios + alpine linux working. This is enough to get started and let more people help finding and fixing bugs.
ok kettenis, deraadt
|
#
1.9 |
|
25-Mar-2017 |
reyk |
Boot using BIOS from /etc/firmware/vmm-bios by default.
Instead of using the internal "vmboot", VMs will now be booted using the external BIOS firmware in /etc/firmware/vmm-bios (which is subject to a LGPLv3 license). Direct booting of OpenBSD kernels or non-default BIOS images is still supported for now using the -b/boot option that is replacing the -k/kernel option.
As requested by Theo, vmd(8) fails if neither the default BIOS is found nor a kernel has been specified in the VM configuration. The "vmm" BIOS has to be installed using fw_update(1), which will be done automatically in most cases where the OpenBSD can fetch it after install/upgrade.
OK mlarkin@
|
#
1.8 |
|
25-Mar-2017 |
mlarkin |
Implement some missing functionality and clean up some code in vmd pci emulation.
ok kettenis
|
#
1.7 |
|
25-Mar-2017 |
mlarkin |
Introduce a new function to obtain properly sized input data, and convert i8253/i8259/mc146818 emulation to use this.
|
#
1.6 |
|
24-Mar-2017 |
mlarkin |
Allow vmd to proceed after an interrupt occurred after retiring a cpuid instruction. Matches previous commit to kernel vmm.c
|
#
1.5 |
|
23-Mar-2017 |
mlarkin |
Implement memory size and SMP CPU count NVRAM registers in the emulated mc146818. This is needed for seabios to boot properly (and construct a sensible e820 map to send to the guest OS).
|
#
1.4 |
|
21-Mar-2017 |
mlarkin |
Fix two errors in NS8250 (UART) emulation. The first error zeroed out the high bits of %eax on reading register data from the emulated UART ports. The second error didn't properly assert the TXRDY bit during init - this bit was only set after the first character was sent. Both these bugs caused seabios to not be able to output any data. Found during the recent effort to get Linux guests booting.
|
#
1.3 |
|
15-Mar-2017 |
reyk |
Improve vmmci(4) shutdown and reboot.
This change handles various cases to power off the VM, even if it is unresponsive, stuck in ddb, or when the shutdown was initiated from the VM guest side. Usage of timeout and VM ACKs make sure that the VM is really turned off at some point.
OK mlarkin@
|
#
1.2 |
|
02-Mar-2017 |
reyk |
Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.
This is especially useful when multiple VMs share a switch, the implementation is independent from the underlying switch or bridge.
no objections mlarkin@
|
#
1.1 |
|
01-Mar-2017 |
reyk |
Split vmm.c into two files: vm.c for the VM child, vmm.c for the parent
As discussed with mlarkin@, it makes it easier to maintain the file.
OK mlarkin@
|
#
1.73 |
|
01-Sep-2022 |
dv |
vmm(4): send all port io emulation to userland
Simplify things by sending any io exits from IN/OUT instructions to userland instead of trying to emulate anything in the kernel. vmm was sending most pertinent exits to vmd anyways, so this functionally changes little.
An added benefit is this solves an issue reported by tb@ where i386 OpenBSD guests would probe for a pc keyboard repeatedly and cause excessive vm exits. (The emulation in vmm was not properly handling these port reads.)
While here, make the assignment of the VEI_DIR_{IN,OUT} enum values not assume the underlying integer the compiler may assign.
ok mlarkin@
|
#
1.72 |
|
30-Aug-2022 |
dv |
Initial support for mmio assist for vmm(4)
Provide the basic information required for a userland assist in emulating instructions touching mmio regions, sending as much information as is provided by the host hardware.
No decode or assist provided at the moment by vmd(8).
ok mlarkin@
|
#
1.71 |
|
29-Jun-2022 |
dv |
vmd(8): fix off by one in vm memory range check
When inspecting if a gpa falls into a known memory range, vmd was considering it valid 1 byte past the end resulting in selecting the wrong starting range for the search.
ok mlarkin@
|
#
1.70 |
|
26-Jun-2022 |
dv |
vmd: create a copy of bios at 4g boundary
Newer Linux kernels call into the bios to perform a reboot and our version of SeaBIOS assumes there's a "copy" of the bios ending at 4g. When SeaBIOS reads from this area, since vmd doesn't perform mmio yet, guests terminate with an unhandled fault.
Carve out some space ending at 4g and copy the bios there. Technically we could load garbage there, but give SeaBIOS what it wants for now.
ok mlarkin@
|
#
1.69 |
|
03-May-2022 |
dv |
vmm/vmd/vmctl: standardize memory units to bytes
At different points in the vm lifecycle vmm(4), vmctl(8), and vmd(8) refer to a vm's memory range sizes in either bytes or megabytes. This is needlessly complex.
Switch to using bytes everywhere and adjust types and constants accordingly. While this makes it possible to specify vm's with memory in fractions of megabytes, the logic requiring whole megabyte values remains.
Feedback from deraadt@, mlarkin@, and Matthew Martin.
ok mlarkin@
|
Revision tags: OPENBSD_7_1_BASE
|
#
1.68 |
|
01-Mar-2022 |
dv |
vmd(8): gracefully handle hitting data limits when starting a vm
With recent changes to login.conf(5) to restrict daemon datasize to a finite value, users can now hit resource limits when attempting to start a vm.
This change fixes the error path when hitting the limit. vmd(8) will no longer abort and memory error messages are relayed to the user.
While here, address potential under-reads/writes using atomicio when relaying data between the child vm process and vmd's vmm process.
Original diff from tedu@. OK mlarkin@.
|
#
1.67 |
|
30-Dec-2021 |
claudio |
Add back support for -B net -b bsd.rd which emulates a PXE install and results in an autoinstall. This can be used to quickly create new OpenBSD installs. OK dv@
|
#
1.66 |
|
29-Nov-2021 |
deraadt |
mostly avoid sys/param.h with a local nitems() ok mlarkin
|
Revision tags: OPENBSD_7_0_BASE
|
#
1.65 |
|
01-Sep-2021 |
dv |
remove unused functions and cleanup vmd.h
Discussed with mlarkin@. These functions were implemented but never used. While in vmd.h, fix the order to match current vmd(8) reality.
|
#
1.64 |
|
16-Jul-2021 |
dv |
vmd(8): simplify vcpu logic, removing uart & vionet reads
Remove legacy state handling on the ns8250 and virtio network devices originally put in place before using libevent for async device events. The vcpu thread doesn't need to process device data as it is handled by the libevent thread.
This has the benefit of simplifying some of the message passing between threads introduced to the ns8250 uart since both the vcpu and libevent threads were processing read events.
No functional change intended. Tested by many, including abieber@, weerd@, Mischa Peters, and Matthias Schmidt. (Thanks.)
OK mlarkin@
|
#
1.63 |
|
16-Jun-2021 |
dv |
cleanup vmd(8) includes and header files
Lots of organic growth other the years lead to unnecessary includes (proc.h everywhere) and odd dependencies between header files. This cleans things up a bit to help with upcoming cleanup around dhcp code.
No functional change.
"go for it" mlarkin@
|
Revision tags: OPENBSD_6_9_BASE
|
#
1.62 |
|
05-Apr-2021 |
dv |
Support booting from compressed kernel images.
The bsd.rd ramdisk now ships gzip'd on amd64. Use libz in base to transparently handle decompression of any compressed kernel images.
Patch from Josh Rickmar.
ok kn@
|
#
1.61 |
|
29-Mar-2021 |
dv |
Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior ignored did not intercept these packets and instead transmitted them.
This should make vmd(8)'s dhcp behave more as a true dhcp server should and allows it to work properly with the new dhcpleased(8) attempting a renewal.
OK mlarkin@
|
#
1.60 |
|
19-Mar-2021 |
kn |
Remove booting from kernels in raw/qcow2 images
Diff and (slightly tweaked) text below from Dave Voutila < dave at sisu dot io >, thanks!
-- Since 6.7 switched to FFS2 as the default filesystem for new installs, the ability for vmd(8) to load a kernel and boot.conf from a disk image directly (without SeaBIOS) has been broken.
A diff from tb to add FFS2 support never mdae it into the tree.
On 5th Jan 2021, new ramdisks for amd64 have started shipping gzipped, breaking the ability to load the bsd.rd directly as a kernel image for a vmd guest without first uncompressing the image.
Using BIOS works, the FFS2 change happend ten months ago and few if any have complained about the breakage. vmctl(8) is still vague about supporting it per its man page and one still has to pass the disk image twice as a "-b" and "-d" argument to boot an OpenBSD guest *without* BIOS.
Josh Rickmar reported the gzip issue on bugs@ and provided patches to add support for compressed ramdisks and kernel images. The easiest way to do so is to drop support for FFS images since they require a call to fmemopen(3) while all the other logic uses fopen(3)/fdopen(3) calls and a file descriptor. It is much easier to get thsoe patches merged if they don't have to account for extracting files from disk images. --
No objections anyone "Removing it makes sense" reyk (who wrote the FFS module) OK mlarkin
|
#
1.59 |
|
13-Feb-2021 |
mlarkin |
Fix some wrong comments and KNF/long line wraps
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.58 |
|
28-Jun-2020 |
pd |
vmd(8): Eliminate libevent state corruption
libevent functions for com, pic and rtc are now only called on event_thread. vcpu exit handlers send messages on a dev pipe and callbacks on these events do the event management (event_add, evtimer_add, etc). Previously, libevent state was mutated by two threads, event_thread, that runs all the callbacks and the vcpu thread when running exit handlers. This could have lead to libevent state corruption.
Patch from Dave Voutila <dave@sisu.io>
ok claudio@ tested by abieber@ and brynet@
|
Revision tags: OPENBSD_6_7_BASE
|
#
1.57 |
|
30-Apr-2020 |
pd |
vmd(8): correctly terminate vm processes after sending vm
Instead of a round about way of sending a message to vmm that 'send is successful' and terminating by vm_remove from vmm, we can send the imsg and exit in the vm process. The sigchld handler in vmm will vm_remove it from its structures. This is how a normal vm is terminated as well.
Previously, vm_remove was called in vmm_dispatch_vm (ie. the event handler to receive messages from vm process) when hanlding the IMSG_VMDOP_SEND_VM_RESPONSE (ie. the vm process has written the vm state to the fd passed on by vmctl send). This is not how vm_remove was intented to be used as it does a free(vm). The vm struct holds the buffers for imsg and so after handling this IMSG_VMDOP_SEND_VM_RESPONSE message, vmm_dispatch_vm loops again to do imsg_get(ibuf, &imsg) to read the next message (and we had just freed this *ibuf when we freed the vm struct) causing it to segfault.
reported by kn@ ok kn@
|
#
1.56 |
|
21-Apr-2020 |
pd |
vmd: improve concurrency control in pause
Previous implementation hit a deadlock sometimes as the pthread_cond_broadcast for the pause mutex could happen before pthread_cond_wait. This implementation uses a barrier which is hit when all vpcus are paused.
ok mpi@
|
#
1.55 |
|
08-Apr-2020 |
pd |
vmm(4): add IOCTL handler to sets the access protections of the ept
This exposes VMM_IOC_MPROTECT_EPT which can be used by vmd to lock in physical pages. Currently, vmd just terminates the vm in case it gets a protection fault in the future.
This feature is used by solo5 which uses vmm(4) as a backend hypervisor.
ok mpi@
Patch from Adam Steen <adam@adamsteen.com.au>
|
#
1.54 |
|
11-Dec-2019 |
pd |
vmd: proper concurrency control when pausing a vm
Removes an XXX which slept for 1s waiting for the vcpu thread to reach HLT and pause. We now define a paused and unpaused condition so that a call to pause_vm() / vmctl pause blocks till the vm really reaches a paused state.
Also, detach events for devices from event loop when pausing and add them back when unpausing. This is because some callbacks call pthread_mutex_lock and if the vm is paused, it would block also causing the libevent thread to block. This would mean that we would not be able to process any IMSGs received from vmm (parent process) including a message to unpause.
ok mlarkin@
|
#
1.53 |
|
30-Nov-2019 |
mlarkin |
Revert previous - the stability was not as improved as we had thought and we ended up accidentally breaking vmctl. This will need more thought.
ok ori@
|
#
1.52 |
|
29-Nov-2019 |
mlarkin |
Fix at least one cause of VMs spinning at 100% host CPU
After debugging with ori@, it looks like an event ends up on the wrong libevent queue, and we end continually de-queueing and re-queueing the event continually. While it's unclear exactly why this happened, a clue on libevent's github issues page for the same problem pointed us to using a different event base for the device events. This seems to have unstuck ori@'s problematic VM, and I have also seen no more hangs after this.
We have not completely separated the queues; ori@ will work on setting new libevent bases for those later. But those events are pretty frequency.
with help from and ok ori@
|
Revision tags: OPENBSD_6_6_BASE
|
#
1.51 |
|
17-Jul-2019 |
pd |
vmm/vmd: Fix migration with pvclock
Implement VMM_IOC_READVMPARAMS and VMM_IOC_WRITEVMPARAMS ioctls to read and write pvclock state.
reads ok mlarkin@
|
#
1.50 |
|
28-Jun-2019 |
deraadt |
When system calls indicate an error they return -1, not some arbitrary value < 0. errno is only updated in this case. Change all (most?) callers of syscalls to follow this better, and let's see if this strictness helps us in the future.
|
#
1.49 |
|
28-May-2019 |
pd |
vmd: unset CR0_CD and CR0_NW in default flat64 register values
These never got unset on AMD/SVM guests when booted via vmctl start -b causing them to run very slow
ok mlarkin@
|
#
1.48 |
|
12-May-2019 |
pd |
vmm: add a x86 page table walker
Add a first cut of x86 page table walker to vmd(8) and vmm(4). This function is not used right now but is a building block for future features like HPET, OUTSB and INSB emulation, nested virtualisation support, etc.
With help from Mike Larkin
ok mlarkin@
|
#
1.47 |
|
11-May-2019 |
jasper |
vm_dump_header allocated space for a signature but it was never set; set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image
ok mlarkin@ pd@
|
#
1.46 |
|
11-May-2019 |
jasper |
track the state of the vm (running, paused, etc) using a single bitfield instead of a handful of separate variables. this will makes it easier for vmd to report and check on the individual vm states
no functional change intended
ok ccardenas@ mlarkin@
|
Revision tags: OPENBSD_6_5_BASE
|
#
1.45 |
|
01-Mar-2019 |
mlarkin |
vmd(8): remove some i386 remnants that missed the original cleanup
ok pd, kn, deraadt
|
#
1.44 |
|
20-Feb-2019 |
mlarkin |
vmd(8): initialize guest %drX registers to power-on defaults on launch
Initializes the %drX registers to power on defaults, and bump the VM send/recieve header to reflect same
discussed with deraadt@
|
#
1.43 |
|
10-Dec-2018 |
claudio |
Implement the fw_cfg interface basics and use it to set the bootorder if a bootdevice was forced. This implements both the pure IO port interface and also the new DMA interface, a few direct commands are implemented which are needed but in general the "file" interface should be used. There is no write support for the guest. Tested against the latest vmm-firmware port. This requires also a -current kernel to pass the IO ports to vmd(8). OK mlarkin@ ccardenas@
|
#
1.42 |
|
06-Dec-2018 |
claudio |
Make it possible to define the bootdevice in vmd. This information is used currently only when booting a OpenBSD kernel. If VMBOOTDEV_NET is used the internal dhcp server will pass "auto_install" as boot file to the client and the boot loader passes the MAC of the first interface to the kernel to indicate PXE booting. Adding boot order support to SeaBIOS is not yet implemented. Ok ccardenas@
|
Revision tags: OPENBSD_6_4_BASE
|
#
1.41 |
|
08-Oct-2018 |
reyk |
Add support for qcow2 base images (external snapshots).
This works is from Ori Bernstein, committing on his behalf:
Add support to vmd for external snapshots. That is, snapshots that are derived from a base image. Data lookups start in the derived image, and if the derived image does not contain some data, the search proceeds ot the base image. Multiple derived images may exist off of a single base image.
A limitation of this format is that modifying the base image will corrupt the derived image.
This change also adds support for creating disk derived disk images to vmctl. To use it:
vmctl create derived.qcow2 -s 16G -b base.qcow2
From Ori Bernstein OK mlarkin@ reyk@
|
#
1.40 |
|
28-Sep-2018 |
reyk |
Support vmd-internal's vmboot with qcow2 disk images.
OK mlarkin@
|
#
1.39 |
|
19-Sep-2018 |
ccardenas |
Various clean up items for disks.
- qcow2: general cleanup - vioraw: check malloc - virtio: add function to sync disks - vm: call virtio_shutdown to sync disks when vm is finished executing
Thanks to Ori Bernstein.
Ok miko@
|
#
1.38 |
|
17-Jul-2018 |
mlarkin |
vmd(8): fix vmctl -b option for i386 kernels.
ok pd@
|
#
1.37 |
|
12-Jul-2018 |
mlarkin |
vmm(8)/vmm(4): send a copy of the guest register state to vmd on exit, avoiding multiple readregs ioctls back to vmm in case register content is needed subsequently.
ok phessler
|
#
1.36 |
|
10-Jul-2018 |
mlarkin |
vmd(8): route ELCR handler to the right function
|
#
1.35 |
|
09-Jul-2018 |
mlarkin |
vmd(8): better debug message in a failure case
|
#
1.34 |
|
19-Jun-2018 |
reyk |
knf
|
#
1.33 |
|
27-Apr-2018 |
mlarkin |
vmd(8): implement vmd side of ELCR registers
ok guenther
|
#
1.32 |
|
26-Apr-2018 |
mlarkin |
vmd(8): handle PIT channel 2 status readback via port 0x61
Allow PIT channel 2 status (fired/counting) readback via port 0x61 bit 5.
ok guenther@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.31 |
|
03-Jan-2018 |
ccardenas |
Add initial CD-ROM support to VMD via vioscsi.
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
#
1.30 |
|
29-Nov-2017 |
mlarkin |
make vmm(4) less responsible for initial register state, preferring to let usermode daemons handle that.
ok pd@
|
#
1.29 |
|
28-Nov-2017 |
mlarkin |
fix some spelling errors in a few comments
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.28 |
|
19-Sep-2017 |
mlarkin |
Clarify a wrong conditional, found by jsg.
ok jsg
|
#
1.27 |
|
17-Sep-2017 |
pd |
vmd: send/recv pci config space instead of recreating pci devices on receive
ok mlarkin@
|
#
1.26 |
|
17-Sep-2017 |
pd |
vmd: re add rtc.per and rtc.sec evtimers on receive
This was missed in receive. mc146818_start is already defined. This fixes rtc time resync on receive.
ok mlarkin@
|
#
1.25 |
|
11-Sep-2017 |
dlg |
add functions to provide direct access to guest memory as vmd addresses
iovec_mem() populates an iovec array based on guest physical addresses. this allows the use of things like readv and writev for moving data between the guest and a disk image file without having to bounce the memory.
vaddr_mem() provides a vmd usable pointer based on a guests physical address. this makes it possible to directly reference things like virtio rings without having to bounce that memory either. however, it assumes that a contiguous range of guest physical memory will sit in a single vm memory range. mlarkin@ says this is right.
ok mlarkin@
|
#
1.24 |
|
20-Aug-2017 |
pd |
vmd: Allow only upward migration
This restricts receiving vms from hosts with more cpu features.
Tested on broadwell -> skylake (works) skylake -> broadwell (don't work)
ok mlarkin@
|
#
1.23 |
|
14-Aug-2017 |
mlarkin |
vmd: set MSR_MISC_ENABLE=0 on vm creation, this will be re-set in vmm based on proper values from the host in use.
|
#
1.22 |
|
15-Jul-2017 |
pd |
Add vmctl send and vmctl receive
ok reyk@ and mlarkin@
|
#
1.21 |
|
09-Jul-2017 |
pd |
vmd/vmctl: Add ability to pause / unpause vms
With help from Ashwin Agrawal
ok reyk@ mlarkin@
|
#
1.20 |
|
07-Jun-2017 |
mlarkin |
vmd: Implement simulated baudrate support in the ns8250 module. The previous version was allowing an output rate that is "too fast", and linux guests would give up after 512 characters TXed ("too much work for irq4").
This diff calculates the approximate rate we can sustain at the current programmed baud rate and limits the output to that rate by inserting a HZ delay after a specified number of characters have been transmitted. This fixes the linux guest console issue.
Note that the console now outputs at more or less the selected baud rate, instead of nearly instantaneously as before - if you selected 9600 in your guest VMs before, you might want to change that to 115200 now for a better console experience.
krw@ "seems like a good idea to me"
|
#
1.19 |
|
30-May-2017 |
tedu |
split vioblk read/write functions into start and finish as prep for async io operations. ok mlarkin
|
#
1.18 |
|
28-May-2017 |
mlarkin |
SVM: add some exit types
Also, fix a comment that wasn't applicable anymore, and change a format from decimal to hex
|
#
1.17 |
|
05-May-2017 |
reyk |
VMs cannot use proc_compose() to PROC_VMM, they have to use imsg_compose() on the "vmm_pipe" directly. This fixes the communication channel from VMs back to vmm.
|
#
1.16 |
|
05-May-2017 |
mlarkin |
Allow vmd(8) to set guest %xcr0
Usermode part of previous vmm(4) diff.
Posted to tech by Pratik Vyas
|
#
1.15 |
|
02-May-2017 |
mlarkin |
fix an error in i386 vmd build
|
#
1.14 |
|
02-May-2017 |
mlarkin |
Matching vmd(8) part of previous diff (first part of vmctl send/receive).
ok kettenis
|
#
1.13 |
|
25-Apr-2017 |
reyk |
spacing
|
#
1.12 |
|
19-Apr-2017 |
reyk |
Add support for dynamic "NAT" interfaces (-L/local interface).
When a local interface is configured, vmd configures a /31 address on the tap(4) interface of the host and provides another IP in the same subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server that replies with IP, gateway, and DNS addresses to the VM. The built-in server only ever responds to the VM on the inside and cannot leak its DHCP responses to the outside.
Thanks to Uwe Werler, Josh Grosse, and some others for testing!
OK deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.11 |
|
27-Mar-2017 |
deraadt |
die whitespace die die die
|
#
1.10 |
|
25-Mar-2017 |
mlarkin |
Last bits needed to get seabios + alpine linux working. This is enough to get started and let more people help finding and fixing bugs.
ok kettenis, deraadt
|
#
1.9 |
|
25-Mar-2017 |
reyk |
Boot using BIOS from /etc/firmware/vmm-bios by default.
Instead of using the internal "vmboot", VMs will now be booted using the external BIOS firmware in /etc/firmware/vmm-bios (which is subject to a LGPLv3 license). Direct booting of OpenBSD kernels or non-default BIOS images is still supported for now using the -b/boot option that is replacing the -k/kernel option.
As requested by Theo, vmd(8) fails if neither the default BIOS is found nor a kernel has been specified in the VM configuration. The "vmm" BIOS has to be installed using fw_update(1), which will be done automatically in most cases where the OpenBSD can fetch it after install/upgrade.
OK mlarkin@
|
#
1.8 |
|
25-Mar-2017 |
mlarkin |
Implement some missing functionality and clean up some code in vmd pci emulation.
ok kettenis
|
#
1.7 |
|
25-Mar-2017 |
mlarkin |
Introduce a new function to obtain properly sized input data, and convert i8253/i8259/mc146818 emulation to use this.
|
#
1.6 |
|
24-Mar-2017 |
mlarkin |
Allow vmd to proceed after an interrupt occurred after retiring a cpuid instruction. Matches previous commit to kernel vmm.c
|
#
1.5 |
|
23-Mar-2017 |
mlarkin |
Implement memory size and SMP CPU count NVRAM registers in the emulated mc146818. This is needed for seabios to boot properly (and construct a sensible e820 map to send to the guest OS).
|
#
1.4 |
|
21-Mar-2017 |
mlarkin |
Fix two errors in NS8250 (UART) emulation. The first error zeroed out the high bits of %eax on reading register data from the emulated UART ports. The second error didn't properly assert the TXRDY bit during init - this bit was only set after the first character was sent. Both these bugs caused seabios to not be able to output any data. Found during the recent effort to get Linux guests booting.
|
#
1.3 |
|
15-Mar-2017 |
reyk |
Improve vmmci(4) shutdown and reboot.
This change handles various cases to power off the VM, even if it is unresponsive, stuck in ddb, or when the shutdown was initiated from the VM guest side. Usage of timeout and VM ACKs make sure that the VM is really turned off at some point.
OK mlarkin@
|
#
1.2 |
|
02-Mar-2017 |
reyk |
Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.
This is especially useful when multiple VMs share a switch, the implementation is independent from the underlying switch or bridge.
no objections mlarkin@
|
#
1.1 |
|
01-Mar-2017 |
reyk |
Split vmm.c into two files: vm.c for the VM child, vmm.c for the parent
As discussed with mlarkin@, it makes it easier to maintain the file.
OK mlarkin@
|
#
1.72 |
|
30-Aug-2022 |
dv |
Initial support for mmio assist for vmm(4)
Provide the basic information required for a userland assist in emulating instructions touching mmio regions, sending as much information as is provided by the host hardware.
No decode or assist provided at the moment by vmd(8).
ok mlarkin@
|
#
1.71 |
|
29-Jun-2022 |
dv |
vmd(8): fix off by one in vm memory range check
When inspecting if a gpa falls into a known memory range, vmd was considering it valid 1 byte past the end resulting in selecting the wrong starting range for the search.
ok mlarkin@
|
#
1.70 |
|
26-Jun-2022 |
dv |
vmd: create a copy of bios at 4g boundary
Newer Linux kernels call into the bios to perform a reboot and our version of SeaBIOS assumes there's a "copy" of the bios ending at 4g. When SeaBIOS reads from this area, since vmd doesn't perform mmio yet, guests terminate with an unhandled fault.
Carve out some space ending at 4g and copy the bios there. Technically we could load garbage there, but give SeaBIOS what it wants for now.
ok mlarkin@
|
#
1.69 |
|
03-May-2022 |
dv |
vmm/vmd/vmctl: standardize memory units to bytes
At different points in the vm lifecycle vmm(4), vmctl(8), and vmd(8) refer to a vm's memory range sizes in either bytes or megabytes. This is needlessly complex.
Switch to using bytes everywhere and adjust types and constants accordingly. While this makes it possible to specify vm's with memory in fractions of megabytes, the logic requiring whole megabyte values remains.
Feedback from deraadt@, mlarkin@, and Matthew Martin.
ok mlarkin@
|
Revision tags: OPENBSD_7_1_BASE
|
#
1.68 |
|
01-Mar-2022 |
dv |
vmd(8): gracefully handle hitting data limits when starting a vm
With recent changes to login.conf(5) to restrict daemon datasize to a finite value, users can now hit resource limits when attempting to start a vm.
This change fixes the error path when hitting the limit. vmd(8) will no longer abort and memory error messages are relayed to the user.
While here, address potential under-reads/writes using atomicio when relaying data between the child vm process and vmd's vmm process.
Original diff from tedu@. OK mlarkin@.
|
#
1.67 |
|
30-Dec-2021 |
claudio |
Add back support for -B net -b bsd.rd which emulates a PXE install and results in an autoinstall. This can be used to quickly create new OpenBSD installs. OK dv@
|
#
1.66 |
|
29-Nov-2021 |
deraadt |
mostly avoid sys/param.h with a local nitems() ok mlarkin
|
Revision tags: OPENBSD_7_0_BASE
|
#
1.65 |
|
01-Sep-2021 |
dv |
remove unused functions and cleanup vmd.h
Discussed with mlarkin@. These functions were implemented but never used. While in vmd.h, fix the order to match current vmd(8) reality.
|
#
1.64 |
|
16-Jul-2021 |
dv |
vmd(8): simplify vcpu logic, removing uart & vionet reads
Remove legacy state handling on the ns8250 and virtio network devices originally put in place before using libevent for async device events. The vcpu thread doesn't need to process device data as it is handled by the libevent thread.
This has the benefit of simplifying some of the message passing between threads introduced to the ns8250 uart since both the vcpu and libevent threads were processing read events.
No functional change intended. Tested by many, including abieber@, weerd@, Mischa Peters, and Matthias Schmidt. (Thanks.)
OK mlarkin@
|
#
1.63 |
|
16-Jun-2021 |
dv |
cleanup vmd(8) includes and header files
Lots of organic growth other the years lead to unnecessary includes (proc.h everywhere) and odd dependencies between header files. This cleans things up a bit to help with upcoming cleanup around dhcp code.
No functional change.
"go for it" mlarkin@
|
Revision tags: OPENBSD_6_9_BASE
|
#
1.62 |
|
05-Apr-2021 |
dv |
Support booting from compressed kernel images.
The bsd.rd ramdisk now ships gzip'd on amd64. Use libz in base to transparently handle decompression of any compressed kernel images.
Patch from Josh Rickmar.
ok kn@
|
#
1.61 |
|
29-Mar-2021 |
dv |
Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior ignored did not intercept these packets and instead transmitted them.
This should make vmd(8)'s dhcp behave more as a true dhcp server should and allows it to work properly with the new dhcpleased(8) attempting a renewal.
OK mlarkin@
|
#
1.60 |
|
19-Mar-2021 |
kn |
Remove booting from kernels in raw/qcow2 images
Diff and (slightly tweaked) text below from Dave Voutila < dave at sisu dot io >, thanks!
-- Since 6.7 switched to FFS2 as the default filesystem for new installs, the ability for vmd(8) to load a kernel and boot.conf from a disk image directly (without SeaBIOS) has been broken.
A diff from tb to add FFS2 support never mdae it into the tree.
On 5th Jan 2021, new ramdisks for amd64 have started shipping gzipped, breaking the ability to load the bsd.rd directly as a kernel image for a vmd guest without first uncompressing the image.
Using BIOS works, the FFS2 change happend ten months ago and few if any have complained about the breakage. vmctl(8) is still vague about supporting it per its man page and one still has to pass the disk image twice as a "-b" and "-d" argument to boot an OpenBSD guest *without* BIOS.
Josh Rickmar reported the gzip issue on bugs@ and provided patches to add support for compressed ramdisks and kernel images. The easiest way to do so is to drop support for FFS images since they require a call to fmemopen(3) while all the other logic uses fopen(3)/fdopen(3) calls and a file descriptor. It is much easier to get thsoe patches merged if they don't have to account for extracting files from disk images. --
No objections anyone "Removing it makes sense" reyk (who wrote the FFS module) OK mlarkin
|
#
1.59 |
|
13-Feb-2021 |
mlarkin |
Fix some wrong comments and KNF/long line wraps
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.58 |
|
28-Jun-2020 |
pd |
vmd(8): Eliminate libevent state corruption
libevent functions for com, pic and rtc are now only called on event_thread. vcpu exit handlers send messages on a dev pipe and callbacks on these events do the event management (event_add, evtimer_add, etc). Previously, libevent state was mutated by two threads, event_thread, that runs all the callbacks and the vcpu thread when running exit handlers. This could have lead to libevent state corruption.
Patch from Dave Voutila <dave@sisu.io>
ok claudio@ tested by abieber@ and brynet@
|
Revision tags: OPENBSD_6_7_BASE
|
#
1.57 |
|
30-Apr-2020 |
pd |
vmd(8): correctly terminate vm processes after sending vm
Instead of a round about way of sending a message to vmm that 'send is successful' and terminating by vm_remove from vmm, we can send the imsg and exit in the vm process. The sigchld handler in vmm will vm_remove it from its structures. This is how a normal vm is terminated as well.
Previously, vm_remove was called in vmm_dispatch_vm (ie. the event handler to receive messages from vm process) when hanlding the IMSG_VMDOP_SEND_VM_RESPONSE (ie. the vm process has written the vm state to the fd passed on by vmctl send). This is not how vm_remove was intented to be used as it does a free(vm). The vm struct holds the buffers for imsg and so after handling this IMSG_VMDOP_SEND_VM_RESPONSE message, vmm_dispatch_vm loops again to do imsg_get(ibuf, &imsg) to read the next message (and we had just freed this *ibuf when we freed the vm struct) causing it to segfault.
reported by kn@ ok kn@
|
#
1.56 |
|
21-Apr-2020 |
pd |
vmd: improve concurrency control in pause
Previous implementation hit a deadlock sometimes as the pthread_cond_broadcast for the pause mutex could happen before pthread_cond_wait. This implementation uses a barrier which is hit when all vpcus are paused.
ok mpi@
|
#
1.55 |
|
08-Apr-2020 |
pd |
vmm(4): add IOCTL handler to sets the access protections of the ept
This exposes VMM_IOC_MPROTECT_EPT which can be used by vmd to lock in physical pages. Currently, vmd just terminates the vm in case it gets a protection fault in the future.
This feature is used by solo5 which uses vmm(4) as a backend hypervisor.
ok mpi@
Patch from Adam Steen <adam@adamsteen.com.au>
|
#
1.54 |
|
11-Dec-2019 |
pd |
vmd: proper concurrency control when pausing a vm
Removes an XXX which slept for 1s waiting for the vcpu thread to reach HLT and pause. We now define a paused and unpaused condition so that a call to pause_vm() / vmctl pause blocks till the vm really reaches a paused state.
Also, detach events for devices from event loop when pausing and add them back when unpausing. This is because some callbacks call pthread_mutex_lock and if the vm is paused, it would block also causing the libevent thread to block. This would mean that we would not be able to process any IMSGs received from vmm (parent process) including a message to unpause.
ok mlarkin@
|
#
1.53 |
|
30-Nov-2019 |
mlarkin |
Revert previous - the stability was not as improved as we had thought and we ended up accidentally breaking vmctl. This will need more thought.
ok ori@
|
#
1.52 |
|
29-Nov-2019 |
mlarkin |
Fix at least one cause of VMs spinning at 100% host CPU
After debugging with ori@, it looks like an event ends up on the wrong libevent queue, and we end continually de-queueing and re-queueing the event continually. While it's unclear exactly why this happened, a clue on libevent's github issues page for the same problem pointed us to using a different event base for the device events. This seems to have unstuck ori@'s problematic VM, and I have also seen no more hangs after this.
We have not completely separated the queues; ori@ will work on setting new libevent bases for those later. But those events are pretty frequency.
with help from and ok ori@
|
Revision tags: OPENBSD_6_6_BASE
|
#
1.51 |
|
17-Jul-2019 |
pd |
vmm/vmd: Fix migration with pvclock
Implement VMM_IOC_READVMPARAMS and VMM_IOC_WRITEVMPARAMS ioctls to read and write pvclock state.
reads ok mlarkin@
|
#
1.50 |
|
28-Jun-2019 |
deraadt |
When system calls indicate an error they return -1, not some arbitrary value < 0. errno is only updated in this case. Change all (most?) callers of syscalls to follow this better, and let's see if this strictness helps us in the future.
|
#
1.49 |
|
28-May-2019 |
pd |
vmd: unset CR0_CD and CR0_NW in default flat64 register values
These never got unset on AMD/SVM guests when booted via vmctl start -b causing them to run very slow
ok mlarkin@
|
#
1.48 |
|
12-May-2019 |
pd |
vmm: add a x86 page table walker
Add a first cut of x86 page table walker to vmd(8) and vmm(4). This function is not used right now but is a building block for future features like HPET, OUTSB and INSB emulation, nested virtualisation support, etc.
With help from Mike Larkin
ok mlarkin@
|
#
1.47 |
|
11-May-2019 |
jasper |
vm_dump_header allocated space for a signature but it was never set; set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image
ok mlarkin@ pd@
|
#
1.46 |
|
11-May-2019 |
jasper |
track the state of the vm (running, paused, etc) using a single bitfield instead of a handful of separate variables. this will makes it easier for vmd to report and check on the individual vm states
no functional change intended
ok ccardenas@ mlarkin@
|
Revision tags: OPENBSD_6_5_BASE
|
#
1.45 |
|
01-Mar-2019 |
mlarkin |
vmd(8): remove some i386 remnants that missed the original cleanup
ok pd, kn, deraadt
|
#
1.44 |
|
20-Feb-2019 |
mlarkin |
vmd(8): initialize guest %drX registers to power-on defaults on launch
Initializes the %drX registers to power on defaults, and bump the VM send/recieve header to reflect same
discussed with deraadt@
|
#
1.43 |
|
10-Dec-2018 |
claudio |
Implement the fw_cfg interface basics and use it to set the bootorder if a bootdevice was forced. This implements both the pure IO port interface and also the new DMA interface, a few direct commands are implemented which are needed but in general the "file" interface should be used. There is no write support for the guest. Tested against the latest vmm-firmware port. This requires also a -current kernel to pass the IO ports to vmd(8). OK mlarkin@ ccardenas@
|
#
1.42 |
|
06-Dec-2018 |
claudio |
Make it possible to define the bootdevice in vmd. This information is used currently only when booting a OpenBSD kernel. If VMBOOTDEV_NET is used the internal dhcp server will pass "auto_install" as boot file to the client and the boot loader passes the MAC of the first interface to the kernel to indicate PXE booting. Adding boot order support to SeaBIOS is not yet implemented. Ok ccardenas@
|
Revision tags: OPENBSD_6_4_BASE
|
#
1.41 |
|
08-Oct-2018 |
reyk |
Add support for qcow2 base images (external snapshots).
This works is from Ori Bernstein, committing on his behalf:
Add support to vmd for external snapshots. That is, snapshots that are derived from a base image. Data lookups start in the derived image, and if the derived image does not contain some data, the search proceeds ot the base image. Multiple derived images may exist off of a single base image.
A limitation of this format is that modifying the base image will corrupt the derived image.
This change also adds support for creating disk derived disk images to vmctl. To use it:
vmctl create derived.qcow2 -s 16G -b base.qcow2
From Ori Bernstein OK mlarkin@ reyk@
|
#
1.40 |
|
28-Sep-2018 |
reyk |
Support vmd-internal's vmboot with qcow2 disk images.
OK mlarkin@
|
#
1.39 |
|
19-Sep-2018 |
ccardenas |
Various clean up items for disks.
- qcow2: general cleanup - vioraw: check malloc - virtio: add function to sync disks - vm: call virtio_shutdown to sync disks when vm is finished executing
Thanks to Ori Bernstein.
Ok miko@
|
#
1.38 |
|
17-Jul-2018 |
mlarkin |
vmd(8): fix vmctl -b option for i386 kernels.
ok pd@
|
#
1.37 |
|
12-Jul-2018 |
mlarkin |
vmm(8)/vmm(4): send a copy of the guest register state to vmd on exit, avoiding multiple readregs ioctls back to vmm in case register content is needed subsequently.
ok phessler
|
#
1.36 |
|
10-Jul-2018 |
mlarkin |
vmd(8): route ELCR handler to the right function
|
#
1.35 |
|
09-Jul-2018 |
mlarkin |
vmd(8): better debug message in a failure case
|
#
1.34 |
|
19-Jun-2018 |
reyk |
knf
|
#
1.33 |
|
27-Apr-2018 |
mlarkin |
vmd(8): implement vmd side of ELCR registers
ok guenther
|
#
1.32 |
|
26-Apr-2018 |
mlarkin |
vmd(8): handle PIT channel 2 status readback via port 0x61
Allow PIT channel 2 status (fired/counting) readback via port 0x61 bit 5.
ok guenther@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.31 |
|
03-Jan-2018 |
ccardenas |
Add initial CD-ROM support to VMD via vioscsi.
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
#
1.30 |
|
29-Nov-2017 |
mlarkin |
make vmm(4) less responsible for initial register state, preferring to let usermode daemons handle that.
ok pd@
|
#
1.29 |
|
28-Nov-2017 |
mlarkin |
fix some spelling errors in a few comments
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.28 |
|
19-Sep-2017 |
mlarkin |
Clarify a wrong conditional, found by jsg.
ok jsg
|
#
1.27 |
|
17-Sep-2017 |
pd |
vmd: send/recv pci config space instead of recreating pci devices on receive
ok mlarkin@
|
#
1.26 |
|
17-Sep-2017 |
pd |
vmd: re add rtc.per and rtc.sec evtimers on receive
This was missed in receive. mc146818_start is already defined. This fixes rtc time resync on receive.
ok mlarkin@
|
#
1.25 |
|
11-Sep-2017 |
dlg |
add functions to provide direct access to guest memory as vmd addresses
iovec_mem() populates an iovec array based on guest physical addresses. this allows the use of things like readv and writev for moving data between the guest and a disk image file without having to bounce the memory.
vaddr_mem() provides a vmd usable pointer based on a guests physical address. this makes it possible to directly reference things like virtio rings without having to bounce that memory either. however, it assumes that a contiguous range of guest physical memory will sit in a single vm memory range. mlarkin@ says this is right.
ok mlarkin@
|
#
1.24 |
|
20-Aug-2017 |
pd |
vmd: Allow only upward migration
This restricts receiving vms from hosts with more cpu features.
Tested on broadwell -> skylake (works) skylake -> broadwell (don't work)
ok mlarkin@
|
#
1.23 |
|
14-Aug-2017 |
mlarkin |
vmd: set MSR_MISC_ENABLE=0 on vm creation, this will be re-set in vmm based on proper values from the host in use.
|
#
1.22 |
|
15-Jul-2017 |
pd |
Add vmctl send and vmctl receive
ok reyk@ and mlarkin@
|
#
1.21 |
|
09-Jul-2017 |
pd |
vmd/vmctl: Add ability to pause / unpause vms
With help from Ashwin Agrawal
ok reyk@ mlarkin@
|
#
1.20 |
|
07-Jun-2017 |
mlarkin |
vmd: Implement simulated baudrate support in the ns8250 module. The previous version was allowing an output rate that is "too fast", and linux guests would give up after 512 characters TXed ("too much work for irq4").
This diff calculates the approximate rate we can sustain at the current programmed baud rate and limits the output to that rate by inserting a HZ delay after a specified number of characters have been transmitted. This fixes the linux guest console issue.
Note that the console now outputs at more or less the selected baud rate, instead of nearly instantaneously as before - if you selected 9600 in your guest VMs before, you might want to change that to 115200 now for a better console experience.
krw@ "seems like a good idea to me"
|
#
1.19 |
|
30-May-2017 |
tedu |
split vioblk read/write functions into start and finish as prep for async io operations. ok mlarkin
|
#
1.18 |
|
28-May-2017 |
mlarkin |
SVM: add some exit types
Also, fix a comment that wasn't applicable anymore, and change a format from decimal to hex
|
#
1.17 |
|
05-May-2017 |
reyk |
VMs cannot use proc_compose() to PROC_VMM, they have to use imsg_compose() on the "vmm_pipe" directly. This fixes the communication channel from VMs back to vmm.
|
#
1.16 |
|
05-May-2017 |
mlarkin |
Allow vmd(8) to set guest %xcr0
Usermode part of previous vmm(4) diff.
Posted to tech by Pratik Vyas
|
#
1.15 |
|
02-May-2017 |
mlarkin |
fix an error in i386 vmd build
|
#
1.14 |
|
02-May-2017 |
mlarkin |
Matching vmd(8) part of previous diff (first part of vmctl send/receive).
ok kettenis
|
#
1.13 |
|
25-Apr-2017 |
reyk |
spacing
|
#
1.12 |
|
19-Apr-2017 |
reyk |
Add support for dynamic "NAT" interfaces (-L/local interface).
When a local interface is configured, vmd configures a /31 address on the tap(4) interface of the host and provides another IP in the same subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server that replies with IP, gateway, and DNS addresses to the VM. The built-in server only ever responds to the VM on the inside and cannot leak its DHCP responses to the outside.
Thanks to Uwe Werler, Josh Grosse, and some others for testing!
OK deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.11 |
|
27-Mar-2017 |
deraadt |
die whitespace die die die
|
#
1.10 |
|
25-Mar-2017 |
mlarkin |
Last bits needed to get seabios + alpine linux working. This is enough to get started and let more people help finding and fixing bugs.
ok kettenis, deraadt
|
#
1.9 |
|
25-Mar-2017 |
reyk |
Boot using BIOS from /etc/firmware/vmm-bios by default.
Instead of using the internal "vmboot", VMs will now be booted using the external BIOS firmware in /etc/firmware/vmm-bios (which is subject to a LGPLv3 license). Direct booting of OpenBSD kernels or non-default BIOS images is still supported for now using the -b/boot option that is replacing the -k/kernel option.
As requested by Theo, vmd(8) fails if neither the default BIOS is found nor a kernel has been specified in the VM configuration. The "vmm" BIOS has to be installed using fw_update(1), which will be done automatically in most cases where the OpenBSD can fetch it after install/upgrade.
OK mlarkin@
|
#
1.8 |
|
25-Mar-2017 |
mlarkin |
Implement some missing functionality and clean up some code in vmd pci emulation.
ok kettenis
|
#
1.7 |
|
25-Mar-2017 |
mlarkin |
Introduce a new function to obtain properly sized input data, and convert i8253/i8259/mc146818 emulation to use this.
|
#
1.6 |
|
24-Mar-2017 |
mlarkin |
Allow vmd to proceed after an interrupt occurred after retiring a cpuid instruction. Matches previous commit to kernel vmm.c
|
#
1.5 |
|
23-Mar-2017 |
mlarkin |
Implement memory size and SMP CPU count NVRAM registers in the emulated mc146818. This is needed for seabios to boot properly (and construct a sensible e820 map to send to the guest OS).
|
#
1.4 |
|
21-Mar-2017 |
mlarkin |
Fix two errors in NS8250 (UART) emulation. The first error zeroed out the high bits of %eax on reading register data from the emulated UART ports. The second error didn't properly assert the TXRDY bit during init - this bit was only set after the first character was sent. Both these bugs caused seabios to not be able to output any data. Found during the recent effort to get Linux guests booting.
|
#
1.3 |
|
15-Mar-2017 |
reyk |
Improve vmmci(4) shutdown and reboot.
This change handles various cases to power off the VM, even if it is unresponsive, stuck in ddb, or when the shutdown was initiated from the VM guest side. Usage of timeout and VM ACKs make sure that the VM is really turned off at some point.
OK mlarkin@
|
#
1.2 |
|
02-Mar-2017 |
reyk |
Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.
This is especially useful when multiple VMs share a switch, the implementation is independent from the underlying switch or bridge.
no objections mlarkin@
|
#
1.1 |
|
01-Mar-2017 |
reyk |
Split vmm.c into two files: vm.c for the VM child, vmm.c for the parent
As discussed with mlarkin@, it makes it easier to maintain the file.
OK mlarkin@
|
#
1.71 |
|
29-Jun-2022 |
dv |
vmd(8): fix off by one in vm memory range check
When inspecting if a gpa falls into a known memory range, vmd was considering it valid 1 byte past the end resulting in selecting the wrong starting range for the search.
ok mlarkin@
|
#
1.70 |
|
26-Jun-2022 |
dv |
vmd: create a copy of bios at 4g boundary
Newer Linux kernels call into the bios to perform a reboot and our version of SeaBIOS assumes there's a "copy" of the bios ending at 4g. When SeaBIOS reads from this area, since vmd doesn't perform mmio yet, guests terminate with an unhandled fault.
Carve out some space ending at 4g and copy the bios there. Technically we could load garbage there, but give SeaBIOS what it wants for now.
ok mlarkin@
|
#
1.69 |
|
03-May-2022 |
dv |
vmm/vmd/vmctl: standardize memory units to bytes
At different points in the vm lifecycle vmm(4), vmctl(8), and vmd(8) refer to a vm's memory range sizes in either bytes or megabytes. This is needlessly complex.
Switch to using bytes everywhere and adjust types and constants accordingly. While this makes it possible to specify vm's with memory in fractions of megabytes, the logic requiring whole megabyte values remains.
Feedback from deraadt@, mlarkin@, and Matthew Martin.
ok mlarkin@
|
Revision tags: OPENBSD_7_1_BASE
|
#
1.68 |
|
01-Mar-2022 |
dv |
vmd(8): gracefully handle hitting data limits when starting a vm
With recent changes to login.conf(5) to restrict daemon datasize to a finite value, users can now hit resource limits when attempting to start a vm.
This change fixes the error path when hitting the limit. vmd(8) will no longer abort and memory error messages are relayed to the user.
While here, address potential under-reads/writes using atomicio when relaying data between the child vm process and vmd's vmm process.
Original diff from tedu@. OK mlarkin@.
|
#
1.67 |
|
30-Dec-2021 |
claudio |
Add back support for -B net -b bsd.rd which emulates a PXE install and results in an autoinstall. This can be used to quickly create new OpenBSD installs. OK dv@
|
#
1.66 |
|
29-Nov-2021 |
deraadt |
mostly avoid sys/param.h with a local nitems() ok mlarkin
|
Revision tags: OPENBSD_7_0_BASE
|
#
1.65 |
|
01-Sep-2021 |
dv |
remove unused functions and cleanup vmd.h
Discussed with mlarkin@. These functions were implemented but never used. While in vmd.h, fix the order to match current vmd(8) reality.
|
#
1.64 |
|
16-Jul-2021 |
dv |
vmd(8): simplify vcpu logic, removing uart & vionet reads
Remove legacy state handling on the ns8250 and virtio network devices originally put in place before using libevent for async device events. The vcpu thread doesn't need to process device data as it is handled by the libevent thread.
This has the benefit of simplifying some of the message passing between threads introduced to the ns8250 uart since both the vcpu and libevent threads were processing read events.
No functional change intended. Tested by many, including abieber@, weerd@, Mischa Peters, and Matthias Schmidt. (Thanks.)
OK mlarkin@
|
#
1.63 |
|
16-Jun-2021 |
dv |
cleanup vmd(8) includes and header files
Lots of organic growth other the years lead to unnecessary includes (proc.h everywhere) and odd dependencies between header files. This cleans things up a bit to help with upcoming cleanup around dhcp code.
No functional change.
"go for it" mlarkin@
|
Revision tags: OPENBSD_6_9_BASE
|
#
1.62 |
|
05-Apr-2021 |
dv |
Support booting from compressed kernel images.
The bsd.rd ramdisk now ships gzip'd on amd64. Use libz in base to transparently handle decompression of any compressed kernel images.
Patch from Josh Rickmar.
ok kn@
|
#
1.61 |
|
29-Mar-2021 |
dv |
Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior ignored did not intercept these packets and instead transmitted them.
This should make vmd(8)'s dhcp behave more as a true dhcp server should and allows it to work properly with the new dhcpleased(8) attempting a renewal.
OK mlarkin@
|
#
1.60 |
|
19-Mar-2021 |
kn |
Remove booting from kernels in raw/qcow2 images
Diff and (slightly tweaked) text below from Dave Voutila < dave at sisu dot io >, thanks!
-- Since 6.7 switched to FFS2 as the default filesystem for new installs, the ability for vmd(8) to load a kernel and boot.conf from a disk image directly (without SeaBIOS) has been broken.
A diff from tb to add FFS2 support never mdae it into the tree.
On 5th Jan 2021, new ramdisks for amd64 have started shipping gzipped, breaking the ability to load the bsd.rd directly as a kernel image for a vmd guest without first uncompressing the image.
Using BIOS works, the FFS2 change happend ten months ago and few if any have complained about the breakage. vmctl(8) is still vague about supporting it per its man page and one still has to pass the disk image twice as a "-b" and "-d" argument to boot an OpenBSD guest *without* BIOS.
Josh Rickmar reported the gzip issue on bugs@ and provided patches to add support for compressed ramdisks and kernel images. The easiest way to do so is to drop support for FFS images since they require a call to fmemopen(3) while all the other logic uses fopen(3)/fdopen(3) calls and a file descriptor. It is much easier to get thsoe patches merged if they don't have to account for extracting files from disk images. --
No objections anyone "Removing it makes sense" reyk (who wrote the FFS module) OK mlarkin
|
#
1.59 |
|
13-Feb-2021 |
mlarkin |
Fix some wrong comments and KNF/long line wraps
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.58 |
|
28-Jun-2020 |
pd |
vmd(8): Eliminate libevent state corruption
libevent functions for com, pic and rtc are now only called on event_thread. vcpu exit handlers send messages on a dev pipe and callbacks on these events do the event management (event_add, evtimer_add, etc). Previously, libevent state was mutated by two threads, event_thread, that runs all the callbacks and the vcpu thread when running exit handlers. This could have lead to libevent state corruption.
Patch from Dave Voutila <dave@sisu.io>
ok claudio@ tested by abieber@ and brynet@
|
Revision tags: OPENBSD_6_7_BASE
|
#
1.57 |
|
30-Apr-2020 |
pd |
vmd(8): correctly terminate vm processes after sending vm
Instead of a round about way of sending a message to vmm that 'send is successful' and terminating by vm_remove from vmm, we can send the imsg and exit in the vm process. The sigchld handler in vmm will vm_remove it from its structures. This is how a normal vm is terminated as well.
Previously, vm_remove was called in vmm_dispatch_vm (ie. the event handler to receive messages from vm process) when hanlding the IMSG_VMDOP_SEND_VM_RESPONSE (ie. the vm process has written the vm state to the fd passed on by vmctl send). This is not how vm_remove was intented to be used as it does a free(vm). The vm struct holds the buffers for imsg and so after handling this IMSG_VMDOP_SEND_VM_RESPONSE message, vmm_dispatch_vm loops again to do imsg_get(ibuf, &imsg) to read the next message (and we had just freed this *ibuf when we freed the vm struct) causing it to segfault.
reported by kn@ ok kn@
|
#
1.56 |
|
21-Apr-2020 |
pd |
vmd: improve concurrency control in pause
Previous implementation hit a deadlock sometimes as the pthread_cond_broadcast for the pause mutex could happen before pthread_cond_wait. This implementation uses a barrier which is hit when all vpcus are paused.
ok mpi@
|
#
1.55 |
|
08-Apr-2020 |
pd |
vmm(4): add IOCTL handler to sets the access protections of the ept
This exposes VMM_IOC_MPROTECT_EPT which can be used by vmd to lock in physical pages. Currently, vmd just terminates the vm in case it gets a protection fault in the future.
This feature is used by solo5 which uses vmm(4) as a backend hypervisor.
ok mpi@
Patch from Adam Steen <adam@adamsteen.com.au>
|
#
1.54 |
|
11-Dec-2019 |
pd |
vmd: proper concurrency control when pausing a vm
Removes an XXX which slept for 1s waiting for the vcpu thread to reach HLT and pause. We now define a paused and unpaused condition so that a call to pause_vm() / vmctl pause blocks till the vm really reaches a paused state.
Also, detach events for devices from event loop when pausing and add them back when unpausing. This is because some callbacks call pthread_mutex_lock and if the vm is paused, it would block also causing the libevent thread to block. This would mean that we would not be able to process any IMSGs received from vmm (parent process) including a message to unpause.
ok mlarkin@
|
#
1.53 |
|
30-Nov-2019 |
mlarkin |
Revert previous - the stability was not as improved as we had thought and we ended up accidentally breaking vmctl. This will need more thought.
ok ori@
|
#
1.52 |
|
29-Nov-2019 |
mlarkin |
Fix at least one cause of VMs spinning at 100% host CPU
After debugging with ori@, it looks like an event ends up on the wrong libevent queue, and we end continually de-queueing and re-queueing the event continually. While it's unclear exactly why this happened, a clue on libevent's github issues page for the same problem pointed us to using a different event base for the device events. This seems to have unstuck ori@'s problematic VM, and I have also seen no more hangs after this.
We have not completely separated the queues; ori@ will work on setting new libevent bases for those later. But those events are pretty frequency.
with help from and ok ori@
|
Revision tags: OPENBSD_6_6_BASE
|
#
1.51 |
|
17-Jul-2019 |
pd |
vmm/vmd: Fix migration with pvclock
Implement VMM_IOC_READVMPARAMS and VMM_IOC_WRITEVMPARAMS ioctls to read and write pvclock state.
reads ok mlarkin@
|
#
1.50 |
|
28-Jun-2019 |
deraadt |
When system calls indicate an error they return -1, not some arbitrary value < 0. errno is only updated in this case. Change all (most?) callers of syscalls to follow this better, and let's see if this strictness helps us in the future.
|
#
1.49 |
|
28-May-2019 |
pd |
vmd: unset CR0_CD and CR0_NW in default flat64 register values
These never got unset on AMD/SVM guests when booted via vmctl start -b causing them to run very slow
ok mlarkin@
|
#
1.48 |
|
12-May-2019 |
pd |
vmm: add a x86 page table walker
Add a first cut of x86 page table walker to vmd(8) and vmm(4). This function is not used right now but is a building block for future features like HPET, OUTSB and INSB emulation, nested virtualisation support, etc.
With help from Mike Larkin
ok mlarkin@
|
#
1.47 |
|
11-May-2019 |
jasper |
vm_dump_header allocated space for a signature but it was never set; set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image
ok mlarkin@ pd@
|
#
1.46 |
|
11-May-2019 |
jasper |
track the state of the vm (running, paused, etc) using a single bitfield instead of a handful of separate variables. this will makes it easier for vmd to report and check on the individual vm states
no functional change intended
ok ccardenas@ mlarkin@
|
Revision tags: OPENBSD_6_5_BASE
|
#
1.45 |
|
01-Mar-2019 |
mlarkin |
vmd(8): remove some i386 remnants that missed the original cleanup
ok pd, kn, deraadt
|
#
1.44 |
|
20-Feb-2019 |
mlarkin |
vmd(8): initialize guest %drX registers to power-on defaults on launch
Initializes the %drX registers to power on defaults, and bump the VM send/recieve header to reflect same
discussed with deraadt@
|
#
1.43 |
|
10-Dec-2018 |
claudio |
Implement the fw_cfg interface basics and use it to set the bootorder if a bootdevice was forced. This implements both the pure IO port interface and also the new DMA interface, a few direct commands are implemented which are needed but in general the "file" interface should be used. There is no write support for the guest. Tested against the latest vmm-firmware port. This requires also a -current kernel to pass the IO ports to vmd(8). OK mlarkin@ ccardenas@
|
#
1.42 |
|
06-Dec-2018 |
claudio |
Make it possible to define the bootdevice in vmd. This information is used currently only when booting a OpenBSD kernel. If VMBOOTDEV_NET is used the internal dhcp server will pass "auto_install" as boot file to the client and the boot loader passes the MAC of the first interface to the kernel to indicate PXE booting. Adding boot order support to SeaBIOS is not yet implemented. Ok ccardenas@
|
Revision tags: OPENBSD_6_4_BASE
|
#
1.41 |
|
08-Oct-2018 |
reyk |
Add support for qcow2 base images (external snapshots).
This works is from Ori Bernstein, committing on his behalf:
Add support to vmd for external snapshots. That is, snapshots that are derived from a base image. Data lookups start in the derived image, and if the derived image does not contain some data, the search proceeds ot the base image. Multiple derived images may exist off of a single base image.
A limitation of this format is that modifying the base image will corrupt the derived image.
This change also adds support for creating disk derived disk images to vmctl. To use it:
vmctl create derived.qcow2 -s 16G -b base.qcow2
From Ori Bernstein OK mlarkin@ reyk@
|
#
1.40 |
|
28-Sep-2018 |
reyk |
Support vmd-internal's vmboot with qcow2 disk images.
OK mlarkin@
|
#
1.39 |
|
19-Sep-2018 |
ccardenas |
Various clean up items for disks.
- qcow2: general cleanup - vioraw: check malloc - virtio: add function to sync disks - vm: call virtio_shutdown to sync disks when vm is finished executing
Thanks to Ori Bernstein.
Ok miko@
|
#
1.38 |
|
17-Jul-2018 |
mlarkin |
vmd(8): fix vmctl -b option for i386 kernels.
ok pd@
|
#
1.37 |
|
12-Jul-2018 |
mlarkin |
vmm(8)/vmm(4): send a copy of the guest register state to vmd on exit, avoiding multiple readregs ioctls back to vmm in case register content is needed subsequently.
ok phessler
|
#
1.36 |
|
10-Jul-2018 |
mlarkin |
vmd(8): route ELCR handler to the right function
|
#
1.35 |
|
09-Jul-2018 |
mlarkin |
vmd(8): better debug message in a failure case
|
#
1.34 |
|
19-Jun-2018 |
reyk |
knf
|
#
1.33 |
|
27-Apr-2018 |
mlarkin |
vmd(8): implement vmd side of ELCR registers
ok guenther
|
#
1.32 |
|
26-Apr-2018 |
mlarkin |
vmd(8): handle PIT channel 2 status readback via port 0x61
Allow PIT channel 2 status (fired/counting) readback via port 0x61 bit 5.
ok guenther@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.31 |
|
03-Jan-2018 |
ccardenas |
Add initial CD-ROM support to VMD via vioscsi.
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
#
1.30 |
|
29-Nov-2017 |
mlarkin |
make vmm(4) less responsible for initial register state, preferring to let usermode daemons handle that.
ok pd@
|
#
1.29 |
|
28-Nov-2017 |
mlarkin |
fix some spelling errors in a few comments
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.28 |
|
19-Sep-2017 |
mlarkin |
Clarify a wrong conditional, found by jsg.
ok jsg
|
#
1.27 |
|
17-Sep-2017 |
pd |
vmd: send/recv pci config space instead of recreating pci devices on receive
ok mlarkin@
|
#
1.26 |
|
17-Sep-2017 |
pd |
vmd: re add rtc.per and rtc.sec evtimers on receive
This was missed in receive. mc146818_start is already defined. This fixes rtc time resync on receive.
ok mlarkin@
|
#
1.25 |
|
11-Sep-2017 |
dlg |
add functions to provide direct access to guest memory as vmd addresses
iovec_mem() populates an iovec array based on guest physical addresses. this allows the use of things like readv and writev for moving data between the guest and a disk image file without having to bounce the memory.
vaddr_mem() provides a vmd usable pointer based on a guests physical address. this makes it possible to directly reference things like virtio rings without having to bounce that memory either. however, it assumes that a contiguous range of guest physical memory will sit in a single vm memory range. mlarkin@ says this is right.
ok mlarkin@
|
#
1.24 |
|
20-Aug-2017 |
pd |
vmd: Allow only upward migration
This restricts receiving vms from hosts with more cpu features.
Tested on broadwell -> skylake (works) skylake -> broadwell (don't work)
ok mlarkin@
|
#
1.23 |
|
14-Aug-2017 |
mlarkin |
vmd: set MSR_MISC_ENABLE=0 on vm creation, this will be re-set in vmm based on proper values from the host in use.
|
#
1.22 |
|
15-Jul-2017 |
pd |
Add vmctl send and vmctl receive
ok reyk@ and mlarkin@
|
#
1.21 |
|
09-Jul-2017 |
pd |
vmd/vmctl: Add ability to pause / unpause vms
With help from Ashwin Agrawal
ok reyk@ mlarkin@
|
#
1.20 |
|
07-Jun-2017 |
mlarkin |
vmd: Implement simulated baudrate support in the ns8250 module. The previous version was allowing an output rate that is "too fast", and linux guests would give up after 512 characters TXed ("too much work for irq4").
This diff calculates the approximate rate we can sustain at the current programmed baud rate and limits the output to that rate by inserting a HZ delay after a specified number of characters have been transmitted. This fixes the linux guest console issue.
Note that the console now outputs at more or less the selected baud rate, instead of nearly instantaneously as before - if you selected 9600 in your guest VMs before, you might want to change that to 115200 now for a better console experience.
krw@ "seems like a good idea to me"
|
#
1.19 |
|
30-May-2017 |
tedu |
split vioblk read/write functions into start and finish as prep for async io operations. ok mlarkin
|
#
1.18 |
|
28-May-2017 |
mlarkin |
SVM: add some exit types
Also, fix a comment that wasn't applicable anymore, and change a format from decimal to hex
|
#
1.17 |
|
05-May-2017 |
reyk |
VMs cannot use proc_compose() to PROC_VMM, they have to use imsg_compose() on the "vmm_pipe" directly. This fixes the communication channel from VMs back to vmm.
|
#
1.16 |
|
05-May-2017 |
mlarkin |
Allow vmd(8) to set guest %xcr0
Usermode part of previous vmm(4) diff.
Posted to tech by Pratik Vyas
|
#
1.15 |
|
02-May-2017 |
mlarkin |
fix an error in i386 vmd build
|
#
1.14 |
|
02-May-2017 |
mlarkin |
Matching vmd(8) part of previous diff (first part of vmctl send/receive).
ok kettenis
|
#
1.13 |
|
25-Apr-2017 |
reyk |
spacing
|
#
1.12 |
|
19-Apr-2017 |
reyk |
Add support for dynamic "NAT" interfaces (-L/local interface).
When a local interface is configured, vmd configures a /31 address on the tap(4) interface of the host and provides another IP in the same subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server that replies with IP, gateway, and DNS addresses to the VM. The built-in server only ever responds to the VM on the inside and cannot leak its DHCP responses to the outside.
Thanks to Uwe Werler, Josh Grosse, and some others for testing!
OK deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.11 |
|
27-Mar-2017 |
deraadt |
die whitespace die die die
|
#
1.10 |
|
25-Mar-2017 |
mlarkin |
Last bits needed to get seabios + alpine linux working. This is enough to get started and let more people help finding and fixing bugs.
ok kettenis, deraadt
|
#
1.9 |
|
25-Mar-2017 |
reyk |
Boot using BIOS from /etc/firmware/vmm-bios by default.
Instead of using the internal "vmboot", VMs will now be booted using the external BIOS firmware in /etc/firmware/vmm-bios (which is subject to a LGPLv3 license). Direct booting of OpenBSD kernels or non-default BIOS images is still supported for now using the -b/boot option that is replacing the -k/kernel option.
As requested by Theo, vmd(8) fails if neither the default BIOS is found nor a kernel has been specified in the VM configuration. The "vmm" BIOS has to be installed using fw_update(1), which will be done automatically in most cases where the OpenBSD can fetch it after install/upgrade.
OK mlarkin@
|
#
1.8 |
|
25-Mar-2017 |
mlarkin |
Implement some missing functionality and clean up some code in vmd pci emulation.
ok kettenis
|
#
1.7 |
|
25-Mar-2017 |
mlarkin |
Introduce a new function to obtain properly sized input data, and convert i8253/i8259/mc146818 emulation to use this.
|
#
1.6 |
|
24-Mar-2017 |
mlarkin |
Allow vmd to proceed after an interrupt occurred after retiring a cpuid instruction. Matches previous commit to kernel vmm.c
|
#
1.5 |
|
23-Mar-2017 |
mlarkin |
Implement memory size and SMP CPU count NVRAM registers in the emulated mc146818. This is needed for seabios to boot properly (and construct a sensible e820 map to send to the guest OS).
|
#
1.4 |
|
21-Mar-2017 |
mlarkin |
Fix two errors in NS8250 (UART) emulation. The first error zeroed out the high bits of %eax on reading register data from the emulated UART ports. The second error didn't properly assert the TXRDY bit during init - this bit was only set after the first character was sent. Both these bugs caused seabios to not be able to output any data. Found during the recent effort to get Linux guests booting.
|
#
1.3 |
|
15-Mar-2017 |
reyk |
Improve vmmci(4) shutdown and reboot.
This change handles various cases to power off the VM, even if it is unresponsive, stuck in ddb, or when the shutdown was initiated from the VM guest side. Usage of timeout and VM ACKs make sure that the VM is really turned off at some point.
OK mlarkin@
|
#
1.2 |
|
02-Mar-2017 |
reyk |
Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.
This is especially useful when multiple VMs share a switch, the implementation is independent from the underlying switch or bridge.
no objections mlarkin@
|
#
1.1 |
|
01-Mar-2017 |
reyk |
Split vmm.c into two files: vm.c for the VM child, vmm.c for the parent
As discussed with mlarkin@, it makes it easier to maintain the file.
OK mlarkin@
|
#
1.70 |
|
26-Jun-2022 |
dv |
vmd: create a copy of bios at 4g boundary
Newer Linux kernels call into the bios to perform a reboot and our version of SeaBIOS assumes there's a "copy" of the bios ending at 4g. When SeaBIOS reads from this area, since vmd doesn't perform mmio yet, guests terminate with an unhandled fault.
Carve out some space ending at 4g and copy the bios there. Technically we could load garbage there, but give SeaBIOS what it wants for now.
ok mlarkin@
|
#
1.69 |
|
03-May-2022 |
dv |
vmm/vmd/vmctl: standardize memory units to bytes
At different points in the vm lifecycle vmm(4), vmctl(8), and vmd(8) refer to a vm's memory range sizes in either bytes or megabytes. This is needlessly complex.
Switch to using bytes everywhere and adjust types and constants accordingly. While this makes it possible to specify vm's with memory in fractions of megabytes, the logic requiring whole megabyte values remains.
Feedback from deraadt@, mlarkin@, and Matthew Martin.
ok mlarkin@
|
Revision tags: OPENBSD_7_1_BASE
|
#
1.68 |
|
01-Mar-2022 |
dv |
vmd(8): gracefully handle hitting data limits when starting a vm
With recent changes to login.conf(5) to restrict daemon datasize to a finite value, users can now hit resource limits when attempting to start a vm.
This change fixes the error path when hitting the limit. vmd(8) will no longer abort and memory error messages are relayed to the user.
While here, address potential under-reads/writes using atomicio when relaying data between the child vm process and vmd's vmm process.
Original diff from tedu@. OK mlarkin@.
|
#
1.67 |
|
30-Dec-2021 |
claudio |
Add back support for -B net -b bsd.rd which emulates a PXE install and results in an autoinstall. This can be used to quickly create new OpenBSD installs. OK dv@
|
#
1.66 |
|
29-Nov-2021 |
deraadt |
mostly avoid sys/param.h with a local nitems() ok mlarkin
|
Revision tags: OPENBSD_7_0_BASE
|
#
1.65 |
|
01-Sep-2021 |
dv |
remove unused functions and cleanup vmd.h
Discussed with mlarkin@. These functions were implemented but never used. While in vmd.h, fix the order to match current vmd(8) reality.
|
#
1.64 |
|
16-Jul-2021 |
dv |
vmd(8): simplify vcpu logic, removing uart & vionet reads
Remove legacy state handling on the ns8250 and virtio network devices originally put in place before using libevent for async device events. The vcpu thread doesn't need to process device data as it is handled by the libevent thread.
This has the benefit of simplifying some of the message passing between threads introduced to the ns8250 uart since both the vcpu and libevent threads were processing read events.
No functional change intended. Tested by many, including abieber@, weerd@, Mischa Peters, and Matthias Schmidt. (Thanks.)
OK mlarkin@
|
#
1.63 |
|
16-Jun-2021 |
dv |
cleanup vmd(8) includes and header files
Lots of organic growth other the years lead to unnecessary includes (proc.h everywhere) and odd dependencies between header files. This cleans things up a bit to help with upcoming cleanup around dhcp code.
No functional change.
"go for it" mlarkin@
|
Revision tags: OPENBSD_6_9_BASE
|
#
1.62 |
|
05-Apr-2021 |
dv |
Support booting from compressed kernel images.
The bsd.rd ramdisk now ships gzip'd on amd64. Use libz in base to transparently handle decompression of any compressed kernel images.
Patch from Josh Rickmar.
ok kn@
|
#
1.61 |
|
29-Mar-2021 |
dv |
Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior ignored did not intercept these packets and instead transmitted them.
This should make vmd(8)'s dhcp behave more as a true dhcp server should and allows it to work properly with the new dhcpleased(8) attempting a renewal.
OK mlarkin@
|
#
1.60 |
|
19-Mar-2021 |
kn |
Remove booting from kernels in raw/qcow2 images
Diff and (slightly tweaked) text below from Dave Voutila < dave at sisu dot io >, thanks!
-- Since 6.7 switched to FFS2 as the default filesystem for new installs, the ability for vmd(8) to load a kernel and boot.conf from a disk image directly (without SeaBIOS) has been broken.
A diff from tb to add FFS2 support never mdae it into the tree.
On 5th Jan 2021, new ramdisks for amd64 have started shipping gzipped, breaking the ability to load the bsd.rd directly as a kernel image for a vmd guest without first uncompressing the image.
Using BIOS works, the FFS2 change happend ten months ago and few if any have complained about the breakage. vmctl(8) is still vague about supporting it per its man page and one still has to pass the disk image twice as a "-b" and "-d" argument to boot an OpenBSD guest *without* BIOS.
Josh Rickmar reported the gzip issue on bugs@ and provided patches to add support for compressed ramdisks and kernel images. The easiest way to do so is to drop support for FFS images since they require a call to fmemopen(3) while all the other logic uses fopen(3)/fdopen(3) calls and a file descriptor. It is much easier to get thsoe patches merged if they don't have to account for extracting files from disk images. --
No objections anyone "Removing it makes sense" reyk (who wrote the FFS module) OK mlarkin
|
#
1.59 |
|
13-Feb-2021 |
mlarkin |
Fix some wrong comments and KNF/long line wraps
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.58 |
|
28-Jun-2020 |
pd |
vmd(8): Eliminate libevent state corruption
libevent functions for com, pic and rtc are now only called on event_thread. vcpu exit handlers send messages on a dev pipe and callbacks on these events do the event management (event_add, evtimer_add, etc). Previously, libevent state was mutated by two threads, event_thread, that runs all the callbacks and the vcpu thread when running exit handlers. This could have lead to libevent state corruption.
Patch from Dave Voutila <dave@sisu.io>
ok claudio@ tested by abieber@ and brynet@
|
Revision tags: OPENBSD_6_7_BASE
|
#
1.57 |
|
30-Apr-2020 |
pd |
vmd(8): correctly terminate vm processes after sending vm
Instead of a round about way of sending a message to vmm that 'send is successful' and terminating by vm_remove from vmm, we can send the imsg and exit in the vm process. The sigchld handler in vmm will vm_remove it from its structures. This is how a normal vm is terminated as well.
Previously, vm_remove was called in vmm_dispatch_vm (ie. the event handler to receive messages from vm process) when hanlding the IMSG_VMDOP_SEND_VM_RESPONSE (ie. the vm process has written the vm state to the fd passed on by vmctl send). This is not how vm_remove was intented to be used as it does a free(vm). The vm struct holds the buffers for imsg and so after handling this IMSG_VMDOP_SEND_VM_RESPONSE message, vmm_dispatch_vm loops again to do imsg_get(ibuf, &imsg) to read the next message (and we had just freed this *ibuf when we freed the vm struct) causing it to segfault.
reported by kn@ ok kn@
|
#
1.56 |
|
21-Apr-2020 |
pd |
vmd: improve concurrency control in pause
Previous implementation hit a deadlock sometimes as the pthread_cond_broadcast for the pause mutex could happen before pthread_cond_wait. This implementation uses a barrier which is hit when all vpcus are paused.
ok mpi@
|
#
1.55 |
|
08-Apr-2020 |
pd |
vmm(4): add IOCTL handler to sets the access protections of the ept
This exposes VMM_IOC_MPROTECT_EPT which can be used by vmd to lock in physical pages. Currently, vmd just terminates the vm in case it gets a protection fault in the future.
This feature is used by solo5 which uses vmm(4) as a backend hypervisor.
ok mpi@
Patch from Adam Steen <adam@adamsteen.com.au>
|
#
1.54 |
|
11-Dec-2019 |
pd |
vmd: proper concurrency control when pausing a vm
Removes an XXX which slept for 1s waiting for the vcpu thread to reach HLT and pause. We now define a paused and unpaused condition so that a call to pause_vm() / vmctl pause blocks till the vm really reaches a paused state.
Also, detach events for devices from event loop when pausing and add them back when unpausing. This is because some callbacks call pthread_mutex_lock and if the vm is paused, it would block also causing the libevent thread to block. This would mean that we would not be able to process any IMSGs received from vmm (parent process) including a message to unpause.
ok mlarkin@
|
#
1.53 |
|
30-Nov-2019 |
mlarkin |
Revert previous - the stability was not as improved as we had thought and we ended up accidentally breaking vmctl. This will need more thought.
ok ori@
|
#
1.52 |
|
29-Nov-2019 |
mlarkin |
Fix at least one cause of VMs spinning at 100% host CPU
After debugging with ori@, it looks like an event ends up on the wrong libevent queue, and we end continually de-queueing and re-queueing the event continually. While it's unclear exactly why this happened, a clue on libevent's github issues page for the same problem pointed us to using a different event base for the device events. This seems to have unstuck ori@'s problematic VM, and I have also seen no more hangs after this.
We have not completely separated the queues; ori@ will work on setting new libevent bases for those later. But those events are pretty frequency.
with help from and ok ori@
|
Revision tags: OPENBSD_6_6_BASE
|
#
1.51 |
|
17-Jul-2019 |
pd |
vmm/vmd: Fix migration with pvclock
Implement VMM_IOC_READVMPARAMS and VMM_IOC_WRITEVMPARAMS ioctls to read and write pvclock state.
reads ok mlarkin@
|
#
1.50 |
|
28-Jun-2019 |
deraadt |
When system calls indicate an error they return -1, not some arbitrary value < 0. errno is only updated in this case. Change all (most?) callers of syscalls to follow this better, and let's see if this strictness helps us in the future.
|
#
1.49 |
|
28-May-2019 |
pd |
vmd: unset CR0_CD and CR0_NW in default flat64 register values
These never got unset on AMD/SVM guests when booted via vmctl start -b causing them to run very slow
ok mlarkin@
|
#
1.48 |
|
12-May-2019 |
pd |
vmm: add a x86 page table walker
Add a first cut of x86 page table walker to vmd(8) and vmm(4). This function is not used right now but is a building block for future features like HPET, OUTSB and INSB emulation, nested virtualisation support, etc.
With help from Mike Larkin
ok mlarkin@
|
#
1.47 |
|
11-May-2019 |
jasper |
vm_dump_header allocated space for a signature but it was never set; set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image
ok mlarkin@ pd@
|
#
1.46 |
|
11-May-2019 |
jasper |
track the state of the vm (running, paused, etc) using a single bitfield instead of a handful of separate variables. this will makes it easier for vmd to report and check on the individual vm states
no functional change intended
ok ccardenas@ mlarkin@
|
Revision tags: OPENBSD_6_5_BASE
|
#
1.45 |
|
01-Mar-2019 |
mlarkin |
vmd(8): remove some i386 remnants that missed the original cleanup
ok pd, kn, deraadt
|
#
1.44 |
|
20-Feb-2019 |
mlarkin |
vmd(8): initialize guest %drX registers to power-on defaults on launch
Initializes the %drX registers to power on defaults, and bump the VM send/recieve header to reflect same
discussed with deraadt@
|
#
1.43 |
|
10-Dec-2018 |
claudio |
Implement the fw_cfg interface basics and use it to set the bootorder if a bootdevice was forced. This implements both the pure IO port interface and also the new DMA interface, a few direct commands are implemented which are needed but in general the "file" interface should be used. There is no write support for the guest. Tested against the latest vmm-firmware port. This requires also a -current kernel to pass the IO ports to vmd(8). OK mlarkin@ ccardenas@
|
#
1.42 |
|
06-Dec-2018 |
claudio |
Make it possible to define the bootdevice in vmd. This information is used currently only when booting a OpenBSD kernel. If VMBOOTDEV_NET is used the internal dhcp server will pass "auto_install" as boot file to the client and the boot loader passes the MAC of the first interface to the kernel to indicate PXE booting. Adding boot order support to SeaBIOS is not yet implemented. Ok ccardenas@
|
Revision tags: OPENBSD_6_4_BASE
|
#
1.41 |
|
08-Oct-2018 |
reyk |
Add support for qcow2 base images (external snapshots).
This works is from Ori Bernstein, committing on his behalf:
Add support to vmd for external snapshots. That is, snapshots that are derived from a base image. Data lookups start in the derived image, and if the derived image does not contain some data, the search proceeds ot the base image. Multiple derived images may exist off of a single base image.
A limitation of this format is that modifying the base image will corrupt the derived image.
This change also adds support for creating disk derived disk images to vmctl. To use it:
vmctl create derived.qcow2 -s 16G -b base.qcow2
From Ori Bernstein OK mlarkin@ reyk@
|
#
1.40 |
|
28-Sep-2018 |
reyk |
Support vmd-internal's vmboot with qcow2 disk images.
OK mlarkin@
|
#
1.39 |
|
19-Sep-2018 |
ccardenas |
Various clean up items for disks.
- qcow2: general cleanup - vioraw: check malloc - virtio: add function to sync disks - vm: call virtio_shutdown to sync disks when vm is finished executing
Thanks to Ori Bernstein.
Ok miko@
|
#
1.38 |
|
17-Jul-2018 |
mlarkin |
vmd(8): fix vmctl -b option for i386 kernels.
ok pd@
|
#
1.37 |
|
12-Jul-2018 |
mlarkin |
vmm(8)/vmm(4): send a copy of the guest register state to vmd on exit, avoiding multiple readregs ioctls back to vmm in case register content is needed subsequently.
ok phessler
|
#
1.36 |
|
10-Jul-2018 |
mlarkin |
vmd(8): route ELCR handler to the right function
|
#
1.35 |
|
09-Jul-2018 |
mlarkin |
vmd(8): better debug message in a failure case
|
#
1.34 |
|
19-Jun-2018 |
reyk |
knf
|
#
1.33 |
|
27-Apr-2018 |
mlarkin |
vmd(8): implement vmd side of ELCR registers
ok guenther
|
#
1.32 |
|
26-Apr-2018 |
mlarkin |
vmd(8): handle PIT channel 2 status readback via port 0x61
Allow PIT channel 2 status (fired/counting) readback via port 0x61 bit 5.
ok guenther@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.31 |
|
03-Jan-2018 |
ccardenas |
Add initial CD-ROM support to VMD via vioscsi.
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
#
1.30 |
|
29-Nov-2017 |
mlarkin |
make vmm(4) less responsible for initial register state, preferring to let usermode daemons handle that.
ok pd@
|
#
1.29 |
|
28-Nov-2017 |
mlarkin |
fix some spelling errors in a few comments
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.28 |
|
19-Sep-2017 |
mlarkin |
Clarify a wrong conditional, found by jsg.
ok jsg
|
#
1.27 |
|
17-Sep-2017 |
pd |
vmd: send/recv pci config space instead of recreating pci devices on receive
ok mlarkin@
|
#
1.26 |
|
17-Sep-2017 |
pd |
vmd: re add rtc.per and rtc.sec evtimers on receive
This was missed in receive. mc146818_start is already defined. This fixes rtc time resync on receive.
ok mlarkin@
|
#
1.25 |
|
11-Sep-2017 |
dlg |
add functions to provide direct access to guest memory as vmd addresses
iovec_mem() populates an iovec array based on guest physical addresses. this allows the use of things like readv and writev for moving data between the guest and a disk image file without having to bounce the memory.
vaddr_mem() provides a vmd usable pointer based on a guests physical address. this makes it possible to directly reference things like virtio rings without having to bounce that memory either. however, it assumes that a contiguous range of guest physical memory will sit in a single vm memory range. mlarkin@ says this is right.
ok mlarkin@
|
#
1.24 |
|
20-Aug-2017 |
pd |
vmd: Allow only upward migration
This restricts receiving vms from hosts with more cpu features.
Tested on broadwell -> skylake (works) skylake -> broadwell (don't work)
ok mlarkin@
|
#
1.23 |
|
14-Aug-2017 |
mlarkin |
vmd: set MSR_MISC_ENABLE=0 on vm creation, this will be re-set in vmm based on proper values from the host in use.
|
#
1.22 |
|
15-Jul-2017 |
pd |
Add vmctl send and vmctl receive
ok reyk@ and mlarkin@
|
#
1.21 |
|
09-Jul-2017 |
pd |
vmd/vmctl: Add ability to pause / unpause vms
With help from Ashwin Agrawal
ok reyk@ mlarkin@
|
#
1.20 |
|
07-Jun-2017 |
mlarkin |
vmd: Implement simulated baudrate support in the ns8250 module. The previous version was allowing an output rate that is "too fast", and linux guests would give up after 512 characters TXed ("too much work for irq4").
This diff calculates the approximate rate we can sustain at the current programmed baud rate and limits the output to that rate by inserting a HZ delay after a specified number of characters have been transmitted. This fixes the linux guest console issue.
Note that the console now outputs at more or less the selected baud rate, instead of nearly instantaneously as before - if you selected 9600 in your guest VMs before, you might want to change that to 115200 now for a better console experience.
krw@ "seems like a good idea to me"
|
#
1.19 |
|
30-May-2017 |
tedu |
split vioblk read/write functions into start and finish as prep for async io operations. ok mlarkin
|
#
1.18 |
|
28-May-2017 |
mlarkin |
SVM: add some exit types
Also, fix a comment that wasn't applicable anymore, and change a format from decimal to hex
|
#
1.17 |
|
05-May-2017 |
reyk |
VMs cannot use proc_compose() to PROC_VMM, they have to use imsg_compose() on the "vmm_pipe" directly. This fixes the communication channel from VMs back to vmm.
|
#
1.16 |
|
05-May-2017 |
mlarkin |
Allow vmd(8) to set guest %xcr0
Usermode part of previous vmm(4) diff.
Posted to tech by Pratik Vyas
|
#
1.15 |
|
02-May-2017 |
mlarkin |
fix an error in i386 vmd build
|
#
1.14 |
|
02-May-2017 |
mlarkin |
Matching vmd(8) part of previous diff (first part of vmctl send/receive).
ok kettenis
|
#
1.13 |
|
25-Apr-2017 |
reyk |
spacing
|
#
1.12 |
|
19-Apr-2017 |
reyk |
Add support for dynamic "NAT" interfaces (-L/local interface).
When a local interface is configured, vmd configures a /31 address on the tap(4) interface of the host and provides another IP in the same subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server that replies with IP, gateway, and DNS addresses to the VM. The built-in server only ever responds to the VM on the inside and cannot leak its DHCP responses to the outside.
Thanks to Uwe Werler, Josh Grosse, and some others for testing!
OK deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.11 |
|
27-Mar-2017 |
deraadt |
die whitespace die die die
|
#
1.10 |
|
25-Mar-2017 |
mlarkin |
Last bits needed to get seabios + alpine linux working. This is enough to get started and let more people help finding and fixing bugs.
ok kettenis, deraadt
|
#
1.9 |
|
25-Mar-2017 |
reyk |
Boot using BIOS from /etc/firmware/vmm-bios by default.
Instead of using the internal "vmboot", VMs will now be booted using the external BIOS firmware in /etc/firmware/vmm-bios (which is subject to a LGPLv3 license). Direct booting of OpenBSD kernels or non-default BIOS images is still supported for now using the -b/boot option that is replacing the -k/kernel option.
As requested by Theo, vmd(8) fails if neither the default BIOS is found nor a kernel has been specified in the VM configuration. The "vmm" BIOS has to be installed using fw_update(1), which will be done automatically in most cases where the OpenBSD can fetch it after install/upgrade.
OK mlarkin@
|
#
1.8 |
|
25-Mar-2017 |
mlarkin |
Implement some missing functionality and clean up some code in vmd pci emulation.
ok kettenis
|
#
1.7 |
|
25-Mar-2017 |
mlarkin |
Introduce a new function to obtain properly sized input data, and convert i8253/i8259/mc146818 emulation to use this.
|
#
1.6 |
|
24-Mar-2017 |
mlarkin |
Allow vmd to proceed after an interrupt occurred after retiring a cpuid instruction. Matches previous commit to kernel vmm.c
|
#
1.5 |
|
23-Mar-2017 |
mlarkin |
Implement memory size and SMP CPU count NVRAM registers in the emulated mc146818. This is needed for seabios to boot properly (and construct a sensible e820 map to send to the guest OS).
|
#
1.4 |
|
21-Mar-2017 |
mlarkin |
Fix two errors in NS8250 (UART) emulation. The first error zeroed out the high bits of %eax on reading register data from the emulated UART ports. The second error didn't properly assert the TXRDY bit during init - this bit was only set after the first character was sent. Both these bugs caused seabios to not be able to output any data. Found during the recent effort to get Linux guests booting.
|
#
1.3 |
|
15-Mar-2017 |
reyk |
Improve vmmci(4) shutdown and reboot.
This change handles various cases to power off the VM, even if it is unresponsive, stuck in ddb, or when the shutdown was initiated from the VM guest side. Usage of timeout and VM ACKs make sure that the VM is really turned off at some point.
OK mlarkin@
|
#
1.2 |
|
02-Mar-2017 |
reyk |
Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.
This is especially useful when multiple VMs share a switch, the implementation is independent from the underlying switch or bridge.
no objections mlarkin@
|
#
1.1 |
|
01-Mar-2017 |
reyk |
Split vmm.c into two files: vm.c for the VM child, vmm.c for the parent
As discussed with mlarkin@, it makes it easier to maintain the file.
OK mlarkin@
|
#
1.69 |
|
03-May-2022 |
dv |
vmm/vmd/vmctl: standardize memory units to bytes
At different points in the vm lifecycle vmm(4), vmctl(8), and vmd(8) refer to a vm's memory range sizes in either bytes or megabytes. This is needlessly complex.
Switch to using bytes everywhere and adjust types and constants accordingly. While this makes it possible to specify vm's with memory in fractions of megabytes, the logic requiring whole megabyte values remains.
Feedback from deraadt@, mlarkin@, and Matthew Martin.
ok mlarkin@
|
Revision tags: OPENBSD_7_1_BASE
|
#
1.68 |
|
01-Mar-2022 |
dv |
vmd(8): gracefully handle hitting data limits when starting a vm
With recent changes to login.conf(5) to restrict daemon datasize to a finite value, users can now hit resource limits when attempting to start a vm.
This change fixes the error path when hitting the limit. vmd(8) will no longer abort and memory error messages are relayed to the user.
While here, address potential under-reads/writes using atomicio when relaying data between the child vm process and vmd's vmm process.
Original diff from tedu@. OK mlarkin@.
|
#
1.67 |
|
30-Dec-2021 |
claudio |
Add back support for -B net -b bsd.rd which emulates a PXE install and results in an autoinstall. This can be used to quickly create new OpenBSD installs. OK dv@
|
#
1.66 |
|
29-Nov-2021 |
deraadt |
mostly avoid sys/param.h with a local nitems() ok mlarkin
|
Revision tags: OPENBSD_7_0_BASE
|
#
1.65 |
|
01-Sep-2021 |
dv |
remove unused functions and cleanup vmd.h
Discussed with mlarkin@. These functions were implemented but never used. While in vmd.h, fix the order to match current vmd(8) reality.
|
#
1.64 |
|
16-Jul-2021 |
dv |
vmd(8): simplify vcpu logic, removing uart & vionet reads
Remove legacy state handling on the ns8250 and virtio network devices originally put in place before using libevent for async device events. The vcpu thread doesn't need to process device data as it is handled by the libevent thread.
This has the benefit of simplifying some of the message passing between threads introduced to the ns8250 uart since both the vcpu and libevent threads were processing read events.
No functional change intended. Tested by many, including abieber@, weerd@, Mischa Peters, and Matthias Schmidt. (Thanks.)
OK mlarkin@
|
#
1.63 |
|
16-Jun-2021 |
dv |
cleanup vmd(8) includes and header files
Lots of organic growth other the years lead to unnecessary includes (proc.h everywhere) and odd dependencies between header files. This cleans things up a bit to help with upcoming cleanup around dhcp code.
No functional change.
"go for it" mlarkin@
|
Revision tags: OPENBSD_6_9_BASE
|
#
1.62 |
|
05-Apr-2021 |
dv |
Support booting from compressed kernel images.
The bsd.rd ramdisk now ships gzip'd on amd64. Use libz in base to transparently handle decompression of any compressed kernel images.
Patch from Josh Rickmar.
ok kn@
|
#
1.61 |
|
29-Mar-2021 |
dv |
Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior ignored did not intercept these packets and instead transmitted them.
This should make vmd(8)'s dhcp behave more as a true dhcp server should and allows it to work properly with the new dhcpleased(8) attempting a renewal.
OK mlarkin@
|
#
1.60 |
|
19-Mar-2021 |
kn |
Remove booting from kernels in raw/qcow2 images
Diff and (slightly tweaked) text below from Dave Voutila < dave at sisu dot io >, thanks!
-- Since 6.7 switched to FFS2 as the default filesystem for new installs, the ability for vmd(8) to load a kernel and boot.conf from a disk image directly (without SeaBIOS) has been broken.
A diff from tb to add FFS2 support never mdae it into the tree.
On 5th Jan 2021, new ramdisks for amd64 have started shipping gzipped, breaking the ability to load the bsd.rd directly as a kernel image for a vmd guest without first uncompressing the image.
Using BIOS works, the FFS2 change happend ten months ago and few if any have complained about the breakage. vmctl(8) is still vague about supporting it per its man page and one still has to pass the disk image twice as a "-b" and "-d" argument to boot an OpenBSD guest *without* BIOS.
Josh Rickmar reported the gzip issue on bugs@ and provided patches to add support for compressed ramdisks and kernel images. The easiest way to do so is to drop support for FFS images since they require a call to fmemopen(3) while all the other logic uses fopen(3)/fdopen(3) calls and a file descriptor. It is much easier to get thsoe patches merged if they don't have to account for extracting files from disk images. --
No objections anyone "Removing it makes sense" reyk (who wrote the FFS module) OK mlarkin
|
#
1.59 |
|
13-Feb-2021 |
mlarkin |
Fix some wrong comments and KNF/long line wraps
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.58 |
|
28-Jun-2020 |
pd |
vmd(8): Eliminate libevent state corruption
libevent functions for com, pic and rtc are now only called on event_thread. vcpu exit handlers send messages on a dev pipe and callbacks on these events do the event management (event_add, evtimer_add, etc). Previously, libevent state was mutated by two threads, event_thread, that runs all the callbacks and the vcpu thread when running exit handlers. This could have lead to libevent state corruption.
Patch from Dave Voutila <dave@sisu.io>
ok claudio@ tested by abieber@ and brynet@
|
Revision tags: OPENBSD_6_7_BASE
|
#
1.57 |
|
30-Apr-2020 |
pd |
vmd(8): correctly terminate vm processes after sending vm
Instead of a round about way of sending a message to vmm that 'send is successful' and terminating by vm_remove from vmm, we can send the imsg and exit in the vm process. The sigchld handler in vmm will vm_remove it from its structures. This is how a normal vm is terminated as well.
Previously, vm_remove was called in vmm_dispatch_vm (ie. the event handler to receive messages from vm process) when hanlding the IMSG_VMDOP_SEND_VM_RESPONSE (ie. the vm process has written the vm state to the fd passed on by vmctl send). This is not how vm_remove was intented to be used as it does a free(vm). The vm struct holds the buffers for imsg and so after handling this IMSG_VMDOP_SEND_VM_RESPONSE message, vmm_dispatch_vm loops again to do imsg_get(ibuf, &imsg) to read the next message (and we had just freed this *ibuf when we freed the vm struct) causing it to segfault.
reported by kn@ ok kn@
|
#
1.56 |
|
21-Apr-2020 |
pd |
vmd: improve concurrency control in pause
Previous implementation hit a deadlock sometimes as the pthread_cond_broadcast for the pause mutex could happen before pthread_cond_wait. This implementation uses a barrier which is hit when all vpcus are paused.
ok mpi@
|
#
1.55 |
|
08-Apr-2020 |
pd |
vmm(4): add IOCTL handler to sets the access protections of the ept
This exposes VMM_IOC_MPROTECT_EPT which can be used by vmd to lock in physical pages. Currently, vmd just terminates the vm in case it gets a protection fault in the future.
This feature is used by solo5 which uses vmm(4) as a backend hypervisor.
ok mpi@
Patch from Adam Steen <adam@adamsteen.com.au>
|
#
1.54 |
|
11-Dec-2019 |
pd |
vmd: proper concurrency control when pausing a vm
Removes an XXX which slept for 1s waiting for the vcpu thread to reach HLT and pause. We now define a paused and unpaused condition so that a call to pause_vm() / vmctl pause blocks till the vm really reaches a paused state.
Also, detach events for devices from event loop when pausing and add them back when unpausing. This is because some callbacks call pthread_mutex_lock and if the vm is paused, it would block also causing the libevent thread to block. This would mean that we would not be able to process any IMSGs received from vmm (parent process) including a message to unpause.
ok mlarkin@
|
#
1.53 |
|
30-Nov-2019 |
mlarkin |
Revert previous - the stability was not as improved as we had thought and we ended up accidentally breaking vmctl. This will need more thought.
ok ori@
|
#
1.52 |
|
29-Nov-2019 |
mlarkin |
Fix at least one cause of VMs spinning at 100% host CPU
After debugging with ori@, it looks like an event ends up on the wrong libevent queue, and we end continually de-queueing and re-queueing the event continually. While it's unclear exactly why this happened, a clue on libevent's github issues page for the same problem pointed us to using a different event base for the device events. This seems to have unstuck ori@'s problematic VM, and I have also seen no more hangs after this.
We have not completely separated the queues; ori@ will work on setting new libevent bases for those later. But those events are pretty frequency.
with help from and ok ori@
|
Revision tags: OPENBSD_6_6_BASE
|
#
1.51 |
|
17-Jul-2019 |
pd |
vmm/vmd: Fix migration with pvclock
Implement VMM_IOC_READVMPARAMS and VMM_IOC_WRITEVMPARAMS ioctls to read and write pvclock state.
reads ok mlarkin@
|
#
1.50 |
|
28-Jun-2019 |
deraadt |
When system calls indicate an error they return -1, not some arbitrary value < 0. errno is only updated in this case. Change all (most?) callers of syscalls to follow this better, and let's see if this strictness helps us in the future.
|
#
1.49 |
|
28-May-2019 |
pd |
vmd: unset CR0_CD and CR0_NW in default flat64 register values
These never got unset on AMD/SVM guests when booted via vmctl start -b causing them to run very slow
ok mlarkin@
|
#
1.48 |
|
12-May-2019 |
pd |
vmm: add a x86 page table walker
Add a first cut of x86 page table walker to vmd(8) and vmm(4). This function is not used right now but is a building block for future features like HPET, OUTSB and INSB emulation, nested virtualisation support, etc.
With help from Mike Larkin
ok mlarkin@
|
#
1.47 |
|
11-May-2019 |
jasper |
vm_dump_header allocated space for a signature but it was never set; set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image
ok mlarkin@ pd@
|
#
1.46 |
|
11-May-2019 |
jasper |
track the state of the vm (running, paused, etc) using a single bitfield instead of a handful of separate variables. this will makes it easier for vmd to report and check on the individual vm states
no functional change intended
ok ccardenas@ mlarkin@
|
Revision tags: OPENBSD_6_5_BASE
|
#
1.45 |
|
01-Mar-2019 |
mlarkin |
vmd(8): remove some i386 remnants that missed the original cleanup
ok pd, kn, deraadt
|
#
1.44 |
|
20-Feb-2019 |
mlarkin |
vmd(8): initialize guest %drX registers to power-on defaults on launch
Initializes the %drX registers to power on defaults, and bump the VM send/recieve header to reflect same
discussed with deraadt@
|
#
1.43 |
|
10-Dec-2018 |
claudio |
Implement the fw_cfg interface basics and use it to set the bootorder if a bootdevice was forced. This implements both the pure IO port interface and also the new DMA interface, a few direct commands are implemented which are needed but in general the "file" interface should be used. There is no write support for the guest. Tested against the latest vmm-firmware port. This requires also a -current kernel to pass the IO ports to vmd(8). OK mlarkin@ ccardenas@
|
#
1.42 |
|
06-Dec-2018 |
claudio |
Make it possible to define the bootdevice in vmd. This information is used currently only when booting a OpenBSD kernel. If VMBOOTDEV_NET is used the internal dhcp server will pass "auto_install" as boot file to the client and the boot loader passes the MAC of the first interface to the kernel to indicate PXE booting. Adding boot order support to SeaBIOS is not yet implemented. Ok ccardenas@
|
Revision tags: OPENBSD_6_4_BASE
|
#
1.41 |
|
08-Oct-2018 |
reyk |
Add support for qcow2 base images (external snapshots).
This works is from Ori Bernstein, committing on his behalf:
Add support to vmd for external snapshots. That is, snapshots that are derived from a base image. Data lookups start in the derived image, and if the derived image does not contain some data, the search proceeds ot the base image. Multiple derived images may exist off of a single base image.
A limitation of this format is that modifying the base image will corrupt the derived image.
This change also adds support for creating disk derived disk images to vmctl. To use it:
vmctl create derived.qcow2 -s 16G -b base.qcow2
From Ori Bernstein OK mlarkin@ reyk@
|
#
1.40 |
|
28-Sep-2018 |
reyk |
Support vmd-internal's vmboot with qcow2 disk images.
OK mlarkin@
|
#
1.39 |
|
19-Sep-2018 |
ccardenas |
Various clean up items for disks.
- qcow2: general cleanup - vioraw: check malloc - virtio: add function to sync disks - vm: call virtio_shutdown to sync disks when vm is finished executing
Thanks to Ori Bernstein.
Ok miko@
|
#
1.38 |
|
17-Jul-2018 |
mlarkin |
vmd(8): fix vmctl -b option for i386 kernels.
ok pd@
|
#
1.37 |
|
12-Jul-2018 |
mlarkin |
vmm(8)/vmm(4): send a copy of the guest register state to vmd on exit, avoiding multiple readregs ioctls back to vmm in case register content is needed subsequently.
ok phessler
|
#
1.36 |
|
10-Jul-2018 |
mlarkin |
vmd(8): route ELCR handler to the right function
|
#
1.35 |
|
09-Jul-2018 |
mlarkin |
vmd(8): better debug message in a failure case
|
#
1.34 |
|
19-Jun-2018 |
reyk |
knf
|
#
1.33 |
|
27-Apr-2018 |
mlarkin |
vmd(8): implement vmd side of ELCR registers
ok guenther
|
#
1.32 |
|
26-Apr-2018 |
mlarkin |
vmd(8): handle PIT channel 2 status readback via port 0x61
Allow PIT channel 2 status (fired/counting) readback via port 0x61 bit 5.
ok guenther@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.31 |
|
03-Jan-2018 |
ccardenas |
Add initial CD-ROM support to VMD via vioscsi.
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
#
1.30 |
|
29-Nov-2017 |
mlarkin |
make vmm(4) less responsible for initial register state, preferring to let usermode daemons handle that.
ok pd@
|
#
1.29 |
|
28-Nov-2017 |
mlarkin |
fix some spelling errors in a few comments
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.28 |
|
19-Sep-2017 |
mlarkin |
Clarify a wrong conditional, found by jsg.
ok jsg
|
#
1.27 |
|
17-Sep-2017 |
pd |
vmd: send/recv pci config space instead of recreating pci devices on receive
ok mlarkin@
|
#
1.26 |
|
17-Sep-2017 |
pd |
vmd: re add rtc.per and rtc.sec evtimers on receive
This was missed in receive. mc146818_start is already defined. This fixes rtc time resync on receive.
ok mlarkin@
|
#
1.25 |
|
11-Sep-2017 |
dlg |
add functions to provide direct access to guest memory as vmd addresses
iovec_mem() populates an iovec array based on guest physical addresses. this allows the use of things like readv and writev for moving data between the guest and a disk image file without having to bounce the memory.
vaddr_mem() provides a vmd usable pointer based on a guests physical address. this makes it possible to directly reference things like virtio rings without having to bounce that memory either. however, it assumes that a contiguous range of guest physical memory will sit in a single vm memory range. mlarkin@ says this is right.
ok mlarkin@
|
#
1.24 |
|
20-Aug-2017 |
pd |
vmd: Allow only upward migration
This restricts receiving vms from hosts with more cpu features.
Tested on broadwell -> skylake (works) skylake -> broadwell (don't work)
ok mlarkin@
|
#
1.23 |
|
14-Aug-2017 |
mlarkin |
vmd: set MSR_MISC_ENABLE=0 on vm creation, this will be re-set in vmm based on proper values from the host in use.
|
#
1.22 |
|
15-Jul-2017 |
pd |
Add vmctl send and vmctl receive
ok reyk@ and mlarkin@
|
#
1.21 |
|
09-Jul-2017 |
pd |
vmd/vmctl: Add ability to pause / unpause vms
With help from Ashwin Agrawal
ok reyk@ mlarkin@
|
#
1.20 |
|
07-Jun-2017 |
mlarkin |
vmd: Implement simulated baudrate support in the ns8250 module. The previous version was allowing an output rate that is "too fast", and linux guests would give up after 512 characters TXed ("too much work for irq4").
This diff calculates the approximate rate we can sustain at the current programmed baud rate and limits the output to that rate by inserting a HZ delay after a specified number of characters have been transmitted. This fixes the linux guest console issue.
Note that the console now outputs at more or less the selected baud rate, instead of nearly instantaneously as before - if you selected 9600 in your guest VMs before, you might want to change that to 115200 now for a better console experience.
krw@ "seems like a good idea to me"
|
#
1.19 |
|
30-May-2017 |
tedu |
split vioblk read/write functions into start and finish as prep for async io operations. ok mlarkin
|
#
1.18 |
|
28-May-2017 |
mlarkin |
SVM: add some exit types
Also, fix a comment that wasn't applicable anymore, and change a format from decimal to hex
|
#
1.17 |
|
05-May-2017 |
reyk |
VMs cannot use proc_compose() to PROC_VMM, they have to use imsg_compose() on the "vmm_pipe" directly. This fixes the communication channel from VMs back to vmm.
|
#
1.16 |
|
05-May-2017 |
mlarkin |
Allow vmd(8) to set guest %xcr0
Usermode part of previous vmm(4) diff.
Posted to tech by Pratik Vyas
|
#
1.15 |
|
02-May-2017 |
mlarkin |
fix an error in i386 vmd build
|
#
1.14 |
|
02-May-2017 |
mlarkin |
Matching vmd(8) part of previous diff (first part of vmctl send/receive).
ok kettenis
|
#
1.13 |
|
25-Apr-2017 |
reyk |
spacing
|
#
1.12 |
|
19-Apr-2017 |
reyk |
Add support for dynamic "NAT" interfaces (-L/local interface).
When a local interface is configured, vmd configures a /31 address on the tap(4) interface of the host and provides another IP in the same subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server that replies with IP, gateway, and DNS addresses to the VM. The built-in server only ever responds to the VM on the inside and cannot leak its DHCP responses to the outside.
Thanks to Uwe Werler, Josh Grosse, and some others for testing!
OK deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.11 |
|
27-Mar-2017 |
deraadt |
die whitespace die die die
|
#
1.10 |
|
25-Mar-2017 |
mlarkin |
Last bits needed to get seabios + alpine linux working. This is enough to get started and let more people help finding and fixing bugs.
ok kettenis, deraadt
|
#
1.9 |
|
25-Mar-2017 |
reyk |
Boot using BIOS from /etc/firmware/vmm-bios by default.
Instead of using the internal "vmboot", VMs will now be booted using the external BIOS firmware in /etc/firmware/vmm-bios (which is subject to a LGPLv3 license). Direct booting of OpenBSD kernels or non-default BIOS images is still supported for now using the -b/boot option that is replacing the -k/kernel option.
As requested by Theo, vmd(8) fails if neither the default BIOS is found nor a kernel has been specified in the VM configuration. The "vmm" BIOS has to be installed using fw_update(1), which will be done automatically in most cases where the OpenBSD can fetch it after install/upgrade.
OK mlarkin@
|
#
1.8 |
|
25-Mar-2017 |
mlarkin |
Implement some missing functionality and clean up some code in vmd pci emulation.
ok kettenis
|
#
1.7 |
|
25-Mar-2017 |
mlarkin |
Introduce a new function to obtain properly sized input data, and convert i8253/i8259/mc146818 emulation to use this.
|
#
1.6 |
|
24-Mar-2017 |
mlarkin |
Allow vmd to proceed after an interrupt occurred after retiring a cpuid instruction. Matches previous commit to kernel vmm.c
|
#
1.5 |
|
23-Mar-2017 |
mlarkin |
Implement memory size and SMP CPU count NVRAM registers in the emulated mc146818. This is needed for seabios to boot properly (and construct a sensible e820 map to send to the guest OS).
|
#
1.4 |
|
21-Mar-2017 |
mlarkin |
Fix two errors in NS8250 (UART) emulation. The first error zeroed out the high bits of %eax on reading register data from the emulated UART ports. The second error didn't properly assert the TXRDY bit during init - this bit was only set after the first character was sent. Both these bugs caused seabios to not be able to output any data. Found during the recent effort to get Linux guests booting.
|
#
1.3 |
|
15-Mar-2017 |
reyk |
Improve vmmci(4) shutdown and reboot.
This change handles various cases to power off the VM, even if it is unresponsive, stuck in ddb, or when the shutdown was initiated from the VM guest side. Usage of timeout and VM ACKs make sure that the VM is really turned off at some point.
OK mlarkin@
|
#
1.2 |
|
02-Mar-2017 |
reyk |
Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.
This is especially useful when multiple VMs share a switch, the implementation is independent from the underlying switch or bridge.
no objections mlarkin@
|
#
1.1 |
|
01-Mar-2017 |
reyk |
Split vmm.c into two files: vm.c for the VM child, vmm.c for the parent
As discussed with mlarkin@, it makes it easier to maintain the file.
OK mlarkin@
|
#
1.68 |
|
01-Mar-2022 |
dv |
vmd(8): gracefully handle hitting data limits when starting a vm
With recent changes to login.conf(5) to restrict daemon datasize to a finite value, users can now hit resource limits when attempting to start a vm.
This change fixes the error path when hitting the limit. vmd(8) will no longer abort and memory error messages are relayed to the user.
While here, address potential under-reads/writes using atomicio when relaying data between the child vm process and vmd's vmm process.
Original diff from tedu@. OK mlarkin@.
|
#
1.67 |
|
30-Dec-2021 |
claudio |
Add back support for -B net -b bsd.rd which emulates a PXE install and results in an autoinstall. This can be used to quickly create new OpenBSD installs. OK dv@
|
#
1.66 |
|
29-Nov-2021 |
deraadt |
mostly avoid sys/param.h with a local nitems() ok mlarkin
|
Revision tags: OPENBSD_7_0_BASE
|
#
1.65 |
|
01-Sep-2021 |
dv |
remove unused functions and cleanup vmd.h
Discussed with mlarkin@. These functions were implemented but never used. While in vmd.h, fix the order to match current vmd(8) reality.
|
#
1.64 |
|
16-Jul-2021 |
dv |
vmd(8): simplify vcpu logic, removing uart & vionet reads
Remove legacy state handling on the ns8250 and virtio network devices originally put in place before using libevent for async device events. The vcpu thread doesn't need to process device data as it is handled by the libevent thread.
This has the benefit of simplifying some of the message passing between threads introduced to the ns8250 uart since both the vcpu and libevent threads were processing read events.
No functional change intended. Tested by many, including abieber@, weerd@, Mischa Peters, and Matthias Schmidt. (Thanks.)
OK mlarkin@
|
#
1.63 |
|
16-Jun-2021 |
dv |
cleanup vmd(8) includes and header files
Lots of organic growth other the years lead to unnecessary includes (proc.h everywhere) and odd dependencies between header files. This cleans things up a bit to help with upcoming cleanup around dhcp code.
No functional change.
"go for it" mlarkin@
|
Revision tags: OPENBSD_6_9_BASE
|
#
1.62 |
|
05-Apr-2021 |
dv |
Support booting from compressed kernel images.
The bsd.rd ramdisk now ships gzip'd on amd64. Use libz in base to transparently handle decompression of any compressed kernel images.
Patch from Josh Rickmar.
ok kn@
|
#
1.61 |
|
29-Mar-2021 |
dv |
Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior ignored did not intercept these packets and instead transmitted them.
This should make vmd(8)'s dhcp behave more as a true dhcp server should and allows it to work properly with the new dhcpleased(8) attempting a renewal.
OK mlarkin@
|
#
1.60 |
|
19-Mar-2021 |
kn |
Remove booting from kernels in raw/qcow2 images
Diff and (slightly tweaked) text below from Dave Voutila < dave at sisu dot io >, thanks!
-- Since 6.7 switched to FFS2 as the default filesystem for new installs, the ability for vmd(8) to load a kernel and boot.conf from a disk image directly (without SeaBIOS) has been broken.
A diff from tb to add FFS2 support never mdae it into the tree.
On 5th Jan 2021, new ramdisks for amd64 have started shipping gzipped, breaking the ability to load the bsd.rd directly as a kernel image for a vmd guest without first uncompressing the image.
Using BIOS works, the FFS2 change happend ten months ago and few if any have complained about the breakage. vmctl(8) is still vague about supporting it per its man page and one still has to pass the disk image twice as a "-b" and "-d" argument to boot an OpenBSD guest *without* BIOS.
Josh Rickmar reported the gzip issue on bugs@ and provided patches to add support for compressed ramdisks and kernel images. The easiest way to do so is to drop support for FFS images since they require a call to fmemopen(3) while all the other logic uses fopen(3)/fdopen(3) calls and a file descriptor. It is much easier to get thsoe patches merged if they don't have to account for extracting files from disk images. --
No objections anyone "Removing it makes sense" reyk (who wrote the FFS module) OK mlarkin
|
#
1.59 |
|
13-Feb-2021 |
mlarkin |
Fix some wrong comments and KNF/long line wraps
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.58 |
|
28-Jun-2020 |
pd |
vmd(8): Eliminate libevent state corruption
libevent functions for com, pic and rtc are now only called on event_thread. vcpu exit handlers send messages on a dev pipe and callbacks on these events do the event management (event_add, evtimer_add, etc). Previously, libevent state was mutated by two threads, event_thread, that runs all the callbacks and the vcpu thread when running exit handlers. This could have lead to libevent state corruption.
Patch from Dave Voutila <dave@sisu.io>
ok claudio@ tested by abieber@ and brynet@
|
Revision tags: OPENBSD_6_7_BASE
|
#
1.57 |
|
30-Apr-2020 |
pd |
vmd(8): correctly terminate vm processes after sending vm
Instead of a round about way of sending a message to vmm that 'send is successful' and terminating by vm_remove from vmm, we can send the imsg and exit in the vm process. The sigchld handler in vmm will vm_remove it from its structures. This is how a normal vm is terminated as well.
Previously, vm_remove was called in vmm_dispatch_vm (ie. the event handler to receive messages from vm process) when hanlding the IMSG_VMDOP_SEND_VM_RESPONSE (ie. the vm process has written the vm state to the fd passed on by vmctl send). This is not how vm_remove was intented to be used as it does a free(vm). The vm struct holds the buffers for imsg and so after handling this IMSG_VMDOP_SEND_VM_RESPONSE message, vmm_dispatch_vm loops again to do imsg_get(ibuf, &imsg) to read the next message (and we had just freed this *ibuf when we freed the vm struct) causing it to segfault.
reported by kn@ ok kn@
|
#
1.56 |
|
21-Apr-2020 |
pd |
vmd: improve concurrency control in pause
Previous implementation hit a deadlock sometimes as the pthread_cond_broadcast for the pause mutex could happen before pthread_cond_wait. This implementation uses a barrier which is hit when all vpcus are paused.
ok mpi@
|
#
1.55 |
|
08-Apr-2020 |
pd |
vmm(4): add IOCTL handler to sets the access protections of the ept
This exposes VMM_IOC_MPROTECT_EPT which can be used by vmd to lock in physical pages. Currently, vmd just terminates the vm in case it gets a protection fault in the future.
This feature is used by solo5 which uses vmm(4) as a backend hypervisor.
ok mpi@
Patch from Adam Steen <adam@adamsteen.com.au>
|
#
1.54 |
|
11-Dec-2019 |
pd |
vmd: proper concurrency control when pausing a vm
Removes an XXX which slept for 1s waiting for the vcpu thread to reach HLT and pause. We now define a paused and unpaused condition so that a call to pause_vm() / vmctl pause blocks till the vm really reaches a paused state.
Also, detach events for devices from event loop when pausing and add them back when unpausing. This is because some callbacks call pthread_mutex_lock and if the vm is paused, it would block also causing the libevent thread to block. This would mean that we would not be able to process any IMSGs received from vmm (parent process) including a message to unpause.
ok mlarkin@
|
#
1.53 |
|
30-Nov-2019 |
mlarkin |
Revert previous - the stability was not as improved as we had thought and we ended up accidentally breaking vmctl. This will need more thought.
ok ori@
|
#
1.52 |
|
29-Nov-2019 |
mlarkin |
Fix at least one cause of VMs spinning at 100% host CPU
After debugging with ori@, it looks like an event ends up on the wrong libevent queue, and we end continually de-queueing and re-queueing the event continually. While it's unclear exactly why this happened, a clue on libevent's github issues page for the same problem pointed us to using a different event base for the device events. This seems to have unstuck ori@'s problematic VM, and I have also seen no more hangs after this.
We have not completely separated the queues; ori@ will work on setting new libevent bases for those later. But those events are pretty frequency.
with help from and ok ori@
|
Revision tags: OPENBSD_6_6_BASE
|
#
1.51 |
|
17-Jul-2019 |
pd |
vmm/vmd: Fix migration with pvclock
Implement VMM_IOC_READVMPARAMS and VMM_IOC_WRITEVMPARAMS ioctls to read and write pvclock state.
reads ok mlarkin@
|
#
1.50 |
|
28-Jun-2019 |
deraadt |
When system calls indicate an error they return -1, not some arbitrary value < 0. errno is only updated in this case. Change all (most?) callers of syscalls to follow this better, and let's see if this strictness helps us in the future.
|
#
1.49 |
|
28-May-2019 |
pd |
vmd: unset CR0_CD and CR0_NW in default flat64 register values
These never got unset on AMD/SVM guests when booted via vmctl start -b causing them to run very slow
ok mlarkin@
|
#
1.48 |
|
12-May-2019 |
pd |
vmm: add a x86 page table walker
Add a first cut of x86 page table walker to vmd(8) and vmm(4). This function is not used right now but is a building block for future features like HPET, OUTSB and INSB emulation, nested virtualisation support, etc.
With help from Mike Larkin
ok mlarkin@
|
#
1.47 |
|
11-May-2019 |
jasper |
vm_dump_header allocated space for a signature but it was never set; set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image
ok mlarkin@ pd@
|
#
1.46 |
|
11-May-2019 |
jasper |
track the state of the vm (running, paused, etc) using a single bitfield instead of a handful of separate variables. this will makes it easier for vmd to report and check on the individual vm states
no functional change intended
ok ccardenas@ mlarkin@
|
Revision tags: OPENBSD_6_5_BASE
|
#
1.45 |
|
01-Mar-2019 |
mlarkin |
vmd(8): remove some i386 remnants that missed the original cleanup
ok pd, kn, deraadt
|
#
1.44 |
|
20-Feb-2019 |
mlarkin |
vmd(8): initialize guest %drX registers to power-on defaults on launch
Initializes the %drX registers to power on defaults, and bump the VM send/recieve header to reflect same
discussed with deraadt@
|
#
1.43 |
|
10-Dec-2018 |
claudio |
Implement the fw_cfg interface basics and use it to set the bootorder if a bootdevice was forced. This implements both the pure IO port interface and also the new DMA interface, a few direct commands are implemented which are needed but in general the "file" interface should be used. There is no write support for the guest. Tested against the latest vmm-firmware port. This requires also a -current kernel to pass the IO ports to vmd(8). OK mlarkin@ ccardenas@
|
#
1.42 |
|
06-Dec-2018 |
claudio |
Make it possible to define the bootdevice in vmd. This information is used currently only when booting a OpenBSD kernel. If VMBOOTDEV_NET is used the internal dhcp server will pass "auto_install" as boot file to the client and the boot loader passes the MAC of the first interface to the kernel to indicate PXE booting. Adding boot order support to SeaBIOS is not yet implemented. Ok ccardenas@
|
Revision tags: OPENBSD_6_4_BASE
|
#
1.41 |
|
08-Oct-2018 |
reyk |
Add support for qcow2 base images (external snapshots).
This works is from Ori Bernstein, committing on his behalf:
Add support to vmd for external snapshots. That is, snapshots that are derived from a base image. Data lookups start in the derived image, and if the derived image does not contain some data, the search proceeds ot the base image. Multiple derived images may exist off of a single base image.
A limitation of this format is that modifying the base image will corrupt the derived image.
This change also adds support for creating disk derived disk images to vmctl. To use it:
vmctl create derived.qcow2 -s 16G -b base.qcow2
From Ori Bernstein OK mlarkin@ reyk@
|
#
1.40 |
|
28-Sep-2018 |
reyk |
Support vmd-internal's vmboot with qcow2 disk images.
OK mlarkin@
|
#
1.39 |
|
19-Sep-2018 |
ccardenas |
Various clean up items for disks.
- qcow2: general cleanup - vioraw: check malloc - virtio: add function to sync disks - vm: call virtio_shutdown to sync disks when vm is finished executing
Thanks to Ori Bernstein.
Ok miko@
|
#
1.38 |
|
17-Jul-2018 |
mlarkin |
vmd(8): fix vmctl -b option for i386 kernels.
ok pd@
|
#
1.37 |
|
12-Jul-2018 |
mlarkin |
vmm(8)/vmm(4): send a copy of the guest register state to vmd on exit, avoiding multiple readregs ioctls back to vmm in case register content is needed subsequently.
ok phessler
|
#
1.36 |
|
10-Jul-2018 |
mlarkin |
vmd(8): route ELCR handler to the right function
|
#
1.35 |
|
09-Jul-2018 |
mlarkin |
vmd(8): better debug message in a failure case
|
#
1.34 |
|
19-Jun-2018 |
reyk |
knf
|
#
1.33 |
|
27-Apr-2018 |
mlarkin |
vmd(8): implement vmd side of ELCR registers
ok guenther
|
#
1.32 |
|
26-Apr-2018 |
mlarkin |
vmd(8): handle PIT channel 2 status readback via port 0x61
Allow PIT channel 2 status (fired/counting) readback via port 0x61 bit 5.
ok guenther@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.31 |
|
03-Jan-2018 |
ccardenas |
Add initial CD-ROM support to VMD via vioscsi.
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
#
1.30 |
|
29-Nov-2017 |
mlarkin |
make vmm(4) less responsible for initial register state, preferring to let usermode daemons handle that.
ok pd@
|
#
1.29 |
|
28-Nov-2017 |
mlarkin |
fix some spelling errors in a few comments
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.28 |
|
19-Sep-2017 |
mlarkin |
Clarify a wrong conditional, found by jsg.
ok jsg
|
#
1.27 |
|
17-Sep-2017 |
pd |
vmd: send/recv pci config space instead of recreating pci devices on receive
ok mlarkin@
|
#
1.26 |
|
17-Sep-2017 |
pd |
vmd: re add rtc.per and rtc.sec evtimers on receive
This was missed in receive. mc146818_start is already defined. This fixes rtc time resync on receive.
ok mlarkin@
|
#
1.25 |
|
11-Sep-2017 |
dlg |
add functions to provide direct access to guest memory as vmd addresses
iovec_mem() populates an iovec array based on guest physical addresses. this allows the use of things like readv and writev for moving data between the guest and a disk image file without having to bounce the memory.
vaddr_mem() provides a vmd usable pointer based on a guests physical address. this makes it possible to directly reference things like virtio rings without having to bounce that memory either. however, it assumes that a contiguous range of guest physical memory will sit in a single vm memory range. mlarkin@ says this is right.
ok mlarkin@
|
#
1.24 |
|
20-Aug-2017 |
pd |
vmd: Allow only upward migration
This restricts receiving vms from hosts with more cpu features.
Tested on broadwell -> skylake (works) skylake -> broadwell (don't work)
ok mlarkin@
|
#
1.23 |
|
14-Aug-2017 |
mlarkin |
vmd: set MSR_MISC_ENABLE=0 on vm creation, this will be re-set in vmm based on proper values from the host in use.
|
#
1.22 |
|
15-Jul-2017 |
pd |
Add vmctl send and vmctl receive
ok reyk@ and mlarkin@
|
#
1.21 |
|
09-Jul-2017 |
pd |
vmd/vmctl: Add ability to pause / unpause vms
With help from Ashwin Agrawal
ok reyk@ mlarkin@
|
#
1.20 |
|
07-Jun-2017 |
mlarkin |
vmd: Implement simulated baudrate support in the ns8250 module. The previous version was allowing an output rate that is "too fast", and linux guests would give up after 512 characters TXed ("too much work for irq4").
This diff calculates the approximate rate we can sustain at the current programmed baud rate and limits the output to that rate by inserting a HZ delay after a specified number of characters have been transmitted. This fixes the linux guest console issue.
Note that the console now outputs at more or less the selected baud rate, instead of nearly instantaneously as before - if you selected 9600 in your guest VMs before, you might want to change that to 115200 now for a better console experience.
krw@ "seems like a good idea to me"
|
#
1.19 |
|
30-May-2017 |
tedu |
split vioblk read/write functions into start and finish as prep for async io operations. ok mlarkin
|
#
1.18 |
|
28-May-2017 |
mlarkin |
SVM: add some exit types
Also, fix a comment that wasn't applicable anymore, and change a format from decimal to hex
|
#
1.17 |
|
05-May-2017 |
reyk |
VMs cannot use proc_compose() to PROC_VMM, they have to use imsg_compose() on the "vmm_pipe" directly. This fixes the communication channel from VMs back to vmm.
|
#
1.16 |
|
05-May-2017 |
mlarkin |
Allow vmd(8) to set guest %xcr0
Usermode part of previous vmm(4) diff.
Posted to tech by Pratik Vyas
|
#
1.15 |
|
02-May-2017 |
mlarkin |
fix an error in i386 vmd build
|
#
1.14 |
|
02-May-2017 |
mlarkin |
Matching vmd(8) part of previous diff (first part of vmctl send/receive).
ok kettenis
|
#
1.13 |
|
25-Apr-2017 |
reyk |
spacing
|
#
1.12 |
|
19-Apr-2017 |
reyk |
Add support for dynamic "NAT" interfaces (-L/local interface).
When a local interface is configured, vmd configures a /31 address on the tap(4) interface of the host and provides another IP in the same subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server that replies with IP, gateway, and DNS addresses to the VM. The built-in server only ever responds to the VM on the inside and cannot leak its DHCP responses to the outside.
Thanks to Uwe Werler, Josh Grosse, and some others for testing!
OK deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.11 |
|
27-Mar-2017 |
deraadt |
die whitespace die die die
|
#
1.10 |
|
25-Mar-2017 |
mlarkin |
Last bits needed to get seabios + alpine linux working. This is enough to get started and let more people help finding and fixing bugs.
ok kettenis, deraadt
|
#
1.9 |
|
25-Mar-2017 |
reyk |
Boot using BIOS from /etc/firmware/vmm-bios by default.
Instead of using the internal "vmboot", VMs will now be booted using the external BIOS firmware in /etc/firmware/vmm-bios (which is subject to a LGPLv3 license). Direct booting of OpenBSD kernels or non-default BIOS images is still supported for now using the -b/boot option that is replacing the -k/kernel option.
As requested by Theo, vmd(8) fails if neither the default BIOS is found nor a kernel has been specified in the VM configuration. The "vmm" BIOS has to be installed using fw_update(1), which will be done automatically in most cases where the OpenBSD can fetch it after install/upgrade.
OK mlarkin@
|
#
1.8 |
|
25-Mar-2017 |
mlarkin |
Implement some missing functionality and clean up some code in vmd pci emulation.
ok kettenis
|
#
1.7 |
|
25-Mar-2017 |
mlarkin |
Introduce a new function to obtain properly sized input data, and convert i8253/i8259/mc146818 emulation to use this.
|
#
1.6 |
|
24-Mar-2017 |
mlarkin |
Allow vmd to proceed after an interrupt occurred after retiring a cpuid instruction. Matches previous commit to kernel vmm.c
|
#
1.5 |
|
23-Mar-2017 |
mlarkin |
Implement memory size and SMP CPU count NVRAM registers in the emulated mc146818. This is needed for seabios to boot properly (and construct a sensible e820 map to send to the guest OS).
|
#
1.4 |
|
21-Mar-2017 |
mlarkin |
Fix two errors in NS8250 (UART) emulation. The first error zeroed out the high bits of %eax on reading register data from the emulated UART ports. The second error didn't properly assert the TXRDY bit during init - this bit was only set after the first character was sent. Both these bugs caused seabios to not be able to output any data. Found during the recent effort to get Linux guests booting.
|
#
1.3 |
|
15-Mar-2017 |
reyk |
Improve vmmci(4) shutdown and reboot.
This change handles various cases to power off the VM, even if it is unresponsive, stuck in ddb, or when the shutdown was initiated from the VM guest side. Usage of timeout and VM ACKs make sure that the VM is really turned off at some point.
OK mlarkin@
|
#
1.2 |
|
02-Mar-2017 |
reyk |
Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.
This is especially useful when multiple VMs share a switch, the implementation is independent from the underlying switch or bridge.
no objections mlarkin@
|
#
1.1 |
|
01-Mar-2017 |
reyk |
Split vmm.c into two files: vm.c for the VM child, vmm.c for the parent
As discussed with mlarkin@, it makes it easier to maintain the file.
OK mlarkin@
|
#
1.67 |
|
30-Dec-2021 |
claudio |
Add back support for -B net -b bsd.rd which emulates a PXE install and results in an autoinstall. This can be used to quickly create new OpenBSD installs. OK dv@
|
#
1.66 |
|
29-Nov-2021 |
deraadt |
mostly avoid sys/param.h with a local nitems() ok mlarkin
|
Revision tags: OPENBSD_7_0_BASE
|
#
1.65 |
|
01-Sep-2021 |
dv |
remove unused functions and cleanup vmd.h
Discussed with mlarkin@. These functions were implemented but never used. While in vmd.h, fix the order to match current vmd(8) reality.
|
#
1.64 |
|
16-Jul-2021 |
dv |
vmd(8): simplify vcpu logic, removing uart & vionet reads
Remove legacy state handling on the ns8250 and virtio network devices originally put in place before using libevent for async device events. The vcpu thread doesn't need to process device data as it is handled by the libevent thread.
This has the benefit of simplifying some of the message passing between threads introduced to the ns8250 uart since both the vcpu and libevent threads were processing read events.
No functional change intended. Tested by many, including abieber@, weerd@, Mischa Peters, and Matthias Schmidt. (Thanks.)
OK mlarkin@
|
#
1.63 |
|
16-Jun-2021 |
dv |
cleanup vmd(8) includes and header files
Lots of organic growth other the years lead to unnecessary includes (proc.h everywhere) and odd dependencies between header files. This cleans things up a bit to help with upcoming cleanup around dhcp code.
No functional change.
"go for it" mlarkin@
|
Revision tags: OPENBSD_6_9_BASE
|
#
1.62 |
|
05-Apr-2021 |
dv |
Support booting from compressed kernel images.
The bsd.rd ramdisk now ships gzip'd on amd64. Use libz in base to transparently handle decompression of any compressed kernel images.
Patch from Josh Rickmar.
ok kn@
|
#
1.61 |
|
29-Mar-2021 |
dv |
Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior ignored did not intercept these packets and instead transmitted them.
This should make vmd(8)'s dhcp behave more as a true dhcp server should and allows it to work properly with the new dhcpleased(8) attempting a renewal.
OK mlarkin@
|
#
1.60 |
|
19-Mar-2021 |
kn |
Remove booting from kernels in raw/qcow2 images
Diff and (slightly tweaked) text below from Dave Voutila < dave at sisu dot io >, thanks!
-- Since 6.7 switched to FFS2 as the default filesystem for new installs, the ability for vmd(8) to load a kernel and boot.conf from a disk image directly (without SeaBIOS) has been broken.
A diff from tb to add FFS2 support never mdae it into the tree.
On 5th Jan 2021, new ramdisks for amd64 have started shipping gzipped, breaking the ability to load the bsd.rd directly as a kernel image for a vmd guest without first uncompressing the image.
Using BIOS works, the FFS2 change happend ten months ago and few if any have complained about the breakage. vmctl(8) is still vague about supporting it per its man page and one still has to pass the disk image twice as a "-b" and "-d" argument to boot an OpenBSD guest *without* BIOS.
Josh Rickmar reported the gzip issue on bugs@ and provided patches to add support for compressed ramdisks and kernel images. The easiest way to do so is to drop support for FFS images since they require a call to fmemopen(3) while all the other logic uses fopen(3)/fdopen(3) calls and a file descriptor. It is much easier to get thsoe patches merged if they don't have to account for extracting files from disk images. --
No objections anyone "Removing it makes sense" reyk (who wrote the FFS module) OK mlarkin
|
#
1.59 |
|
13-Feb-2021 |
mlarkin |
Fix some wrong comments and KNF/long line wraps
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.58 |
|
28-Jun-2020 |
pd |
vmd(8): Eliminate libevent state corruption
libevent functions for com, pic and rtc are now only called on event_thread. vcpu exit handlers send messages on a dev pipe and callbacks on these events do the event management (event_add, evtimer_add, etc). Previously, libevent state was mutated by two threads, event_thread, that runs all the callbacks and the vcpu thread when running exit handlers. This could have lead to libevent state corruption.
Patch from Dave Voutila <dave@sisu.io>
ok claudio@ tested by abieber@ and brynet@
|
Revision tags: OPENBSD_6_7_BASE
|
#
1.57 |
|
30-Apr-2020 |
pd |
vmd(8): correctly terminate vm processes after sending vm
Instead of a round about way of sending a message to vmm that 'send is successful' and terminating by vm_remove from vmm, we can send the imsg and exit in the vm process. The sigchld handler in vmm will vm_remove it from its structures. This is how a normal vm is terminated as well.
Previously, vm_remove was called in vmm_dispatch_vm (ie. the event handler to receive messages from vm process) when hanlding the IMSG_VMDOP_SEND_VM_RESPONSE (ie. the vm process has written the vm state to the fd passed on by vmctl send). This is not how vm_remove was intented to be used as it does a free(vm). The vm struct holds the buffers for imsg and so after handling this IMSG_VMDOP_SEND_VM_RESPONSE message, vmm_dispatch_vm loops again to do imsg_get(ibuf, &imsg) to read the next message (and we had just freed this *ibuf when we freed the vm struct) causing it to segfault.
reported by kn@ ok kn@
|
#
1.56 |
|
21-Apr-2020 |
pd |
vmd: improve concurrency control in pause
Previous implementation hit a deadlock sometimes as the pthread_cond_broadcast for the pause mutex could happen before pthread_cond_wait. This implementation uses a barrier which is hit when all vpcus are paused.
ok mpi@
|
#
1.55 |
|
08-Apr-2020 |
pd |
vmm(4): add IOCTL handler to sets the access protections of the ept
This exposes VMM_IOC_MPROTECT_EPT which can be used by vmd to lock in physical pages. Currently, vmd just terminates the vm in case it gets a protection fault in the future.
This feature is used by solo5 which uses vmm(4) as a backend hypervisor.
ok mpi@
Patch from Adam Steen <adam@adamsteen.com.au>
|
#
1.54 |
|
11-Dec-2019 |
pd |
vmd: proper concurrency control when pausing a vm
Removes an XXX which slept for 1s waiting for the vcpu thread to reach HLT and pause. We now define a paused and unpaused condition so that a call to pause_vm() / vmctl pause blocks till the vm really reaches a paused state.
Also, detach events for devices from event loop when pausing and add them back when unpausing. This is because some callbacks call pthread_mutex_lock and if the vm is paused, it would block also causing the libevent thread to block. This would mean that we would not be able to process any IMSGs received from vmm (parent process) including a message to unpause.
ok mlarkin@
|
#
1.53 |
|
30-Nov-2019 |
mlarkin |
Revert previous - the stability was not as improved as we had thought and we ended up accidentally breaking vmctl. This will need more thought.
ok ori@
|
#
1.52 |
|
29-Nov-2019 |
mlarkin |
Fix at least one cause of VMs spinning at 100% host CPU
After debugging with ori@, it looks like an event ends up on the wrong libevent queue, and we end continually de-queueing and re-queueing the event continually. While it's unclear exactly why this happened, a clue on libevent's github issues page for the same problem pointed us to using a different event base for the device events. This seems to have unstuck ori@'s problematic VM, and I have also seen no more hangs after this.
We have not completely separated the queues; ori@ will work on setting new libevent bases for those later. But those events are pretty frequency.
with help from and ok ori@
|
Revision tags: OPENBSD_6_6_BASE
|
#
1.51 |
|
17-Jul-2019 |
pd |
vmm/vmd: Fix migration with pvclock
Implement VMM_IOC_READVMPARAMS and VMM_IOC_WRITEVMPARAMS ioctls to read and write pvclock state.
reads ok mlarkin@
|
#
1.50 |
|
28-Jun-2019 |
deraadt |
When system calls indicate an error they return -1, not some arbitrary value < 0. errno is only updated in this case. Change all (most?) callers of syscalls to follow this better, and let's see if this strictness helps us in the future.
|
#
1.49 |
|
28-May-2019 |
pd |
vmd: unset CR0_CD and CR0_NW in default flat64 register values
These never got unset on AMD/SVM guests when booted via vmctl start -b causing them to run very slow
ok mlarkin@
|
#
1.48 |
|
12-May-2019 |
pd |
vmm: add a x86 page table walker
Add a first cut of x86 page table walker to vmd(8) and vmm(4). This function is not used right now but is a building block for future features like HPET, OUTSB and INSB emulation, nested virtualisation support, etc.
With help from Mike Larkin
ok mlarkin@
|
#
1.47 |
|
11-May-2019 |
jasper |
vm_dump_header allocated space for a signature but it was never set; set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image
ok mlarkin@ pd@
|
#
1.46 |
|
11-May-2019 |
jasper |
track the state of the vm (running, paused, etc) using a single bitfield instead of a handful of separate variables. this will makes it easier for vmd to report and check on the individual vm states
no functional change intended
ok ccardenas@ mlarkin@
|
Revision tags: OPENBSD_6_5_BASE
|
#
1.45 |
|
01-Mar-2019 |
mlarkin |
vmd(8): remove some i386 remnants that missed the original cleanup
ok pd, kn, deraadt
|
#
1.44 |
|
20-Feb-2019 |
mlarkin |
vmd(8): initialize guest %drX registers to power-on defaults on launch
Initializes the %drX registers to power on defaults, and bump the VM send/recieve header to reflect same
discussed with deraadt@
|
#
1.43 |
|
10-Dec-2018 |
claudio |
Implement the fw_cfg interface basics and use it to set the bootorder if a bootdevice was forced. This implements both the pure IO port interface and also the new DMA interface, a few direct commands are implemented which are needed but in general the "file" interface should be used. There is no write support for the guest. Tested against the latest vmm-firmware port. This requires also a -current kernel to pass the IO ports to vmd(8). OK mlarkin@ ccardenas@
|
#
1.42 |
|
06-Dec-2018 |
claudio |
Make it possible to define the bootdevice in vmd. This information is used currently only when booting a OpenBSD kernel. If VMBOOTDEV_NET is used the internal dhcp server will pass "auto_install" as boot file to the client and the boot loader passes the MAC of the first interface to the kernel to indicate PXE booting. Adding boot order support to SeaBIOS is not yet implemented. Ok ccardenas@
|
Revision tags: OPENBSD_6_4_BASE
|
#
1.41 |
|
08-Oct-2018 |
reyk |
Add support for qcow2 base images (external snapshots).
This works is from Ori Bernstein, committing on his behalf:
Add support to vmd for external snapshots. That is, snapshots that are derived from a base image. Data lookups start in the derived image, and if the derived image does not contain some data, the search proceeds ot the base image. Multiple derived images may exist off of a single base image.
A limitation of this format is that modifying the base image will corrupt the derived image.
This change also adds support for creating disk derived disk images to vmctl. To use it:
vmctl create derived.qcow2 -s 16G -b base.qcow2
From Ori Bernstein OK mlarkin@ reyk@
|
#
1.40 |
|
28-Sep-2018 |
reyk |
Support vmd-internal's vmboot with qcow2 disk images.
OK mlarkin@
|
#
1.39 |
|
19-Sep-2018 |
ccardenas |
Various clean up items for disks.
- qcow2: general cleanup - vioraw: check malloc - virtio: add function to sync disks - vm: call virtio_shutdown to sync disks when vm is finished executing
Thanks to Ori Bernstein.
Ok miko@
|
#
1.38 |
|
17-Jul-2018 |
mlarkin |
vmd(8): fix vmctl -b option for i386 kernels.
ok pd@
|
#
1.37 |
|
12-Jul-2018 |
mlarkin |
vmm(8)/vmm(4): send a copy of the guest register state to vmd on exit, avoiding multiple readregs ioctls back to vmm in case register content is needed subsequently.
ok phessler
|
#
1.36 |
|
10-Jul-2018 |
mlarkin |
vmd(8): route ELCR handler to the right function
|
#
1.35 |
|
09-Jul-2018 |
mlarkin |
vmd(8): better debug message in a failure case
|
#
1.34 |
|
19-Jun-2018 |
reyk |
knf
|
#
1.33 |
|
27-Apr-2018 |
mlarkin |
vmd(8): implement vmd side of ELCR registers
ok guenther
|
#
1.32 |
|
26-Apr-2018 |
mlarkin |
vmd(8): handle PIT channel 2 status readback via port 0x61
Allow PIT channel 2 status (fired/counting) readback via port 0x61 bit 5.
ok guenther@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.31 |
|
03-Jan-2018 |
ccardenas |
Add initial CD-ROM support to VMD via vioscsi.
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
#
1.30 |
|
29-Nov-2017 |
mlarkin |
make vmm(4) less responsible for initial register state, preferring to let usermode daemons handle that.
ok pd@
|
#
1.29 |
|
28-Nov-2017 |
mlarkin |
fix some spelling errors in a few comments
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.28 |
|
19-Sep-2017 |
mlarkin |
Clarify a wrong conditional, found by jsg.
ok jsg
|
#
1.27 |
|
17-Sep-2017 |
pd |
vmd: send/recv pci config space instead of recreating pci devices on receive
ok mlarkin@
|
#
1.26 |
|
17-Sep-2017 |
pd |
vmd: re add rtc.per and rtc.sec evtimers on receive
This was missed in receive. mc146818_start is already defined. This fixes rtc time resync on receive.
ok mlarkin@
|
#
1.25 |
|
11-Sep-2017 |
dlg |
add functions to provide direct access to guest memory as vmd addresses
iovec_mem() populates an iovec array based on guest physical addresses. this allows the use of things like readv and writev for moving data between the guest and a disk image file without having to bounce the memory.
vaddr_mem() provides a vmd usable pointer based on a guests physical address. this makes it possible to directly reference things like virtio rings without having to bounce that memory either. however, it assumes that a contiguous range of guest physical memory will sit in a single vm memory range. mlarkin@ says this is right.
ok mlarkin@
|
#
1.24 |
|
20-Aug-2017 |
pd |
vmd: Allow only upward migration
This restricts receiving vms from hosts with more cpu features.
Tested on broadwell -> skylake (works) skylake -> broadwell (don't work)
ok mlarkin@
|
#
1.23 |
|
14-Aug-2017 |
mlarkin |
vmd: set MSR_MISC_ENABLE=0 on vm creation, this will be re-set in vmm based on proper values from the host in use.
|
#
1.22 |
|
15-Jul-2017 |
pd |
Add vmctl send and vmctl receive
ok reyk@ and mlarkin@
|
#
1.21 |
|
09-Jul-2017 |
pd |
vmd/vmctl: Add ability to pause / unpause vms
With help from Ashwin Agrawal
ok reyk@ mlarkin@
|
#
1.20 |
|
07-Jun-2017 |
mlarkin |
vmd: Implement simulated baudrate support in the ns8250 module. The previous version was allowing an output rate that is "too fast", and linux guests would give up after 512 characters TXed ("too much work for irq4").
This diff calculates the approximate rate we can sustain at the current programmed baud rate and limits the output to that rate by inserting a HZ delay after a specified number of characters have been transmitted. This fixes the linux guest console issue.
Note that the console now outputs at more or less the selected baud rate, instead of nearly instantaneously as before - if you selected 9600 in your guest VMs before, you might want to change that to 115200 now for a better console experience.
krw@ "seems like a good idea to me"
|
#
1.19 |
|
30-May-2017 |
tedu |
split vioblk read/write functions into start and finish as prep for async io operations. ok mlarkin
|
#
1.18 |
|
28-May-2017 |
mlarkin |
SVM: add some exit types
Also, fix a comment that wasn't applicable anymore, and change a format from decimal to hex
|
#
1.17 |
|
05-May-2017 |
reyk |
VMs cannot use proc_compose() to PROC_VMM, they have to use imsg_compose() on the "vmm_pipe" directly. This fixes the communication channel from VMs back to vmm.
|
#
1.16 |
|
05-May-2017 |
mlarkin |
Allow vmd(8) to set guest %xcr0
Usermode part of previous vmm(4) diff.
Posted to tech by Pratik Vyas
|
#
1.15 |
|
02-May-2017 |
mlarkin |
fix an error in i386 vmd build
|
#
1.14 |
|
02-May-2017 |
mlarkin |
Matching vmd(8) part of previous diff (first part of vmctl send/receive).
ok kettenis
|
#
1.13 |
|
25-Apr-2017 |
reyk |
spacing
|
#
1.12 |
|
19-Apr-2017 |
reyk |
Add support for dynamic "NAT" interfaces (-L/local interface).
When a local interface is configured, vmd configures a /31 address on the tap(4) interface of the host and provides another IP in the same subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server that replies with IP, gateway, and DNS addresses to the VM. The built-in server only ever responds to the VM on the inside and cannot leak its DHCP responses to the outside.
Thanks to Uwe Werler, Josh Grosse, and some others for testing!
OK deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.11 |
|
27-Mar-2017 |
deraadt |
die whitespace die die die
|
#
1.10 |
|
25-Mar-2017 |
mlarkin |
Last bits needed to get seabios + alpine linux working. This is enough to get started and let more people help finding and fixing bugs.
ok kettenis, deraadt
|
#
1.9 |
|
25-Mar-2017 |
reyk |
Boot using BIOS from /etc/firmware/vmm-bios by default.
Instead of using the internal "vmboot", VMs will now be booted using the external BIOS firmware in /etc/firmware/vmm-bios (which is subject to a LGPLv3 license). Direct booting of OpenBSD kernels or non-default BIOS images is still supported for now using the -b/boot option that is replacing the -k/kernel option.
As requested by Theo, vmd(8) fails if neither the default BIOS is found nor a kernel has been specified in the VM configuration. The "vmm" BIOS has to be installed using fw_update(1), which will be done automatically in most cases where the OpenBSD can fetch it after install/upgrade.
OK mlarkin@
|
#
1.8 |
|
25-Mar-2017 |
mlarkin |
Implement some missing functionality and clean up some code in vmd pci emulation.
ok kettenis
|
#
1.7 |
|
25-Mar-2017 |
mlarkin |
Introduce a new function to obtain properly sized input data, and convert i8253/i8259/mc146818 emulation to use this.
|
#
1.6 |
|
24-Mar-2017 |
mlarkin |
Allow vmd to proceed after an interrupt occurred after retiring a cpuid instruction. Matches previous commit to kernel vmm.c
|
#
1.5 |
|
23-Mar-2017 |
mlarkin |
Implement memory size and SMP CPU count NVRAM registers in the emulated mc146818. This is needed for seabios to boot properly (and construct a sensible e820 map to send to the guest OS).
|
#
1.4 |
|
21-Mar-2017 |
mlarkin |
Fix two errors in NS8250 (UART) emulation. The first error zeroed out the high bits of %eax on reading register data from the emulated UART ports. The second error didn't properly assert the TXRDY bit during init - this bit was only set after the first character was sent. Both these bugs caused seabios to not be able to output any data. Found during the recent effort to get Linux guests booting.
|
#
1.3 |
|
15-Mar-2017 |
reyk |
Improve vmmci(4) shutdown and reboot.
This change handles various cases to power off the VM, even if it is unresponsive, stuck in ddb, or when the shutdown was initiated from the VM guest side. Usage of timeout and VM ACKs make sure that the VM is really turned off at some point.
OK mlarkin@
|
#
1.2 |
|
02-Mar-2017 |
reyk |
Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.
This is especially useful when multiple VMs share a switch, the implementation is independent from the underlying switch or bridge.
no objections mlarkin@
|
#
1.1 |
|
01-Mar-2017 |
reyk |
Split vmm.c into two files: vm.c for the VM child, vmm.c for the parent
As discussed with mlarkin@, it makes it easier to maintain the file.
OK mlarkin@
|
#
1.66 |
|
29-Nov-2021 |
deraadt |
mostly avoid sys/param.h with a local nitems() ok mlarkin
|
Revision tags: OPENBSD_7_0_BASE
|
#
1.65 |
|
01-Sep-2021 |
dv |
remove unused functions and cleanup vmd.h
Discussed with mlarkin@. These functions were implemented but never used. While in vmd.h, fix the order to match current vmd(8) reality.
|
#
1.64 |
|
16-Jul-2021 |
dv |
vmd(8): simplify vcpu logic, removing uart & vionet reads
Remove legacy state handling on the ns8250 and virtio network devices originally put in place before using libevent for async device events. The vcpu thread doesn't need to process device data as it is handled by the libevent thread.
This has the benefit of simplifying some of the message passing between threads introduced to the ns8250 uart since both the vcpu and libevent threads were processing read events.
No functional change intended. Tested by many, including abieber@, weerd@, Mischa Peters, and Matthias Schmidt. (Thanks.)
OK mlarkin@
|
#
1.63 |
|
16-Jun-2021 |
dv |
cleanup vmd(8) includes and header files
Lots of organic growth other the years lead to unnecessary includes (proc.h everywhere) and odd dependencies between header files. This cleans things up a bit to help with upcoming cleanup around dhcp code.
No functional change.
"go for it" mlarkin@
|
Revision tags: OPENBSD_6_9_BASE
|
#
1.62 |
|
05-Apr-2021 |
dv |
Support booting from compressed kernel images.
The bsd.rd ramdisk now ships gzip'd on amd64. Use libz in base to transparently handle decompression of any compressed kernel images.
Patch from Josh Rickmar.
ok kn@
|
#
1.61 |
|
29-Mar-2021 |
dv |
Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior ignored did not intercept these packets and instead transmitted them.
This should make vmd(8)'s dhcp behave more as a true dhcp server should and allows it to work properly with the new dhcpleased(8) attempting a renewal.
OK mlarkin@
|
#
1.60 |
|
19-Mar-2021 |
kn |
Remove booting from kernels in raw/qcow2 images
Diff and (slightly tweaked) text below from Dave Voutila < dave at sisu dot io >, thanks!
-- Since 6.7 switched to FFS2 as the default filesystem for new installs, the ability for vmd(8) to load a kernel and boot.conf from a disk image directly (without SeaBIOS) has been broken.
A diff from tb to add FFS2 support never mdae it into the tree.
On 5th Jan 2021, new ramdisks for amd64 have started shipping gzipped, breaking the ability to load the bsd.rd directly as a kernel image for a vmd guest without first uncompressing the image.
Using BIOS works, the FFS2 change happend ten months ago and few if any have complained about the breakage. vmctl(8) is still vague about supporting it per its man page and one still has to pass the disk image twice as a "-b" and "-d" argument to boot an OpenBSD guest *without* BIOS.
Josh Rickmar reported the gzip issue on bugs@ and provided patches to add support for compressed ramdisks and kernel images. The easiest way to do so is to drop support for FFS images since they require a call to fmemopen(3) while all the other logic uses fopen(3)/fdopen(3) calls and a file descriptor. It is much easier to get thsoe patches merged if they don't have to account for extracting files from disk images. --
No objections anyone "Removing it makes sense" reyk (who wrote the FFS module) OK mlarkin
|
#
1.59 |
|
13-Feb-2021 |
mlarkin |
Fix some wrong comments and KNF/long line wraps
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.58 |
|
28-Jun-2020 |
pd |
vmd(8): Eliminate libevent state corruption
libevent functions for com, pic and rtc are now only called on event_thread. vcpu exit handlers send messages on a dev pipe and callbacks on these events do the event management (event_add, evtimer_add, etc). Previously, libevent state was mutated by two threads, event_thread, that runs all the callbacks and the vcpu thread when running exit handlers. This could have lead to libevent state corruption.
Patch from Dave Voutila <dave@sisu.io>
ok claudio@ tested by abieber@ and brynet@
|
Revision tags: OPENBSD_6_7_BASE
|
#
1.57 |
|
30-Apr-2020 |
pd |
vmd(8): correctly terminate vm processes after sending vm
Instead of a round about way of sending a message to vmm that 'send is successful' and terminating by vm_remove from vmm, we can send the imsg and exit in the vm process. The sigchld handler in vmm will vm_remove it from its structures. This is how a normal vm is terminated as well.
Previously, vm_remove was called in vmm_dispatch_vm (ie. the event handler to receive messages from vm process) when hanlding the IMSG_VMDOP_SEND_VM_RESPONSE (ie. the vm process has written the vm state to the fd passed on by vmctl send). This is not how vm_remove was intented to be used as it does a free(vm). The vm struct holds the buffers for imsg and so after handling this IMSG_VMDOP_SEND_VM_RESPONSE message, vmm_dispatch_vm loops again to do imsg_get(ibuf, &imsg) to read the next message (and we had just freed this *ibuf when we freed the vm struct) causing it to segfault.
reported by kn@ ok kn@
|
#
1.56 |
|
21-Apr-2020 |
pd |
vmd: improve concurrency control in pause
Previous implementation hit a deadlock sometimes as the pthread_cond_broadcast for the pause mutex could happen before pthread_cond_wait. This implementation uses a barrier which is hit when all vpcus are paused.
ok mpi@
|
#
1.55 |
|
08-Apr-2020 |
pd |
vmm(4): add IOCTL handler to sets the access protections of the ept
This exposes VMM_IOC_MPROTECT_EPT which can be used by vmd to lock in physical pages. Currently, vmd just terminates the vm in case it gets a protection fault in the future.
This feature is used by solo5 which uses vmm(4) as a backend hypervisor.
ok mpi@
Patch from Adam Steen <adam@adamsteen.com.au>
|
#
1.54 |
|
11-Dec-2019 |
pd |
vmd: proper concurrency control when pausing a vm
Removes an XXX which slept for 1s waiting for the vcpu thread to reach HLT and pause. We now define a paused and unpaused condition so that a call to pause_vm() / vmctl pause blocks till the vm really reaches a paused state.
Also, detach events for devices from event loop when pausing and add them back when unpausing. This is because some callbacks call pthread_mutex_lock and if the vm is paused, it would block also causing the libevent thread to block. This would mean that we would not be able to process any IMSGs received from vmm (parent process) including a message to unpause.
ok mlarkin@
|
#
1.53 |
|
30-Nov-2019 |
mlarkin |
Revert previous - the stability was not as improved as we had thought and we ended up accidentally breaking vmctl. This will need more thought.
ok ori@
|
#
1.52 |
|
29-Nov-2019 |
mlarkin |
Fix at least one cause of VMs spinning at 100% host CPU
After debugging with ori@, it looks like an event ends up on the wrong libevent queue, and we end continually de-queueing and re-queueing the event continually. While it's unclear exactly why this happened, a clue on libevent's github issues page for the same problem pointed us to using a different event base for the device events. This seems to have unstuck ori@'s problematic VM, and I have also seen no more hangs after this.
We have not completely separated the queues; ori@ will work on setting new libevent bases for those later. But those events are pretty frequency.
with help from and ok ori@
|
Revision tags: OPENBSD_6_6_BASE
|
#
1.51 |
|
17-Jul-2019 |
pd |
vmm/vmd: Fix migration with pvclock
Implement VMM_IOC_READVMPARAMS and VMM_IOC_WRITEVMPARAMS ioctls to read and write pvclock state.
reads ok mlarkin@
|
#
1.50 |
|
28-Jun-2019 |
deraadt |
When system calls indicate an error they return -1, not some arbitrary value < 0. errno is only updated in this case. Change all (most?) callers of syscalls to follow this better, and let's see if this strictness helps us in the future.
|
#
1.49 |
|
28-May-2019 |
pd |
vmd: unset CR0_CD and CR0_NW in default flat64 register values
These never got unset on AMD/SVM guests when booted via vmctl start -b causing them to run very slow
ok mlarkin@
|
#
1.48 |
|
12-May-2019 |
pd |
vmm: add a x86 page table walker
Add a first cut of x86 page table walker to vmd(8) and vmm(4). This function is not used right now but is a building block for future features like HPET, OUTSB and INSB emulation, nested virtualisation support, etc.
With help from Mike Larkin
ok mlarkin@
|
#
1.47 |
|
11-May-2019 |
jasper |
vm_dump_header allocated space for a signature but it was never set; set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image
ok mlarkin@ pd@
|
#
1.46 |
|
11-May-2019 |
jasper |
track the state of the vm (running, paused, etc) using a single bitfield instead of a handful of separate variables. this will makes it easier for vmd to report and check on the individual vm states
no functional change intended
ok ccardenas@ mlarkin@
|
Revision tags: OPENBSD_6_5_BASE
|
#
1.45 |
|
01-Mar-2019 |
mlarkin |
vmd(8): remove some i386 remnants that missed the original cleanup
ok pd, kn, deraadt
|
#
1.44 |
|
20-Feb-2019 |
mlarkin |
vmd(8): initialize guest %drX registers to power-on defaults on launch
Initializes the %drX registers to power on defaults, and bump the VM send/recieve header to reflect same
discussed with deraadt@
|
#
1.43 |
|
10-Dec-2018 |
claudio |
Implement the fw_cfg interface basics and use it to set the bootorder if a bootdevice was forced. This implements both the pure IO port interface and also the new DMA interface, a few direct commands are implemented which are needed but in general the "file" interface should be used. There is no write support for the guest. Tested against the latest vmm-firmware port. This requires also a -current kernel to pass the IO ports to vmd(8). OK mlarkin@ ccardenas@
|
#
1.42 |
|
06-Dec-2018 |
claudio |
Make it possible to define the bootdevice in vmd. This information is used currently only when booting a OpenBSD kernel. If VMBOOTDEV_NET is used the internal dhcp server will pass "auto_install" as boot file to the client and the boot loader passes the MAC of the first interface to the kernel to indicate PXE booting. Adding boot order support to SeaBIOS is not yet implemented. Ok ccardenas@
|
Revision tags: OPENBSD_6_4_BASE
|
#
1.41 |
|
08-Oct-2018 |
reyk |
Add support for qcow2 base images (external snapshots).
This works is from Ori Bernstein, committing on his behalf:
Add support to vmd for external snapshots. That is, snapshots that are derived from a base image. Data lookups start in the derived image, and if the derived image does not contain some data, the search proceeds ot the base image. Multiple derived images may exist off of a single base image.
A limitation of this format is that modifying the base image will corrupt the derived image.
This change also adds support for creating disk derived disk images to vmctl. To use it:
vmctl create derived.qcow2 -s 16G -b base.qcow2
From Ori Bernstein OK mlarkin@ reyk@
|
#
1.40 |
|
28-Sep-2018 |
reyk |
Support vmd-internal's vmboot with qcow2 disk images.
OK mlarkin@
|
#
1.39 |
|
19-Sep-2018 |
ccardenas |
Various clean up items for disks.
- qcow2: general cleanup - vioraw: check malloc - virtio: add function to sync disks - vm: call virtio_shutdown to sync disks when vm is finished executing
Thanks to Ori Bernstein.
Ok miko@
|
#
1.38 |
|
17-Jul-2018 |
mlarkin |
vmd(8): fix vmctl -b option for i386 kernels.
ok pd@
|
#
1.37 |
|
12-Jul-2018 |
mlarkin |
vmm(8)/vmm(4): send a copy of the guest register state to vmd on exit, avoiding multiple readregs ioctls back to vmm in case register content is needed subsequently.
ok phessler
|
#
1.36 |
|
10-Jul-2018 |
mlarkin |
vmd(8): route ELCR handler to the right function
|
#
1.35 |
|
09-Jul-2018 |
mlarkin |
vmd(8): better debug message in a failure case
|
#
1.34 |
|
19-Jun-2018 |
reyk |
knf
|
#
1.33 |
|
27-Apr-2018 |
mlarkin |
vmd(8): implement vmd side of ELCR registers
ok guenther
|
#
1.32 |
|
26-Apr-2018 |
mlarkin |
vmd(8): handle PIT channel 2 status readback via port 0x61
Allow PIT channel 2 status (fired/counting) readback via port 0x61 bit 5.
ok guenther@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.31 |
|
03-Jan-2018 |
ccardenas |
Add initial CD-ROM support to VMD via vioscsi.
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
#
1.30 |
|
29-Nov-2017 |
mlarkin |
make vmm(4) less responsible for initial register state, preferring to let usermode daemons handle that.
ok pd@
|
#
1.29 |
|
28-Nov-2017 |
mlarkin |
fix some spelling errors in a few comments
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.28 |
|
19-Sep-2017 |
mlarkin |
Clarify a wrong conditional, found by jsg.
ok jsg
|
#
1.27 |
|
17-Sep-2017 |
pd |
vmd: send/recv pci config space instead of recreating pci devices on receive
ok mlarkin@
|
#
1.26 |
|
17-Sep-2017 |
pd |
vmd: re add rtc.per and rtc.sec evtimers on receive
This was missed in receive. mc146818_start is already defined. This fixes rtc time resync on receive.
ok mlarkin@
|
#
1.25 |
|
11-Sep-2017 |
dlg |
add functions to provide direct access to guest memory as vmd addresses
iovec_mem() populates an iovec array based on guest physical addresses. this allows the use of things like readv and writev for moving data between the guest and a disk image file without having to bounce the memory.
vaddr_mem() provides a vmd usable pointer based on a guests physical address. this makes it possible to directly reference things like virtio rings without having to bounce that memory either. however, it assumes that a contiguous range of guest physical memory will sit in a single vm memory range. mlarkin@ says this is right.
ok mlarkin@
|
#
1.24 |
|
20-Aug-2017 |
pd |
vmd: Allow only upward migration
This restricts receiving vms from hosts with more cpu features.
Tested on broadwell -> skylake (works) skylake -> broadwell (don't work)
ok mlarkin@
|
#
1.23 |
|
14-Aug-2017 |
mlarkin |
vmd: set MSR_MISC_ENABLE=0 on vm creation, this will be re-set in vmm based on proper values from the host in use.
|
#
1.22 |
|
15-Jul-2017 |
pd |
Add vmctl send and vmctl receive
ok reyk@ and mlarkin@
|
#
1.21 |
|
09-Jul-2017 |
pd |
vmd/vmctl: Add ability to pause / unpause vms
With help from Ashwin Agrawal
ok reyk@ mlarkin@
|
#
1.20 |
|
07-Jun-2017 |
mlarkin |
vmd: Implement simulated baudrate support in the ns8250 module. The previous version was allowing an output rate that is "too fast", and linux guests would give up after 512 characters TXed ("too much work for irq4").
This diff calculates the approximate rate we can sustain at the current programmed baud rate and limits the output to that rate by inserting a HZ delay after a specified number of characters have been transmitted. This fixes the linux guest console issue.
Note that the console now outputs at more or less the selected baud rate, instead of nearly instantaneously as before - if you selected 9600 in your guest VMs before, you might want to change that to 115200 now for a better console experience.
krw@ "seems like a good idea to me"
|
#
1.19 |
|
30-May-2017 |
tedu |
split vioblk read/write functions into start and finish as prep for async io operations. ok mlarkin
|
#
1.18 |
|
28-May-2017 |
mlarkin |
SVM: add some exit types
Also, fix a comment that wasn't applicable anymore, and change a format from decimal to hex
|
#
1.17 |
|
05-May-2017 |
reyk |
VMs cannot use proc_compose() to PROC_VMM, they have to use imsg_compose() on the "vmm_pipe" directly. This fixes the communication channel from VMs back to vmm.
|
#
1.16 |
|
05-May-2017 |
mlarkin |
Allow vmd(8) to set guest %xcr0
Usermode part of previous vmm(4) diff.
Posted to tech by Pratik Vyas
|
#
1.15 |
|
02-May-2017 |
mlarkin |
fix an error in i386 vmd build
|
#
1.14 |
|
02-May-2017 |
mlarkin |
Matching vmd(8) part of previous diff (first part of vmctl send/receive).
ok kettenis
|
#
1.13 |
|
25-Apr-2017 |
reyk |
spacing
|
#
1.12 |
|
19-Apr-2017 |
reyk |
Add support for dynamic "NAT" interfaces (-L/local interface).
When a local interface is configured, vmd configures a /31 address on the tap(4) interface of the host and provides another IP in the same subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server that replies with IP, gateway, and DNS addresses to the VM. The built-in server only ever responds to the VM on the inside and cannot leak its DHCP responses to the outside.
Thanks to Uwe Werler, Josh Grosse, and some others for testing!
OK deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.11 |
|
27-Mar-2017 |
deraadt |
die whitespace die die die
|
#
1.10 |
|
25-Mar-2017 |
mlarkin |
Last bits needed to get seabios + alpine linux working. This is enough to get started and let more people help finding and fixing bugs.
ok kettenis, deraadt
|
#
1.9 |
|
25-Mar-2017 |
reyk |
Boot using BIOS from /etc/firmware/vmm-bios by default.
Instead of using the internal "vmboot", VMs will now be booted using the external BIOS firmware in /etc/firmware/vmm-bios (which is subject to a LGPLv3 license). Direct booting of OpenBSD kernels or non-default BIOS images is still supported for now using the -b/boot option that is replacing the -k/kernel option.
As requested by Theo, vmd(8) fails if neither the default BIOS is found nor a kernel has been specified in the VM configuration. The "vmm" BIOS has to be installed using fw_update(1), which will be done automatically in most cases where the OpenBSD can fetch it after install/upgrade.
OK mlarkin@
|
#
1.8 |
|
25-Mar-2017 |
mlarkin |
Implement some missing functionality and clean up some code in vmd pci emulation.
ok kettenis
|
#
1.7 |
|
25-Mar-2017 |
mlarkin |
Introduce a new function to obtain properly sized input data, and convert i8253/i8259/mc146818 emulation to use this.
|
#
1.6 |
|
24-Mar-2017 |
mlarkin |
Allow vmd to proceed after an interrupt occurred after retiring a cpuid instruction. Matches previous commit to kernel vmm.c
|
#
1.5 |
|
23-Mar-2017 |
mlarkin |
Implement memory size and SMP CPU count NVRAM registers in the emulated mc146818. This is needed for seabios to boot properly (and construct a sensible e820 map to send to the guest OS).
|
#
1.4 |
|
21-Mar-2017 |
mlarkin |
Fix two errors in NS8250 (UART) emulation. The first error zeroed out the high bits of %eax on reading register data from the emulated UART ports. The second error didn't properly assert the TXRDY bit during init - this bit was only set after the first character was sent. Both these bugs caused seabios to not be able to output any data. Found during the recent effort to get Linux guests booting.
|
#
1.3 |
|
15-Mar-2017 |
reyk |
Improve vmmci(4) shutdown and reboot.
This change handles various cases to power off the VM, even if it is unresponsive, stuck in ddb, or when the shutdown was initiated from the VM guest side. Usage of timeout and VM ACKs make sure that the VM is really turned off at some point.
OK mlarkin@
|
#
1.2 |
|
02-Mar-2017 |
reyk |
Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.
This is especially useful when multiple VMs share a switch, the implementation is independent from the underlying switch or bridge.
no objections mlarkin@
|
#
1.1 |
|
01-Mar-2017 |
reyk |
Split vmm.c into two files: vm.c for the VM child, vmm.c for the parent
As discussed with mlarkin@, it makes it easier to maintain the file.
OK mlarkin@
|
#
1.65 |
|
01-Sep-2021 |
dv |
remove unused functions and cleanup vmd.h
Discussed with mlarkin@. These functions were implemented but never used. While in vmd.h, fix the order to match current vmd(8) reality.
|
#
1.64 |
|
16-Jul-2021 |
dv |
vmd(8): simplify vcpu logic, removing uart & vionet reads
Remove legacy state handling on the ns8250 and virtio network devices originally put in place before using libevent for async device events. The vcpu thread doesn't need to process device data as it is handled by the libevent thread.
This has the benefit of simplifying some of the message passing between threads introduced to the ns8250 uart since both the vcpu and libevent threads were processing read events.
No functional change intended. Tested by many, including abieber@, weerd@, Mischa Peters, and Matthias Schmidt. (Thanks.)
OK mlarkin@
|
#
1.63 |
|
16-Jun-2021 |
dv |
cleanup vmd(8) includes and header files
Lots of organic growth other the years lead to unnecessary includes (proc.h everywhere) and odd dependencies between header files. This cleans things up a bit to help with upcoming cleanup around dhcp code.
No functional change.
"go for it" mlarkin@
|
Revision tags: OPENBSD_6_9_BASE
|
#
1.62 |
|
05-Apr-2021 |
dv |
Support booting from compressed kernel images.
The bsd.rd ramdisk now ships gzip'd on amd64. Use libz in base to transparently handle decompression of any compressed kernel images.
Patch from Josh Rickmar.
ok kn@
|
#
1.61 |
|
29-Mar-2021 |
dv |
Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior ignored did not intercept these packets and instead transmitted them.
This should make vmd(8)'s dhcp behave more as a true dhcp server should and allows it to work properly with the new dhcpleased(8) attempting a renewal.
OK mlarkin@
|
#
1.60 |
|
19-Mar-2021 |
kn |
Remove booting from kernels in raw/qcow2 images
Diff and (slightly tweaked) text below from Dave Voutila < dave at sisu dot io >, thanks!
-- Since 6.7 switched to FFS2 as the default filesystem for new installs, the ability for vmd(8) to load a kernel and boot.conf from a disk image directly (without SeaBIOS) has been broken.
A diff from tb to add FFS2 support never mdae it into the tree.
On 5th Jan 2021, new ramdisks for amd64 have started shipping gzipped, breaking the ability to load the bsd.rd directly as a kernel image for a vmd guest without first uncompressing the image.
Using BIOS works, the FFS2 change happend ten months ago and few if any have complained about the breakage. vmctl(8) is still vague about supporting it per its man page and one still has to pass the disk image twice as a "-b" and "-d" argument to boot an OpenBSD guest *without* BIOS.
Josh Rickmar reported the gzip issue on bugs@ and provided patches to add support for compressed ramdisks and kernel images. The easiest way to do so is to drop support for FFS images since they require a call to fmemopen(3) while all the other logic uses fopen(3)/fdopen(3) calls and a file descriptor. It is much easier to get thsoe patches merged if they don't have to account for extracting files from disk images. --
No objections anyone "Removing it makes sense" reyk (who wrote the FFS module) OK mlarkin
|
#
1.59 |
|
13-Feb-2021 |
mlarkin |
Fix some wrong comments and KNF/long line wraps
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.58 |
|
28-Jun-2020 |
pd |
vmd(8): Eliminate libevent state corruption
libevent functions for com, pic and rtc are now only called on event_thread. vcpu exit handlers send messages on a dev pipe and callbacks on these events do the event management (event_add, evtimer_add, etc). Previously, libevent state was mutated by two threads, event_thread, that runs all the callbacks and the vcpu thread when running exit handlers. This could have lead to libevent state corruption.
Patch from Dave Voutila <dave@sisu.io>
ok claudio@ tested by abieber@ and brynet@
|
Revision tags: OPENBSD_6_7_BASE
|
#
1.57 |
|
30-Apr-2020 |
pd |
vmd(8): correctly terminate vm processes after sending vm
Instead of a round about way of sending a message to vmm that 'send is successful' and terminating by vm_remove from vmm, we can send the imsg and exit in the vm process. The sigchld handler in vmm will vm_remove it from its structures. This is how a normal vm is terminated as well.
Previously, vm_remove was called in vmm_dispatch_vm (ie. the event handler to receive messages from vm process) when hanlding the IMSG_VMDOP_SEND_VM_RESPONSE (ie. the vm process has written the vm state to the fd passed on by vmctl send). This is not how vm_remove was intented to be used as it does a free(vm). The vm struct holds the buffers for imsg and so after handling this IMSG_VMDOP_SEND_VM_RESPONSE message, vmm_dispatch_vm loops again to do imsg_get(ibuf, &imsg) to read the next message (and we had just freed this *ibuf when we freed the vm struct) causing it to segfault.
reported by kn@ ok kn@
|
#
1.56 |
|
21-Apr-2020 |
pd |
vmd: improve concurrency control in pause
Previous implementation hit a deadlock sometimes as the pthread_cond_broadcast for the pause mutex could happen before pthread_cond_wait. This implementation uses a barrier which is hit when all vpcus are paused.
ok mpi@
|
#
1.55 |
|
08-Apr-2020 |
pd |
vmm(4): add IOCTL handler to sets the access protections of the ept
This exposes VMM_IOC_MPROTECT_EPT which can be used by vmd to lock in physical pages. Currently, vmd just terminates the vm in case it gets a protection fault in the future.
This feature is used by solo5 which uses vmm(4) as a backend hypervisor.
ok mpi@
Patch from Adam Steen <adam@adamsteen.com.au>
|
#
1.54 |
|
11-Dec-2019 |
pd |
vmd: proper concurrency control when pausing a vm
Removes an XXX which slept for 1s waiting for the vcpu thread to reach HLT and pause. We now define a paused and unpaused condition so that a call to pause_vm() / vmctl pause blocks till the vm really reaches a paused state.
Also, detach events for devices from event loop when pausing and add them back when unpausing. This is because some callbacks call pthread_mutex_lock and if the vm is paused, it would block also causing the libevent thread to block. This would mean that we would not be able to process any IMSGs received from vmm (parent process) including a message to unpause.
ok mlarkin@
|
#
1.53 |
|
30-Nov-2019 |
mlarkin |
Revert previous - the stability was not as improved as we had thought and we ended up accidentally breaking vmctl. This will need more thought.
ok ori@
|
#
1.52 |
|
29-Nov-2019 |
mlarkin |
Fix at least one cause of VMs spinning at 100% host CPU
After debugging with ori@, it looks like an event ends up on the wrong libevent queue, and we end continually de-queueing and re-queueing the event continually. While it's unclear exactly why this happened, a clue on libevent's github issues page for the same problem pointed us to using a different event base for the device events. This seems to have unstuck ori@'s problematic VM, and I have also seen no more hangs after this.
We have not completely separated the queues; ori@ will work on setting new libevent bases for those later. But those events are pretty frequency.
with help from and ok ori@
|
Revision tags: OPENBSD_6_6_BASE
|
#
1.51 |
|
17-Jul-2019 |
pd |
vmm/vmd: Fix migration with pvclock
Implement VMM_IOC_READVMPARAMS and VMM_IOC_WRITEVMPARAMS ioctls to read and write pvclock state.
reads ok mlarkin@
|
#
1.50 |
|
28-Jun-2019 |
deraadt |
When system calls indicate an error they return -1, not some arbitrary value < 0. errno is only updated in this case. Change all (most?) callers of syscalls to follow this better, and let's see if this strictness helps us in the future.
|
#
1.49 |
|
28-May-2019 |
pd |
vmd: unset CR0_CD and CR0_NW in default flat64 register values
These never got unset on AMD/SVM guests when booted via vmctl start -b causing them to run very slow
ok mlarkin@
|
#
1.48 |
|
12-May-2019 |
pd |
vmm: add a x86 page table walker
Add a first cut of x86 page table walker to vmd(8) and vmm(4). This function is not used right now but is a building block for future features like HPET, OUTSB and INSB emulation, nested virtualisation support, etc.
With help from Mike Larkin
ok mlarkin@
|
#
1.47 |
|
11-May-2019 |
jasper |
vm_dump_header allocated space for a signature but it was never set; set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image
ok mlarkin@ pd@
|
#
1.46 |
|
11-May-2019 |
jasper |
track the state of the vm (running, paused, etc) using a single bitfield instead of a handful of separate variables. this will makes it easier for vmd to report and check on the individual vm states
no functional change intended
ok ccardenas@ mlarkin@
|
Revision tags: OPENBSD_6_5_BASE
|
#
1.45 |
|
01-Mar-2019 |
mlarkin |
vmd(8): remove some i386 remnants that missed the original cleanup
ok pd, kn, deraadt
|
#
1.44 |
|
20-Feb-2019 |
mlarkin |
vmd(8): initialize guest %drX registers to power-on defaults on launch
Initializes the %drX registers to power on defaults, and bump the VM send/recieve header to reflect same
discussed with deraadt@
|
#
1.43 |
|
10-Dec-2018 |
claudio |
Implement the fw_cfg interface basics and use it to set the bootorder if a bootdevice was forced. This implements both the pure IO port interface and also the new DMA interface, a few direct commands are implemented which are needed but in general the "file" interface should be used. There is no write support for the guest. Tested against the latest vmm-firmware port. This requires also a -current kernel to pass the IO ports to vmd(8). OK mlarkin@ ccardenas@
|
#
1.42 |
|
06-Dec-2018 |
claudio |
Make it possible to define the bootdevice in vmd. This information is used currently only when booting a OpenBSD kernel. If VMBOOTDEV_NET is used the internal dhcp server will pass "auto_install" as boot file to the client and the boot loader passes the MAC of the first interface to the kernel to indicate PXE booting. Adding boot order support to SeaBIOS is not yet implemented. Ok ccardenas@
|
Revision tags: OPENBSD_6_4_BASE
|
#
1.41 |
|
08-Oct-2018 |
reyk |
Add support for qcow2 base images (external snapshots).
This works is from Ori Bernstein, committing on his behalf:
Add support to vmd for external snapshots. That is, snapshots that are derived from a base image. Data lookups start in the derived image, and if the derived image does not contain some data, the search proceeds ot the base image. Multiple derived images may exist off of a single base image.
A limitation of this format is that modifying the base image will corrupt the derived image.
This change also adds support for creating disk derived disk images to vmctl. To use it:
vmctl create derived.qcow2 -s 16G -b base.qcow2
From Ori Bernstein OK mlarkin@ reyk@
|
#
1.40 |
|
28-Sep-2018 |
reyk |
Support vmd-internal's vmboot with qcow2 disk images.
OK mlarkin@
|
#
1.39 |
|
19-Sep-2018 |
ccardenas |
Various clean up items for disks.
- qcow2: general cleanup - vioraw: check malloc - virtio: add function to sync disks - vm: call virtio_shutdown to sync disks when vm is finished executing
Thanks to Ori Bernstein.
Ok miko@
|
#
1.38 |
|
17-Jul-2018 |
mlarkin |
vmd(8): fix vmctl -b option for i386 kernels.
ok pd@
|
#
1.37 |
|
12-Jul-2018 |
mlarkin |
vmm(8)/vmm(4): send a copy of the guest register state to vmd on exit, avoiding multiple readregs ioctls back to vmm in case register content is needed subsequently.
ok phessler
|
#
1.36 |
|
10-Jul-2018 |
mlarkin |
vmd(8): route ELCR handler to the right function
|
#
1.35 |
|
09-Jul-2018 |
mlarkin |
vmd(8): better debug message in a failure case
|
#
1.34 |
|
19-Jun-2018 |
reyk |
knf
|
#
1.33 |
|
27-Apr-2018 |
mlarkin |
vmd(8): implement vmd side of ELCR registers
ok guenther
|
#
1.32 |
|
26-Apr-2018 |
mlarkin |
vmd(8): handle PIT channel 2 status readback via port 0x61
Allow PIT channel 2 status (fired/counting) readback via port 0x61 bit 5.
ok guenther@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.31 |
|
03-Jan-2018 |
ccardenas |
Add initial CD-ROM support to VMD via vioscsi.
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
#
1.30 |
|
29-Nov-2017 |
mlarkin |
make vmm(4) less responsible for initial register state, preferring to let usermode daemons handle that.
ok pd@
|
#
1.29 |
|
28-Nov-2017 |
mlarkin |
fix some spelling errors in a few comments
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.28 |
|
19-Sep-2017 |
mlarkin |
Clarify a wrong conditional, found by jsg.
ok jsg
|
#
1.27 |
|
17-Sep-2017 |
pd |
vmd: send/recv pci config space instead of recreating pci devices on receive
ok mlarkin@
|
#
1.26 |
|
17-Sep-2017 |
pd |
vmd: re add rtc.per and rtc.sec evtimers on receive
This was missed in receive. mc146818_start is already defined. This fixes rtc time resync on receive.
ok mlarkin@
|
#
1.25 |
|
11-Sep-2017 |
dlg |
add functions to provide direct access to guest memory as vmd addresses
iovec_mem() populates an iovec array based on guest physical addresses. this allows the use of things like readv and writev for moving data between the guest and a disk image file without having to bounce the memory.
vaddr_mem() provides a vmd usable pointer based on a guests physical address. this makes it possible to directly reference things like virtio rings without having to bounce that memory either. however, it assumes that a contiguous range of guest physical memory will sit in a single vm memory range. mlarkin@ says this is right.
ok mlarkin@
|
#
1.24 |
|
20-Aug-2017 |
pd |
vmd: Allow only upward migration
This restricts receiving vms from hosts with more cpu features.
Tested on broadwell -> skylake (works) skylake -> broadwell (don't work)
ok mlarkin@
|
#
1.23 |
|
14-Aug-2017 |
mlarkin |
vmd: set MSR_MISC_ENABLE=0 on vm creation, this will be re-set in vmm based on proper values from the host in use.
|
#
1.22 |
|
15-Jul-2017 |
pd |
Add vmctl send and vmctl receive
ok reyk@ and mlarkin@
|
#
1.21 |
|
09-Jul-2017 |
pd |
vmd/vmctl: Add ability to pause / unpause vms
With help from Ashwin Agrawal
ok reyk@ mlarkin@
|
#
1.20 |
|
07-Jun-2017 |
mlarkin |
vmd: Implement simulated baudrate support in the ns8250 module. The previous version was allowing an output rate that is "too fast", and linux guests would give up after 512 characters TXed ("too much work for irq4").
This diff calculates the approximate rate we can sustain at the current programmed baud rate and limits the output to that rate by inserting a HZ delay after a specified number of characters have been transmitted. This fixes the linux guest console issue.
Note that the console now outputs at more or less the selected baud rate, instead of nearly instantaneously as before - if you selected 9600 in your guest VMs before, you might want to change that to 115200 now for a better console experience.
krw@ "seems like a good idea to me"
|
#
1.19 |
|
30-May-2017 |
tedu |
split vioblk read/write functions into start and finish as prep for async io operations. ok mlarkin
|
#
1.18 |
|
28-May-2017 |
mlarkin |
SVM: add some exit types
Also, fix a comment that wasn't applicable anymore, and change a format from decimal to hex
|
#
1.17 |
|
05-May-2017 |
reyk |
VMs cannot use proc_compose() to PROC_VMM, they have to use imsg_compose() on the "vmm_pipe" directly. This fixes the communication channel from VMs back to vmm.
|
#
1.16 |
|
05-May-2017 |
mlarkin |
Allow vmd(8) to set guest %xcr0
Usermode part of previous vmm(4) diff.
Posted to tech by Pratik Vyas
|
#
1.15 |
|
02-May-2017 |
mlarkin |
fix an error in i386 vmd build
|
#
1.14 |
|
02-May-2017 |
mlarkin |
Matching vmd(8) part of previous diff (first part of vmctl send/receive).
ok kettenis
|
#
1.13 |
|
25-Apr-2017 |
reyk |
spacing
|
#
1.12 |
|
19-Apr-2017 |
reyk |
Add support for dynamic "NAT" interfaces (-L/local interface).
When a local interface is configured, vmd configures a /31 address on the tap(4) interface of the host and provides another IP in the same subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server that replies with IP, gateway, and DNS addresses to the VM. The built-in server only ever responds to the VM on the inside and cannot leak its DHCP responses to the outside.
Thanks to Uwe Werler, Josh Grosse, and some others for testing!
OK deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.11 |
|
27-Mar-2017 |
deraadt |
die whitespace die die die
|
#
1.10 |
|
25-Mar-2017 |
mlarkin |
Last bits needed to get seabios + alpine linux working. This is enough to get started and let more people help finding and fixing bugs.
ok kettenis, deraadt
|
#
1.9 |
|
25-Mar-2017 |
reyk |
Boot using BIOS from /etc/firmware/vmm-bios by default.
Instead of using the internal "vmboot", VMs will now be booted using the external BIOS firmware in /etc/firmware/vmm-bios (which is subject to a LGPLv3 license). Direct booting of OpenBSD kernels or non-default BIOS images is still supported for now using the -b/boot option that is replacing the -k/kernel option.
As requested by Theo, vmd(8) fails if neither the default BIOS is found nor a kernel has been specified in the VM configuration. The "vmm" BIOS has to be installed using fw_update(1), which will be done automatically in most cases where the OpenBSD can fetch it after install/upgrade.
OK mlarkin@
|
#
1.8 |
|
25-Mar-2017 |
mlarkin |
Implement some missing functionality and clean up some code in vmd pci emulation.
ok kettenis
|
#
1.7 |
|
25-Mar-2017 |
mlarkin |
Introduce a new function to obtain properly sized input data, and convert i8253/i8259/mc146818 emulation to use this.
|
#
1.6 |
|
24-Mar-2017 |
mlarkin |
Allow vmd to proceed after an interrupt occurred after retiring a cpuid instruction. Matches previous commit to kernel vmm.c
|
#
1.5 |
|
23-Mar-2017 |
mlarkin |
Implement memory size and SMP CPU count NVRAM registers in the emulated mc146818. This is needed for seabios to boot properly (and construct a sensible e820 map to send to the guest OS).
|
#
1.4 |
|
21-Mar-2017 |
mlarkin |
Fix two errors in NS8250 (UART) emulation. The first error zeroed out the high bits of %eax on reading register data from the emulated UART ports. The second error didn't properly assert the TXRDY bit during init - this bit was only set after the first character was sent. Both these bugs caused seabios to not be able to output any data. Found during the recent effort to get Linux guests booting.
|
#
1.3 |
|
15-Mar-2017 |
reyk |
Improve vmmci(4) shutdown and reboot.
This change handles various cases to power off the VM, even if it is unresponsive, stuck in ddb, or when the shutdown was initiated from the VM guest side. Usage of timeout and VM ACKs make sure that the VM is really turned off at some point.
OK mlarkin@
|
#
1.2 |
|
02-Mar-2017 |
reyk |
Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.
This is especially useful when multiple VMs share a switch, the implementation is independent from the underlying switch or bridge.
no objections mlarkin@
|
#
1.1 |
|
01-Mar-2017 |
reyk |
Split vmm.c into two files: vm.c for the VM child, vmm.c for the parent
As discussed with mlarkin@, it makes it easier to maintain the file.
OK mlarkin@
|
#
1.64 |
|
16-Jul-2021 |
dv |
vmd(8): simplify vcpu logic, removing uart & vionet reads
Remove legacy state handling on the ns8250 and virtio network devices originally put in place before using libevent for async device events. The vcpu thread doesn't need to process device data as it is handled by the libevent thread.
This has the benefit of simplifying some of the message passing between threads introduced to the ns8250 uart since both the vcpu and libevent threads were processing read events.
No functional change intended. Tested by many, including abieber@, weerd@, Mischa Peters, and Matthias Schmidt. (Thanks.)
OK mlarkin@
|
#
1.63 |
|
16-Jun-2021 |
dv |
cleanup vmd(8) includes and header files
Lots of organic growth other the years lead to unnecessary includes (proc.h everywhere) and odd dependencies between header files. This cleans things up a bit to help with upcoming cleanup around dhcp code.
No functional change.
"go for it" mlarkin@
|
Revision tags: OPENBSD_6_9_BASE
|
#
1.62 |
|
05-Apr-2021 |
dv |
Support booting from compressed kernel images.
The bsd.rd ramdisk now ships gzip'd on amd64. Use libz in base to transparently handle decompression of any compressed kernel images.
Patch from Josh Rickmar.
ok kn@
|
#
1.61 |
|
29-Mar-2021 |
dv |
Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior ignored did not intercept these packets and instead transmitted them.
This should make vmd(8)'s dhcp behave more as a true dhcp server should and allows it to work properly with the new dhcpleased(8) attempting a renewal.
OK mlarkin@
|
#
1.60 |
|
19-Mar-2021 |
kn |
Remove booting from kernels in raw/qcow2 images
Diff and (slightly tweaked) text below from Dave Voutila < dave at sisu dot io >, thanks!
-- Since 6.7 switched to FFS2 as the default filesystem for new installs, the ability for vmd(8) to load a kernel and boot.conf from a disk image directly (without SeaBIOS) has been broken.
A diff from tb to add FFS2 support never mdae it into the tree.
On 5th Jan 2021, new ramdisks for amd64 have started shipping gzipped, breaking the ability to load the bsd.rd directly as a kernel image for a vmd guest without first uncompressing the image.
Using BIOS works, the FFS2 change happend ten months ago and few if any have complained about the breakage. vmctl(8) is still vague about supporting it per its man page and one still has to pass the disk image twice as a "-b" and "-d" argument to boot an OpenBSD guest *without* BIOS.
Josh Rickmar reported the gzip issue on bugs@ and provided patches to add support for compressed ramdisks and kernel images. The easiest way to do so is to drop support for FFS images since they require a call to fmemopen(3) while all the other logic uses fopen(3)/fdopen(3) calls and a file descriptor. It is much easier to get thsoe patches merged if they don't have to account for extracting files from disk images. --
No objections anyone "Removing it makes sense" reyk (who wrote the FFS module) OK mlarkin
|
#
1.59 |
|
13-Feb-2021 |
mlarkin |
Fix some wrong comments and KNF/long line wraps
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.58 |
|
28-Jun-2020 |
pd |
vmd(8): Eliminate libevent state corruption
libevent functions for com, pic and rtc are now only called on event_thread. vcpu exit handlers send messages on a dev pipe and callbacks on these events do the event management (event_add, evtimer_add, etc). Previously, libevent state was mutated by two threads, event_thread, that runs all the callbacks and the vcpu thread when running exit handlers. This could have lead to libevent state corruption.
Patch from Dave Voutila <dave@sisu.io>
ok claudio@ tested by abieber@ and brynet@
|
Revision tags: OPENBSD_6_7_BASE
|
#
1.57 |
|
30-Apr-2020 |
pd |
vmd(8): correctly terminate vm processes after sending vm
Instead of a round about way of sending a message to vmm that 'send is successful' and terminating by vm_remove from vmm, we can send the imsg and exit in the vm process. The sigchld handler in vmm will vm_remove it from its structures. This is how a normal vm is terminated as well.
Previously, vm_remove was called in vmm_dispatch_vm (ie. the event handler to receive messages from vm process) when hanlding the IMSG_VMDOP_SEND_VM_RESPONSE (ie. the vm process has written the vm state to the fd passed on by vmctl send). This is not how vm_remove was intented to be used as it does a free(vm). The vm struct holds the buffers for imsg and so after handling this IMSG_VMDOP_SEND_VM_RESPONSE message, vmm_dispatch_vm loops again to do imsg_get(ibuf, &imsg) to read the next message (and we had just freed this *ibuf when we freed the vm struct) causing it to segfault.
reported by kn@ ok kn@
|
#
1.56 |
|
21-Apr-2020 |
pd |
vmd: improve concurrency control in pause
Previous implementation hit a deadlock sometimes as the pthread_cond_broadcast for the pause mutex could happen before pthread_cond_wait. This implementation uses a barrier which is hit when all vpcus are paused.
ok mpi@
|
#
1.55 |
|
08-Apr-2020 |
pd |
vmm(4): add IOCTL handler to sets the access protections of the ept
This exposes VMM_IOC_MPROTECT_EPT which can be used by vmd to lock in physical pages. Currently, vmd just terminates the vm in case it gets a protection fault in the future.
This feature is used by solo5 which uses vmm(4) as a backend hypervisor.
ok mpi@
Patch from Adam Steen <adam@adamsteen.com.au>
|
#
1.54 |
|
11-Dec-2019 |
pd |
vmd: proper concurrency control when pausing a vm
Removes an XXX which slept for 1s waiting for the vcpu thread to reach HLT and pause. We now define a paused and unpaused condition so that a call to pause_vm() / vmctl pause blocks till the vm really reaches a paused state.
Also, detach events for devices from event loop when pausing and add them back when unpausing. This is because some callbacks call pthread_mutex_lock and if the vm is paused, it would block also causing the libevent thread to block. This would mean that we would not be able to process any IMSGs received from vmm (parent process) including a message to unpause.
ok mlarkin@
|
#
1.53 |
|
30-Nov-2019 |
mlarkin |
Revert previous - the stability was not as improved as we had thought and we ended up accidentally breaking vmctl. This will need more thought.
ok ori@
|
#
1.52 |
|
29-Nov-2019 |
mlarkin |
Fix at least one cause of VMs spinning at 100% host CPU
After debugging with ori@, it looks like an event ends up on the wrong libevent queue, and we end continually de-queueing and re-queueing the event continually. While it's unclear exactly why this happened, a clue on libevent's github issues page for the same problem pointed us to using a different event base for the device events. This seems to have unstuck ori@'s problematic VM, and I have also seen no more hangs after this.
We have not completely separated the queues; ori@ will work on setting new libevent bases for those later. But those events are pretty frequency.
with help from and ok ori@
|
Revision tags: OPENBSD_6_6_BASE
|
#
1.51 |
|
17-Jul-2019 |
pd |
vmm/vmd: Fix migration with pvclock
Implement VMM_IOC_READVMPARAMS and VMM_IOC_WRITEVMPARAMS ioctls to read and write pvclock state.
reads ok mlarkin@
|
#
1.50 |
|
28-Jun-2019 |
deraadt |
When system calls indicate an error they return -1, not some arbitrary value < 0. errno is only updated in this case. Change all (most?) callers of syscalls to follow this better, and let's see if this strictness helps us in the future.
|
#
1.49 |
|
28-May-2019 |
pd |
vmd: unset CR0_CD and CR0_NW in default flat64 register values
These never got unset on AMD/SVM guests when booted via vmctl start -b causing them to run very slow
ok mlarkin@
|
#
1.48 |
|
12-May-2019 |
pd |
vmm: add a x86 page table walker
Add a first cut of x86 page table walker to vmd(8) and vmm(4). This function is not used right now but is a building block for future features like HPET, OUTSB and INSB emulation, nested virtualisation support, etc.
With help from Mike Larkin
ok mlarkin@
|
#
1.47 |
|
11-May-2019 |
jasper |
vm_dump_header allocated space for a signature but it was never set; set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image
ok mlarkin@ pd@
|
#
1.46 |
|
11-May-2019 |
jasper |
track the state of the vm (running, paused, etc) using a single bitfield instead of a handful of separate variables. this will makes it easier for vmd to report and check on the individual vm states
no functional change intended
ok ccardenas@ mlarkin@
|
Revision tags: OPENBSD_6_5_BASE
|
#
1.45 |
|
01-Mar-2019 |
mlarkin |
vmd(8): remove some i386 remnants that missed the original cleanup
ok pd, kn, deraadt
|
#
1.44 |
|
20-Feb-2019 |
mlarkin |
vmd(8): initialize guest %drX registers to power-on defaults on launch
Initializes the %drX registers to power on defaults, and bump the VM send/recieve header to reflect same
discussed with deraadt@
|
#
1.43 |
|
10-Dec-2018 |
claudio |
Implement the fw_cfg interface basics and use it to set the bootorder if a bootdevice was forced. This implements both the pure IO port interface and also the new DMA interface, a few direct commands are implemented which are needed but in general the "file" interface should be used. There is no write support for the guest. Tested against the latest vmm-firmware port. This requires also a -current kernel to pass the IO ports to vmd(8). OK mlarkin@ ccardenas@
|
#
1.42 |
|
06-Dec-2018 |
claudio |
Make it possible to define the bootdevice in vmd. This information is used currently only when booting a OpenBSD kernel. If VMBOOTDEV_NET is used the internal dhcp server will pass "auto_install" as boot file to the client and the boot loader passes the MAC of the first interface to the kernel to indicate PXE booting. Adding boot order support to SeaBIOS is not yet implemented. Ok ccardenas@
|
Revision tags: OPENBSD_6_4_BASE
|
#
1.41 |
|
08-Oct-2018 |
reyk |
Add support for qcow2 base images (external snapshots).
This works is from Ori Bernstein, committing on his behalf:
Add support to vmd for external snapshots. That is, snapshots that are derived from a base image. Data lookups start in the derived image, and if the derived image does not contain some data, the search proceeds ot the base image. Multiple derived images may exist off of a single base image.
A limitation of this format is that modifying the base image will corrupt the derived image.
This change also adds support for creating disk derived disk images to vmctl. To use it:
vmctl create derived.qcow2 -s 16G -b base.qcow2
From Ori Bernstein OK mlarkin@ reyk@
|
#
1.40 |
|
28-Sep-2018 |
reyk |
Support vmd-internal's vmboot with qcow2 disk images.
OK mlarkin@
|
#
1.39 |
|
19-Sep-2018 |
ccardenas |
Various clean up items for disks.
- qcow2: general cleanup - vioraw: check malloc - virtio: add function to sync disks - vm: call virtio_shutdown to sync disks when vm is finished executing
Thanks to Ori Bernstein.
Ok miko@
|
#
1.38 |
|
17-Jul-2018 |
mlarkin |
vmd(8): fix vmctl -b option for i386 kernels.
ok pd@
|
#
1.37 |
|
12-Jul-2018 |
mlarkin |
vmm(8)/vmm(4): send a copy of the guest register state to vmd on exit, avoiding multiple readregs ioctls back to vmm in case register content is needed subsequently.
ok phessler
|
#
1.36 |
|
10-Jul-2018 |
mlarkin |
vmd(8): route ELCR handler to the right function
|
#
1.35 |
|
09-Jul-2018 |
mlarkin |
vmd(8): better debug message in a failure case
|
#
1.34 |
|
19-Jun-2018 |
reyk |
knf
|
#
1.33 |
|
27-Apr-2018 |
mlarkin |
vmd(8): implement vmd side of ELCR registers
ok guenther
|
#
1.32 |
|
26-Apr-2018 |
mlarkin |
vmd(8): handle PIT channel 2 status readback via port 0x61
Allow PIT channel 2 status (fired/counting) readback via port 0x61 bit 5.
ok guenther@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.31 |
|
03-Jan-2018 |
ccardenas |
Add initial CD-ROM support to VMD via vioscsi.
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
#
1.30 |
|
29-Nov-2017 |
mlarkin |
make vmm(4) less responsible for initial register state, preferring to let usermode daemons handle that.
ok pd@
|
#
1.29 |
|
28-Nov-2017 |
mlarkin |
fix some spelling errors in a few comments
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.28 |
|
19-Sep-2017 |
mlarkin |
Clarify a wrong conditional, found by jsg.
ok jsg
|
#
1.27 |
|
17-Sep-2017 |
pd |
vmd: send/recv pci config space instead of recreating pci devices on receive
ok mlarkin@
|
#
1.26 |
|
17-Sep-2017 |
pd |
vmd: re add rtc.per and rtc.sec evtimers on receive
This was missed in receive. mc146818_start is already defined. This fixes rtc time resync on receive.
ok mlarkin@
|
#
1.25 |
|
11-Sep-2017 |
dlg |
add functions to provide direct access to guest memory as vmd addresses
iovec_mem() populates an iovec array based on guest physical addresses. this allows the use of things like readv and writev for moving data between the guest and a disk image file without having to bounce the memory.
vaddr_mem() provides a vmd usable pointer based on a guests physical address. this makes it possible to directly reference things like virtio rings without having to bounce that memory either. however, it assumes that a contiguous range of guest physical memory will sit in a single vm memory range. mlarkin@ says this is right.
ok mlarkin@
|
#
1.24 |
|
20-Aug-2017 |
pd |
vmd: Allow only upward migration
This restricts receiving vms from hosts with more cpu features.
Tested on broadwell -> skylake (works) skylake -> broadwell (don't work)
ok mlarkin@
|
#
1.23 |
|
14-Aug-2017 |
mlarkin |
vmd: set MSR_MISC_ENABLE=0 on vm creation, this will be re-set in vmm based on proper values from the host in use.
|
#
1.22 |
|
15-Jul-2017 |
pd |
Add vmctl send and vmctl receive
ok reyk@ and mlarkin@
|
#
1.21 |
|
09-Jul-2017 |
pd |
vmd/vmctl: Add ability to pause / unpause vms
With help from Ashwin Agrawal
ok reyk@ mlarkin@
|
#
1.20 |
|
07-Jun-2017 |
mlarkin |
vmd: Implement simulated baudrate support in the ns8250 module. The previous version was allowing an output rate that is "too fast", and linux guests would give up after 512 characters TXed ("too much work for irq4").
This diff calculates the approximate rate we can sustain at the current programmed baud rate and limits the output to that rate by inserting a HZ delay after a specified number of characters have been transmitted. This fixes the linux guest console issue.
Note that the console now outputs at more or less the selected baud rate, instead of nearly instantaneously as before - if you selected 9600 in your guest VMs before, you might want to change that to 115200 now for a better console experience.
krw@ "seems like a good idea to me"
|
#
1.19 |
|
30-May-2017 |
tedu |
split vioblk read/write functions into start and finish as prep for async io operations. ok mlarkin
|
#
1.18 |
|
28-May-2017 |
mlarkin |
SVM: add some exit types
Also, fix a comment that wasn't applicable anymore, and change a format from decimal to hex
|
#
1.17 |
|
05-May-2017 |
reyk |
VMs cannot use proc_compose() to PROC_VMM, they have to use imsg_compose() on the "vmm_pipe" directly. This fixes the communication channel from VMs back to vmm.
|
#
1.16 |
|
05-May-2017 |
mlarkin |
Allow vmd(8) to set guest %xcr0
Usermode part of previous vmm(4) diff.
Posted to tech by Pratik Vyas
|
#
1.15 |
|
02-May-2017 |
mlarkin |
fix an error in i386 vmd build
|
#
1.14 |
|
02-May-2017 |
mlarkin |
Matching vmd(8) part of previous diff (first part of vmctl send/receive).
ok kettenis
|
#
1.13 |
|
25-Apr-2017 |
reyk |
spacing
|
#
1.12 |
|
19-Apr-2017 |
reyk |
Add support for dynamic "NAT" interfaces (-L/local interface).
When a local interface is configured, vmd configures a /31 address on the tap(4) interface of the host and provides another IP in the same subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server that replies with IP, gateway, and DNS addresses to the VM. The built-in server only ever responds to the VM on the inside and cannot leak its DHCP responses to the outside.
Thanks to Uwe Werler, Josh Grosse, and some others for testing!
OK deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.11 |
|
27-Mar-2017 |
deraadt |
die whitespace die die die
|
#
1.10 |
|
25-Mar-2017 |
mlarkin |
Last bits needed to get seabios + alpine linux working. This is enough to get started and let more people help finding and fixing bugs.
ok kettenis, deraadt
|
#
1.9 |
|
25-Mar-2017 |
reyk |
Boot using BIOS from /etc/firmware/vmm-bios by default.
Instead of using the internal "vmboot", VMs will now be booted using the external BIOS firmware in /etc/firmware/vmm-bios (which is subject to a LGPLv3 license). Direct booting of OpenBSD kernels or non-default BIOS images is still supported for now using the -b/boot option that is replacing the -k/kernel option.
As requested by Theo, vmd(8) fails if neither the default BIOS is found nor a kernel has been specified in the VM configuration. The "vmm" BIOS has to be installed using fw_update(1), which will be done automatically in most cases where the OpenBSD can fetch it after install/upgrade.
OK mlarkin@
|
#
1.8 |
|
25-Mar-2017 |
mlarkin |
Implement some missing functionality and clean up some code in vmd pci emulation.
ok kettenis
|
#
1.7 |
|
25-Mar-2017 |
mlarkin |
Introduce a new function to obtain properly sized input data, and convert i8253/i8259/mc146818 emulation to use this.
|
#
1.6 |
|
24-Mar-2017 |
mlarkin |
Allow vmd to proceed after an interrupt occurred after retiring a cpuid instruction. Matches previous commit to kernel vmm.c
|
#
1.5 |
|
23-Mar-2017 |
mlarkin |
Implement memory size and SMP CPU count NVRAM registers in the emulated mc146818. This is needed for seabios to boot properly (and construct a sensible e820 map to send to the guest OS).
|
#
1.4 |
|
21-Mar-2017 |
mlarkin |
Fix two errors in NS8250 (UART) emulation. The first error zeroed out the high bits of %eax on reading register data from the emulated UART ports. The second error didn't properly assert the TXRDY bit during init - this bit was only set after the first character was sent. Both these bugs caused seabios to not be able to output any data. Found during the recent effort to get Linux guests booting.
|
#
1.3 |
|
15-Mar-2017 |
reyk |
Improve vmmci(4) shutdown and reboot.
This change handles various cases to power off the VM, even if it is unresponsive, stuck in ddb, or when the shutdown was initiated from the VM guest side. Usage of timeout and VM ACKs make sure that the VM is really turned off at some point.
OK mlarkin@
|
#
1.2 |
|
02-Mar-2017 |
reyk |
Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.
This is especially useful when multiple VMs share a switch, the implementation is independent from the underlying switch or bridge.
no objections mlarkin@
|
#
1.1 |
|
01-Mar-2017 |
reyk |
Split vmm.c into two files: vm.c for the VM child, vmm.c for the parent
As discussed with mlarkin@, it makes it easier to maintain the file.
OK mlarkin@
|
#
1.63 |
|
16-Jun-2021 |
dv |
cleanup vmd(8) includes and header files
Lots of organic growth other the years lead to unnecessary includes (proc.h everywhere) and odd dependencies between header files. This cleans things up a bit to help with upcoming cleanup around dhcp code.
No functional change.
"go for it" mlarkin@
|
Revision tags: OPENBSD_6_9_BASE
|
#
1.62 |
|
05-Apr-2021 |
dv |
Support booting from compressed kernel images.
The bsd.rd ramdisk now ships gzip'd on amd64. Use libz in base to transparently handle decompression of any compressed kernel images.
Patch from Josh Rickmar.
ok kn@
|
#
1.61 |
|
29-Mar-2021 |
dv |
Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior ignored did not intercept these packets and instead transmitted them.
This should make vmd(8)'s dhcp behave more as a true dhcp server should and allows it to work properly with the new dhcpleased(8) attempting a renewal.
OK mlarkin@
|
#
1.60 |
|
19-Mar-2021 |
kn |
Remove booting from kernels in raw/qcow2 images
Diff and (slightly tweaked) text below from Dave Voutila < dave at sisu dot io >, thanks!
-- Since 6.7 switched to FFS2 as the default filesystem for new installs, the ability for vmd(8) to load a kernel and boot.conf from a disk image directly (without SeaBIOS) has been broken.
A diff from tb to add FFS2 support never mdae it into the tree.
On 5th Jan 2021, new ramdisks for amd64 have started shipping gzipped, breaking the ability to load the bsd.rd directly as a kernel image for a vmd guest without first uncompressing the image.
Using BIOS works, the FFS2 change happend ten months ago and few if any have complained about the breakage. vmctl(8) is still vague about supporting it per its man page and one still has to pass the disk image twice as a "-b" and "-d" argument to boot an OpenBSD guest *without* BIOS.
Josh Rickmar reported the gzip issue on bugs@ and provided patches to add support for compressed ramdisks and kernel images. The easiest way to do so is to drop support for FFS images since they require a call to fmemopen(3) while all the other logic uses fopen(3)/fdopen(3) calls and a file descriptor. It is much easier to get thsoe patches merged if they don't have to account for extracting files from disk images. --
No objections anyone "Removing it makes sense" reyk (who wrote the FFS module) OK mlarkin
|
#
1.59 |
|
13-Feb-2021 |
mlarkin |
Fix some wrong comments and KNF/long line wraps
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.58 |
|
28-Jun-2020 |
pd |
vmd(8): Eliminate libevent state corruption
libevent functions for com, pic and rtc are now only called on event_thread. vcpu exit handlers send messages on a dev pipe and callbacks on these events do the event management (event_add, evtimer_add, etc). Previously, libevent state was mutated by two threads, event_thread, that runs all the callbacks and the vcpu thread when running exit handlers. This could have lead to libevent state corruption.
Patch from Dave Voutila <dave@sisu.io>
ok claudio@ tested by abieber@ and brynet@
|
Revision tags: OPENBSD_6_7_BASE
|
#
1.57 |
|
30-Apr-2020 |
pd |
vmd(8): correctly terminate vm processes after sending vm
Instead of a round about way of sending a message to vmm that 'send is successful' and terminating by vm_remove from vmm, we can send the imsg and exit in the vm process. The sigchld handler in vmm will vm_remove it from its structures. This is how a normal vm is terminated as well.
Previously, vm_remove was called in vmm_dispatch_vm (ie. the event handler to receive messages from vm process) when hanlding the IMSG_VMDOP_SEND_VM_RESPONSE (ie. the vm process has written the vm state to the fd passed on by vmctl send). This is not how vm_remove was intented to be used as it does a free(vm). The vm struct holds the buffers for imsg and so after handling this IMSG_VMDOP_SEND_VM_RESPONSE message, vmm_dispatch_vm loops again to do imsg_get(ibuf, &imsg) to read the next message (and we had just freed this *ibuf when we freed the vm struct) causing it to segfault.
reported by kn@ ok kn@
|
#
1.56 |
|
21-Apr-2020 |
pd |
vmd: improve concurrency control in pause
Previous implementation hit a deadlock sometimes as the pthread_cond_broadcast for the pause mutex could happen before pthread_cond_wait. This implementation uses a barrier which is hit when all vpcus are paused.
ok mpi@
|
#
1.55 |
|
08-Apr-2020 |
pd |
vmm(4): add IOCTL handler to sets the access protections of the ept
This exposes VMM_IOC_MPROTECT_EPT which can be used by vmd to lock in physical pages. Currently, vmd just terminates the vm in case it gets a protection fault in the future.
This feature is used by solo5 which uses vmm(4) as a backend hypervisor.
ok mpi@
Patch from Adam Steen <adam@adamsteen.com.au>
|
#
1.54 |
|
11-Dec-2019 |
pd |
vmd: proper concurrency control when pausing a vm
Removes an XXX which slept for 1s waiting for the vcpu thread to reach HLT and pause. We now define a paused and unpaused condition so that a call to pause_vm() / vmctl pause blocks till the vm really reaches a paused state.
Also, detach events for devices from event loop when pausing and add them back when unpausing. This is because some callbacks call pthread_mutex_lock and if the vm is paused, it would block also causing the libevent thread to block. This would mean that we would not be able to process any IMSGs received from vmm (parent process) including a message to unpause.
ok mlarkin@
|
#
1.53 |
|
30-Nov-2019 |
mlarkin |
Revert previous - the stability was not as improved as we had thought and we ended up accidentally breaking vmctl. This will need more thought.
ok ori@
|
#
1.52 |
|
29-Nov-2019 |
mlarkin |
Fix at least one cause of VMs spinning at 100% host CPU
After debugging with ori@, it looks like an event ends up on the wrong libevent queue, and we end continually de-queueing and re-queueing the event continually. While it's unclear exactly why this happened, a clue on libevent's github issues page for the same problem pointed us to using a different event base for the device events. This seems to have unstuck ori@'s problematic VM, and I have also seen no more hangs after this.
We have not completely separated the queues; ori@ will work on setting new libevent bases for those later. But those events are pretty frequency.
with help from and ok ori@
|
Revision tags: OPENBSD_6_6_BASE
|
#
1.51 |
|
17-Jul-2019 |
pd |
vmm/vmd: Fix migration with pvclock
Implement VMM_IOC_READVMPARAMS and VMM_IOC_WRITEVMPARAMS ioctls to read and write pvclock state.
reads ok mlarkin@
|
#
1.50 |
|
28-Jun-2019 |
deraadt |
When system calls indicate an error they return -1, not some arbitrary value < 0. errno is only updated in this case. Change all (most?) callers of syscalls to follow this better, and let's see if this strictness helps us in the future.
|
#
1.49 |
|
28-May-2019 |
pd |
vmd: unset CR0_CD and CR0_NW in default flat64 register values
These never got unset on AMD/SVM guests when booted via vmctl start -b causing them to run very slow
ok mlarkin@
|
#
1.48 |
|
12-May-2019 |
pd |
vmm: add a x86 page table walker
Add a first cut of x86 page table walker to vmd(8) and vmm(4). This function is not used right now but is a building block for future features like HPET, OUTSB and INSB emulation, nested virtualisation support, etc.
With help from Mike Larkin
ok mlarkin@
|
#
1.47 |
|
11-May-2019 |
jasper |
vm_dump_header allocated space for a signature but it was never set; set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image
ok mlarkin@ pd@
|
#
1.46 |
|
11-May-2019 |
jasper |
track the state of the vm (running, paused, etc) using a single bitfield instead of a handful of separate variables. this will makes it easier for vmd to report and check on the individual vm states
no functional change intended
ok ccardenas@ mlarkin@
|
Revision tags: OPENBSD_6_5_BASE
|
#
1.45 |
|
01-Mar-2019 |
mlarkin |
vmd(8): remove some i386 remnants that missed the original cleanup
ok pd, kn, deraadt
|
#
1.44 |
|
20-Feb-2019 |
mlarkin |
vmd(8): initialize guest %drX registers to power-on defaults on launch
Initializes the %drX registers to power on defaults, and bump the VM send/recieve header to reflect same
discussed with deraadt@
|
#
1.43 |
|
10-Dec-2018 |
claudio |
Implement the fw_cfg interface basics and use it to set the bootorder if a bootdevice was forced. This implements both the pure IO port interface and also the new DMA interface, a few direct commands are implemented which are needed but in general the "file" interface should be used. There is no write support for the guest. Tested against the latest vmm-firmware port. This requires also a -current kernel to pass the IO ports to vmd(8). OK mlarkin@ ccardenas@
|
#
1.42 |
|
06-Dec-2018 |
claudio |
Make it possible to define the bootdevice in vmd. This information is used currently only when booting a OpenBSD kernel. If VMBOOTDEV_NET is used the internal dhcp server will pass "auto_install" as boot file to the client and the boot loader passes the MAC of the first interface to the kernel to indicate PXE booting. Adding boot order support to SeaBIOS is not yet implemented. Ok ccardenas@
|
Revision tags: OPENBSD_6_4_BASE
|
#
1.41 |
|
08-Oct-2018 |
reyk |
Add support for qcow2 base images (external snapshots).
This works is from Ori Bernstein, committing on his behalf:
Add support to vmd for external snapshots. That is, snapshots that are derived from a base image. Data lookups start in the derived image, and if the derived image does not contain some data, the search proceeds ot the base image. Multiple derived images may exist off of a single base image.
A limitation of this format is that modifying the base image will corrupt the derived image.
This change also adds support for creating disk derived disk images to vmctl. To use it:
vmctl create derived.qcow2 -s 16G -b base.qcow2
From Ori Bernstein OK mlarkin@ reyk@
|
#
1.40 |
|
28-Sep-2018 |
reyk |
Support vmd-internal's vmboot with qcow2 disk images.
OK mlarkin@
|
#
1.39 |
|
19-Sep-2018 |
ccardenas |
Various clean up items for disks.
- qcow2: general cleanup - vioraw: check malloc - virtio: add function to sync disks - vm: call virtio_shutdown to sync disks when vm is finished executing
Thanks to Ori Bernstein.
Ok miko@
|
#
1.38 |
|
17-Jul-2018 |
mlarkin |
vmd(8): fix vmctl -b option for i386 kernels.
ok pd@
|
#
1.37 |
|
12-Jul-2018 |
mlarkin |
vmm(8)/vmm(4): send a copy of the guest register state to vmd on exit, avoiding multiple readregs ioctls back to vmm in case register content is needed subsequently.
ok phessler
|
#
1.36 |
|
10-Jul-2018 |
mlarkin |
vmd(8): route ELCR handler to the right function
|
#
1.35 |
|
09-Jul-2018 |
mlarkin |
vmd(8): better debug message in a failure case
|
#
1.34 |
|
19-Jun-2018 |
reyk |
knf
|
#
1.33 |
|
27-Apr-2018 |
mlarkin |
vmd(8): implement vmd side of ELCR registers
ok guenther
|
#
1.32 |
|
26-Apr-2018 |
mlarkin |
vmd(8): handle PIT channel 2 status readback via port 0x61
Allow PIT channel 2 status (fired/counting) readback via port 0x61 bit 5.
ok guenther@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.31 |
|
03-Jan-2018 |
ccardenas |
Add initial CD-ROM support to VMD via vioscsi.
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
#
1.30 |
|
29-Nov-2017 |
mlarkin |
make vmm(4) less responsible for initial register state, preferring to let usermode daemons handle that.
ok pd@
|
#
1.29 |
|
28-Nov-2017 |
mlarkin |
fix some spelling errors in a few comments
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.28 |
|
19-Sep-2017 |
mlarkin |
Clarify a wrong conditional, found by jsg.
ok jsg
|
#
1.27 |
|
17-Sep-2017 |
pd |
vmd: send/recv pci config space instead of recreating pci devices on receive
ok mlarkin@
|
#
1.26 |
|
17-Sep-2017 |
pd |
vmd: re add rtc.per and rtc.sec evtimers on receive
This was missed in receive. mc146818_start is already defined. This fixes rtc time resync on receive.
ok mlarkin@
|
#
1.25 |
|
11-Sep-2017 |
dlg |
add functions to provide direct access to guest memory as vmd addresses
iovec_mem() populates an iovec array based on guest physical addresses. this allows the use of things like readv and writev for moving data between the guest and a disk image file without having to bounce the memory.
vaddr_mem() provides a vmd usable pointer based on a guests physical address. this makes it possible to directly reference things like virtio rings without having to bounce that memory either. however, it assumes that a contiguous range of guest physical memory will sit in a single vm memory range. mlarkin@ says this is right.
ok mlarkin@
|
#
1.24 |
|
20-Aug-2017 |
pd |
vmd: Allow only upward migration
This restricts receiving vms from hosts with more cpu features.
Tested on broadwell -> skylake (works) skylake -> broadwell (don't work)
ok mlarkin@
|
#
1.23 |
|
14-Aug-2017 |
mlarkin |
vmd: set MSR_MISC_ENABLE=0 on vm creation, this will be re-set in vmm based on proper values from the host in use.
|
#
1.22 |
|
15-Jul-2017 |
pd |
Add vmctl send and vmctl receive
ok reyk@ and mlarkin@
|
#
1.21 |
|
09-Jul-2017 |
pd |
vmd/vmctl: Add ability to pause / unpause vms
With help from Ashwin Agrawal
ok reyk@ mlarkin@
|
#
1.20 |
|
07-Jun-2017 |
mlarkin |
vmd: Implement simulated baudrate support in the ns8250 module. The previous version was allowing an output rate that is "too fast", and linux guests would give up after 512 characters TXed ("too much work for irq4").
This diff calculates the approximate rate we can sustain at the current programmed baud rate and limits the output to that rate by inserting a HZ delay after a specified number of characters have been transmitted. This fixes the linux guest console issue.
Note that the console now outputs at more or less the selected baud rate, instead of nearly instantaneously as before - if you selected 9600 in your guest VMs before, you might want to change that to 115200 now for a better console experience.
krw@ "seems like a good idea to me"
|
#
1.19 |
|
30-May-2017 |
tedu |
split vioblk read/write functions into start and finish as prep for async io operations. ok mlarkin
|
#
1.18 |
|
28-May-2017 |
mlarkin |
SVM: add some exit types
Also, fix a comment that wasn't applicable anymore, and change a format from decimal to hex
|
#
1.17 |
|
05-May-2017 |
reyk |
VMs cannot use proc_compose() to PROC_VMM, they have to use imsg_compose() on the "vmm_pipe" directly. This fixes the communication channel from VMs back to vmm.
|
#
1.16 |
|
05-May-2017 |
mlarkin |
Allow vmd(8) to set guest %xcr0
Usermode part of previous vmm(4) diff.
Posted to tech by Pratik Vyas
|
#
1.15 |
|
02-May-2017 |
mlarkin |
fix an error in i386 vmd build
|
#
1.14 |
|
02-May-2017 |
mlarkin |
Matching vmd(8) part of previous diff (first part of vmctl send/receive).
ok kettenis
|
#
1.13 |
|
25-Apr-2017 |
reyk |
spacing
|
#
1.12 |
|
19-Apr-2017 |
reyk |
Add support for dynamic "NAT" interfaces (-L/local interface).
When a local interface is configured, vmd configures a /31 address on the tap(4) interface of the host and provides another IP in the same subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server that replies with IP, gateway, and DNS addresses to the VM. The built-in server only ever responds to the VM on the inside and cannot leak its DHCP responses to the outside.
Thanks to Uwe Werler, Josh Grosse, and some others for testing!
OK deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.11 |
|
27-Mar-2017 |
deraadt |
die whitespace die die die
|
#
1.10 |
|
25-Mar-2017 |
mlarkin |
Last bits needed to get seabios + alpine linux working. This is enough to get started and let more people help finding and fixing bugs.
ok kettenis, deraadt
|
#
1.9 |
|
25-Mar-2017 |
reyk |
Boot using BIOS from /etc/firmware/vmm-bios by default.
Instead of using the internal "vmboot", VMs will now be booted using the external BIOS firmware in /etc/firmware/vmm-bios (which is subject to a LGPLv3 license). Direct booting of OpenBSD kernels or non-default BIOS images is still supported for now using the -b/boot option that is replacing the -k/kernel option.
As requested by Theo, vmd(8) fails if neither the default BIOS is found nor a kernel has been specified in the VM configuration. The "vmm" BIOS has to be installed using fw_update(1), which will be done automatically in most cases where the OpenBSD can fetch it after install/upgrade.
OK mlarkin@
|
#
1.8 |
|
25-Mar-2017 |
mlarkin |
Implement some missing functionality and clean up some code in vmd pci emulation.
ok kettenis
|
#
1.7 |
|
25-Mar-2017 |
mlarkin |
Introduce a new function to obtain properly sized input data, and convert i8253/i8259/mc146818 emulation to use this.
|
#
1.6 |
|
24-Mar-2017 |
mlarkin |
Allow vmd to proceed after an interrupt occurred after retiring a cpuid instruction. Matches previous commit to kernel vmm.c
|
#
1.5 |
|
23-Mar-2017 |
mlarkin |
Implement memory size and SMP CPU count NVRAM registers in the emulated mc146818. This is needed for seabios to boot properly (and construct a sensible e820 map to send to the guest OS).
|
#
1.4 |
|
21-Mar-2017 |
mlarkin |
Fix two errors in NS8250 (UART) emulation. The first error zeroed out the high bits of %eax on reading register data from the emulated UART ports. The second error didn't properly assert the TXRDY bit during init - this bit was only set after the first character was sent. Both these bugs caused seabios to not be able to output any data. Found during the recent effort to get Linux guests booting.
|
#
1.3 |
|
15-Mar-2017 |
reyk |
Improve vmmci(4) shutdown and reboot.
This change handles various cases to power off the VM, even if it is unresponsive, stuck in ddb, or when the shutdown was initiated from the VM guest side. Usage of timeout and VM ACKs make sure that the VM is really turned off at some point.
OK mlarkin@
|
#
1.2 |
|
02-Mar-2017 |
reyk |
Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.
This is especially useful when multiple VMs share a switch, the implementation is independent from the underlying switch or bridge.
no objections mlarkin@
|
#
1.1 |
|
01-Mar-2017 |
reyk |
Split vmm.c into two files: vm.c for the VM child, vmm.c for the parent
As discussed with mlarkin@, it makes it easier to maintain the file.
OK mlarkin@
|
#
1.62 |
|
05-Apr-2021 |
dv |
Support booting from compressed kernel images.
The bsd.rd ramdisk now ships gzip'd on amd64. Use libz in base to transparently handle decompression of any compressed kernel images.
Patch from Josh Rickmar.
ok kn@
|
#
1.61 |
|
29-Mar-2021 |
dv |
Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior ignored did not intercept these packets and instead transmitted them.
This should make vmd(8)'s dhcp behave more as a true dhcp server should and allows it to work properly with the new dhcpleased(8) attempting a renewal.
OK mlarkin@
|
#
1.60 |
|
19-Mar-2021 |
kn |
Remove booting from kernels in raw/qcow2 images
Diff and (slightly tweaked) text below from Dave Voutila < dave at sisu dot io >, thanks!
-- Since 6.7 switched to FFS2 as the default filesystem for new installs, the ability for vmd(8) to load a kernel and boot.conf from a disk image directly (without SeaBIOS) has been broken.
A diff from tb to add FFS2 support never mdae it into the tree.
On 5th Jan 2021, new ramdisks for amd64 have started shipping gzipped, breaking the ability to load the bsd.rd directly as a kernel image for a vmd guest without first uncompressing the image.
Using BIOS works, the FFS2 change happend ten months ago and few if any have complained about the breakage. vmctl(8) is still vague about supporting it per its man page and one still has to pass the disk image twice as a "-b" and "-d" argument to boot an OpenBSD guest *without* BIOS.
Josh Rickmar reported the gzip issue on bugs@ and provided patches to add support for compressed ramdisks and kernel images. The easiest way to do so is to drop support for FFS images since they require a call to fmemopen(3) while all the other logic uses fopen(3)/fdopen(3) calls and a file descriptor. It is much easier to get thsoe patches merged if they don't have to account for extracting files from disk images. --
No objections anyone "Removing it makes sense" reyk (who wrote the FFS module) OK mlarkin
|
#
1.59 |
|
13-Feb-2021 |
mlarkin |
Fix some wrong comments and KNF/long line wraps
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.58 |
|
28-Jun-2020 |
pd |
vmd(8): Eliminate libevent state corruption
libevent functions for com, pic and rtc are now only called on event_thread. vcpu exit handlers send messages on a dev pipe and callbacks on these events do the event management (event_add, evtimer_add, etc). Previously, libevent state was mutated by two threads, event_thread, that runs all the callbacks and the vcpu thread when running exit handlers. This could have lead to libevent state corruption.
Patch from Dave Voutila <dave@sisu.io>
ok claudio@ tested by abieber@ and brynet@
|
Revision tags: OPENBSD_6_7_BASE
|
#
1.57 |
|
30-Apr-2020 |
pd |
vmd(8): correctly terminate vm processes after sending vm
Instead of a round about way of sending a message to vmm that 'send is successful' and terminating by vm_remove from vmm, we can send the imsg and exit in the vm process. The sigchld handler in vmm will vm_remove it from its structures. This is how a normal vm is terminated as well.
Previously, vm_remove was called in vmm_dispatch_vm (ie. the event handler to receive messages from vm process) when hanlding the IMSG_VMDOP_SEND_VM_RESPONSE (ie. the vm process has written the vm state to the fd passed on by vmctl send). This is not how vm_remove was intented to be used as it does a free(vm). The vm struct holds the buffers for imsg and so after handling this IMSG_VMDOP_SEND_VM_RESPONSE message, vmm_dispatch_vm loops again to do imsg_get(ibuf, &imsg) to read the next message (and we had just freed this *ibuf when we freed the vm struct) causing it to segfault.
reported by kn@ ok kn@
|
#
1.56 |
|
21-Apr-2020 |
pd |
vmd: improve concurrency control in pause
Previous implementation hit a deadlock sometimes as the pthread_cond_broadcast for the pause mutex could happen before pthread_cond_wait. This implementation uses a barrier which is hit when all vpcus are paused.
ok mpi@
|
#
1.55 |
|
08-Apr-2020 |
pd |
vmm(4): add IOCTL handler to sets the access protections of the ept
This exposes VMM_IOC_MPROTECT_EPT which can be used by vmd to lock in physical pages. Currently, vmd just terminates the vm in case it gets a protection fault in the future.
This feature is used by solo5 which uses vmm(4) as a backend hypervisor.
ok mpi@
Patch from Adam Steen <adam@adamsteen.com.au>
|
#
1.54 |
|
11-Dec-2019 |
pd |
vmd: proper concurrency control when pausing a vm
Removes an XXX which slept for 1s waiting for the vcpu thread to reach HLT and pause. We now define a paused and unpaused condition so that a call to pause_vm() / vmctl pause blocks till the vm really reaches a paused state.
Also, detach events for devices from event loop when pausing and add them back when unpausing. This is because some callbacks call pthread_mutex_lock and if the vm is paused, it would block also causing the libevent thread to block. This would mean that we would not be able to process any IMSGs received from vmm (parent process) including a message to unpause.
ok mlarkin@
|
#
1.53 |
|
30-Nov-2019 |
mlarkin |
Revert previous - the stability was not as improved as we had thought and we ended up accidentally breaking vmctl. This will need more thought.
ok ori@
|
#
1.52 |
|
29-Nov-2019 |
mlarkin |
Fix at least one cause of VMs spinning at 100% host CPU
After debugging with ori@, it looks like an event ends up on the wrong libevent queue, and we end continually de-queueing and re-queueing the event continually. While it's unclear exactly why this happened, a clue on libevent's github issues page for the same problem pointed us to using a different event base for the device events. This seems to have unstuck ori@'s problematic VM, and I have also seen no more hangs after this.
We have not completely separated the queues; ori@ will work on setting new libevent bases for those later. But those events are pretty frequency.
with help from and ok ori@
|
Revision tags: OPENBSD_6_6_BASE
|
#
1.51 |
|
17-Jul-2019 |
pd |
vmm/vmd: Fix migration with pvclock
Implement VMM_IOC_READVMPARAMS and VMM_IOC_WRITEVMPARAMS ioctls to read and write pvclock state.
reads ok mlarkin@
|
#
1.50 |
|
28-Jun-2019 |
deraadt |
When system calls indicate an error they return -1, not some arbitrary value < 0. errno is only updated in this case. Change all (most?) callers of syscalls to follow this better, and let's see if this strictness helps us in the future.
|
#
1.49 |
|
28-May-2019 |
pd |
vmd: unset CR0_CD and CR0_NW in default flat64 register values
These never got unset on AMD/SVM guests when booted via vmctl start -b causing them to run very slow
ok mlarkin@
|
#
1.48 |
|
12-May-2019 |
pd |
vmm: add a x86 page table walker
Add a first cut of x86 page table walker to vmd(8) and vmm(4). This function is not used right now but is a building block for future features like HPET, OUTSB and INSB emulation, nested virtualisation support, etc.
With help from Mike Larkin
ok mlarkin@
|
#
1.47 |
|
11-May-2019 |
jasper |
vm_dump_header allocated space for a signature but it was never set; set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image
ok mlarkin@ pd@
|
#
1.46 |
|
11-May-2019 |
jasper |
track the state of the vm (running, paused, etc) using a single bitfield instead of a handful of separate variables. this will makes it easier for vmd to report and check on the individual vm states
no functional change intended
ok ccardenas@ mlarkin@
|
Revision tags: OPENBSD_6_5_BASE
|
#
1.45 |
|
01-Mar-2019 |
mlarkin |
vmd(8): remove some i386 remnants that missed the original cleanup
ok pd, kn, deraadt
|
#
1.44 |
|
20-Feb-2019 |
mlarkin |
vmd(8): initialize guest %drX registers to power-on defaults on launch
Initializes the %drX registers to power on defaults, and bump the VM send/recieve header to reflect same
discussed with deraadt@
|
#
1.43 |
|
10-Dec-2018 |
claudio |
Implement the fw_cfg interface basics and use it to set the bootorder if a bootdevice was forced. This implements both the pure IO port interface and also the new DMA interface, a few direct commands are implemented which are needed but in general the "file" interface should be used. There is no write support for the guest. Tested against the latest vmm-firmware port. This requires also a -current kernel to pass the IO ports to vmd(8). OK mlarkin@ ccardenas@
|
#
1.42 |
|
06-Dec-2018 |
claudio |
Make it possible to define the bootdevice in vmd. This information is used currently only when booting a OpenBSD kernel. If VMBOOTDEV_NET is used the internal dhcp server will pass "auto_install" as boot file to the client and the boot loader passes the MAC of the first interface to the kernel to indicate PXE booting. Adding boot order support to SeaBIOS is not yet implemented. Ok ccardenas@
|
Revision tags: OPENBSD_6_4_BASE
|
#
1.41 |
|
08-Oct-2018 |
reyk |
Add support for qcow2 base images (external snapshots).
This works is from Ori Bernstein, committing on his behalf:
Add support to vmd for external snapshots. That is, snapshots that are derived from a base image. Data lookups start in the derived image, and if the derived image does not contain some data, the search proceeds ot the base image. Multiple derived images may exist off of a single base image.
A limitation of this format is that modifying the base image will corrupt the derived image.
This change also adds support for creating disk derived disk images to vmctl. To use it:
vmctl create derived.qcow2 -s 16G -b base.qcow2
From Ori Bernstein OK mlarkin@ reyk@
|
#
1.40 |
|
28-Sep-2018 |
reyk |
Support vmd-internal's vmboot with qcow2 disk images.
OK mlarkin@
|
#
1.39 |
|
19-Sep-2018 |
ccardenas |
Various clean up items for disks.
- qcow2: general cleanup - vioraw: check malloc - virtio: add function to sync disks - vm: call virtio_shutdown to sync disks when vm is finished executing
Thanks to Ori Bernstein.
Ok miko@
|
#
1.38 |
|
17-Jul-2018 |
mlarkin |
vmd(8): fix vmctl -b option for i386 kernels.
ok pd@
|
#
1.37 |
|
12-Jul-2018 |
mlarkin |
vmm(8)/vmm(4): send a copy of the guest register state to vmd on exit, avoiding multiple readregs ioctls back to vmm in case register content is needed subsequently.
ok phessler
|
#
1.36 |
|
10-Jul-2018 |
mlarkin |
vmd(8): route ELCR handler to the right function
|
#
1.35 |
|
09-Jul-2018 |
mlarkin |
vmd(8): better debug message in a failure case
|
#
1.34 |
|
19-Jun-2018 |
reyk |
knf
|
#
1.33 |
|
27-Apr-2018 |
mlarkin |
vmd(8): implement vmd side of ELCR registers
ok guenther
|
#
1.32 |
|
26-Apr-2018 |
mlarkin |
vmd(8): handle PIT channel 2 status readback via port 0x61
Allow PIT channel 2 status (fired/counting) readback via port 0x61 bit 5.
ok guenther@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.31 |
|
03-Jan-2018 |
ccardenas |
Add initial CD-ROM support to VMD via vioscsi.
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
#
1.30 |
|
29-Nov-2017 |
mlarkin |
make vmm(4) less responsible for initial register state, preferring to let usermode daemons handle that.
ok pd@
|
#
1.29 |
|
28-Nov-2017 |
mlarkin |
fix some spelling errors in a few comments
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.28 |
|
19-Sep-2017 |
mlarkin |
Clarify a wrong conditional, found by jsg.
ok jsg
|
#
1.27 |
|
17-Sep-2017 |
pd |
vmd: send/recv pci config space instead of recreating pci devices on receive
ok mlarkin@
|
#
1.26 |
|
17-Sep-2017 |
pd |
vmd: re add rtc.per and rtc.sec evtimers on receive
This was missed in receive. mc146818_start is already defined. This fixes rtc time resync on receive.
ok mlarkin@
|
#
1.25 |
|
11-Sep-2017 |
dlg |
add functions to provide direct access to guest memory as vmd addresses
iovec_mem() populates an iovec array based on guest physical addresses. this allows the use of things like readv and writev for moving data between the guest and a disk image file without having to bounce the memory.
vaddr_mem() provides a vmd usable pointer based on a guests physical address. this makes it possible to directly reference things like virtio rings without having to bounce that memory either. however, it assumes that a contiguous range of guest physical memory will sit in a single vm memory range. mlarkin@ says this is right.
ok mlarkin@
|
#
1.24 |
|
20-Aug-2017 |
pd |
vmd: Allow only upward migration
This restricts receiving vms from hosts with more cpu features.
Tested on broadwell -> skylake (works) skylake -> broadwell (don't work)
ok mlarkin@
|
#
1.23 |
|
14-Aug-2017 |
mlarkin |
vmd: set MSR_MISC_ENABLE=0 on vm creation, this will be re-set in vmm based on proper values from the host in use.
|
#
1.22 |
|
15-Jul-2017 |
pd |
Add vmctl send and vmctl receive
ok reyk@ and mlarkin@
|
#
1.21 |
|
09-Jul-2017 |
pd |
vmd/vmctl: Add ability to pause / unpause vms
With help from Ashwin Agrawal
ok reyk@ mlarkin@
|
#
1.20 |
|
07-Jun-2017 |
mlarkin |
vmd: Implement simulated baudrate support in the ns8250 module. The previous version was allowing an output rate that is "too fast", and linux guests would give up after 512 characters TXed ("too much work for irq4").
This diff calculates the approximate rate we can sustain at the current programmed baud rate and limits the output to that rate by inserting a HZ delay after a specified number of characters have been transmitted. This fixes the linux guest console issue.
Note that the console now outputs at more or less the selected baud rate, instead of nearly instantaneously as before - if you selected 9600 in your guest VMs before, you might want to change that to 115200 now for a better console experience.
krw@ "seems like a good idea to me"
|
#
1.19 |
|
30-May-2017 |
tedu |
split vioblk read/write functions into start and finish as prep for async io operations. ok mlarkin
|
#
1.18 |
|
28-May-2017 |
mlarkin |
SVM: add some exit types
Also, fix a comment that wasn't applicable anymore, and change a format from decimal to hex
|
#
1.17 |
|
05-May-2017 |
reyk |
VMs cannot use proc_compose() to PROC_VMM, they have to use imsg_compose() on the "vmm_pipe" directly. This fixes the communication channel from VMs back to vmm.
|
#
1.16 |
|
05-May-2017 |
mlarkin |
Allow vmd(8) to set guest %xcr0
Usermode part of previous vmm(4) diff.
Posted to tech by Pratik Vyas
|
#
1.15 |
|
02-May-2017 |
mlarkin |
fix an error in i386 vmd build
|
#
1.14 |
|
02-May-2017 |
mlarkin |
Matching vmd(8) part of previous diff (first part of vmctl send/receive).
ok kettenis
|
#
1.13 |
|
25-Apr-2017 |
reyk |
spacing
|
#
1.12 |
|
19-Apr-2017 |
reyk |
Add support for dynamic "NAT" interfaces (-L/local interface).
When a local interface is configured, vmd configures a /31 address on the tap(4) interface of the host and provides another IP in the same subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server that replies with IP, gateway, and DNS addresses to the VM. The built-in server only ever responds to the VM on the inside and cannot leak its DHCP responses to the outside.
Thanks to Uwe Werler, Josh Grosse, and some others for testing!
OK deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.11 |
|
27-Mar-2017 |
deraadt |
die whitespace die die die
|
#
1.10 |
|
25-Mar-2017 |
mlarkin |
Last bits needed to get seabios + alpine linux working. This is enough to get started and let more people help finding and fixing bugs.
ok kettenis, deraadt
|
#
1.9 |
|
25-Mar-2017 |
reyk |
Boot using BIOS from /etc/firmware/vmm-bios by default.
Instead of using the internal "vmboot", VMs will now be booted using the external BIOS firmware in /etc/firmware/vmm-bios (which is subject to a LGPLv3 license). Direct booting of OpenBSD kernels or non-default BIOS images is still supported for now using the -b/boot option that is replacing the -k/kernel option.
As requested by Theo, vmd(8) fails if neither the default BIOS is found nor a kernel has been specified in the VM configuration. The "vmm" BIOS has to be installed using fw_update(1), which will be done automatically in most cases where the OpenBSD can fetch it after install/upgrade.
OK mlarkin@
|
#
1.8 |
|
25-Mar-2017 |
mlarkin |
Implement some missing functionality and clean up some code in vmd pci emulation.
ok kettenis
|
#
1.7 |
|
25-Mar-2017 |
mlarkin |
Introduce a new function to obtain properly sized input data, and convert i8253/i8259/mc146818 emulation to use this.
|
#
1.6 |
|
24-Mar-2017 |
mlarkin |
Allow vmd to proceed after an interrupt occurred after retiring a cpuid instruction. Matches previous commit to kernel vmm.c
|
#
1.5 |
|
23-Mar-2017 |
mlarkin |
Implement memory size and SMP CPU count NVRAM registers in the emulated mc146818. This is needed for seabios to boot properly (and construct a sensible e820 map to send to the guest OS).
|
#
1.4 |
|
21-Mar-2017 |
mlarkin |
Fix two errors in NS8250 (UART) emulation. The first error zeroed out the high bits of %eax on reading register data from the emulated UART ports. The second error didn't properly assert the TXRDY bit during init - this bit was only set after the first character was sent. Both these bugs caused seabios to not be able to output any data. Found during the recent effort to get Linux guests booting.
|
#
1.3 |
|
15-Mar-2017 |
reyk |
Improve vmmci(4) shutdown and reboot.
This change handles various cases to power off the VM, even if it is unresponsive, stuck in ddb, or when the shutdown was initiated from the VM guest side. Usage of timeout and VM ACKs make sure that the VM is really turned off at some point.
OK mlarkin@
|
#
1.2 |
|
02-Mar-2017 |
reyk |
Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.
This is especially useful when multiple VMs share a switch, the implementation is independent from the underlying switch or bridge.
no objections mlarkin@
|
#
1.1 |
|
01-Mar-2017 |
reyk |
Split vmm.c into two files: vm.c for the VM child, vmm.c for the parent
As discussed with mlarkin@, it makes it easier to maintain the file.
OK mlarkin@
|
#
1.61 |
|
29-Mar-2021 |
dv |
Propagate host-side tap(4) lladdr to guest vm process to allow unicast dhcp and bootp renewals with vmd(8)'s built-in dhcp server. Previous behavior ignored did not intercept these packets and instead transmitted them.
This should make vmd(8)'s dhcp behave more as a true dhcp server should and allows it to work properly with the new dhcpleased(8) attempting a renewal.
OK mlarkin@
|
#
1.60 |
|
19-Mar-2021 |
kn |
Remove booting from kernels in raw/qcow2 images
Diff and (slightly tweaked) text below from Dave Voutila < dave at sisu dot io >, thanks!
-- Since 6.7 switched to FFS2 as the default filesystem for new installs, the ability for vmd(8) to load a kernel and boot.conf from a disk image directly (without SeaBIOS) has been broken.
A diff from tb to add FFS2 support never mdae it into the tree.
On 5th Jan 2021, new ramdisks for amd64 have started shipping gzipped, breaking the ability to load the bsd.rd directly as a kernel image for a vmd guest without first uncompressing the image.
Using BIOS works, the FFS2 change happend ten months ago and few if any have complained about the breakage. vmctl(8) is still vague about supporting it per its man page and one still has to pass the disk image twice as a "-b" and "-d" argument to boot an OpenBSD guest *without* BIOS.
Josh Rickmar reported the gzip issue on bugs@ and provided patches to add support for compressed ramdisks and kernel images. The easiest way to do so is to drop support for FFS images since they require a call to fmemopen(3) while all the other logic uses fopen(3)/fdopen(3) calls and a file descriptor. It is much easier to get thsoe patches merged if they don't have to account for extracting files from disk images. --
No objections anyone "Removing it makes sense" reyk (who wrote the FFS module) OK mlarkin
|
#
1.59 |
|
13-Feb-2021 |
mlarkin |
Fix some wrong comments and KNF/long line wraps
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.58 |
|
28-Jun-2020 |
pd |
vmd(8): Eliminate libevent state corruption
libevent functions for com, pic and rtc are now only called on event_thread. vcpu exit handlers send messages on a dev pipe and callbacks on these events do the event management (event_add, evtimer_add, etc). Previously, libevent state was mutated by two threads, event_thread, that runs all the callbacks and the vcpu thread when running exit handlers. This could have lead to libevent state corruption.
Patch from Dave Voutila <dave@sisu.io>
ok claudio@ tested by abieber@ and brynet@
|
Revision tags: OPENBSD_6_7_BASE
|
#
1.57 |
|
30-Apr-2020 |
pd |
vmd(8): correctly terminate vm processes after sending vm
Instead of a round about way of sending a message to vmm that 'send is successful' and terminating by vm_remove from vmm, we can send the imsg and exit in the vm process. The sigchld handler in vmm will vm_remove it from its structures. This is how a normal vm is terminated as well.
Previously, vm_remove was called in vmm_dispatch_vm (ie. the event handler to receive messages from vm process) when hanlding the IMSG_VMDOP_SEND_VM_RESPONSE (ie. the vm process has written the vm state to the fd passed on by vmctl send). This is not how vm_remove was intented to be used as it does a free(vm). The vm struct holds the buffers for imsg and so after handling this IMSG_VMDOP_SEND_VM_RESPONSE message, vmm_dispatch_vm loops again to do imsg_get(ibuf, &imsg) to read the next message (and we had just freed this *ibuf when we freed the vm struct) causing it to segfault.
reported by kn@ ok kn@
|
#
1.56 |
|
21-Apr-2020 |
pd |
vmd: improve concurrency control in pause
Previous implementation hit a deadlock sometimes as the pthread_cond_broadcast for the pause mutex could happen before pthread_cond_wait. This implementation uses a barrier which is hit when all vpcus are paused.
ok mpi@
|
#
1.55 |
|
08-Apr-2020 |
pd |
vmm(4): add IOCTL handler to sets the access protections of the ept
This exposes VMM_IOC_MPROTECT_EPT which can be used by vmd to lock in physical pages. Currently, vmd just terminates the vm in case it gets a protection fault in the future.
This feature is used by solo5 which uses vmm(4) as a backend hypervisor.
ok mpi@
Patch from Adam Steen <adam@adamsteen.com.au>
|
#
1.54 |
|
11-Dec-2019 |
pd |
vmd: proper concurrency control when pausing a vm
Removes an XXX which slept for 1s waiting for the vcpu thread to reach HLT and pause. We now define a paused and unpaused condition so that a call to pause_vm() / vmctl pause blocks till the vm really reaches a paused state.
Also, detach events for devices from event loop when pausing and add them back when unpausing. This is because some callbacks call pthread_mutex_lock and if the vm is paused, it would block also causing the libevent thread to block. This would mean that we would not be able to process any IMSGs received from vmm (parent process) including a message to unpause.
ok mlarkin@
|
#
1.53 |
|
30-Nov-2019 |
mlarkin |
Revert previous - the stability was not as improved as we had thought and we ended up accidentally breaking vmctl. This will need more thought.
ok ori@
|
#
1.52 |
|
29-Nov-2019 |
mlarkin |
Fix at least one cause of VMs spinning at 100% host CPU
After debugging with ori@, it looks like an event ends up on the wrong libevent queue, and we end continually de-queueing and re-queueing the event continually. While it's unclear exactly why this happened, a clue on libevent's github issues page for the same problem pointed us to using a different event base for the device events. This seems to have unstuck ori@'s problematic VM, and I have also seen no more hangs after this.
We have not completely separated the queues; ori@ will work on setting new libevent bases for those later. But those events are pretty frequency.
with help from and ok ori@
|
Revision tags: OPENBSD_6_6_BASE
|
#
1.51 |
|
17-Jul-2019 |
pd |
vmm/vmd: Fix migration with pvclock
Implement VMM_IOC_READVMPARAMS and VMM_IOC_WRITEVMPARAMS ioctls to read and write pvclock state.
reads ok mlarkin@
|
#
1.50 |
|
28-Jun-2019 |
deraadt |
When system calls indicate an error they return -1, not some arbitrary value < 0. errno is only updated in this case. Change all (most?) callers of syscalls to follow this better, and let's see if this strictness helps us in the future.
|
#
1.49 |
|
28-May-2019 |
pd |
vmd: unset CR0_CD and CR0_NW in default flat64 register values
These never got unset on AMD/SVM guests when booted via vmctl start -b causing them to run very slow
ok mlarkin@
|
#
1.48 |
|
12-May-2019 |
pd |
vmm: add a x86 page table walker
Add a first cut of x86 page table walker to vmd(8) and vmm(4). This function is not used right now but is a building block for future features like HPET, OUTSB and INSB emulation, nested virtualisation support, etc.
With help from Mike Larkin
ok mlarkin@
|
#
1.47 |
|
11-May-2019 |
jasper |
vm_dump_header allocated space for a signature but it was never set; set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image
ok mlarkin@ pd@
|
#
1.46 |
|
11-May-2019 |
jasper |
track the state of the vm (running, paused, etc) using a single bitfield instead of a handful of separate variables. this will makes it easier for vmd to report and check on the individual vm states
no functional change intended
ok ccardenas@ mlarkin@
|
Revision tags: OPENBSD_6_5_BASE
|
#
1.45 |
|
01-Mar-2019 |
mlarkin |
vmd(8): remove some i386 remnants that missed the original cleanup
ok pd, kn, deraadt
|
#
1.44 |
|
20-Feb-2019 |
mlarkin |
vmd(8): initialize guest %drX registers to power-on defaults on launch
Initializes the %drX registers to power on defaults, and bump the VM send/recieve header to reflect same
discussed with deraadt@
|
#
1.43 |
|
10-Dec-2018 |
claudio |
Implement the fw_cfg interface basics and use it to set the bootorder if a bootdevice was forced. This implements both the pure IO port interface and also the new DMA interface, a few direct commands are implemented which are needed but in general the "file" interface should be used. There is no write support for the guest. Tested against the latest vmm-firmware port. This requires also a -current kernel to pass the IO ports to vmd(8). OK mlarkin@ ccardenas@
|
#
1.42 |
|
06-Dec-2018 |
claudio |
Make it possible to define the bootdevice in vmd. This information is used currently only when booting a OpenBSD kernel. If VMBOOTDEV_NET is used the internal dhcp server will pass "auto_install" as boot file to the client and the boot loader passes the MAC of the first interface to the kernel to indicate PXE booting. Adding boot order support to SeaBIOS is not yet implemented. Ok ccardenas@
|
Revision tags: OPENBSD_6_4_BASE
|
#
1.41 |
|
08-Oct-2018 |
reyk |
Add support for qcow2 base images (external snapshots).
This works is from Ori Bernstein, committing on his behalf:
Add support to vmd for external snapshots. That is, snapshots that are derived from a base image. Data lookups start in the derived image, and if the derived image does not contain some data, the search proceeds ot the base image. Multiple derived images may exist off of a single base image.
A limitation of this format is that modifying the base image will corrupt the derived image.
This change also adds support for creating disk derived disk images to vmctl. To use it:
vmctl create derived.qcow2 -s 16G -b base.qcow2
From Ori Bernstein OK mlarkin@ reyk@
|
#
1.40 |
|
28-Sep-2018 |
reyk |
Support vmd-internal's vmboot with qcow2 disk images.
OK mlarkin@
|
#
1.39 |
|
19-Sep-2018 |
ccardenas |
Various clean up items for disks.
- qcow2: general cleanup - vioraw: check malloc - virtio: add function to sync disks - vm: call virtio_shutdown to sync disks when vm is finished executing
Thanks to Ori Bernstein.
Ok miko@
|
#
1.38 |
|
17-Jul-2018 |
mlarkin |
vmd(8): fix vmctl -b option for i386 kernels.
ok pd@
|
#
1.37 |
|
12-Jul-2018 |
mlarkin |
vmm(8)/vmm(4): send a copy of the guest register state to vmd on exit, avoiding multiple readregs ioctls back to vmm in case register content is needed subsequently.
ok phessler
|
#
1.36 |
|
10-Jul-2018 |
mlarkin |
vmd(8): route ELCR handler to the right function
|
#
1.35 |
|
09-Jul-2018 |
mlarkin |
vmd(8): better debug message in a failure case
|
#
1.34 |
|
19-Jun-2018 |
reyk |
knf
|
#
1.33 |
|
27-Apr-2018 |
mlarkin |
vmd(8): implement vmd side of ELCR registers
ok guenther
|
#
1.32 |
|
26-Apr-2018 |
mlarkin |
vmd(8): handle PIT channel 2 status readback via port 0x61
Allow PIT channel 2 status (fired/counting) readback via port 0x61 bit 5.
ok guenther@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.31 |
|
03-Jan-2018 |
ccardenas |
Add initial CD-ROM support to VMD via vioscsi.
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
#
1.30 |
|
29-Nov-2017 |
mlarkin |
make vmm(4) less responsible for initial register state, preferring to let usermode daemons handle that.
ok pd@
|
#
1.29 |
|
28-Nov-2017 |
mlarkin |
fix some spelling errors in a few comments
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.28 |
|
19-Sep-2017 |
mlarkin |
Clarify a wrong conditional, found by jsg.
ok jsg
|
#
1.27 |
|
17-Sep-2017 |
pd |
vmd: send/recv pci config space instead of recreating pci devices on receive
ok mlarkin@
|
#
1.26 |
|
17-Sep-2017 |
pd |
vmd: re add rtc.per and rtc.sec evtimers on receive
This was missed in receive. mc146818_start is already defined. This fixes rtc time resync on receive.
ok mlarkin@
|
#
1.25 |
|
11-Sep-2017 |
dlg |
add functions to provide direct access to guest memory as vmd addresses
iovec_mem() populates an iovec array based on guest physical addresses. this allows the use of things like readv and writev for moving data between the guest and a disk image file without having to bounce the memory.
vaddr_mem() provides a vmd usable pointer based on a guests physical address. this makes it possible to directly reference things like virtio rings without having to bounce that memory either. however, it assumes that a contiguous range of guest physical memory will sit in a single vm memory range. mlarkin@ says this is right.
ok mlarkin@
|
#
1.24 |
|
20-Aug-2017 |
pd |
vmd: Allow only upward migration
This restricts receiving vms from hosts with more cpu features.
Tested on broadwell -> skylake (works) skylake -> broadwell (don't work)
ok mlarkin@
|
#
1.23 |
|
14-Aug-2017 |
mlarkin |
vmd: set MSR_MISC_ENABLE=0 on vm creation, this will be re-set in vmm based on proper values from the host in use.
|
#
1.22 |
|
15-Jul-2017 |
pd |
Add vmctl send and vmctl receive
ok reyk@ and mlarkin@
|
#
1.21 |
|
09-Jul-2017 |
pd |
vmd/vmctl: Add ability to pause / unpause vms
With help from Ashwin Agrawal
ok reyk@ mlarkin@
|
#
1.20 |
|
07-Jun-2017 |
mlarkin |
vmd: Implement simulated baudrate support in the ns8250 module. The previous version was allowing an output rate that is "too fast", and linux guests would give up after 512 characters TXed ("too much work for irq4").
This diff calculates the approximate rate we can sustain at the current programmed baud rate and limits the output to that rate by inserting a HZ delay after a specified number of characters have been transmitted. This fixes the linux guest console issue.
Note that the console now outputs at more or less the selected baud rate, instead of nearly instantaneously as before - if you selected 9600 in your guest VMs before, you might want to change that to 115200 now for a better console experience.
krw@ "seems like a good idea to me"
|
#
1.19 |
|
30-May-2017 |
tedu |
split vioblk read/write functions into start and finish as prep for async io operations. ok mlarkin
|
#
1.18 |
|
28-May-2017 |
mlarkin |
SVM: add some exit types
Also, fix a comment that wasn't applicable anymore, and change a format from decimal to hex
|
#
1.17 |
|
05-May-2017 |
reyk |
VMs cannot use proc_compose() to PROC_VMM, they have to use imsg_compose() on the "vmm_pipe" directly. This fixes the communication channel from VMs back to vmm.
|
#
1.16 |
|
05-May-2017 |
mlarkin |
Allow vmd(8) to set guest %xcr0
Usermode part of previous vmm(4) diff.
Posted to tech by Pratik Vyas
|
#
1.15 |
|
02-May-2017 |
mlarkin |
fix an error in i386 vmd build
|
#
1.14 |
|
02-May-2017 |
mlarkin |
Matching vmd(8) part of previous diff (first part of vmctl send/receive).
ok kettenis
|
#
1.13 |
|
25-Apr-2017 |
reyk |
spacing
|
#
1.12 |
|
19-Apr-2017 |
reyk |
Add support for dynamic "NAT" interfaces (-L/local interface).
When a local interface is configured, vmd configures a /31 address on the tap(4) interface of the host and provides another IP in the same subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server that replies with IP, gateway, and DNS addresses to the VM. The built-in server only ever responds to the VM on the inside and cannot leak its DHCP responses to the outside.
Thanks to Uwe Werler, Josh Grosse, and some others for testing!
OK deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.11 |
|
27-Mar-2017 |
deraadt |
die whitespace die die die
|
#
1.10 |
|
25-Mar-2017 |
mlarkin |
Last bits needed to get seabios + alpine linux working. This is enough to get started and let more people help finding and fixing bugs.
ok kettenis, deraadt
|
#
1.9 |
|
25-Mar-2017 |
reyk |
Boot using BIOS from /etc/firmware/vmm-bios by default.
Instead of using the internal "vmboot", VMs will now be booted using the external BIOS firmware in /etc/firmware/vmm-bios (which is subject to a LGPLv3 license). Direct booting of OpenBSD kernels or non-default BIOS images is still supported for now using the -b/boot option that is replacing the -k/kernel option.
As requested by Theo, vmd(8) fails if neither the default BIOS is found nor a kernel has been specified in the VM configuration. The "vmm" BIOS has to be installed using fw_update(1), which will be done automatically in most cases where the OpenBSD can fetch it after install/upgrade.
OK mlarkin@
|
#
1.8 |
|
25-Mar-2017 |
mlarkin |
Implement some missing functionality and clean up some code in vmd pci emulation.
ok kettenis
|
#
1.7 |
|
25-Mar-2017 |
mlarkin |
Introduce a new function to obtain properly sized input data, and convert i8253/i8259/mc146818 emulation to use this.
|
#
1.6 |
|
24-Mar-2017 |
mlarkin |
Allow vmd to proceed after an interrupt occurred after retiring a cpuid instruction. Matches previous commit to kernel vmm.c
|
#
1.5 |
|
23-Mar-2017 |
mlarkin |
Implement memory size and SMP CPU count NVRAM registers in the emulated mc146818. This is needed for seabios to boot properly (and construct a sensible e820 map to send to the guest OS).
|
#
1.4 |
|
21-Mar-2017 |
mlarkin |
Fix two errors in NS8250 (UART) emulation. The first error zeroed out the high bits of %eax on reading register data from the emulated UART ports. The second error didn't properly assert the TXRDY bit during init - this bit was only set after the first character was sent. Both these bugs caused seabios to not be able to output any data. Found during the recent effort to get Linux guests booting.
|
#
1.3 |
|
15-Mar-2017 |
reyk |
Improve vmmci(4) shutdown and reboot.
This change handles various cases to power off the VM, even if it is unresponsive, stuck in ddb, or when the shutdown was initiated from the VM guest side. Usage of timeout and VM ACKs make sure that the VM is really turned off at some point.
OK mlarkin@
|
#
1.2 |
|
02-Mar-2017 |
reyk |
Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.
This is especially useful when multiple VMs share a switch, the implementation is independent from the underlying switch or bridge.
no objections mlarkin@
|
#
1.1 |
|
01-Mar-2017 |
reyk |
Split vmm.c into two files: vm.c for the VM child, vmm.c for the parent
As discussed with mlarkin@, it makes it easier to maintain the file.
OK mlarkin@
|
#
1.60 |
|
19-Mar-2021 |
kn |
Remove booting from kernels in raw/qcow2 images
Diff and (slightly tweaked) text below from Dave Voutila < dave at sisu dot io >, thanks!
-- Since 6.7 switched to FFS2 as the default filesystem for new installs, the ability for vmd(8) to load a kernel and boot.conf from a disk image directly (without SeaBIOS) has been broken.
A diff from tb to add FFS2 support never mdae it into the tree.
On 5th Jan 2021, new ramdisks for amd64 have started shipping gzipped, breaking the ability to load the bsd.rd directly as a kernel image for a vmd guest without first uncompressing the image.
Using BIOS works, the FFS2 change happend ten months ago and few if any have complained about the breakage. vmctl(8) is still vague about supporting it per its man page and one still has to pass the disk image twice as a "-b" and "-d" argument to boot an OpenBSD guest *without* BIOS.
Josh Rickmar reported the gzip issue on bugs@ and provided patches to add support for compressed ramdisks and kernel images. The easiest way to do so is to drop support for FFS images since they require a call to fmemopen(3) while all the other logic uses fopen(3)/fdopen(3) calls and a file descriptor. It is much easier to get thsoe patches merged if they don't have to account for extracting files from disk images. --
No objections anyone "Removing it makes sense" reyk (who wrote the FFS module) OK mlarkin
|
#
1.59 |
|
13-Feb-2021 |
mlarkin |
Fix some wrong comments and KNF/long line wraps
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.58 |
|
28-Jun-2020 |
pd |
vmd(8): Eliminate libevent state corruption
libevent functions for com, pic and rtc are now only called on event_thread. vcpu exit handlers send messages on a dev pipe and callbacks on these events do the event management (event_add, evtimer_add, etc). Previously, libevent state was mutated by two threads, event_thread, that runs all the callbacks and the vcpu thread when running exit handlers. This could have lead to libevent state corruption.
Patch from Dave Voutila <dave@sisu.io>
ok claudio@ tested by abieber@ and brynet@
|
Revision tags: OPENBSD_6_7_BASE
|
#
1.57 |
|
30-Apr-2020 |
pd |
vmd(8): correctly terminate vm processes after sending vm
Instead of a round about way of sending a message to vmm that 'send is successful' and terminating by vm_remove from vmm, we can send the imsg and exit in the vm process. The sigchld handler in vmm will vm_remove it from its structures. This is how a normal vm is terminated as well.
Previously, vm_remove was called in vmm_dispatch_vm (ie. the event handler to receive messages from vm process) when hanlding the IMSG_VMDOP_SEND_VM_RESPONSE (ie. the vm process has written the vm state to the fd passed on by vmctl send). This is not how vm_remove was intented to be used as it does a free(vm). The vm struct holds the buffers for imsg and so after handling this IMSG_VMDOP_SEND_VM_RESPONSE message, vmm_dispatch_vm loops again to do imsg_get(ibuf, &imsg) to read the next message (and we had just freed this *ibuf when we freed the vm struct) causing it to segfault.
reported by kn@ ok kn@
|
#
1.56 |
|
21-Apr-2020 |
pd |
vmd: improve concurrency control in pause
Previous implementation hit a deadlock sometimes as the pthread_cond_broadcast for the pause mutex could happen before pthread_cond_wait. This implementation uses a barrier which is hit when all vpcus are paused.
ok mpi@
|
#
1.55 |
|
08-Apr-2020 |
pd |
vmm(4): add IOCTL handler to sets the access protections of the ept
This exposes VMM_IOC_MPROTECT_EPT which can be used by vmd to lock in physical pages. Currently, vmd just terminates the vm in case it gets a protection fault in the future.
This feature is used by solo5 which uses vmm(4) as a backend hypervisor.
ok mpi@
Patch from Adam Steen <adam@adamsteen.com.au>
|
#
1.54 |
|
11-Dec-2019 |
pd |
vmd: proper concurrency control when pausing a vm
Removes an XXX which slept for 1s waiting for the vcpu thread to reach HLT and pause. We now define a paused and unpaused condition so that a call to pause_vm() / vmctl pause blocks till the vm really reaches a paused state.
Also, detach events for devices from event loop when pausing and add them back when unpausing. This is because some callbacks call pthread_mutex_lock and if the vm is paused, it would block also causing the libevent thread to block. This would mean that we would not be able to process any IMSGs received from vmm (parent process) including a message to unpause.
ok mlarkin@
|
#
1.53 |
|
30-Nov-2019 |
mlarkin |
Revert previous - the stability was not as improved as we had thought and we ended up accidentally breaking vmctl. This will need more thought.
ok ori@
|
#
1.52 |
|
29-Nov-2019 |
mlarkin |
Fix at least one cause of VMs spinning at 100% host CPU
After debugging with ori@, it looks like an event ends up on the wrong libevent queue, and we end continually de-queueing and re-queueing the event continually. While it's unclear exactly why this happened, a clue on libevent's github issues page for the same problem pointed us to using a different event base for the device events. This seems to have unstuck ori@'s problematic VM, and I have also seen no more hangs after this.
We have not completely separated the queues; ori@ will work on setting new libevent bases for those later. But those events are pretty frequency.
with help from and ok ori@
|
Revision tags: OPENBSD_6_6_BASE
|
#
1.51 |
|
17-Jul-2019 |
pd |
vmm/vmd: Fix migration with pvclock
Implement VMM_IOC_READVMPARAMS and VMM_IOC_WRITEVMPARAMS ioctls to read and write pvclock state.
reads ok mlarkin@
|
#
1.50 |
|
28-Jun-2019 |
deraadt |
When system calls indicate an error they return -1, not some arbitrary value < 0. errno is only updated in this case. Change all (most?) callers of syscalls to follow this better, and let's see if this strictness helps us in the future.
|
#
1.49 |
|
28-May-2019 |
pd |
vmd: unset CR0_CD and CR0_NW in default flat64 register values
These never got unset on AMD/SVM guests when booted via vmctl start -b causing them to run very slow
ok mlarkin@
|
#
1.48 |
|
12-May-2019 |
pd |
vmm: add a x86 page table walker
Add a first cut of x86 page table walker to vmd(8) and vmm(4). This function is not used right now but is a building block for future features like HPET, OUTSB and INSB emulation, nested virtualisation support, etc.
With help from Mike Larkin
ok mlarkin@
|
#
1.47 |
|
11-May-2019 |
jasper |
vm_dump_header allocated space for a signature but it was never set; set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image
ok mlarkin@ pd@
|
#
1.46 |
|
11-May-2019 |
jasper |
track the state of the vm (running, paused, etc) using a single bitfield instead of a handful of separate variables. this will makes it easier for vmd to report and check on the individual vm states
no functional change intended
ok ccardenas@ mlarkin@
|
Revision tags: OPENBSD_6_5_BASE
|
#
1.45 |
|
01-Mar-2019 |
mlarkin |
vmd(8): remove some i386 remnants that missed the original cleanup
ok pd, kn, deraadt
|
#
1.44 |
|
20-Feb-2019 |
mlarkin |
vmd(8): initialize guest %drX registers to power-on defaults on launch
Initializes the %drX registers to power on defaults, and bump the VM send/recieve header to reflect same
discussed with deraadt@
|
#
1.43 |
|
10-Dec-2018 |
claudio |
Implement the fw_cfg interface basics and use it to set the bootorder if a bootdevice was forced. This implements both the pure IO port interface and also the new DMA interface, a few direct commands are implemented which are needed but in general the "file" interface should be used. There is no write support for the guest. Tested against the latest vmm-firmware port. This requires also a -current kernel to pass the IO ports to vmd(8). OK mlarkin@ ccardenas@
|
#
1.42 |
|
06-Dec-2018 |
claudio |
Make it possible to define the bootdevice in vmd. This information is used currently only when booting a OpenBSD kernel. If VMBOOTDEV_NET is used the internal dhcp server will pass "auto_install" as boot file to the client and the boot loader passes the MAC of the first interface to the kernel to indicate PXE booting. Adding boot order support to SeaBIOS is not yet implemented. Ok ccardenas@
|
Revision tags: OPENBSD_6_4_BASE
|
#
1.41 |
|
08-Oct-2018 |
reyk |
Add support for qcow2 base images (external snapshots).
This works is from Ori Bernstein, committing on his behalf:
Add support to vmd for external snapshots. That is, snapshots that are derived from a base image. Data lookups start in the derived image, and if the derived image does not contain some data, the search proceeds ot the base image. Multiple derived images may exist off of a single base image.
A limitation of this format is that modifying the base image will corrupt the derived image.
This change also adds support for creating disk derived disk images to vmctl. To use it:
vmctl create derived.qcow2 -s 16G -b base.qcow2
From Ori Bernstein OK mlarkin@ reyk@
|
#
1.40 |
|
28-Sep-2018 |
reyk |
Support vmd-internal's vmboot with qcow2 disk images.
OK mlarkin@
|
#
1.39 |
|
19-Sep-2018 |
ccardenas |
Various clean up items for disks.
- qcow2: general cleanup - vioraw: check malloc - virtio: add function to sync disks - vm: call virtio_shutdown to sync disks when vm is finished executing
Thanks to Ori Bernstein.
Ok miko@
|
#
1.38 |
|
17-Jul-2018 |
mlarkin |
vmd(8): fix vmctl -b option for i386 kernels.
ok pd@
|
#
1.37 |
|
12-Jul-2018 |
mlarkin |
vmm(8)/vmm(4): send a copy of the guest register state to vmd on exit, avoiding multiple readregs ioctls back to vmm in case register content is needed subsequently.
ok phessler
|
#
1.36 |
|
10-Jul-2018 |
mlarkin |
vmd(8): route ELCR handler to the right function
|
#
1.35 |
|
09-Jul-2018 |
mlarkin |
vmd(8): better debug message in a failure case
|
#
1.34 |
|
19-Jun-2018 |
reyk |
knf
|
#
1.33 |
|
27-Apr-2018 |
mlarkin |
vmd(8): implement vmd side of ELCR registers
ok guenther
|
#
1.32 |
|
26-Apr-2018 |
mlarkin |
vmd(8): handle PIT channel 2 status readback via port 0x61
Allow PIT channel 2 status (fired/counting) readback via port 0x61 bit 5.
ok guenther@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.31 |
|
03-Jan-2018 |
ccardenas |
Add initial CD-ROM support to VMD via vioscsi.
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
#
1.30 |
|
29-Nov-2017 |
mlarkin |
make vmm(4) less responsible for initial register state, preferring to let usermode daemons handle that.
ok pd@
|
#
1.29 |
|
28-Nov-2017 |
mlarkin |
fix some spelling errors in a few comments
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.28 |
|
19-Sep-2017 |
mlarkin |
Clarify a wrong conditional, found by jsg.
ok jsg
|
#
1.27 |
|
17-Sep-2017 |
pd |
vmd: send/recv pci config space instead of recreating pci devices on receive
ok mlarkin@
|
#
1.26 |
|
17-Sep-2017 |
pd |
vmd: re add rtc.per and rtc.sec evtimers on receive
This was missed in receive. mc146818_start is already defined. This fixes rtc time resync on receive.
ok mlarkin@
|
#
1.25 |
|
11-Sep-2017 |
dlg |
add functions to provide direct access to guest memory as vmd addresses
iovec_mem() populates an iovec array based on guest physical addresses. this allows the use of things like readv and writev for moving data between the guest and a disk image file without having to bounce the memory.
vaddr_mem() provides a vmd usable pointer based on a guests physical address. this makes it possible to directly reference things like virtio rings without having to bounce that memory either. however, it assumes that a contiguous range of guest physical memory will sit in a single vm memory range. mlarkin@ says this is right.
ok mlarkin@
|
#
1.24 |
|
20-Aug-2017 |
pd |
vmd: Allow only upward migration
This restricts receiving vms from hosts with more cpu features.
Tested on broadwell -> skylake (works) skylake -> broadwell (don't work)
ok mlarkin@
|
#
1.23 |
|
14-Aug-2017 |
mlarkin |
vmd: set MSR_MISC_ENABLE=0 on vm creation, this will be re-set in vmm based on proper values from the host in use.
|
#
1.22 |
|
15-Jul-2017 |
pd |
Add vmctl send and vmctl receive
ok reyk@ and mlarkin@
|
#
1.21 |
|
09-Jul-2017 |
pd |
vmd/vmctl: Add ability to pause / unpause vms
With help from Ashwin Agrawal
ok reyk@ mlarkin@
|
#
1.20 |
|
07-Jun-2017 |
mlarkin |
vmd: Implement simulated baudrate support in the ns8250 module. The previous version was allowing an output rate that is "too fast", and linux guests would give up after 512 characters TXed ("too much work for irq4").
This diff calculates the approximate rate we can sustain at the current programmed baud rate and limits the output to that rate by inserting a HZ delay after a specified number of characters have been transmitted. This fixes the linux guest console issue.
Note that the console now outputs at more or less the selected baud rate, instead of nearly instantaneously as before - if you selected 9600 in your guest VMs before, you might want to change that to 115200 now for a better console experience.
krw@ "seems like a good idea to me"
|
#
1.19 |
|
30-May-2017 |
tedu |
split vioblk read/write functions into start and finish as prep for async io operations. ok mlarkin
|
#
1.18 |
|
28-May-2017 |
mlarkin |
SVM: add some exit types
Also, fix a comment that wasn't applicable anymore, and change a format from decimal to hex
|
#
1.17 |
|
05-May-2017 |
reyk |
VMs cannot use proc_compose() to PROC_VMM, they have to use imsg_compose() on the "vmm_pipe" directly. This fixes the communication channel from VMs back to vmm.
|
#
1.16 |
|
05-May-2017 |
mlarkin |
Allow vmd(8) to set guest %xcr0
Usermode part of previous vmm(4) diff.
Posted to tech by Pratik Vyas
|
#
1.15 |
|
02-May-2017 |
mlarkin |
fix an error in i386 vmd build
|
#
1.14 |
|
02-May-2017 |
mlarkin |
Matching vmd(8) part of previous diff (first part of vmctl send/receive).
ok kettenis
|
#
1.13 |
|
25-Apr-2017 |
reyk |
spacing
|
#
1.12 |
|
19-Apr-2017 |
reyk |
Add support for dynamic "NAT" interfaces (-L/local interface).
When a local interface is configured, vmd configures a /31 address on the tap(4) interface of the host and provides another IP in the same subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server that replies with IP, gateway, and DNS addresses to the VM. The built-in server only ever responds to the VM on the inside and cannot leak its DHCP responses to the outside.
Thanks to Uwe Werler, Josh Grosse, and some others for testing!
OK deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.11 |
|
27-Mar-2017 |
deraadt |
die whitespace die die die
|
#
1.10 |
|
25-Mar-2017 |
mlarkin |
Last bits needed to get seabios + alpine linux working. This is enough to get started and let more people help finding and fixing bugs.
ok kettenis, deraadt
|
#
1.9 |
|
25-Mar-2017 |
reyk |
Boot using BIOS from /etc/firmware/vmm-bios by default.
Instead of using the internal "vmboot", VMs will now be booted using the external BIOS firmware in /etc/firmware/vmm-bios (which is subject to a LGPLv3 license). Direct booting of OpenBSD kernels or non-default BIOS images is still supported for now using the -b/boot option that is replacing the -k/kernel option.
As requested by Theo, vmd(8) fails if neither the default BIOS is found nor a kernel has been specified in the VM configuration. The "vmm" BIOS has to be installed using fw_update(1), which will be done automatically in most cases where the OpenBSD can fetch it after install/upgrade.
OK mlarkin@
|
#
1.8 |
|
25-Mar-2017 |
mlarkin |
Implement some missing functionality and clean up some code in vmd pci emulation.
ok kettenis
|
#
1.7 |
|
25-Mar-2017 |
mlarkin |
Introduce a new function to obtain properly sized input data, and convert i8253/i8259/mc146818 emulation to use this.
|
#
1.6 |
|
24-Mar-2017 |
mlarkin |
Allow vmd to proceed after an interrupt occurred after retiring a cpuid instruction. Matches previous commit to kernel vmm.c
|
#
1.5 |
|
23-Mar-2017 |
mlarkin |
Implement memory size and SMP CPU count NVRAM registers in the emulated mc146818. This is needed for seabios to boot properly (and construct a sensible e820 map to send to the guest OS).
|
#
1.4 |
|
21-Mar-2017 |
mlarkin |
Fix two errors in NS8250 (UART) emulation. The first error zeroed out the high bits of %eax on reading register data from the emulated UART ports. The second error didn't properly assert the TXRDY bit during init - this bit was only set after the first character was sent. Both these bugs caused seabios to not be able to output any data. Found during the recent effort to get Linux guests booting.
|
#
1.3 |
|
15-Mar-2017 |
reyk |
Improve vmmci(4) shutdown and reboot.
This change handles various cases to power off the VM, even if it is unresponsive, stuck in ddb, or when the shutdown was initiated from the VM guest side. Usage of timeout and VM ACKs make sure that the VM is really turned off at some point.
OK mlarkin@
|
#
1.2 |
|
02-Mar-2017 |
reyk |
Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.
This is especially useful when multiple VMs share a switch, the implementation is independent from the underlying switch or bridge.
no objections mlarkin@
|
#
1.1 |
|
01-Mar-2017 |
reyk |
Split vmm.c into two files: vm.c for the VM child, vmm.c for the parent
As discussed with mlarkin@, it makes it easier to maintain the file.
OK mlarkin@
|
#
1.59 |
|
13-Feb-2021 |
mlarkin |
Fix some wrong comments and KNF/long line wraps
|
Revision tags: OPENBSD_6_8_BASE
|
#
1.58 |
|
28-Jun-2020 |
pd |
vmd(8): Eliminate libevent state corruption
libevent functions for com, pic and rtc are now only called on event_thread. vcpu exit handlers send messages on a dev pipe and callbacks on these events do the event management (event_add, evtimer_add, etc). Previously, libevent state was mutated by two threads, event_thread, that runs all the callbacks and the vcpu thread when running exit handlers. This could have lead to libevent state corruption.
Patch from Dave Voutila <dave@sisu.io>
ok claudio@ tested by abieber@ and brynet@
|
Revision tags: OPENBSD_6_7_BASE
|
#
1.57 |
|
30-Apr-2020 |
pd |
vmd(8): correctly terminate vm processes after sending vm
Instead of a round about way of sending a message to vmm that 'send is successful' and terminating by vm_remove from vmm, we can send the imsg and exit in the vm process. The sigchld handler in vmm will vm_remove it from its structures. This is how a normal vm is terminated as well.
Previously, vm_remove was called in vmm_dispatch_vm (ie. the event handler to receive messages from vm process) when hanlding the IMSG_VMDOP_SEND_VM_RESPONSE (ie. the vm process has written the vm state to the fd passed on by vmctl send). This is not how vm_remove was intented to be used as it does a free(vm). The vm struct holds the buffers for imsg and so after handling this IMSG_VMDOP_SEND_VM_RESPONSE message, vmm_dispatch_vm loops again to do imsg_get(ibuf, &imsg) to read the next message (and we had just freed this *ibuf when we freed the vm struct) causing it to segfault.
reported by kn@ ok kn@
|
#
1.56 |
|
21-Apr-2020 |
pd |
vmd: improve concurrency control in pause
Previous implementation hit a deadlock sometimes as the pthread_cond_broadcast for the pause mutex could happen before pthread_cond_wait. This implementation uses a barrier which is hit when all vpcus are paused.
ok mpi@
|
#
1.55 |
|
08-Apr-2020 |
pd |
vmm(4): add IOCTL handler to sets the access protections of the ept
This exposes VMM_IOC_MPROTECT_EPT which can be used by vmd to lock in physical pages. Currently, vmd just terminates the vm in case it gets a protection fault in the future.
This feature is used by solo5 which uses vmm(4) as a backend hypervisor.
ok mpi@
Patch from Adam Steen <adam@adamsteen.com.au>
|
#
1.54 |
|
11-Dec-2019 |
pd |
vmd: proper concurrency control when pausing a vm
Removes an XXX which slept for 1s waiting for the vcpu thread to reach HLT and pause. We now define a paused and unpaused condition so that a call to pause_vm() / vmctl pause blocks till the vm really reaches a paused state.
Also, detach events for devices from event loop when pausing and add them back when unpausing. This is because some callbacks call pthread_mutex_lock and if the vm is paused, it would block also causing the libevent thread to block. This would mean that we would not be able to process any IMSGs received from vmm (parent process) including a message to unpause.
ok mlarkin@
|
#
1.53 |
|
30-Nov-2019 |
mlarkin |
Revert previous - the stability was not as improved as we had thought and we ended up accidentally breaking vmctl. This will need more thought.
ok ori@
|
#
1.52 |
|
29-Nov-2019 |
mlarkin |
Fix at least one cause of VMs spinning at 100% host CPU
After debugging with ori@, it looks like an event ends up on the wrong libevent queue, and we end continually de-queueing and re-queueing the event continually. While it's unclear exactly why this happened, a clue on libevent's github issues page for the same problem pointed us to using a different event base for the device events. This seems to have unstuck ori@'s problematic VM, and I have also seen no more hangs after this.
We have not completely separated the queues; ori@ will work on setting new libevent bases for those later. But those events are pretty frequency.
with help from and ok ori@
|
Revision tags: OPENBSD_6_6_BASE
|
#
1.51 |
|
17-Jul-2019 |
pd |
vmm/vmd: Fix migration with pvclock
Implement VMM_IOC_READVMPARAMS and VMM_IOC_WRITEVMPARAMS ioctls to read and write pvclock state.
reads ok mlarkin@
|
#
1.50 |
|
28-Jun-2019 |
deraadt |
When system calls indicate an error they return -1, not some arbitrary value < 0. errno is only updated in this case. Change all (most?) callers of syscalls to follow this better, and let's see if this strictness helps us in the future.
|
#
1.49 |
|
28-May-2019 |
pd |
vmd: unset CR0_CD and CR0_NW in default flat64 register values
These never got unset on AMD/SVM guests when booted via vmctl start -b causing them to run very slow
ok mlarkin@
|
#
1.48 |
|
12-May-2019 |
pd |
vmm: add a x86 page table walker
Add a first cut of x86 page table walker to vmd(8) and vmm(4). This function is not used right now but is a building block for future features like HPET, OUTSB and INSB emulation, nested virtualisation support, etc.
With help from Mike Larkin
ok mlarkin@
|
#
1.47 |
|
11-May-2019 |
jasper |
vm_dump_header allocated space for a signature but it was never set; set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image
ok mlarkin@ pd@
|
#
1.46 |
|
11-May-2019 |
jasper |
track the state of the vm (running, paused, etc) using a single bitfield instead of a handful of separate variables. this will makes it easier for vmd to report and check on the individual vm states
no functional change intended
ok ccardenas@ mlarkin@
|
Revision tags: OPENBSD_6_5_BASE
|
#
1.45 |
|
01-Mar-2019 |
mlarkin |
vmd(8): remove some i386 remnants that missed the original cleanup
ok pd, kn, deraadt
|
#
1.44 |
|
20-Feb-2019 |
mlarkin |
vmd(8): initialize guest %drX registers to power-on defaults on launch
Initializes the %drX registers to power on defaults, and bump the VM send/recieve header to reflect same
discussed with deraadt@
|
#
1.43 |
|
10-Dec-2018 |
claudio |
Implement the fw_cfg interface basics and use it to set the bootorder if a bootdevice was forced. This implements both the pure IO port interface and also the new DMA interface, a few direct commands are implemented which are needed but in general the "file" interface should be used. There is no write support for the guest. Tested against the latest vmm-firmware port. This requires also a -current kernel to pass the IO ports to vmd(8). OK mlarkin@ ccardenas@
|
#
1.42 |
|
06-Dec-2018 |
claudio |
Make it possible to define the bootdevice in vmd. This information is used currently only when booting a OpenBSD kernel. If VMBOOTDEV_NET is used the internal dhcp server will pass "auto_install" as boot file to the client and the boot loader passes the MAC of the first interface to the kernel to indicate PXE booting. Adding boot order support to SeaBIOS is not yet implemented. Ok ccardenas@
|
Revision tags: OPENBSD_6_4_BASE
|
#
1.41 |
|
08-Oct-2018 |
reyk |
Add support for qcow2 base images (external snapshots).
This works is from Ori Bernstein, committing on his behalf:
Add support to vmd for external snapshots. That is, snapshots that are derived from a base image. Data lookups start in the derived image, and if the derived image does not contain some data, the search proceeds ot the base image. Multiple derived images may exist off of a single base image.
A limitation of this format is that modifying the base image will corrupt the derived image.
This change also adds support for creating disk derived disk images to vmctl. To use it:
vmctl create derived.qcow2 -s 16G -b base.qcow2
From Ori Bernstein OK mlarkin@ reyk@
|
#
1.40 |
|
28-Sep-2018 |
reyk |
Support vmd-internal's vmboot with qcow2 disk images.
OK mlarkin@
|
#
1.39 |
|
19-Sep-2018 |
ccardenas |
Various clean up items for disks.
- qcow2: general cleanup - vioraw: check malloc - virtio: add function to sync disks - vm: call virtio_shutdown to sync disks when vm is finished executing
Thanks to Ori Bernstein.
Ok miko@
|
#
1.38 |
|
17-Jul-2018 |
mlarkin |
vmd(8): fix vmctl -b option for i386 kernels.
ok pd@
|
#
1.37 |
|
12-Jul-2018 |
mlarkin |
vmm(8)/vmm(4): send a copy of the guest register state to vmd on exit, avoiding multiple readregs ioctls back to vmm in case register content is needed subsequently.
ok phessler
|
#
1.36 |
|
10-Jul-2018 |
mlarkin |
vmd(8): route ELCR handler to the right function
|
#
1.35 |
|
09-Jul-2018 |
mlarkin |
vmd(8): better debug message in a failure case
|
#
1.34 |
|
19-Jun-2018 |
reyk |
knf
|
#
1.33 |
|
27-Apr-2018 |
mlarkin |
vmd(8): implement vmd side of ELCR registers
ok guenther
|
#
1.32 |
|
26-Apr-2018 |
mlarkin |
vmd(8): handle PIT channel 2 status readback via port 0x61
Allow PIT channel 2 status (fired/counting) readback via port 0x61 bit 5.
ok guenther@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.31 |
|
03-Jan-2018 |
ccardenas |
Add initial CD-ROM support to VMD via vioscsi.
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
#
1.30 |
|
29-Nov-2017 |
mlarkin |
make vmm(4) less responsible for initial register state, preferring to let usermode daemons handle that.
ok pd@
|
#
1.29 |
|
28-Nov-2017 |
mlarkin |
fix some spelling errors in a few comments
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.28 |
|
19-Sep-2017 |
mlarkin |
Clarify a wrong conditional, found by jsg.
ok jsg
|
#
1.27 |
|
17-Sep-2017 |
pd |
vmd: send/recv pci config space instead of recreating pci devices on receive
ok mlarkin@
|
#
1.26 |
|
17-Sep-2017 |
pd |
vmd: re add rtc.per and rtc.sec evtimers on receive
This was missed in receive. mc146818_start is already defined. This fixes rtc time resync on receive.
ok mlarkin@
|
#
1.25 |
|
11-Sep-2017 |
dlg |
add functions to provide direct access to guest memory as vmd addresses
iovec_mem() populates an iovec array based on guest physical addresses. this allows the use of things like readv and writev for moving data between the guest and a disk image file without having to bounce the memory.
vaddr_mem() provides a vmd usable pointer based on a guests physical address. this makes it possible to directly reference things like virtio rings without having to bounce that memory either. however, it assumes that a contiguous range of guest physical memory will sit in a single vm memory range. mlarkin@ says this is right.
ok mlarkin@
|
#
1.24 |
|
20-Aug-2017 |
pd |
vmd: Allow only upward migration
This restricts receiving vms from hosts with more cpu features.
Tested on broadwell -> skylake (works) skylake -> broadwell (don't work)
ok mlarkin@
|
#
1.23 |
|
14-Aug-2017 |
mlarkin |
vmd: set MSR_MISC_ENABLE=0 on vm creation, this will be re-set in vmm based on proper values from the host in use.
|
#
1.22 |
|
15-Jul-2017 |
pd |
Add vmctl send and vmctl receive
ok reyk@ and mlarkin@
|
#
1.21 |
|
09-Jul-2017 |
pd |
vmd/vmctl: Add ability to pause / unpause vms
With help from Ashwin Agrawal
ok reyk@ mlarkin@
|
#
1.20 |
|
07-Jun-2017 |
mlarkin |
vmd: Implement simulated baudrate support in the ns8250 module. The previous version was allowing an output rate that is "too fast", and linux guests would give up after 512 characters TXed ("too much work for irq4").
This diff calculates the approximate rate we can sustain at the current programmed baud rate and limits the output to that rate by inserting a HZ delay after a specified number of characters have been transmitted. This fixes the linux guest console issue.
Note that the console now outputs at more or less the selected baud rate, instead of nearly instantaneously as before - if you selected 9600 in your guest VMs before, you might want to change that to 115200 now for a better console experience.
krw@ "seems like a good idea to me"
|
#
1.19 |
|
30-May-2017 |
tedu |
split vioblk read/write functions into start and finish as prep for async io operations. ok mlarkin
|
#
1.18 |
|
28-May-2017 |
mlarkin |
SVM: add some exit types
Also, fix a comment that wasn't applicable anymore, and change a format from decimal to hex
|
#
1.17 |
|
05-May-2017 |
reyk |
VMs cannot use proc_compose() to PROC_VMM, they have to use imsg_compose() on the "vmm_pipe" directly. This fixes the communication channel from VMs back to vmm.
|
#
1.16 |
|
05-May-2017 |
mlarkin |
Allow vmd(8) to set guest %xcr0
Usermode part of previous vmm(4) diff.
Posted to tech by Pratik Vyas
|
#
1.15 |
|
02-May-2017 |
mlarkin |
fix an error in i386 vmd build
|
#
1.14 |
|
02-May-2017 |
mlarkin |
Matching vmd(8) part of previous diff (first part of vmctl send/receive).
ok kettenis
|
#
1.13 |
|
25-Apr-2017 |
reyk |
spacing
|
#
1.12 |
|
19-Apr-2017 |
reyk |
Add support for dynamic "NAT" interfaces (-L/local interface).
When a local interface is configured, vmd configures a /31 address on the tap(4) interface of the host and provides another IP in the same subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server that replies with IP, gateway, and DNS addresses to the VM. The built-in server only ever responds to the VM on the inside and cannot leak its DHCP responses to the outside.
Thanks to Uwe Werler, Josh Grosse, and some others for testing!
OK deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.11 |
|
27-Mar-2017 |
deraadt |
die whitespace die die die
|
#
1.10 |
|
25-Mar-2017 |
mlarkin |
Last bits needed to get seabios + alpine linux working. This is enough to get started and let more people help finding and fixing bugs.
ok kettenis, deraadt
|
#
1.9 |
|
25-Mar-2017 |
reyk |
Boot using BIOS from /etc/firmware/vmm-bios by default.
Instead of using the internal "vmboot", VMs will now be booted using the external BIOS firmware in /etc/firmware/vmm-bios (which is subject to a LGPLv3 license). Direct booting of OpenBSD kernels or non-default BIOS images is still supported for now using the -b/boot option that is replacing the -k/kernel option.
As requested by Theo, vmd(8) fails if neither the default BIOS is found nor a kernel has been specified in the VM configuration. The "vmm" BIOS has to be installed using fw_update(1), which will be done automatically in most cases where the OpenBSD can fetch it after install/upgrade.
OK mlarkin@
|
#
1.8 |
|
25-Mar-2017 |
mlarkin |
Implement some missing functionality and clean up some code in vmd pci emulation.
ok kettenis
|
#
1.7 |
|
25-Mar-2017 |
mlarkin |
Introduce a new function to obtain properly sized input data, and convert i8253/i8259/mc146818 emulation to use this.
|
#
1.6 |
|
24-Mar-2017 |
mlarkin |
Allow vmd to proceed after an interrupt occurred after retiring a cpuid instruction. Matches previous commit to kernel vmm.c
|
#
1.5 |
|
23-Mar-2017 |
mlarkin |
Implement memory size and SMP CPU count NVRAM registers in the emulated mc146818. This is needed for seabios to boot properly (and construct a sensible e820 map to send to the guest OS).
|
#
1.4 |
|
21-Mar-2017 |
mlarkin |
Fix two errors in NS8250 (UART) emulation. The first error zeroed out the high bits of %eax on reading register data from the emulated UART ports. The second error didn't properly assert the TXRDY bit during init - this bit was only set after the first character was sent. Both these bugs caused seabios to not be able to output any data. Found during the recent effort to get Linux guests booting.
|
#
1.3 |
|
15-Mar-2017 |
reyk |
Improve vmmci(4) shutdown and reboot.
This change handles various cases to power off the VM, even if it is unresponsive, stuck in ddb, or when the shutdown was initiated from the VM guest side. Usage of timeout and VM ACKs make sure that the VM is really turned off at some point.
OK mlarkin@
|
#
1.2 |
|
02-Mar-2017 |
reyk |
Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.
This is especially useful when multiple VMs share a switch, the implementation is independent from the underlying switch or bridge.
no objections mlarkin@
|
#
1.1 |
|
01-Mar-2017 |
reyk |
Split vmm.c into two files: vm.c for the VM child, vmm.c for the parent
As discussed with mlarkin@, it makes it easier to maintain the file.
OK mlarkin@
|
#
1.58 |
|
28-Jun-2020 |
pd |
vmd(8): Eliminate libevent state corruption
libevent functions for com, pic and rtc are now only called on event_thread. vcpu exit handlers send messages on a dev pipe and callbacks on these events do the event management (event_add, evtimer_add, etc). Previously, libevent state was mutated by two threads, event_thread, that runs all the callbacks and the vcpu thread when running exit handlers. This could have lead to libevent state corruption.
Patch from Dave Voutila <dave@sisu.io>
ok claudio@ tested by abieber@ and brynet@
|
Revision tags: OPENBSD_6_7_BASE
|
#
1.57 |
|
30-Apr-2020 |
pd |
vmd(8): correctly terminate vm processes after sending vm
Instead of a round about way of sending a message to vmm that 'send is successful' and terminating by vm_remove from vmm, we can send the imsg and exit in the vm process. The sigchld handler in vmm will vm_remove it from its structures. This is how a normal vm is terminated as well.
Previously, vm_remove was called in vmm_dispatch_vm (ie. the event handler to receive messages from vm process) when hanlding the IMSG_VMDOP_SEND_VM_RESPONSE (ie. the vm process has written the vm state to the fd passed on by vmctl send). This is not how vm_remove was intented to be used as it does a free(vm). The vm struct holds the buffers for imsg and so after handling this IMSG_VMDOP_SEND_VM_RESPONSE message, vmm_dispatch_vm loops again to do imsg_get(ibuf, &imsg) to read the next message (and we had just freed this *ibuf when we freed the vm struct) causing it to segfault.
reported by kn@ ok kn@
|
#
1.56 |
|
21-Apr-2020 |
pd |
vmd: improve concurrency control in pause
Previous implementation hit a deadlock sometimes as the pthread_cond_broadcast for the pause mutex could happen before pthread_cond_wait. This implementation uses a barrier which is hit when all vpcus are paused.
ok mpi@
|
#
1.55 |
|
08-Apr-2020 |
pd |
vmm(4): add IOCTL handler to sets the access protections of the ept
This exposes VMM_IOC_MPROTECT_EPT which can be used by vmd to lock in physical pages. Currently, vmd just terminates the vm in case it gets a protection fault in the future.
This feature is used by solo5 which uses vmm(4) as a backend hypervisor.
ok mpi@
Patch from Adam Steen <adam@adamsteen.com.au>
|
#
1.54 |
|
11-Dec-2019 |
pd |
vmd: proper concurrency control when pausing a vm
Removes an XXX which slept for 1s waiting for the vcpu thread to reach HLT and pause. We now define a paused and unpaused condition so that a call to pause_vm() / vmctl pause blocks till the vm really reaches a paused state.
Also, detach events for devices from event loop when pausing and add them back when unpausing. This is because some callbacks call pthread_mutex_lock and if the vm is paused, it would block also causing the libevent thread to block. This would mean that we would not be able to process any IMSGs received from vmm (parent process) including a message to unpause.
ok mlarkin@
|
#
1.53 |
|
30-Nov-2019 |
mlarkin |
Revert previous - the stability was not as improved as we had thought and we ended up accidentally breaking vmctl. This will need more thought.
ok ori@
|
#
1.52 |
|
29-Nov-2019 |
mlarkin |
Fix at least one cause of VMs spinning at 100% host CPU
After debugging with ori@, it looks like an event ends up on the wrong libevent queue, and we end continually de-queueing and re-queueing the event continually. While it's unclear exactly why this happened, a clue on libevent's github issues page for the same problem pointed us to using a different event base for the device events. This seems to have unstuck ori@'s problematic VM, and I have also seen no more hangs after this.
We have not completely separated the queues; ori@ will work on setting new libevent bases for those later. But those events are pretty frequency.
with help from and ok ori@
|
Revision tags: OPENBSD_6_6_BASE
|
#
1.51 |
|
17-Jul-2019 |
pd |
vmm/vmd: Fix migration with pvclock
Implement VMM_IOC_READVMPARAMS and VMM_IOC_WRITEVMPARAMS ioctls to read and write pvclock state.
reads ok mlarkin@
|
#
1.50 |
|
28-Jun-2019 |
deraadt |
When system calls indicate an error they return -1, not some arbitrary value < 0. errno is only updated in this case. Change all (most?) callers of syscalls to follow this better, and let's see if this strictness helps us in the future.
|
#
1.49 |
|
28-May-2019 |
pd |
vmd: unset CR0_CD and CR0_NW in default flat64 register values
These never got unset on AMD/SVM guests when booted via vmctl start -b causing them to run very slow
ok mlarkin@
|
#
1.48 |
|
12-May-2019 |
pd |
vmm: add a x86 page table walker
Add a first cut of x86 page table walker to vmd(8) and vmm(4). This function is not used right now but is a building block for future features like HPET, OUTSB and INSB emulation, nested virtualisation support, etc.
With help from Mike Larkin
ok mlarkin@
|
#
1.47 |
|
11-May-2019 |
jasper |
vm_dump_header allocated space for a signature but it was never set; set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image
ok mlarkin@ pd@
|
#
1.46 |
|
11-May-2019 |
jasper |
track the state of the vm (running, paused, etc) using a single bitfield instead of a handful of separate variables. this will makes it easier for vmd to report and check on the individual vm states
no functional change intended
ok ccardenas@ mlarkin@
|
Revision tags: OPENBSD_6_5_BASE
|
#
1.45 |
|
01-Mar-2019 |
mlarkin |
vmd(8): remove some i386 remnants that missed the original cleanup
ok pd, kn, deraadt
|
#
1.44 |
|
20-Feb-2019 |
mlarkin |
vmd(8): initialize guest %drX registers to power-on defaults on launch
Initializes the %drX registers to power on defaults, and bump the VM send/recieve header to reflect same
discussed with deraadt@
|
#
1.43 |
|
10-Dec-2018 |
claudio |
Implement the fw_cfg interface basics and use it to set the bootorder if a bootdevice was forced. This implements both the pure IO port interface and also the new DMA interface, a few direct commands are implemented which are needed but in general the "file" interface should be used. There is no write support for the guest. Tested against the latest vmm-firmware port. This requires also a -current kernel to pass the IO ports to vmd(8). OK mlarkin@ ccardenas@
|
#
1.42 |
|
06-Dec-2018 |
claudio |
Make it possible to define the bootdevice in vmd. This information is used currently only when booting a OpenBSD kernel. If VMBOOTDEV_NET is used the internal dhcp server will pass "auto_install" as boot file to the client and the boot loader passes the MAC of the first interface to the kernel to indicate PXE booting. Adding boot order support to SeaBIOS is not yet implemented. Ok ccardenas@
|
Revision tags: OPENBSD_6_4_BASE
|
#
1.41 |
|
08-Oct-2018 |
reyk |
Add support for qcow2 base images (external snapshots).
This works is from Ori Bernstein, committing on his behalf:
Add support to vmd for external snapshots. That is, snapshots that are derived from a base image. Data lookups start in the derived image, and if the derived image does not contain some data, the search proceeds ot the base image. Multiple derived images may exist off of a single base image.
A limitation of this format is that modifying the base image will corrupt the derived image.
This change also adds support for creating disk derived disk images to vmctl. To use it:
vmctl create derived.qcow2 -s 16G -b base.qcow2
From Ori Bernstein OK mlarkin@ reyk@
|
#
1.40 |
|
28-Sep-2018 |
reyk |
Support vmd-internal's vmboot with qcow2 disk images.
OK mlarkin@
|
#
1.39 |
|
19-Sep-2018 |
ccardenas |
Various clean up items for disks.
- qcow2: general cleanup - vioraw: check malloc - virtio: add function to sync disks - vm: call virtio_shutdown to sync disks when vm is finished executing
Thanks to Ori Bernstein.
Ok miko@
|
#
1.38 |
|
17-Jul-2018 |
mlarkin |
vmd(8): fix vmctl -b option for i386 kernels.
ok pd@
|
#
1.37 |
|
12-Jul-2018 |
mlarkin |
vmm(8)/vmm(4): send a copy of the guest register state to vmd on exit, avoiding multiple readregs ioctls back to vmm in case register content is needed subsequently.
ok phessler
|
#
1.36 |
|
10-Jul-2018 |
mlarkin |
vmd(8): route ELCR handler to the right function
|
#
1.35 |
|
09-Jul-2018 |
mlarkin |
vmd(8): better debug message in a failure case
|
#
1.34 |
|
19-Jun-2018 |
reyk |
knf
|
#
1.33 |
|
27-Apr-2018 |
mlarkin |
vmd(8): implement vmd side of ELCR registers
ok guenther
|
#
1.32 |
|
26-Apr-2018 |
mlarkin |
vmd(8): handle PIT channel 2 status readback via port 0x61
Allow PIT channel 2 status (fired/counting) readback via port 0x61 bit 5.
ok guenther@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.31 |
|
03-Jan-2018 |
ccardenas |
Add initial CD-ROM support to VMD via vioscsi.
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
#
1.30 |
|
29-Nov-2017 |
mlarkin |
make vmm(4) less responsible for initial register state, preferring to let usermode daemons handle that.
ok pd@
|
#
1.29 |
|
28-Nov-2017 |
mlarkin |
fix some spelling errors in a few comments
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.28 |
|
19-Sep-2017 |
mlarkin |
Clarify a wrong conditional, found by jsg.
ok jsg
|
#
1.27 |
|
17-Sep-2017 |
pd |
vmd: send/recv pci config space instead of recreating pci devices on receive
ok mlarkin@
|
#
1.26 |
|
17-Sep-2017 |
pd |
vmd: re add rtc.per and rtc.sec evtimers on receive
This was missed in receive. mc146818_start is already defined. This fixes rtc time resync on receive.
ok mlarkin@
|
#
1.25 |
|
11-Sep-2017 |
dlg |
add functions to provide direct access to guest memory as vmd addresses
iovec_mem() populates an iovec array based on guest physical addresses. this allows the use of things like readv and writev for moving data between the guest and a disk image file without having to bounce the memory.
vaddr_mem() provides a vmd usable pointer based on a guests physical address. this makes it possible to directly reference things like virtio rings without having to bounce that memory either. however, it assumes that a contiguous range of guest physical memory will sit in a single vm memory range. mlarkin@ says this is right.
ok mlarkin@
|
#
1.24 |
|
20-Aug-2017 |
pd |
vmd: Allow only upward migration
This restricts receiving vms from hosts with more cpu features.
Tested on broadwell -> skylake (works) skylake -> broadwell (don't work)
ok mlarkin@
|
#
1.23 |
|
14-Aug-2017 |
mlarkin |
vmd: set MSR_MISC_ENABLE=0 on vm creation, this will be re-set in vmm based on proper values from the host in use.
|
#
1.22 |
|
15-Jul-2017 |
pd |
Add vmctl send and vmctl receive
ok reyk@ and mlarkin@
|
#
1.21 |
|
09-Jul-2017 |
pd |
vmd/vmctl: Add ability to pause / unpause vms
With help from Ashwin Agrawal
ok reyk@ mlarkin@
|
#
1.20 |
|
07-Jun-2017 |
mlarkin |
vmd: Implement simulated baudrate support in the ns8250 module. The previous version was allowing an output rate that is "too fast", and linux guests would give up after 512 characters TXed ("too much work for irq4").
This diff calculates the approximate rate we can sustain at the current programmed baud rate and limits the output to that rate by inserting a HZ delay after a specified number of characters have been transmitted. This fixes the linux guest console issue.
Note that the console now outputs at more or less the selected baud rate, instead of nearly instantaneously as before - if you selected 9600 in your guest VMs before, you might want to change that to 115200 now for a better console experience.
krw@ "seems like a good idea to me"
|
#
1.19 |
|
30-May-2017 |
tedu |
split vioblk read/write functions into start and finish as prep for async io operations. ok mlarkin
|
#
1.18 |
|
28-May-2017 |
mlarkin |
SVM: add some exit types
Also, fix a comment that wasn't applicable anymore, and change a format from decimal to hex
|
#
1.17 |
|
05-May-2017 |
reyk |
VMs cannot use proc_compose() to PROC_VMM, they have to use imsg_compose() on the "vmm_pipe" directly. This fixes the communication channel from VMs back to vmm.
|
#
1.16 |
|
05-May-2017 |
mlarkin |
Allow vmd(8) to set guest %xcr0
Usermode part of previous vmm(4) diff.
Posted to tech by Pratik Vyas
|
#
1.15 |
|
02-May-2017 |
mlarkin |
fix an error in i386 vmd build
|
#
1.14 |
|
02-May-2017 |
mlarkin |
Matching vmd(8) part of previous diff (first part of vmctl send/receive).
ok kettenis
|
#
1.13 |
|
25-Apr-2017 |
reyk |
spacing
|
#
1.12 |
|
19-Apr-2017 |
reyk |
Add support for dynamic "NAT" interfaces (-L/local interface).
When a local interface is configured, vmd configures a /31 address on the tap(4) interface of the host and provides another IP in the same subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server that replies with IP, gateway, and DNS addresses to the VM. The built-in server only ever responds to the VM on the inside and cannot leak its DHCP responses to the outside.
Thanks to Uwe Werler, Josh Grosse, and some others for testing!
OK deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.11 |
|
27-Mar-2017 |
deraadt |
die whitespace die die die
|
#
1.10 |
|
25-Mar-2017 |
mlarkin |
Last bits needed to get seabios + alpine linux working. This is enough to get started and let more people help finding and fixing bugs.
ok kettenis, deraadt
|
#
1.9 |
|
25-Mar-2017 |
reyk |
Boot using BIOS from /etc/firmware/vmm-bios by default.
Instead of using the internal "vmboot", VMs will now be booted using the external BIOS firmware in /etc/firmware/vmm-bios (which is subject to a LGPLv3 license). Direct booting of OpenBSD kernels or non-default BIOS images is still supported for now using the -b/boot option that is replacing the -k/kernel option.
As requested by Theo, vmd(8) fails if neither the default BIOS is found nor a kernel has been specified in the VM configuration. The "vmm" BIOS has to be installed using fw_update(1), which will be done automatically in most cases where the OpenBSD can fetch it after install/upgrade.
OK mlarkin@
|
#
1.8 |
|
25-Mar-2017 |
mlarkin |
Implement some missing functionality and clean up some code in vmd pci emulation.
ok kettenis
|
#
1.7 |
|
25-Mar-2017 |
mlarkin |
Introduce a new function to obtain properly sized input data, and convert i8253/i8259/mc146818 emulation to use this.
|
#
1.6 |
|
24-Mar-2017 |
mlarkin |
Allow vmd to proceed after an interrupt occurred after retiring a cpuid instruction. Matches previous commit to kernel vmm.c
|
#
1.5 |
|
23-Mar-2017 |
mlarkin |
Implement memory size and SMP CPU count NVRAM registers in the emulated mc146818. This is needed for seabios to boot properly (and construct a sensible e820 map to send to the guest OS).
|
#
1.4 |
|
21-Mar-2017 |
mlarkin |
Fix two errors in NS8250 (UART) emulation. The first error zeroed out the high bits of %eax on reading register data from the emulated UART ports. The second error didn't properly assert the TXRDY bit during init - this bit was only set after the first character was sent. Both these bugs caused seabios to not be able to output any data. Found during the recent effort to get Linux guests booting.
|
#
1.3 |
|
15-Mar-2017 |
reyk |
Improve vmmci(4) shutdown and reboot.
This change handles various cases to power off the VM, even if it is unresponsive, stuck in ddb, or when the shutdown was initiated from the VM guest side. Usage of timeout and VM ACKs make sure that the VM is really turned off at some point.
OK mlarkin@
|
#
1.2 |
|
02-Mar-2017 |
reyk |
Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.
This is especially useful when multiple VMs share a switch, the implementation is independent from the underlying switch or bridge.
no objections mlarkin@
|
#
1.1 |
|
01-Mar-2017 |
reyk |
Split vmm.c into two files: vm.c for the VM child, vmm.c for the parent
As discussed with mlarkin@, it makes it easier to maintain the file.
OK mlarkin@
|
#
1.57 |
|
30-Apr-2020 |
pd |
vmd(8): correctly terminate vm processes after sending vm
Instead of a round about way of sending a message to vmm that 'send is successful' and terminating by vm_remove from vmm, we can send the imsg and exit in the vm process. The sigchld handler in vmm will vm_remove it from its structures. This is how a normal vm is terminated as well.
Previously, vm_remove was called in vmm_dispatch_vm (ie. the event handler to receive messages from vm process) when hanlding the IMSG_VMDOP_SEND_VM_RESPONSE (ie. the vm process has written the vm state to the fd passed on by vmctl send). This is not how vm_remove was intented to be used as it does a free(vm). The vm struct holds the buffers for imsg and so after handling this IMSG_VMDOP_SEND_VM_RESPONSE message, vmm_dispatch_vm loops again to do imsg_get(ibuf, &imsg) to read the next message (and we had just freed this *ibuf when we freed the vm struct) causing it to segfault.
reported by kn@ ok kn@
|
#
1.56 |
|
21-Apr-2020 |
pd |
vmd: improve concurrency control in pause
Previous implementation hit a deadlock sometimes as the pthread_cond_broadcast for the pause mutex could happen before pthread_cond_wait. This implementation uses a barrier which is hit when all vpcus are paused.
ok mpi@
|
#
1.55 |
|
08-Apr-2020 |
pd |
vmm(4): add IOCTL handler to sets the access protections of the ept
This exposes VMM_IOC_MPROTECT_EPT which can be used by vmd to lock in physical pages. Currently, vmd just terminates the vm in case it gets a protection fault in the future.
This feature is used by solo5 which uses vmm(4) as a backend hypervisor.
ok mpi@
Patch from Adam Steen <adam@adamsteen.com.au>
|
#
1.54 |
|
11-Dec-2019 |
pd |
vmd: proper concurrency control when pausing a vm
Removes an XXX which slept for 1s waiting for the vcpu thread to reach HLT and pause. We now define a paused and unpaused condition so that a call to pause_vm() / vmctl pause blocks till the vm really reaches a paused state.
Also, detach events for devices from event loop when pausing and add them back when unpausing. This is because some callbacks call pthread_mutex_lock and if the vm is paused, it would block also causing the libevent thread to block. This would mean that we would not be able to process any IMSGs received from vmm (parent process) including a message to unpause.
ok mlarkin@
|
#
1.53 |
|
30-Nov-2019 |
mlarkin |
Revert previous - the stability was not as improved as we had thought and we ended up accidentally breaking vmctl. This will need more thought.
ok ori@
|
#
1.52 |
|
29-Nov-2019 |
mlarkin |
Fix at least one cause of VMs spinning at 100% host CPU
After debugging with ori@, it looks like an event ends up on the wrong libevent queue, and we end continually de-queueing and re-queueing the event continually. While it's unclear exactly why this happened, a clue on libevent's github issues page for the same problem pointed us to using a different event base for the device events. This seems to have unstuck ori@'s problematic VM, and I have also seen no more hangs after this.
We have not completely separated the queues; ori@ will work on setting new libevent bases for those later. But those events are pretty frequency.
with help from and ok ori@
|
Revision tags: OPENBSD_6_6_BASE
|
#
1.51 |
|
17-Jul-2019 |
pd |
vmm/vmd: Fix migration with pvclock
Implement VMM_IOC_READVMPARAMS and VMM_IOC_WRITEVMPARAMS ioctls to read and write pvclock state.
reads ok mlarkin@
|
#
1.50 |
|
28-Jun-2019 |
deraadt |
When system calls indicate an error they return -1, not some arbitrary value < 0. errno is only updated in this case. Change all (most?) callers of syscalls to follow this better, and let's see if this strictness helps us in the future.
|
#
1.49 |
|
28-May-2019 |
pd |
vmd: unset CR0_CD and CR0_NW in default flat64 register values
These never got unset on AMD/SVM guests when booted via vmctl start -b causing them to run very slow
ok mlarkin@
|
#
1.48 |
|
12-May-2019 |
pd |
vmm: add a x86 page table walker
Add a first cut of x86 page table walker to vmd(8) and vmm(4). This function is not used right now but is a building block for future features like HPET, OUTSB and INSB emulation, nested virtualisation support, etc.
With help from Mike Larkin
ok mlarkin@
|
#
1.47 |
|
11-May-2019 |
jasper |
vm_dump_header allocated space for a signature but it was never set; set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image
ok mlarkin@ pd@
|
#
1.46 |
|
11-May-2019 |
jasper |
track the state of the vm (running, paused, etc) using a single bitfield instead of a handful of separate variables. this will makes it easier for vmd to report and check on the individual vm states
no functional change intended
ok ccardenas@ mlarkin@
|
Revision tags: OPENBSD_6_5_BASE
|
#
1.45 |
|
01-Mar-2019 |
mlarkin |
vmd(8): remove some i386 remnants that missed the original cleanup
ok pd, kn, deraadt
|
#
1.44 |
|
20-Feb-2019 |
mlarkin |
vmd(8): initialize guest %drX registers to power-on defaults on launch
Initializes the %drX registers to power on defaults, and bump the VM send/recieve header to reflect same
discussed with deraadt@
|
#
1.43 |
|
10-Dec-2018 |
claudio |
Implement the fw_cfg interface basics and use it to set the bootorder if a bootdevice was forced. This implements both the pure IO port interface and also the new DMA interface, a few direct commands are implemented which are needed but in general the "file" interface should be used. There is no write support for the guest. Tested against the latest vmm-firmware port. This requires also a -current kernel to pass the IO ports to vmd(8). OK mlarkin@ ccardenas@
|
#
1.42 |
|
06-Dec-2018 |
claudio |
Make it possible to define the bootdevice in vmd. This information is used currently only when booting a OpenBSD kernel. If VMBOOTDEV_NET is used the internal dhcp server will pass "auto_install" as boot file to the client and the boot loader passes the MAC of the first interface to the kernel to indicate PXE booting. Adding boot order support to SeaBIOS is not yet implemented. Ok ccardenas@
|
Revision tags: OPENBSD_6_4_BASE
|
#
1.41 |
|
08-Oct-2018 |
reyk |
Add support for qcow2 base images (external snapshots).
This works is from Ori Bernstein, committing on his behalf:
Add support to vmd for external snapshots. That is, snapshots that are derived from a base image. Data lookups start in the derived image, and if the derived image does not contain some data, the search proceeds ot the base image. Multiple derived images may exist off of a single base image.
A limitation of this format is that modifying the base image will corrupt the derived image.
This change also adds support for creating disk derived disk images to vmctl. To use it:
vmctl create derived.qcow2 -s 16G -b base.qcow2
From Ori Bernstein OK mlarkin@ reyk@
|
#
1.40 |
|
28-Sep-2018 |
reyk |
Support vmd-internal's vmboot with qcow2 disk images.
OK mlarkin@
|
#
1.39 |
|
19-Sep-2018 |
ccardenas |
Various clean up items for disks.
- qcow2: general cleanup - vioraw: check malloc - virtio: add function to sync disks - vm: call virtio_shutdown to sync disks when vm is finished executing
Thanks to Ori Bernstein.
Ok miko@
|
#
1.38 |
|
17-Jul-2018 |
mlarkin |
vmd(8): fix vmctl -b option for i386 kernels.
ok pd@
|
#
1.37 |
|
12-Jul-2018 |
mlarkin |
vmm(8)/vmm(4): send a copy of the guest register state to vmd on exit, avoiding multiple readregs ioctls back to vmm in case register content is needed subsequently.
ok phessler
|
#
1.36 |
|
10-Jul-2018 |
mlarkin |
vmd(8): route ELCR handler to the right function
|
#
1.35 |
|
09-Jul-2018 |
mlarkin |
vmd(8): better debug message in a failure case
|
#
1.34 |
|
19-Jun-2018 |
reyk |
knf
|
#
1.33 |
|
27-Apr-2018 |
mlarkin |
vmd(8): implement vmd side of ELCR registers
ok guenther
|
#
1.32 |
|
26-Apr-2018 |
mlarkin |
vmd(8): handle PIT channel 2 status readback via port 0x61
Allow PIT channel 2 status (fired/counting) readback via port 0x61 bit 5.
ok guenther@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.31 |
|
03-Jan-2018 |
ccardenas |
Add initial CD-ROM support to VMD via vioscsi.
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
#
1.30 |
|
29-Nov-2017 |
mlarkin |
make vmm(4) less responsible for initial register state, preferring to let usermode daemons handle that.
ok pd@
|
#
1.29 |
|
28-Nov-2017 |
mlarkin |
fix some spelling errors in a few comments
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.28 |
|
19-Sep-2017 |
mlarkin |
Clarify a wrong conditional, found by jsg.
ok jsg
|
#
1.27 |
|
17-Sep-2017 |
pd |
vmd: send/recv pci config space instead of recreating pci devices on receive
ok mlarkin@
|
#
1.26 |
|
17-Sep-2017 |
pd |
vmd: re add rtc.per and rtc.sec evtimers on receive
This was missed in receive. mc146818_start is already defined. This fixes rtc time resync on receive.
ok mlarkin@
|
#
1.25 |
|
11-Sep-2017 |
dlg |
add functions to provide direct access to guest memory as vmd addresses
iovec_mem() populates an iovec array based on guest physical addresses. this allows the use of things like readv and writev for moving data between the guest and a disk image file without having to bounce the memory.
vaddr_mem() provides a vmd usable pointer based on a guests physical address. this makes it possible to directly reference things like virtio rings without having to bounce that memory either. however, it assumes that a contiguous range of guest physical memory will sit in a single vm memory range. mlarkin@ says this is right.
ok mlarkin@
|
#
1.24 |
|
20-Aug-2017 |
pd |
vmd: Allow only upward migration
This restricts receiving vms from hosts with more cpu features.
Tested on broadwell -> skylake (works) skylake -> broadwell (don't work)
ok mlarkin@
|
#
1.23 |
|
14-Aug-2017 |
mlarkin |
vmd: set MSR_MISC_ENABLE=0 on vm creation, this will be re-set in vmm based on proper values from the host in use.
|
#
1.22 |
|
15-Jul-2017 |
pd |
Add vmctl send and vmctl receive
ok reyk@ and mlarkin@
|
#
1.21 |
|
09-Jul-2017 |
pd |
vmd/vmctl: Add ability to pause / unpause vms
With help from Ashwin Agrawal
ok reyk@ mlarkin@
|
#
1.20 |
|
07-Jun-2017 |
mlarkin |
vmd: Implement simulated baudrate support in the ns8250 module. The previous version was allowing an output rate that is "too fast", and linux guests would give up after 512 characters TXed ("too much work for irq4").
This diff calculates the approximate rate we can sustain at the current programmed baud rate and limits the output to that rate by inserting a HZ delay after a specified number of characters have been transmitted. This fixes the linux guest console issue.
Note that the console now outputs at more or less the selected baud rate, instead of nearly instantaneously as before - if you selected 9600 in your guest VMs before, you might want to change that to 115200 now for a better console experience.
krw@ "seems like a good idea to me"
|
#
1.19 |
|
30-May-2017 |
tedu |
split vioblk read/write functions into start and finish as prep for async io operations. ok mlarkin
|
#
1.18 |
|
28-May-2017 |
mlarkin |
SVM: add some exit types
Also, fix a comment that wasn't applicable anymore, and change a format from decimal to hex
|
#
1.17 |
|
05-May-2017 |
reyk |
VMs cannot use proc_compose() to PROC_VMM, they have to use imsg_compose() on the "vmm_pipe" directly. This fixes the communication channel from VMs back to vmm.
|
#
1.16 |
|
05-May-2017 |
mlarkin |
Allow vmd(8) to set guest %xcr0
Usermode part of previous vmm(4) diff.
Posted to tech by Pratik Vyas
|
#
1.15 |
|
02-May-2017 |
mlarkin |
fix an error in i386 vmd build
|
#
1.14 |
|
02-May-2017 |
mlarkin |
Matching vmd(8) part of previous diff (first part of vmctl send/receive).
ok kettenis
|
#
1.13 |
|
25-Apr-2017 |
reyk |
spacing
|
#
1.12 |
|
19-Apr-2017 |
reyk |
Add support for dynamic "NAT" interfaces (-L/local interface).
When a local interface is configured, vmd configures a /31 address on the tap(4) interface of the host and provides another IP in the same subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server that replies with IP, gateway, and DNS addresses to the VM. The built-in server only ever responds to the VM on the inside and cannot leak its DHCP responses to the outside.
Thanks to Uwe Werler, Josh Grosse, and some others for testing!
OK deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.11 |
|
27-Mar-2017 |
deraadt |
die whitespace die die die
|
#
1.10 |
|
25-Mar-2017 |
mlarkin |
Last bits needed to get seabios + alpine linux working. This is enough to get started and let more people help finding and fixing bugs.
ok kettenis, deraadt
|
#
1.9 |
|
25-Mar-2017 |
reyk |
Boot using BIOS from /etc/firmware/vmm-bios by default.
Instead of using the internal "vmboot", VMs will now be booted using the external BIOS firmware in /etc/firmware/vmm-bios (which is subject to a LGPLv3 license). Direct booting of OpenBSD kernels or non-default BIOS images is still supported for now using the -b/boot option that is replacing the -k/kernel option.
As requested by Theo, vmd(8) fails if neither the default BIOS is found nor a kernel has been specified in the VM configuration. The "vmm" BIOS has to be installed using fw_update(1), which will be done automatically in most cases where the OpenBSD can fetch it after install/upgrade.
OK mlarkin@
|
#
1.8 |
|
25-Mar-2017 |
mlarkin |
Implement some missing functionality and clean up some code in vmd pci emulation.
ok kettenis
|
#
1.7 |
|
25-Mar-2017 |
mlarkin |
Introduce a new function to obtain properly sized input data, and convert i8253/i8259/mc146818 emulation to use this.
|
#
1.6 |
|
24-Mar-2017 |
mlarkin |
Allow vmd to proceed after an interrupt occurred after retiring a cpuid instruction. Matches previous commit to kernel vmm.c
|
#
1.5 |
|
23-Mar-2017 |
mlarkin |
Implement memory size and SMP CPU count NVRAM registers in the emulated mc146818. This is needed for seabios to boot properly (and construct a sensible e820 map to send to the guest OS).
|
#
1.4 |
|
21-Mar-2017 |
mlarkin |
Fix two errors in NS8250 (UART) emulation. The first error zeroed out the high bits of %eax on reading register data from the emulated UART ports. The second error didn't properly assert the TXRDY bit during init - this bit was only set after the first character was sent. Both these bugs caused seabios to not be able to output any data. Found during the recent effort to get Linux guests booting.
|
#
1.3 |
|
15-Mar-2017 |
reyk |
Improve vmmci(4) shutdown and reboot.
This change handles various cases to power off the VM, even if it is unresponsive, stuck in ddb, or when the shutdown was initiated from the VM guest side. Usage of timeout and VM ACKs make sure that the VM is really turned off at some point.
OK mlarkin@
|
#
1.2 |
|
02-Mar-2017 |
reyk |
Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.
This is especially useful when multiple VMs share a switch, the implementation is independent from the underlying switch or bridge.
no objections mlarkin@
|
#
1.1 |
|
01-Mar-2017 |
reyk |
Split vmm.c into two files: vm.c for the VM child, vmm.c for the parent
As discussed with mlarkin@, it makes it easier to maintain the file.
OK mlarkin@
|
#
1.56 |
|
21-Apr-2020 |
pd |
vmd: improve concurrency control in pause
Previous implementation hit a deadlock sometimes as the pthread_cond_broadcast for the pause mutex could happen before pthread_cond_wait. This implementation uses a barrier which is hit when all vpcus are paused.
ok mpi@
|
#
1.55 |
|
08-Apr-2020 |
pd |
vmm(4): add IOCTL handler to sets the access protections of the ept
This exposes VMM_IOC_MPROTECT_EPT which can be used by vmd to lock in physical pages. Currently, vmd just terminates the vm in case it gets a protection fault in the future.
This feature is used by solo5 which uses vmm(4) as a backend hypervisor.
ok mpi@
Patch from Adam Steen <adam@adamsteen.com.au>
|
#
1.54 |
|
11-Dec-2019 |
pd |
vmd: proper concurrency control when pausing a vm
Removes an XXX which slept for 1s waiting for the vcpu thread to reach HLT and pause. We now define a paused and unpaused condition so that a call to pause_vm() / vmctl pause blocks till the vm really reaches a paused state.
Also, detach events for devices from event loop when pausing and add them back when unpausing. This is because some callbacks call pthread_mutex_lock and if the vm is paused, it would block also causing the libevent thread to block. This would mean that we would not be able to process any IMSGs received from vmm (parent process) including a message to unpause.
ok mlarkin@
|
#
1.53 |
|
30-Nov-2019 |
mlarkin |
Revert previous - the stability was not as improved as we had thought and we ended up accidentally breaking vmctl. This will need more thought.
ok ori@
|
#
1.52 |
|
29-Nov-2019 |
mlarkin |
Fix at least one cause of VMs spinning at 100% host CPU
After debugging with ori@, it looks like an event ends up on the wrong libevent queue, and we end continually de-queueing and re-queueing the event continually. While it's unclear exactly why this happened, a clue on libevent's github issues page for the same problem pointed us to using a different event base for the device events. This seems to have unstuck ori@'s problematic VM, and I have also seen no more hangs after this.
We have not completely separated the queues; ori@ will work on setting new libevent bases for those later. But those events are pretty frequency.
with help from and ok ori@
|
Revision tags: OPENBSD_6_6_BASE
|
#
1.51 |
|
17-Jul-2019 |
pd |
vmm/vmd: Fix migration with pvclock
Implement VMM_IOC_READVMPARAMS and VMM_IOC_WRITEVMPARAMS ioctls to read and write pvclock state.
reads ok mlarkin@
|
#
1.50 |
|
28-Jun-2019 |
deraadt |
When system calls indicate an error they return -1, not some arbitrary value < 0. errno is only updated in this case. Change all (most?) callers of syscalls to follow this better, and let's see if this strictness helps us in the future.
|
#
1.49 |
|
28-May-2019 |
pd |
vmd: unset CR0_CD and CR0_NW in default flat64 register values
These never got unset on AMD/SVM guests when booted via vmctl start -b causing them to run very slow
ok mlarkin@
|
#
1.48 |
|
12-May-2019 |
pd |
vmm: add a x86 page table walker
Add a first cut of x86 page table walker to vmd(8) and vmm(4). This function is not used right now but is a building block for future features like HPET, OUTSB and INSB emulation, nested virtualisation support, etc.
With help from Mike Larkin
ok mlarkin@
|
#
1.47 |
|
11-May-2019 |
jasper |
vm_dump_header allocated space for a signature but it was never set; set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image
ok mlarkin@ pd@
|
#
1.46 |
|
11-May-2019 |
jasper |
track the state of the vm (running, paused, etc) using a single bitfield instead of a handful of separate variables. this will makes it easier for vmd to report and check on the individual vm states
no functional change intended
ok ccardenas@ mlarkin@
|
Revision tags: OPENBSD_6_5_BASE
|
#
1.45 |
|
01-Mar-2019 |
mlarkin |
vmd(8): remove some i386 remnants that missed the original cleanup
ok pd, kn, deraadt
|
#
1.44 |
|
20-Feb-2019 |
mlarkin |
vmd(8): initialize guest %drX registers to power-on defaults on launch
Initializes the %drX registers to power on defaults, and bump the VM send/recieve header to reflect same
discussed with deraadt@
|
#
1.43 |
|
10-Dec-2018 |
claudio |
Implement the fw_cfg interface basics and use it to set the bootorder if a bootdevice was forced. This implements both the pure IO port interface and also the new DMA interface, a few direct commands are implemented which are needed but in general the "file" interface should be used. There is no write support for the guest. Tested against the latest vmm-firmware port. This requires also a -current kernel to pass the IO ports to vmd(8). OK mlarkin@ ccardenas@
|
#
1.42 |
|
06-Dec-2018 |
claudio |
Make it possible to define the bootdevice in vmd. This information is used currently only when booting a OpenBSD kernel. If VMBOOTDEV_NET is used the internal dhcp server will pass "auto_install" as boot file to the client and the boot loader passes the MAC of the first interface to the kernel to indicate PXE booting. Adding boot order support to SeaBIOS is not yet implemented. Ok ccardenas@
|
Revision tags: OPENBSD_6_4_BASE
|
#
1.41 |
|
08-Oct-2018 |
reyk |
Add support for qcow2 base images (external snapshots).
This works is from Ori Bernstein, committing on his behalf:
Add support to vmd for external snapshots. That is, snapshots that are derived from a base image. Data lookups start in the derived image, and if the derived image does not contain some data, the search proceeds ot the base image. Multiple derived images may exist off of a single base image.
A limitation of this format is that modifying the base image will corrupt the derived image.
This change also adds support for creating disk derived disk images to vmctl. To use it:
vmctl create derived.qcow2 -s 16G -b base.qcow2
From Ori Bernstein OK mlarkin@ reyk@
|
#
1.40 |
|
28-Sep-2018 |
reyk |
Support vmd-internal's vmboot with qcow2 disk images.
OK mlarkin@
|
#
1.39 |
|
19-Sep-2018 |
ccardenas |
Various clean up items for disks.
- qcow2: general cleanup - vioraw: check malloc - virtio: add function to sync disks - vm: call virtio_shutdown to sync disks when vm is finished executing
Thanks to Ori Bernstein.
Ok miko@
|
#
1.38 |
|
17-Jul-2018 |
mlarkin |
vmd(8): fix vmctl -b option for i386 kernels.
ok pd@
|
#
1.37 |
|
12-Jul-2018 |
mlarkin |
vmm(8)/vmm(4): send a copy of the guest register state to vmd on exit, avoiding multiple readregs ioctls back to vmm in case register content is needed subsequently.
ok phessler
|
#
1.36 |
|
10-Jul-2018 |
mlarkin |
vmd(8): route ELCR handler to the right function
|
#
1.35 |
|
09-Jul-2018 |
mlarkin |
vmd(8): better debug message in a failure case
|
#
1.34 |
|
19-Jun-2018 |
reyk |
knf
|
#
1.33 |
|
27-Apr-2018 |
mlarkin |
vmd(8): implement vmd side of ELCR registers
ok guenther
|
#
1.32 |
|
26-Apr-2018 |
mlarkin |
vmd(8): handle PIT channel 2 status readback via port 0x61
Allow PIT channel 2 status (fired/counting) readback via port 0x61 bit 5.
ok guenther@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.31 |
|
03-Jan-2018 |
ccardenas |
Add initial CD-ROM support to VMD via vioscsi.
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
#
1.30 |
|
29-Nov-2017 |
mlarkin |
make vmm(4) less responsible for initial register state, preferring to let usermode daemons handle that.
ok pd@
|
#
1.29 |
|
28-Nov-2017 |
mlarkin |
fix some spelling errors in a few comments
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.28 |
|
19-Sep-2017 |
mlarkin |
Clarify a wrong conditional, found by jsg.
ok jsg
|
#
1.27 |
|
17-Sep-2017 |
pd |
vmd: send/recv pci config space instead of recreating pci devices on receive
ok mlarkin@
|
#
1.26 |
|
17-Sep-2017 |
pd |
vmd: re add rtc.per and rtc.sec evtimers on receive
This was missed in receive. mc146818_start is already defined. This fixes rtc time resync on receive.
ok mlarkin@
|
#
1.25 |
|
11-Sep-2017 |
dlg |
add functions to provide direct access to guest memory as vmd addresses
iovec_mem() populates an iovec array based on guest physical addresses. this allows the use of things like readv and writev for moving data between the guest and a disk image file without having to bounce the memory.
vaddr_mem() provides a vmd usable pointer based on a guests physical address. this makes it possible to directly reference things like virtio rings without having to bounce that memory either. however, it assumes that a contiguous range of guest physical memory will sit in a single vm memory range. mlarkin@ says this is right.
ok mlarkin@
|
#
1.24 |
|
20-Aug-2017 |
pd |
vmd: Allow only upward migration
This restricts receiving vms from hosts with more cpu features.
Tested on broadwell -> skylake (works) skylake -> broadwell (don't work)
ok mlarkin@
|
#
1.23 |
|
14-Aug-2017 |
mlarkin |
vmd: set MSR_MISC_ENABLE=0 on vm creation, this will be re-set in vmm based on proper values from the host in use.
|
#
1.22 |
|
15-Jul-2017 |
pd |
Add vmctl send and vmctl receive
ok reyk@ and mlarkin@
|
#
1.21 |
|
09-Jul-2017 |
pd |
vmd/vmctl: Add ability to pause / unpause vms
With help from Ashwin Agrawal
ok reyk@ mlarkin@
|
#
1.20 |
|
07-Jun-2017 |
mlarkin |
vmd: Implement simulated baudrate support in the ns8250 module. The previous version was allowing an output rate that is "too fast", and linux guests would give up after 512 characters TXed ("too much work for irq4").
This diff calculates the approximate rate we can sustain at the current programmed baud rate and limits the output to that rate by inserting a HZ delay after a specified number of characters have been transmitted. This fixes the linux guest console issue.
Note that the console now outputs at more or less the selected baud rate, instead of nearly instantaneously as before - if you selected 9600 in your guest VMs before, you might want to change that to 115200 now for a better console experience.
krw@ "seems like a good idea to me"
|
#
1.19 |
|
30-May-2017 |
tedu |
split vioblk read/write functions into start and finish as prep for async io operations. ok mlarkin
|
#
1.18 |
|
28-May-2017 |
mlarkin |
SVM: add some exit types
Also, fix a comment that wasn't applicable anymore, and change a format from decimal to hex
|
#
1.17 |
|
05-May-2017 |
reyk |
VMs cannot use proc_compose() to PROC_VMM, they have to use imsg_compose() on the "vmm_pipe" directly. This fixes the communication channel from VMs back to vmm.
|
#
1.16 |
|
05-May-2017 |
mlarkin |
Allow vmd(8) to set guest %xcr0
Usermode part of previous vmm(4) diff.
Posted to tech by Pratik Vyas
|
#
1.15 |
|
02-May-2017 |
mlarkin |
fix an error in i386 vmd build
|
#
1.14 |
|
02-May-2017 |
mlarkin |
Matching vmd(8) part of previous diff (first part of vmctl send/receive).
ok kettenis
|
#
1.13 |
|
25-Apr-2017 |
reyk |
spacing
|
#
1.12 |
|
19-Apr-2017 |
reyk |
Add support for dynamic "NAT" interfaces (-L/local interface).
When a local interface is configured, vmd configures a /31 address on the tap(4) interface of the host and provides another IP in the same subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server that replies with IP, gateway, and DNS addresses to the VM. The built-in server only ever responds to the VM on the inside and cannot leak its DHCP responses to the outside.
Thanks to Uwe Werler, Josh Grosse, and some others for testing!
OK deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.11 |
|
27-Mar-2017 |
deraadt |
die whitespace die die die
|
#
1.10 |
|
25-Mar-2017 |
mlarkin |
Last bits needed to get seabios + alpine linux working. This is enough to get started and let more people help finding and fixing bugs.
ok kettenis, deraadt
|
#
1.9 |
|
25-Mar-2017 |
reyk |
Boot using BIOS from /etc/firmware/vmm-bios by default.
Instead of using the internal "vmboot", VMs will now be booted using the external BIOS firmware in /etc/firmware/vmm-bios (which is subject to a LGPLv3 license). Direct booting of OpenBSD kernels or non-default BIOS images is still supported for now using the -b/boot option that is replacing the -k/kernel option.
As requested by Theo, vmd(8) fails if neither the default BIOS is found nor a kernel has been specified in the VM configuration. The "vmm" BIOS has to be installed using fw_update(1), which will be done automatically in most cases where the OpenBSD can fetch it after install/upgrade.
OK mlarkin@
|
#
1.8 |
|
25-Mar-2017 |
mlarkin |
Implement some missing functionality and clean up some code in vmd pci emulation.
ok kettenis
|
#
1.7 |
|
25-Mar-2017 |
mlarkin |
Introduce a new function to obtain properly sized input data, and convert i8253/i8259/mc146818 emulation to use this.
|
#
1.6 |
|
24-Mar-2017 |
mlarkin |
Allow vmd to proceed after an interrupt occurred after retiring a cpuid instruction. Matches previous commit to kernel vmm.c
|
#
1.5 |
|
23-Mar-2017 |
mlarkin |
Implement memory size and SMP CPU count NVRAM registers in the emulated mc146818. This is needed for seabios to boot properly (and construct a sensible e820 map to send to the guest OS).
|
#
1.4 |
|
21-Mar-2017 |
mlarkin |
Fix two errors in NS8250 (UART) emulation. The first error zeroed out the high bits of %eax on reading register data from the emulated UART ports. The second error didn't properly assert the TXRDY bit during init - this bit was only set after the first character was sent. Both these bugs caused seabios to not be able to output any data. Found during the recent effort to get Linux guests booting.
|
#
1.3 |
|
15-Mar-2017 |
reyk |
Improve vmmci(4) shutdown and reboot.
This change handles various cases to power off the VM, even if it is unresponsive, stuck in ddb, or when the shutdown was initiated from the VM guest side. Usage of timeout and VM ACKs make sure that the VM is really turned off at some point.
OK mlarkin@
|
#
1.2 |
|
02-Mar-2017 |
reyk |
Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.
This is especially useful when multiple VMs share a switch, the implementation is independent from the underlying switch or bridge.
no objections mlarkin@
|
#
1.1 |
|
01-Mar-2017 |
reyk |
Split vmm.c into two files: vm.c for the VM child, vmm.c for the parent
As discussed with mlarkin@, it makes it easier to maintain the file.
OK mlarkin@
|
#
1.55 |
|
08-Apr-2020 |
pd |
vmm(4): add IOCTL handler to sets the access protections of the ept
This exposes VMM_IOC_MPROTECT_EPT which can be used by vmd to lock in physical pages. Currently, vmd just terminates the vm in case it gets a protection fault in the future.
This feature is used by solo5 which uses vmm(4) as a backend hypervisor.
ok mpi@
Patch from Adam Steen <adam@adamsteen.com.au>
|
#
1.54 |
|
11-Dec-2019 |
pd |
vmd: proper concurrency control when pausing a vm
Removes an XXX which slept for 1s waiting for the vcpu thread to reach HLT and pause. We now define a paused and unpaused condition so that a call to pause_vm() / vmctl pause blocks till the vm really reaches a paused state.
Also, detach events for devices from event loop when pausing and add them back when unpausing. This is because some callbacks call pthread_mutex_lock and if the vm is paused, it would block also causing the libevent thread to block. This would mean that we would not be able to process any IMSGs received from vmm (parent process) including a message to unpause.
ok mlarkin@
|
#
1.53 |
|
30-Nov-2019 |
mlarkin |
Revert previous - the stability was not as improved as we had thought and we ended up accidentally breaking vmctl. This will need more thought.
ok ori@
|
#
1.52 |
|
29-Nov-2019 |
mlarkin |
Fix at least one cause of VMs spinning at 100% host CPU
After debugging with ori@, it looks like an event ends up on the wrong libevent queue, and we end continually de-queueing and re-queueing the event continually. While it's unclear exactly why this happened, a clue on libevent's github issues page for the same problem pointed us to using a different event base for the device events. This seems to have unstuck ori@'s problematic VM, and I have also seen no more hangs after this.
We have not completely separated the queues; ori@ will work on setting new libevent bases for those later. But those events are pretty frequency.
with help from and ok ori@
|
Revision tags: OPENBSD_6_6_BASE
|
#
1.51 |
|
17-Jul-2019 |
pd |
vmm/vmd: Fix migration with pvclock
Implement VMM_IOC_READVMPARAMS and VMM_IOC_WRITEVMPARAMS ioctls to read and write pvclock state.
reads ok mlarkin@
|
#
1.50 |
|
28-Jun-2019 |
deraadt |
When system calls indicate an error they return -1, not some arbitrary value < 0. errno is only updated in this case. Change all (most?) callers of syscalls to follow this better, and let's see if this strictness helps us in the future.
|
#
1.49 |
|
28-May-2019 |
pd |
vmd: unset CR0_CD and CR0_NW in default flat64 register values
These never got unset on AMD/SVM guests when booted via vmctl start -b causing them to run very slow
ok mlarkin@
|
#
1.48 |
|
12-May-2019 |
pd |
vmm: add a x86 page table walker
Add a first cut of x86 page table walker to vmd(8) and vmm(4). This function is not used right now but is a building block for future features like HPET, OUTSB and INSB emulation, nested virtualisation support, etc.
With help from Mike Larkin
ok mlarkin@
|
#
1.47 |
|
11-May-2019 |
jasper |
vm_dump_header allocated space for a signature but it was never set; set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image
ok mlarkin@ pd@
|
#
1.46 |
|
11-May-2019 |
jasper |
track the state of the vm (running, paused, etc) using a single bitfield instead of a handful of separate variables. this will makes it easier for vmd to report and check on the individual vm states
no functional change intended
ok ccardenas@ mlarkin@
|
Revision tags: OPENBSD_6_5_BASE
|
#
1.45 |
|
01-Mar-2019 |
mlarkin |
vmd(8): remove some i386 remnants that missed the original cleanup
ok pd, kn, deraadt
|
#
1.44 |
|
20-Feb-2019 |
mlarkin |
vmd(8): initialize guest %drX registers to power-on defaults on launch
Initializes the %drX registers to power on defaults, and bump the VM send/recieve header to reflect same
discussed with deraadt@
|
#
1.43 |
|
10-Dec-2018 |
claudio |
Implement the fw_cfg interface basics and use it to set the bootorder if a bootdevice was forced. This implements both the pure IO port interface and also the new DMA interface, a few direct commands are implemented which are needed but in general the "file" interface should be used. There is no write support for the guest. Tested against the latest vmm-firmware port. This requires also a -current kernel to pass the IO ports to vmd(8). OK mlarkin@ ccardenas@
|
#
1.42 |
|
06-Dec-2018 |
claudio |
Make it possible to define the bootdevice in vmd. This information is used currently only when booting a OpenBSD kernel. If VMBOOTDEV_NET is used the internal dhcp server will pass "auto_install" as boot file to the client and the boot loader passes the MAC of the first interface to the kernel to indicate PXE booting. Adding boot order support to SeaBIOS is not yet implemented. Ok ccardenas@
|
Revision tags: OPENBSD_6_4_BASE
|
#
1.41 |
|
08-Oct-2018 |
reyk |
Add support for qcow2 base images (external snapshots).
This works is from Ori Bernstein, committing on his behalf:
Add support to vmd for external snapshots. That is, snapshots that are derived from a base image. Data lookups start in the derived image, and if the derived image does not contain some data, the search proceeds ot the base image. Multiple derived images may exist off of a single base image.
A limitation of this format is that modifying the base image will corrupt the derived image.
This change also adds support for creating disk derived disk images to vmctl. To use it:
vmctl create derived.qcow2 -s 16G -b base.qcow2
From Ori Bernstein OK mlarkin@ reyk@
|
#
1.40 |
|
28-Sep-2018 |
reyk |
Support vmd-internal's vmboot with qcow2 disk images.
OK mlarkin@
|
#
1.39 |
|
19-Sep-2018 |
ccardenas |
Various clean up items for disks.
- qcow2: general cleanup - vioraw: check malloc - virtio: add function to sync disks - vm: call virtio_shutdown to sync disks when vm is finished executing
Thanks to Ori Bernstein.
Ok miko@
|
#
1.38 |
|
17-Jul-2018 |
mlarkin |
vmd(8): fix vmctl -b option for i386 kernels.
ok pd@
|
#
1.37 |
|
12-Jul-2018 |
mlarkin |
vmm(8)/vmm(4): send a copy of the guest register state to vmd on exit, avoiding multiple readregs ioctls back to vmm in case register content is needed subsequently.
ok phessler
|
#
1.36 |
|
10-Jul-2018 |
mlarkin |
vmd(8): route ELCR handler to the right function
|
#
1.35 |
|
09-Jul-2018 |
mlarkin |
vmd(8): better debug message in a failure case
|
#
1.34 |
|
19-Jun-2018 |
reyk |
knf
|
#
1.33 |
|
27-Apr-2018 |
mlarkin |
vmd(8): implement vmd side of ELCR registers
ok guenther
|
#
1.32 |
|
26-Apr-2018 |
mlarkin |
vmd(8): handle PIT channel 2 status readback via port 0x61
Allow PIT channel 2 status (fired/counting) readback via port 0x61 bit 5.
ok guenther@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.31 |
|
03-Jan-2018 |
ccardenas |
Add initial CD-ROM support to VMD via vioscsi.
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
#
1.30 |
|
29-Nov-2017 |
mlarkin |
make vmm(4) less responsible for initial register state, preferring to let usermode daemons handle that.
ok pd@
|
#
1.29 |
|
28-Nov-2017 |
mlarkin |
fix some spelling errors in a few comments
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.28 |
|
19-Sep-2017 |
mlarkin |
Clarify a wrong conditional, found by jsg.
ok jsg
|
#
1.27 |
|
17-Sep-2017 |
pd |
vmd: send/recv pci config space instead of recreating pci devices on receive
ok mlarkin@
|
#
1.26 |
|
17-Sep-2017 |
pd |
vmd: re add rtc.per and rtc.sec evtimers on receive
This was missed in receive. mc146818_start is already defined. This fixes rtc time resync on receive.
ok mlarkin@
|
#
1.25 |
|
11-Sep-2017 |
dlg |
add functions to provide direct access to guest memory as vmd addresses
iovec_mem() populates an iovec array based on guest physical addresses. this allows the use of things like readv and writev for moving data between the guest and a disk image file without having to bounce the memory.
vaddr_mem() provides a vmd usable pointer based on a guests physical address. this makes it possible to directly reference things like virtio rings without having to bounce that memory either. however, it assumes that a contiguous range of guest physical memory will sit in a single vm memory range. mlarkin@ says this is right.
ok mlarkin@
|
#
1.24 |
|
20-Aug-2017 |
pd |
vmd: Allow only upward migration
This restricts receiving vms from hosts with more cpu features.
Tested on broadwell -> skylake (works) skylake -> broadwell (don't work)
ok mlarkin@
|
#
1.23 |
|
14-Aug-2017 |
mlarkin |
vmd: set MSR_MISC_ENABLE=0 on vm creation, this will be re-set in vmm based on proper values from the host in use.
|
#
1.22 |
|
15-Jul-2017 |
pd |
Add vmctl send and vmctl receive
ok reyk@ and mlarkin@
|
#
1.21 |
|
09-Jul-2017 |
pd |
vmd/vmctl: Add ability to pause / unpause vms
With help from Ashwin Agrawal
ok reyk@ mlarkin@
|
#
1.20 |
|
07-Jun-2017 |
mlarkin |
vmd: Implement simulated baudrate support in the ns8250 module. The previous version was allowing an output rate that is "too fast", and linux guests would give up after 512 characters TXed ("too much work for irq4").
This diff calculates the approximate rate we can sustain at the current programmed baud rate and limits the output to that rate by inserting a HZ delay after a specified number of characters have been transmitted. This fixes the linux guest console issue.
Note that the console now outputs at more or less the selected baud rate, instead of nearly instantaneously as before - if you selected 9600 in your guest VMs before, you might want to change that to 115200 now for a better console experience.
krw@ "seems like a good idea to me"
|
#
1.19 |
|
30-May-2017 |
tedu |
split vioblk read/write functions into start and finish as prep for async io operations. ok mlarkin
|
#
1.18 |
|
28-May-2017 |
mlarkin |
SVM: add some exit types
Also, fix a comment that wasn't applicable anymore, and change a format from decimal to hex
|
#
1.17 |
|
05-May-2017 |
reyk |
VMs cannot use proc_compose() to PROC_VMM, they have to use imsg_compose() on the "vmm_pipe" directly. This fixes the communication channel from VMs back to vmm.
|
#
1.16 |
|
05-May-2017 |
mlarkin |
Allow vmd(8) to set guest %xcr0
Usermode part of previous vmm(4) diff.
Posted to tech by Pratik Vyas
|
#
1.15 |
|
02-May-2017 |
mlarkin |
fix an error in i386 vmd build
|
#
1.14 |
|
02-May-2017 |
mlarkin |
Matching vmd(8) part of previous diff (first part of vmctl send/receive).
ok kettenis
|
#
1.13 |
|
25-Apr-2017 |
reyk |
spacing
|
#
1.12 |
|
19-Apr-2017 |
reyk |
Add support for dynamic "NAT" interfaces (-L/local interface).
When a local interface is configured, vmd configures a /31 address on the tap(4) interface of the host and provides another IP in the same subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server that replies with IP, gateway, and DNS addresses to the VM. The built-in server only ever responds to the VM on the inside and cannot leak its DHCP responses to the outside.
Thanks to Uwe Werler, Josh Grosse, and some others for testing!
OK deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.11 |
|
27-Mar-2017 |
deraadt |
die whitespace die die die
|
#
1.10 |
|
25-Mar-2017 |
mlarkin |
Last bits needed to get seabios + alpine linux working. This is enough to get started and let more people help finding and fixing bugs.
ok kettenis, deraadt
|
#
1.9 |
|
25-Mar-2017 |
reyk |
Boot using BIOS from /etc/firmware/vmm-bios by default.
Instead of using the internal "vmboot", VMs will now be booted using the external BIOS firmware in /etc/firmware/vmm-bios (which is subject to a LGPLv3 license). Direct booting of OpenBSD kernels or non-default BIOS images is still supported for now using the -b/boot option that is replacing the -k/kernel option.
As requested by Theo, vmd(8) fails if neither the default BIOS is found nor a kernel has been specified in the VM configuration. The "vmm" BIOS has to be installed using fw_update(1), which will be done automatically in most cases where the OpenBSD can fetch it after install/upgrade.
OK mlarkin@
|
#
1.8 |
|
25-Mar-2017 |
mlarkin |
Implement some missing functionality and clean up some code in vmd pci emulation.
ok kettenis
|
#
1.7 |
|
25-Mar-2017 |
mlarkin |
Introduce a new function to obtain properly sized input data, and convert i8253/i8259/mc146818 emulation to use this.
|
#
1.6 |
|
24-Mar-2017 |
mlarkin |
Allow vmd to proceed after an interrupt occurred after retiring a cpuid instruction. Matches previous commit to kernel vmm.c
|
#
1.5 |
|
23-Mar-2017 |
mlarkin |
Implement memory size and SMP CPU count NVRAM registers in the emulated mc146818. This is needed for seabios to boot properly (and construct a sensible e820 map to send to the guest OS).
|
#
1.4 |
|
21-Mar-2017 |
mlarkin |
Fix two errors in NS8250 (UART) emulation. The first error zeroed out the high bits of %eax on reading register data from the emulated UART ports. The second error didn't properly assert the TXRDY bit during init - this bit was only set after the first character was sent. Both these bugs caused seabios to not be able to output any data. Found during the recent effort to get Linux guests booting.
|
#
1.3 |
|
15-Mar-2017 |
reyk |
Improve vmmci(4) shutdown and reboot.
This change handles various cases to power off the VM, even if it is unresponsive, stuck in ddb, or when the shutdown was initiated from the VM guest side. Usage of timeout and VM ACKs make sure that the VM is really turned off at some point.
OK mlarkin@
|
#
1.2 |
|
02-Mar-2017 |
reyk |
Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.
This is especially useful when multiple VMs share a switch, the implementation is independent from the underlying switch or bridge.
no objections mlarkin@
|
#
1.1 |
|
01-Mar-2017 |
reyk |
Split vmm.c into two files: vm.c for the VM child, vmm.c for the parent
As discussed with mlarkin@, it makes it easier to maintain the file.
OK mlarkin@
|
#
1.54 |
|
11-Dec-2019 |
pd |
vmd: proper concurrency control when pausing a vm
Removes an XXX which slept for 1s waiting for the vcpu thread to reach HLT and pause. We now define a paused and unpaused condition so that a call to pause_vm() / vmctl pause blocks till the vm really reaches a paused state.
Also, detach events for devices from event loop when pausing and add them back when unpausing. This is because some callbacks call pthread_mutex_lock and if the vm is paused, it would block also causing the libevent thread to block. This would mean that we would not be able to process any IMSGs received from vmm (parent process) including a message to unpause.
ok mlarkin@
|
#
1.53 |
|
30-Nov-2019 |
mlarkin |
Revert previous - the stability was not as improved as we had thought and we ended up accidentally breaking vmctl. This will need more thought.
ok ori@
|
#
1.52 |
|
29-Nov-2019 |
mlarkin |
Fix at least one cause of VMs spinning at 100% host CPU
After debugging with ori@, it looks like an event ends up on the wrong libevent queue, and we end continually de-queueing and re-queueing the event continually. While it's unclear exactly why this happened, a clue on libevent's github issues page for the same problem pointed us to using a different event base for the device events. This seems to have unstuck ori@'s problematic VM, and I have also seen no more hangs after this.
We have not completely separated the queues; ori@ will work on setting new libevent bases for those later. But those events are pretty frequency.
with help from and ok ori@
|
Revision tags: OPENBSD_6_6_BASE
|
#
1.51 |
|
17-Jul-2019 |
pd |
vmm/vmd: Fix migration with pvclock
Implement VMM_IOC_READVMPARAMS and VMM_IOC_WRITEVMPARAMS ioctls to read and write pvclock state.
reads ok mlarkin@
|
#
1.50 |
|
28-Jun-2019 |
deraadt |
When system calls indicate an error they return -1, not some arbitrary value < 0. errno is only updated in this case. Change all (most?) callers of syscalls to follow this better, and let's see if this strictness helps us in the future.
|
#
1.49 |
|
28-May-2019 |
pd |
vmd: unset CR0_CD and CR0_NW in default flat64 register values
These never got unset on AMD/SVM guests when booted via vmctl start -b causing them to run very slow
ok mlarkin@
|
#
1.48 |
|
12-May-2019 |
pd |
vmm: add a x86 page table walker
Add a first cut of x86 page table walker to vmd(8) and vmm(4). This function is not used right now but is a building block for future features like HPET, OUTSB and INSB emulation, nested virtualisation support, etc.
With help from Mike Larkin
ok mlarkin@
|
#
1.47 |
|
11-May-2019 |
jasper |
vm_dump_header allocated space for a signature but it was never set; set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image
ok mlarkin@ pd@
|
#
1.46 |
|
11-May-2019 |
jasper |
track the state of the vm (running, paused, etc) using a single bitfield instead of a handful of separate variables. this will makes it easier for vmd to report and check on the individual vm states
no functional change intended
ok ccardenas@ mlarkin@
|
Revision tags: OPENBSD_6_5_BASE
|
#
1.45 |
|
01-Mar-2019 |
mlarkin |
vmd(8): remove some i386 remnants that missed the original cleanup
ok pd, kn, deraadt
|
#
1.44 |
|
20-Feb-2019 |
mlarkin |
vmd(8): initialize guest %drX registers to power-on defaults on launch
Initializes the %drX registers to power on defaults, and bump the VM send/recieve header to reflect same
discussed with deraadt@
|
#
1.43 |
|
10-Dec-2018 |
claudio |
Implement the fw_cfg interface basics and use it to set the bootorder if a bootdevice was forced. This implements both the pure IO port interface and also the new DMA interface, a few direct commands are implemented which are needed but in general the "file" interface should be used. There is no write support for the guest. Tested against the latest vmm-firmware port. This requires also a -current kernel to pass the IO ports to vmd(8). OK mlarkin@ ccardenas@
|
#
1.42 |
|
06-Dec-2018 |
claudio |
Make it possible to define the bootdevice in vmd. This information is used currently only when booting a OpenBSD kernel. If VMBOOTDEV_NET is used the internal dhcp server will pass "auto_install" as boot file to the client and the boot loader passes the MAC of the first interface to the kernel to indicate PXE booting. Adding boot order support to SeaBIOS is not yet implemented. Ok ccardenas@
|
Revision tags: OPENBSD_6_4_BASE
|
#
1.41 |
|
08-Oct-2018 |
reyk |
Add support for qcow2 base images (external snapshots).
This works is from Ori Bernstein, committing on his behalf:
Add support to vmd for external snapshots. That is, snapshots that are derived from a base image. Data lookups start in the derived image, and if the derived image does not contain some data, the search proceeds ot the base image. Multiple derived images may exist off of a single base image.
A limitation of this format is that modifying the base image will corrupt the derived image.
This change also adds support for creating disk derived disk images to vmctl. To use it:
vmctl create derived.qcow2 -s 16G -b base.qcow2
From Ori Bernstein OK mlarkin@ reyk@
|
#
1.40 |
|
28-Sep-2018 |
reyk |
Support vmd-internal's vmboot with qcow2 disk images.
OK mlarkin@
|
#
1.39 |
|
19-Sep-2018 |
ccardenas |
Various clean up items for disks.
- qcow2: general cleanup - vioraw: check malloc - virtio: add function to sync disks - vm: call virtio_shutdown to sync disks when vm is finished executing
Thanks to Ori Bernstein.
Ok miko@
|
#
1.38 |
|
17-Jul-2018 |
mlarkin |
vmd(8): fix vmctl -b option for i386 kernels.
ok pd@
|
#
1.37 |
|
12-Jul-2018 |
mlarkin |
vmm(8)/vmm(4): send a copy of the guest register state to vmd on exit, avoiding multiple readregs ioctls back to vmm in case register content is needed subsequently.
ok phessler
|
#
1.36 |
|
10-Jul-2018 |
mlarkin |
vmd(8): route ELCR handler to the right function
|
#
1.35 |
|
09-Jul-2018 |
mlarkin |
vmd(8): better debug message in a failure case
|
#
1.34 |
|
19-Jun-2018 |
reyk |
knf
|
#
1.33 |
|
27-Apr-2018 |
mlarkin |
vmd(8): implement vmd side of ELCR registers
ok guenther
|
#
1.32 |
|
26-Apr-2018 |
mlarkin |
vmd(8): handle PIT channel 2 status readback via port 0x61
Allow PIT channel 2 status (fired/counting) readback via port 0x61 bit 5.
ok guenther@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.31 |
|
03-Jan-2018 |
ccardenas |
Add initial CD-ROM support to VMD via vioscsi.
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
#
1.30 |
|
29-Nov-2017 |
mlarkin |
make vmm(4) less responsible for initial register state, preferring to let usermode daemons handle that.
ok pd@
|
#
1.29 |
|
28-Nov-2017 |
mlarkin |
fix some spelling errors in a few comments
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.28 |
|
19-Sep-2017 |
mlarkin |
Clarify a wrong conditional, found by jsg.
ok jsg
|
#
1.27 |
|
17-Sep-2017 |
pd |
vmd: send/recv pci config space instead of recreating pci devices on receive
ok mlarkin@
|
#
1.26 |
|
17-Sep-2017 |
pd |
vmd: re add rtc.per and rtc.sec evtimers on receive
This was missed in receive. mc146818_start is already defined. This fixes rtc time resync on receive.
ok mlarkin@
|
#
1.25 |
|
11-Sep-2017 |
dlg |
add functions to provide direct access to guest memory as vmd addresses
iovec_mem() populates an iovec array based on guest physical addresses. this allows the use of things like readv and writev for moving data between the guest and a disk image file without having to bounce the memory.
vaddr_mem() provides a vmd usable pointer based on a guests physical address. this makes it possible to directly reference things like virtio rings without having to bounce that memory either. however, it assumes that a contiguous range of guest physical memory will sit in a single vm memory range. mlarkin@ says this is right.
ok mlarkin@
|
#
1.24 |
|
20-Aug-2017 |
pd |
vmd: Allow only upward migration
This restricts receiving vms from hosts with more cpu features.
Tested on broadwell -> skylake (works) skylake -> broadwell (don't work)
ok mlarkin@
|
#
1.23 |
|
14-Aug-2017 |
mlarkin |
vmd: set MSR_MISC_ENABLE=0 on vm creation, this will be re-set in vmm based on proper values from the host in use.
|
#
1.22 |
|
15-Jul-2017 |
pd |
Add vmctl send and vmctl receive
ok reyk@ and mlarkin@
|
#
1.21 |
|
09-Jul-2017 |
pd |
vmd/vmctl: Add ability to pause / unpause vms
With help from Ashwin Agrawal
ok reyk@ mlarkin@
|
#
1.20 |
|
07-Jun-2017 |
mlarkin |
vmd: Implement simulated baudrate support in the ns8250 module. The previous version was allowing an output rate that is "too fast", and linux guests would give up after 512 characters TXed ("too much work for irq4").
This diff calculates the approximate rate we can sustain at the current programmed baud rate and limits the output to that rate by inserting a HZ delay after a specified number of characters have been transmitted. This fixes the linux guest console issue.
Note that the console now outputs at more or less the selected baud rate, instead of nearly instantaneously as before - if you selected 9600 in your guest VMs before, you might want to change that to 115200 now for a better console experience.
krw@ "seems like a good idea to me"
|
#
1.19 |
|
30-May-2017 |
tedu |
split vioblk read/write functions into start and finish as prep for async io operations. ok mlarkin
|
#
1.18 |
|
28-May-2017 |
mlarkin |
SVM: add some exit types
Also, fix a comment that wasn't applicable anymore, and change a format from decimal to hex
|
#
1.17 |
|
05-May-2017 |
reyk |
VMs cannot use proc_compose() to PROC_VMM, they have to use imsg_compose() on the "vmm_pipe" directly. This fixes the communication channel from VMs back to vmm.
|
#
1.16 |
|
05-May-2017 |
mlarkin |
Allow vmd(8) to set guest %xcr0
Usermode part of previous vmm(4) diff.
Posted to tech by Pratik Vyas
|
#
1.15 |
|
02-May-2017 |
mlarkin |
fix an error in i386 vmd build
|
#
1.14 |
|
02-May-2017 |
mlarkin |
Matching vmd(8) part of previous diff (first part of vmctl send/receive).
ok kettenis
|
#
1.13 |
|
25-Apr-2017 |
reyk |
spacing
|
#
1.12 |
|
19-Apr-2017 |
reyk |
Add support for dynamic "NAT" interfaces (-L/local interface).
When a local interface is configured, vmd configures a /31 address on the tap(4) interface of the host and provides another IP in the same subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server that replies with IP, gateway, and DNS addresses to the VM. The built-in server only ever responds to the VM on the inside and cannot leak its DHCP responses to the outside.
Thanks to Uwe Werler, Josh Grosse, and some others for testing!
OK deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.11 |
|
27-Mar-2017 |
deraadt |
die whitespace die die die
|
#
1.10 |
|
25-Mar-2017 |
mlarkin |
Last bits needed to get seabios + alpine linux working. This is enough to get started and let more people help finding and fixing bugs.
ok kettenis, deraadt
|
#
1.9 |
|
25-Mar-2017 |
reyk |
Boot using BIOS from /etc/firmware/vmm-bios by default.
Instead of using the internal "vmboot", VMs will now be booted using the external BIOS firmware in /etc/firmware/vmm-bios (which is subject to a LGPLv3 license). Direct booting of OpenBSD kernels or non-default BIOS images is still supported for now using the -b/boot option that is replacing the -k/kernel option.
As requested by Theo, vmd(8) fails if neither the default BIOS is found nor a kernel has been specified in the VM configuration. The "vmm" BIOS has to be installed using fw_update(1), which will be done automatically in most cases where the OpenBSD can fetch it after install/upgrade.
OK mlarkin@
|
#
1.8 |
|
25-Mar-2017 |
mlarkin |
Implement some missing functionality and clean up some code in vmd pci emulation.
ok kettenis
|
#
1.7 |
|
25-Mar-2017 |
mlarkin |
Introduce a new function to obtain properly sized input data, and convert i8253/i8259/mc146818 emulation to use this.
|
#
1.6 |
|
24-Mar-2017 |
mlarkin |
Allow vmd to proceed after an interrupt occurred after retiring a cpuid instruction. Matches previous commit to kernel vmm.c
|
#
1.5 |
|
23-Mar-2017 |
mlarkin |
Implement memory size and SMP CPU count NVRAM registers in the emulated mc146818. This is needed for seabios to boot properly (and construct a sensible e820 map to send to the guest OS).
|
#
1.4 |
|
21-Mar-2017 |
mlarkin |
Fix two errors in NS8250 (UART) emulation. The first error zeroed out the high bits of %eax on reading register data from the emulated UART ports. The second error didn't properly assert the TXRDY bit during init - this bit was only set after the first character was sent. Both these bugs caused seabios to not be able to output any data. Found during the recent effort to get Linux guests booting.
|
#
1.3 |
|
15-Mar-2017 |
reyk |
Improve vmmci(4) shutdown and reboot.
This change handles various cases to power off the VM, even if it is unresponsive, stuck in ddb, or when the shutdown was initiated from the VM guest side. Usage of timeout and VM ACKs make sure that the VM is really turned off at some point.
OK mlarkin@
|
#
1.2 |
|
02-Mar-2017 |
reyk |
Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.
This is especially useful when multiple VMs share a switch, the implementation is independent from the underlying switch or bridge.
no objections mlarkin@
|
#
1.1 |
|
01-Mar-2017 |
reyk |
Split vmm.c into two files: vm.c for the VM child, vmm.c for the parent
As discussed with mlarkin@, it makes it easier to maintain the file.
OK mlarkin@
|
#
1.53 |
|
30-Nov-2019 |
mlarkin |
Revert previous - the stability was not as improved as we had thought and we ended up accidentally breaking vmctl. This will need more thought.
ok ori@
|
#
1.52 |
|
29-Nov-2019 |
mlarkin |
Fix at least one cause of VMs spinning at 100% host CPU
After debugging with ori@, it looks like an event ends up on the wrong libevent queue, and we end continually de-queueing and re-queueing the event continually. While it's unclear exactly why this happened, a clue on libevent's github issues page for the same problem pointed us to using a different event base for the device events. This seems to have unstuck ori@'s problematic VM, and I have also seen no more hangs after this.
We have not completely separated the queues; ori@ will work on setting new libevent bases for those later. But those events are pretty frequency.
with help from and ok ori@
|
Revision tags: OPENBSD_6_6_BASE
|
#
1.51 |
|
17-Jul-2019 |
pd |
vmm/vmd: Fix migration with pvclock
Implement VMM_IOC_READVMPARAMS and VMM_IOC_WRITEVMPARAMS ioctls to read and write pvclock state.
reads ok mlarkin@
|
#
1.50 |
|
28-Jun-2019 |
deraadt |
When system calls indicate an error they return -1, not some arbitrary value < 0. errno is only updated in this case. Change all (most?) callers of syscalls to follow this better, and let's see if this strictness helps us in the future.
|
#
1.49 |
|
28-May-2019 |
pd |
vmd: unset CR0_CD and CR0_NW in default flat64 register values
These never got unset on AMD/SVM guests when booted via vmctl start -b causing them to run very slow
ok mlarkin@
|
#
1.48 |
|
12-May-2019 |
pd |
vmm: add a x86 page table walker
Add a first cut of x86 page table walker to vmd(8) and vmm(4). This function is not used right now but is a building block for future features like HPET, OUTSB and INSB emulation, nested virtualisation support, etc.
With help from Mike Larkin
ok mlarkin@
|
#
1.47 |
|
11-May-2019 |
jasper |
vm_dump_header allocated space for a signature but it was never set; set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image
ok mlarkin@ pd@
|
#
1.46 |
|
11-May-2019 |
jasper |
track the state of the vm (running, paused, etc) using a single bitfield instead of a handful of separate variables. this will makes it easier for vmd to report and check on the individual vm states
no functional change intended
ok ccardenas@ mlarkin@
|
Revision tags: OPENBSD_6_5_BASE
|
#
1.45 |
|
01-Mar-2019 |
mlarkin |
vmd(8): remove some i386 remnants that missed the original cleanup
ok pd, kn, deraadt
|
#
1.44 |
|
20-Feb-2019 |
mlarkin |
vmd(8): initialize guest %drX registers to power-on defaults on launch
Initializes the %drX registers to power on defaults, and bump the VM send/recieve header to reflect same
discussed with deraadt@
|
#
1.43 |
|
10-Dec-2018 |
claudio |
Implement the fw_cfg interface basics and use it to set the bootorder if a bootdevice was forced. This implements both the pure IO port interface and also the new DMA interface, a few direct commands are implemented which are needed but in general the "file" interface should be used. There is no write support for the guest. Tested against the latest vmm-firmware port. This requires also a -current kernel to pass the IO ports to vmd(8). OK mlarkin@ ccardenas@
|
#
1.42 |
|
06-Dec-2018 |
claudio |
Make it possible to define the bootdevice in vmd. This information is used currently only when booting a OpenBSD kernel. If VMBOOTDEV_NET is used the internal dhcp server will pass "auto_install" as boot file to the client and the boot loader passes the MAC of the first interface to the kernel to indicate PXE booting. Adding boot order support to SeaBIOS is not yet implemented. Ok ccardenas@
|
Revision tags: OPENBSD_6_4_BASE
|
#
1.41 |
|
08-Oct-2018 |
reyk |
Add support for qcow2 base images (external snapshots).
This works is from Ori Bernstein, committing on his behalf:
Add support to vmd for external snapshots. That is, snapshots that are derived from a base image. Data lookups start in the derived image, and if the derived image does not contain some data, the search proceeds ot the base image. Multiple derived images may exist off of a single base image.
A limitation of this format is that modifying the base image will corrupt the derived image.
This change also adds support for creating disk derived disk images to vmctl. To use it:
vmctl create derived.qcow2 -s 16G -b base.qcow2
From Ori Bernstein OK mlarkin@ reyk@
|
#
1.40 |
|
28-Sep-2018 |
reyk |
Support vmd-internal's vmboot with qcow2 disk images.
OK mlarkin@
|
#
1.39 |
|
19-Sep-2018 |
ccardenas |
Various clean up items for disks.
- qcow2: general cleanup - vioraw: check malloc - virtio: add function to sync disks - vm: call virtio_shutdown to sync disks when vm is finished executing
Thanks to Ori Bernstein.
Ok miko@
|
#
1.38 |
|
17-Jul-2018 |
mlarkin |
vmd(8): fix vmctl -b option for i386 kernels.
ok pd@
|
#
1.37 |
|
12-Jul-2018 |
mlarkin |
vmm(8)/vmm(4): send a copy of the guest register state to vmd on exit, avoiding multiple readregs ioctls back to vmm in case register content is needed subsequently.
ok phessler
|
#
1.36 |
|
10-Jul-2018 |
mlarkin |
vmd(8): route ELCR handler to the right function
|
#
1.35 |
|
09-Jul-2018 |
mlarkin |
vmd(8): better debug message in a failure case
|
#
1.34 |
|
19-Jun-2018 |
reyk |
knf
|
#
1.33 |
|
27-Apr-2018 |
mlarkin |
vmd(8): implement vmd side of ELCR registers
ok guenther
|
#
1.32 |
|
26-Apr-2018 |
mlarkin |
vmd(8): handle PIT channel 2 status readback via port 0x61
Allow PIT channel 2 status (fired/counting) readback via port 0x61 bit 5.
ok guenther@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.31 |
|
03-Jan-2018 |
ccardenas |
Add initial CD-ROM support to VMD via vioscsi.
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
#
1.30 |
|
29-Nov-2017 |
mlarkin |
make vmm(4) less responsible for initial register state, preferring to let usermode daemons handle that.
ok pd@
|
#
1.29 |
|
28-Nov-2017 |
mlarkin |
fix some spelling errors in a few comments
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.28 |
|
19-Sep-2017 |
mlarkin |
Clarify a wrong conditional, found by jsg.
ok jsg
|
#
1.27 |
|
17-Sep-2017 |
pd |
vmd: send/recv pci config space instead of recreating pci devices on receive
ok mlarkin@
|
#
1.26 |
|
17-Sep-2017 |
pd |
vmd: re add rtc.per and rtc.sec evtimers on receive
This was missed in receive. mc146818_start is already defined. This fixes rtc time resync on receive.
ok mlarkin@
|
#
1.25 |
|
11-Sep-2017 |
dlg |
add functions to provide direct access to guest memory as vmd addresses
iovec_mem() populates an iovec array based on guest physical addresses. this allows the use of things like readv and writev for moving data between the guest and a disk image file without having to bounce the memory.
vaddr_mem() provides a vmd usable pointer based on a guests physical address. this makes it possible to directly reference things like virtio rings without having to bounce that memory either. however, it assumes that a contiguous range of guest physical memory will sit in a single vm memory range. mlarkin@ says this is right.
ok mlarkin@
|
#
1.24 |
|
20-Aug-2017 |
pd |
vmd: Allow only upward migration
This restricts receiving vms from hosts with more cpu features.
Tested on broadwell -> skylake (works) skylake -> broadwell (don't work)
ok mlarkin@
|
#
1.23 |
|
14-Aug-2017 |
mlarkin |
vmd: set MSR_MISC_ENABLE=0 on vm creation, this will be re-set in vmm based on proper values from the host in use.
|
#
1.22 |
|
15-Jul-2017 |
pd |
Add vmctl send and vmctl receive
ok reyk@ and mlarkin@
|
#
1.21 |
|
09-Jul-2017 |
pd |
vmd/vmctl: Add ability to pause / unpause vms
With help from Ashwin Agrawal
ok reyk@ mlarkin@
|
#
1.20 |
|
07-Jun-2017 |
mlarkin |
vmd: Implement simulated baudrate support in the ns8250 module. The previous version was allowing an output rate that is "too fast", and linux guests would give up after 512 characters TXed ("too much work for irq4").
This diff calculates the approximate rate we can sustain at the current programmed baud rate and limits the output to that rate by inserting a HZ delay after a specified number of characters have been transmitted. This fixes the linux guest console issue.
Note that the console now outputs at more or less the selected baud rate, instead of nearly instantaneously as before - if you selected 9600 in your guest VMs before, you might want to change that to 115200 now for a better console experience.
krw@ "seems like a good idea to me"
|
#
1.19 |
|
30-May-2017 |
tedu |
split vioblk read/write functions into start and finish as prep for async io operations. ok mlarkin
|
#
1.18 |
|
28-May-2017 |
mlarkin |
SVM: add some exit types
Also, fix a comment that wasn't applicable anymore, and change a format from decimal to hex
|
#
1.17 |
|
05-May-2017 |
reyk |
VMs cannot use proc_compose() to PROC_VMM, they have to use imsg_compose() on the "vmm_pipe" directly. This fixes the communication channel from VMs back to vmm.
|
#
1.16 |
|
05-May-2017 |
mlarkin |
Allow vmd(8) to set guest %xcr0
Usermode part of previous vmm(4) diff.
Posted to tech by Pratik Vyas
|
#
1.15 |
|
02-May-2017 |
mlarkin |
fix an error in i386 vmd build
|
#
1.14 |
|
02-May-2017 |
mlarkin |
Matching vmd(8) part of previous diff (first part of vmctl send/receive).
ok kettenis
|
#
1.13 |
|
25-Apr-2017 |
reyk |
spacing
|
#
1.12 |
|
19-Apr-2017 |
reyk |
Add support for dynamic "NAT" interfaces (-L/local interface).
When a local interface is configured, vmd configures a /31 address on the tap(4) interface of the host and provides another IP in the same subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server that replies with IP, gateway, and DNS addresses to the VM. The built-in server only ever responds to the VM on the inside and cannot leak its DHCP responses to the outside.
Thanks to Uwe Werler, Josh Grosse, and some others for testing!
OK deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.11 |
|
27-Mar-2017 |
deraadt |
die whitespace die die die
|
#
1.10 |
|
25-Mar-2017 |
mlarkin |
Last bits needed to get seabios + alpine linux working. This is enough to get started and let more people help finding and fixing bugs.
ok kettenis, deraadt
|
#
1.9 |
|
25-Mar-2017 |
reyk |
Boot using BIOS from /etc/firmware/vmm-bios by default.
Instead of using the internal "vmboot", VMs will now be booted using the external BIOS firmware in /etc/firmware/vmm-bios (which is subject to a LGPLv3 license). Direct booting of OpenBSD kernels or non-default BIOS images is still supported for now using the -b/boot option that is replacing the -k/kernel option.
As requested by Theo, vmd(8) fails if neither the default BIOS is found nor a kernel has been specified in the VM configuration. The "vmm" BIOS has to be installed using fw_update(1), which will be done automatically in most cases where the OpenBSD can fetch it after install/upgrade.
OK mlarkin@
|
#
1.8 |
|
25-Mar-2017 |
mlarkin |
Implement some missing functionality and clean up some code in vmd pci emulation.
ok kettenis
|
#
1.7 |
|
25-Mar-2017 |
mlarkin |
Introduce a new function to obtain properly sized input data, and convert i8253/i8259/mc146818 emulation to use this.
|
#
1.6 |
|
24-Mar-2017 |
mlarkin |
Allow vmd to proceed after an interrupt occurred after retiring a cpuid instruction. Matches previous commit to kernel vmm.c
|
#
1.5 |
|
23-Mar-2017 |
mlarkin |
Implement memory size and SMP CPU count NVRAM registers in the emulated mc146818. This is needed for seabios to boot properly (and construct a sensible e820 map to send to the guest OS).
|
#
1.4 |
|
21-Mar-2017 |
mlarkin |
Fix two errors in NS8250 (UART) emulation. The first error zeroed out the high bits of %eax on reading register data from the emulated UART ports. The second error didn't properly assert the TXRDY bit during init - this bit was only set after the first character was sent. Both these bugs caused seabios to not be able to output any data. Found during the recent effort to get Linux guests booting.
|
#
1.3 |
|
15-Mar-2017 |
reyk |
Improve vmmci(4) shutdown and reboot.
This change handles various cases to power off the VM, even if it is unresponsive, stuck in ddb, or when the shutdown was initiated from the VM guest side. Usage of timeout and VM ACKs make sure that the VM is really turned off at some point.
OK mlarkin@
|
#
1.2 |
|
02-Mar-2017 |
reyk |
Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.
This is especially useful when multiple VMs share a switch, the implementation is independent from the underlying switch or bridge.
no objections mlarkin@
|
#
1.1 |
|
01-Mar-2017 |
reyk |
Split vmm.c into two files: vm.c for the VM child, vmm.c for the parent
As discussed with mlarkin@, it makes it easier to maintain the file.
OK mlarkin@
|
#
1.52 |
|
29-Nov-2019 |
mlarkin |
Fix at least one cause of VMs spinning at 100% host CPU
After debugging with ori@, it looks like an event ends up on the wrong libevent queue, and we end continually de-queueing and re-queueing the event continually. While it's unclear exactly why this happened, a clue on libevent's github issues page for the same problem pointed us to using a different event base for the device events. This seems to have unstuck ori@'s problematic VM, and I have also seen no more hangs after this.
We have not completely separated the queues; ori@ will work on setting new libevent bases for those later. But those events are pretty frequency.
with help from and ok ori@
|
Revision tags: OPENBSD_6_6_BASE
|
#
1.51 |
|
17-Jul-2019 |
pd |
vmm/vmd: Fix migration with pvclock
Implement VMM_IOC_READVMPARAMS and VMM_IOC_WRITEVMPARAMS ioctls to read and write pvclock state.
reads ok mlarkin@
|
#
1.50 |
|
28-Jun-2019 |
deraadt |
When system calls indicate an error they return -1, not some arbitrary value < 0. errno is only updated in this case. Change all (most?) callers of syscalls to follow this better, and let's see if this strictness helps us in the future.
|
#
1.49 |
|
28-May-2019 |
pd |
vmd: unset CR0_CD and CR0_NW in default flat64 register values
These never got unset on AMD/SVM guests when booted via vmctl start -b causing them to run very slow
ok mlarkin@
|
#
1.48 |
|
12-May-2019 |
pd |
vmm: add a x86 page table walker
Add a first cut of x86 page table walker to vmd(8) and vmm(4). This function is not used right now but is a building block for future features like HPET, OUTSB and INSB emulation, nested virtualisation support, etc.
With help from Mike Larkin
ok mlarkin@
|
#
1.47 |
|
11-May-2019 |
jasper |
vm_dump_header allocated space for a signature but it was never set; set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image
ok mlarkin@ pd@
|
#
1.46 |
|
11-May-2019 |
jasper |
track the state of the vm (running, paused, etc) using a single bitfield instead of a handful of separate variables. this will makes it easier for vmd to report and check on the individual vm states
no functional change intended
ok ccardenas@ mlarkin@
|
Revision tags: OPENBSD_6_5_BASE
|
#
1.45 |
|
01-Mar-2019 |
mlarkin |
vmd(8): remove some i386 remnants that missed the original cleanup
ok pd, kn, deraadt
|
#
1.44 |
|
20-Feb-2019 |
mlarkin |
vmd(8): initialize guest %drX registers to power-on defaults on launch
Initializes the %drX registers to power on defaults, and bump the VM send/recieve header to reflect same
discussed with deraadt@
|
#
1.43 |
|
10-Dec-2018 |
claudio |
Implement the fw_cfg interface basics and use it to set the bootorder if a bootdevice was forced. This implements both the pure IO port interface and also the new DMA interface, a few direct commands are implemented which are needed but in general the "file" interface should be used. There is no write support for the guest. Tested against the latest vmm-firmware port. This requires also a -current kernel to pass the IO ports to vmd(8). OK mlarkin@ ccardenas@
|
#
1.42 |
|
06-Dec-2018 |
claudio |
Make it possible to define the bootdevice in vmd. This information is used currently only when booting a OpenBSD kernel. If VMBOOTDEV_NET is used the internal dhcp server will pass "auto_install" as boot file to the client and the boot loader passes the MAC of the first interface to the kernel to indicate PXE booting. Adding boot order support to SeaBIOS is not yet implemented. Ok ccardenas@
|
Revision tags: OPENBSD_6_4_BASE
|
#
1.41 |
|
08-Oct-2018 |
reyk |
Add support for qcow2 base images (external snapshots).
This works is from Ori Bernstein, committing on his behalf:
Add support to vmd for external snapshots. That is, snapshots that are derived from a base image. Data lookups start in the derived image, and if the derived image does not contain some data, the search proceeds ot the base image. Multiple derived images may exist off of a single base image.
A limitation of this format is that modifying the base image will corrupt the derived image.
This change also adds support for creating disk derived disk images to vmctl. To use it:
vmctl create derived.qcow2 -s 16G -b base.qcow2
From Ori Bernstein OK mlarkin@ reyk@
|
#
1.40 |
|
28-Sep-2018 |
reyk |
Support vmd-internal's vmboot with qcow2 disk images.
OK mlarkin@
|
#
1.39 |
|
19-Sep-2018 |
ccardenas |
Various clean up items for disks.
- qcow2: general cleanup - vioraw: check malloc - virtio: add function to sync disks - vm: call virtio_shutdown to sync disks when vm is finished executing
Thanks to Ori Bernstein.
Ok miko@
|
#
1.38 |
|
17-Jul-2018 |
mlarkin |
vmd(8): fix vmctl -b option for i386 kernels.
ok pd@
|
#
1.37 |
|
12-Jul-2018 |
mlarkin |
vmm(8)/vmm(4): send a copy of the guest register state to vmd on exit, avoiding multiple readregs ioctls back to vmm in case register content is needed subsequently.
ok phessler
|
#
1.36 |
|
10-Jul-2018 |
mlarkin |
vmd(8): route ELCR handler to the right function
|
#
1.35 |
|
09-Jul-2018 |
mlarkin |
vmd(8): better debug message in a failure case
|
#
1.34 |
|
19-Jun-2018 |
reyk |
knf
|
#
1.33 |
|
27-Apr-2018 |
mlarkin |
vmd(8): implement vmd side of ELCR registers
ok guenther
|
#
1.32 |
|
26-Apr-2018 |
mlarkin |
vmd(8): handle PIT channel 2 status readback via port 0x61
Allow PIT channel 2 status (fired/counting) readback via port 0x61 bit 5.
ok guenther@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.31 |
|
03-Jan-2018 |
ccardenas |
Add initial CD-ROM support to VMD via vioscsi.
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
#
1.30 |
|
29-Nov-2017 |
mlarkin |
make vmm(4) less responsible for initial register state, preferring to let usermode daemons handle that.
ok pd@
|
#
1.29 |
|
28-Nov-2017 |
mlarkin |
fix some spelling errors in a few comments
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.28 |
|
19-Sep-2017 |
mlarkin |
Clarify a wrong conditional, found by jsg.
ok jsg
|
#
1.27 |
|
17-Sep-2017 |
pd |
vmd: send/recv pci config space instead of recreating pci devices on receive
ok mlarkin@
|
#
1.26 |
|
17-Sep-2017 |
pd |
vmd: re add rtc.per and rtc.sec evtimers on receive
This was missed in receive. mc146818_start is already defined. This fixes rtc time resync on receive.
ok mlarkin@
|
#
1.25 |
|
11-Sep-2017 |
dlg |
add functions to provide direct access to guest memory as vmd addresses
iovec_mem() populates an iovec array based on guest physical addresses. this allows the use of things like readv and writev for moving data between the guest and a disk image file without having to bounce the memory.
vaddr_mem() provides a vmd usable pointer based on a guests physical address. this makes it possible to directly reference things like virtio rings without having to bounce that memory either. however, it assumes that a contiguous range of guest physical memory will sit in a single vm memory range. mlarkin@ says this is right.
ok mlarkin@
|
#
1.24 |
|
20-Aug-2017 |
pd |
vmd: Allow only upward migration
This restricts receiving vms from hosts with more cpu features.
Tested on broadwell -> skylake (works) skylake -> broadwell (don't work)
ok mlarkin@
|
#
1.23 |
|
14-Aug-2017 |
mlarkin |
vmd: set MSR_MISC_ENABLE=0 on vm creation, this will be re-set in vmm based on proper values from the host in use.
|
#
1.22 |
|
15-Jul-2017 |
pd |
Add vmctl send and vmctl receive
ok reyk@ and mlarkin@
|
#
1.21 |
|
09-Jul-2017 |
pd |
vmd/vmctl: Add ability to pause / unpause vms
With help from Ashwin Agrawal
ok reyk@ mlarkin@
|
#
1.20 |
|
07-Jun-2017 |
mlarkin |
vmd: Implement simulated baudrate support in the ns8250 module. The previous version was allowing an output rate that is "too fast", and linux guests would give up after 512 characters TXed ("too much work for irq4").
This diff calculates the approximate rate we can sustain at the current programmed baud rate and limits the output to that rate by inserting a HZ delay after a specified number of characters have been transmitted. This fixes the linux guest console issue.
Note that the console now outputs at more or less the selected baud rate, instead of nearly instantaneously as before - if you selected 9600 in your guest VMs before, you might want to change that to 115200 now for a better console experience.
krw@ "seems like a good idea to me"
|
#
1.19 |
|
30-May-2017 |
tedu |
split vioblk read/write functions into start and finish as prep for async io operations. ok mlarkin
|
#
1.18 |
|
28-May-2017 |
mlarkin |
SVM: add some exit types
Also, fix a comment that wasn't applicable anymore, and change a format from decimal to hex
|
#
1.17 |
|
05-May-2017 |
reyk |
VMs cannot use proc_compose() to PROC_VMM, they have to use imsg_compose() on the "vmm_pipe" directly. This fixes the communication channel from VMs back to vmm.
|
#
1.16 |
|
05-May-2017 |
mlarkin |
Allow vmd(8) to set guest %xcr0
Usermode part of previous vmm(4) diff.
Posted to tech by Pratik Vyas
|
#
1.15 |
|
02-May-2017 |
mlarkin |
fix an error in i386 vmd build
|
#
1.14 |
|
02-May-2017 |
mlarkin |
Matching vmd(8) part of previous diff (first part of vmctl send/receive).
ok kettenis
|
#
1.13 |
|
25-Apr-2017 |
reyk |
spacing
|
#
1.12 |
|
19-Apr-2017 |
reyk |
Add support for dynamic "NAT" interfaces (-L/local interface).
When a local interface is configured, vmd configures a /31 address on the tap(4) interface of the host and provides another IP in the same subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server that replies with IP, gateway, and DNS addresses to the VM. The built-in server only ever responds to the VM on the inside and cannot leak its DHCP responses to the outside.
Thanks to Uwe Werler, Josh Grosse, and some others for testing!
OK deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.11 |
|
27-Mar-2017 |
deraadt |
die whitespace die die die
|
#
1.10 |
|
25-Mar-2017 |
mlarkin |
Last bits needed to get seabios + alpine linux working. This is enough to get started and let more people help finding and fixing bugs.
ok kettenis, deraadt
|
#
1.9 |
|
25-Mar-2017 |
reyk |
Boot using BIOS from /etc/firmware/vmm-bios by default.
Instead of using the internal "vmboot", VMs will now be booted using the external BIOS firmware in /etc/firmware/vmm-bios (which is subject to a LGPLv3 license). Direct booting of OpenBSD kernels or non-default BIOS images is still supported for now using the -b/boot option that is replacing the -k/kernel option.
As requested by Theo, vmd(8) fails if neither the default BIOS is found nor a kernel has been specified in the VM configuration. The "vmm" BIOS has to be installed using fw_update(1), which will be done automatically in most cases where the OpenBSD can fetch it after install/upgrade.
OK mlarkin@
|
#
1.8 |
|
25-Mar-2017 |
mlarkin |
Implement some missing functionality and clean up some code in vmd pci emulation.
ok kettenis
|
#
1.7 |
|
25-Mar-2017 |
mlarkin |
Introduce a new function to obtain properly sized input data, and convert i8253/i8259/mc146818 emulation to use this.
|
#
1.6 |
|
24-Mar-2017 |
mlarkin |
Allow vmd to proceed after an interrupt occurred after retiring a cpuid instruction. Matches previous commit to kernel vmm.c
|
#
1.5 |
|
23-Mar-2017 |
mlarkin |
Implement memory size and SMP CPU count NVRAM registers in the emulated mc146818. This is needed for seabios to boot properly (and construct a sensible e820 map to send to the guest OS).
|
#
1.4 |
|
21-Mar-2017 |
mlarkin |
Fix two errors in NS8250 (UART) emulation. The first error zeroed out the high bits of %eax on reading register data from the emulated UART ports. The second error didn't properly assert the TXRDY bit during init - this bit was only set after the first character was sent. Both these bugs caused seabios to not be able to output any data. Found during the recent effort to get Linux guests booting.
|
#
1.3 |
|
15-Mar-2017 |
reyk |
Improve vmmci(4) shutdown and reboot.
This change handles various cases to power off the VM, even if it is unresponsive, stuck in ddb, or when the shutdown was initiated from the VM guest side. Usage of timeout and VM ACKs make sure that the VM is really turned off at some point.
OK mlarkin@
|
#
1.2 |
|
02-Mar-2017 |
reyk |
Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.
This is especially useful when multiple VMs share a switch, the implementation is independent from the underlying switch or bridge.
no objections mlarkin@
|
#
1.1 |
|
01-Mar-2017 |
reyk |
Split vmm.c into two files: vm.c for the VM child, vmm.c for the parent
As discussed with mlarkin@, it makes it easier to maintain the file.
OK mlarkin@
|
#
1.51 |
|
17-Jul-2019 |
pd |
vmm/vmd: Fix migration with pvclock
Implement VMM_IOC_READVMPARAMS and VMM_IOC_WRITEVMPARAMS ioctls to read and write pvclock state.
reads ok mlarkin@
|
#
1.50 |
|
28-Jun-2019 |
deraadt |
When system calls indicate an error they return -1, not some arbitrary value < 0. errno is only updated in this case. Change all (most?) callers of syscalls to follow this better, and let's see if this strictness helps us in the future.
|
#
1.49 |
|
28-May-2019 |
pd |
vmd: unset CR0_CD and CR0_NW in default flat64 register values
These never got unset on AMD/SVM guests when booted via vmctl start -b causing them to run very slow
ok mlarkin@
|
#
1.48 |
|
12-May-2019 |
pd |
vmm: add a x86 page table walker
Add a first cut of x86 page table walker to vmd(8) and vmm(4). This function is not used right now but is a building block for future features like HPET, OUTSB and INSB emulation, nested virtualisation support, etc.
With help from Mike Larkin
ok mlarkin@
|
#
1.47 |
|
11-May-2019 |
jasper |
vm_dump_header allocated space for a signature but it was never set; set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image
ok mlarkin@ pd@
|
#
1.46 |
|
11-May-2019 |
jasper |
track the state of the vm (running, paused, etc) using a single bitfield instead of a handful of separate variables. this will makes it easier for vmd to report and check on the individual vm states
no functional change intended
ok ccardenas@ mlarkin@
|
Revision tags: OPENBSD_6_5_BASE
|
#
1.45 |
|
01-Mar-2019 |
mlarkin |
vmd(8): remove some i386 remnants that missed the original cleanup
ok pd, kn, deraadt
|
#
1.44 |
|
20-Feb-2019 |
mlarkin |
vmd(8): initialize guest %drX registers to power-on defaults on launch
Initializes the %drX registers to power on defaults, and bump the VM send/recieve header to reflect same
discussed with deraadt@
|
#
1.43 |
|
10-Dec-2018 |
claudio |
Implement the fw_cfg interface basics and use it to set the bootorder if a bootdevice was forced. This implements both the pure IO port interface and also the new DMA interface, a few direct commands are implemented which are needed but in general the "file" interface should be used. There is no write support for the guest. Tested against the latest vmm-firmware port. This requires also a -current kernel to pass the IO ports to vmd(8). OK mlarkin@ ccardenas@
|
#
1.42 |
|
06-Dec-2018 |
claudio |
Make it possible to define the bootdevice in vmd. This information is used currently only when booting a OpenBSD kernel. If VMBOOTDEV_NET is used the internal dhcp server will pass "auto_install" as boot file to the client and the boot loader passes the MAC of the first interface to the kernel to indicate PXE booting. Adding boot order support to SeaBIOS is not yet implemented. Ok ccardenas@
|
Revision tags: OPENBSD_6_4_BASE
|
#
1.41 |
|
08-Oct-2018 |
reyk |
Add support for qcow2 base images (external snapshots).
This works is from Ori Bernstein, committing on his behalf:
Add support to vmd for external snapshots. That is, snapshots that are derived from a base image. Data lookups start in the derived image, and if the derived image does not contain some data, the search proceeds ot the base image. Multiple derived images may exist off of a single base image.
A limitation of this format is that modifying the base image will corrupt the derived image.
This change also adds support for creating disk derived disk images to vmctl. To use it:
vmctl create derived.qcow2 -s 16G -b base.qcow2
From Ori Bernstein OK mlarkin@ reyk@
|
#
1.40 |
|
28-Sep-2018 |
reyk |
Support vmd-internal's vmboot with qcow2 disk images.
OK mlarkin@
|
#
1.39 |
|
19-Sep-2018 |
ccardenas |
Various clean up items for disks.
- qcow2: general cleanup - vioraw: check malloc - virtio: add function to sync disks - vm: call virtio_shutdown to sync disks when vm is finished executing
Thanks to Ori Bernstein.
Ok miko@
|
#
1.38 |
|
17-Jul-2018 |
mlarkin |
vmd(8): fix vmctl -b option for i386 kernels.
ok pd@
|
#
1.37 |
|
12-Jul-2018 |
mlarkin |
vmm(8)/vmm(4): send a copy of the guest register state to vmd on exit, avoiding multiple readregs ioctls back to vmm in case register content is needed subsequently.
ok phessler
|
#
1.36 |
|
10-Jul-2018 |
mlarkin |
vmd(8): route ELCR handler to the right function
|
#
1.35 |
|
09-Jul-2018 |
mlarkin |
vmd(8): better debug message in a failure case
|
#
1.34 |
|
19-Jun-2018 |
reyk |
knf
|
#
1.33 |
|
27-Apr-2018 |
mlarkin |
vmd(8): implement vmd side of ELCR registers
ok guenther
|
#
1.32 |
|
26-Apr-2018 |
mlarkin |
vmd(8): handle PIT channel 2 status readback via port 0x61
Allow PIT channel 2 status (fired/counting) readback via port 0x61 bit 5.
ok guenther@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.31 |
|
03-Jan-2018 |
ccardenas |
Add initial CD-ROM support to VMD via vioscsi.
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
#
1.30 |
|
29-Nov-2017 |
mlarkin |
make vmm(4) less responsible for initial register state, preferring to let usermode daemons handle that.
ok pd@
|
#
1.29 |
|
28-Nov-2017 |
mlarkin |
fix some spelling errors in a few comments
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.28 |
|
19-Sep-2017 |
mlarkin |
Clarify a wrong conditional, found by jsg.
ok jsg
|
#
1.27 |
|
17-Sep-2017 |
pd |
vmd: send/recv pci config space instead of recreating pci devices on receive
ok mlarkin@
|
#
1.26 |
|
17-Sep-2017 |
pd |
vmd: re add rtc.per and rtc.sec evtimers on receive
This was missed in receive. mc146818_start is already defined. This fixes rtc time resync on receive.
ok mlarkin@
|
#
1.25 |
|
11-Sep-2017 |
dlg |
add functions to provide direct access to guest memory as vmd addresses
iovec_mem() populates an iovec array based on guest physical addresses. this allows the use of things like readv and writev for moving data between the guest and a disk image file without having to bounce the memory.
vaddr_mem() provides a vmd usable pointer based on a guests physical address. this makes it possible to directly reference things like virtio rings without having to bounce that memory either. however, it assumes that a contiguous range of guest physical memory will sit in a single vm memory range. mlarkin@ says this is right.
ok mlarkin@
|
#
1.24 |
|
20-Aug-2017 |
pd |
vmd: Allow only upward migration
This restricts receiving vms from hosts with more cpu features.
Tested on broadwell -> skylake (works) skylake -> broadwell (don't work)
ok mlarkin@
|
#
1.23 |
|
14-Aug-2017 |
mlarkin |
vmd: set MSR_MISC_ENABLE=0 on vm creation, this will be re-set in vmm based on proper values from the host in use.
|
#
1.22 |
|
15-Jul-2017 |
pd |
Add vmctl send and vmctl receive
ok reyk@ and mlarkin@
|
#
1.21 |
|
09-Jul-2017 |
pd |
vmd/vmctl: Add ability to pause / unpause vms
With help from Ashwin Agrawal
ok reyk@ mlarkin@
|
#
1.20 |
|
07-Jun-2017 |
mlarkin |
vmd: Implement simulated baudrate support in the ns8250 module. The previous version was allowing an output rate that is "too fast", and linux guests would give up after 512 characters TXed ("too much work for irq4").
This diff calculates the approximate rate we can sustain at the current programmed baud rate and limits the output to that rate by inserting a HZ delay after a specified number of characters have been transmitted. This fixes the linux guest console issue.
Note that the console now outputs at more or less the selected baud rate, instead of nearly instantaneously as before - if you selected 9600 in your guest VMs before, you might want to change that to 115200 now for a better console experience.
krw@ "seems like a good idea to me"
|
#
1.19 |
|
30-May-2017 |
tedu |
split vioblk read/write functions into start and finish as prep for async io operations. ok mlarkin
|
#
1.18 |
|
28-May-2017 |
mlarkin |
SVM: add some exit types
Also, fix a comment that wasn't applicable anymore, and change a format from decimal to hex
|
#
1.17 |
|
05-May-2017 |
reyk |
VMs cannot use proc_compose() to PROC_VMM, they have to use imsg_compose() on the "vmm_pipe" directly. This fixes the communication channel from VMs back to vmm.
|
#
1.16 |
|
05-May-2017 |
mlarkin |
Allow vmd(8) to set guest %xcr0
Usermode part of previous vmm(4) diff.
Posted to tech by Pratik Vyas
|
#
1.15 |
|
02-May-2017 |
mlarkin |
fix an error in i386 vmd build
|
#
1.14 |
|
02-May-2017 |
mlarkin |
Matching vmd(8) part of previous diff (first part of vmctl send/receive).
ok kettenis
|
#
1.13 |
|
25-Apr-2017 |
reyk |
spacing
|
#
1.12 |
|
19-Apr-2017 |
reyk |
Add support for dynamic "NAT" interfaces (-L/local interface).
When a local interface is configured, vmd configures a /31 address on the tap(4) interface of the host and provides another IP in the same subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server that replies with IP, gateway, and DNS addresses to the VM. The built-in server only ever responds to the VM on the inside and cannot leak its DHCP responses to the outside.
Thanks to Uwe Werler, Josh Grosse, and some others for testing!
OK deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.11 |
|
27-Mar-2017 |
deraadt |
die whitespace die die die
|
#
1.10 |
|
25-Mar-2017 |
mlarkin |
Last bits needed to get seabios + alpine linux working. This is enough to get started and let more people help finding and fixing bugs.
ok kettenis, deraadt
|
#
1.9 |
|
25-Mar-2017 |
reyk |
Boot using BIOS from /etc/firmware/vmm-bios by default.
Instead of using the internal "vmboot", VMs will now be booted using the external BIOS firmware in /etc/firmware/vmm-bios (which is subject to a LGPLv3 license). Direct booting of OpenBSD kernels or non-default BIOS images is still supported for now using the -b/boot option that is replacing the -k/kernel option.
As requested by Theo, vmd(8) fails if neither the default BIOS is found nor a kernel has been specified in the VM configuration. The "vmm" BIOS has to be installed using fw_update(1), which will be done automatically in most cases where the OpenBSD can fetch it after install/upgrade.
OK mlarkin@
|
#
1.8 |
|
25-Mar-2017 |
mlarkin |
Implement some missing functionality and clean up some code in vmd pci emulation.
ok kettenis
|
#
1.7 |
|
25-Mar-2017 |
mlarkin |
Introduce a new function to obtain properly sized input data, and convert i8253/i8259/mc146818 emulation to use this.
|
#
1.6 |
|
24-Mar-2017 |
mlarkin |
Allow vmd to proceed after an interrupt occurred after retiring a cpuid instruction. Matches previous commit to kernel vmm.c
|
#
1.5 |
|
23-Mar-2017 |
mlarkin |
Implement memory size and SMP CPU count NVRAM registers in the emulated mc146818. This is needed for seabios to boot properly (and construct a sensible e820 map to send to the guest OS).
|
#
1.4 |
|
21-Mar-2017 |
mlarkin |
Fix two errors in NS8250 (UART) emulation. The first error zeroed out the high bits of %eax on reading register data from the emulated UART ports. The second error didn't properly assert the TXRDY bit during init - this bit was only set after the first character was sent. Both these bugs caused seabios to not be able to output any data. Found during the recent effort to get Linux guests booting.
|
#
1.3 |
|
15-Mar-2017 |
reyk |
Improve vmmci(4) shutdown and reboot.
This change handles various cases to power off the VM, even if it is unresponsive, stuck in ddb, or when the shutdown was initiated from the VM guest side. Usage of timeout and VM ACKs make sure that the VM is really turned off at some point.
OK mlarkin@
|
#
1.2 |
|
02-Mar-2017 |
reyk |
Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.
This is especially useful when multiple VMs share a switch, the implementation is independent from the underlying switch or bridge.
no objections mlarkin@
|
#
1.1 |
|
01-Mar-2017 |
reyk |
Split vmm.c into two files: vm.c for the VM child, vmm.c for the parent
As discussed with mlarkin@, it makes it easier to maintain the file.
OK mlarkin@
|
#
1.50 |
|
28-Jun-2019 |
deraadt |
When system calls indicate an error they return -1, not some arbitrary value < 0. errno is only updated in this case. Change all (most?) callers of syscalls to follow this better, and let's see if this strictness helps us in the future.
|
#
1.49 |
|
28-May-2019 |
pd |
vmd: unset CR0_CD and CR0_NW in default flat64 register values
These never got unset on AMD/SVM guests when booted via vmctl start -b causing them to run very slow
ok mlarkin@
|
#
1.48 |
|
12-May-2019 |
pd |
vmm: add a x86 page table walker
Add a first cut of x86 page table walker to vmd(8) and vmm(4). This function is not used right now but is a building block for future features like HPET, OUTSB and INSB emulation, nested virtualisation support, etc.
With help from Mike Larkin
ok mlarkin@
|
#
1.47 |
|
11-May-2019 |
jasper |
vm_dump_header allocated space for a signature but it was never set; set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image
ok mlarkin@ pd@
|
#
1.46 |
|
11-May-2019 |
jasper |
track the state of the vm (running, paused, etc) using a single bitfield instead of a handful of separate variables. this will makes it easier for vmd to report and check on the individual vm states
no functional change intended
ok ccardenas@ mlarkin@
|
Revision tags: OPENBSD_6_5_BASE
|
#
1.45 |
|
01-Mar-2019 |
mlarkin |
vmd(8): remove some i386 remnants that missed the original cleanup
ok pd, kn, deraadt
|
#
1.44 |
|
20-Feb-2019 |
mlarkin |
vmd(8): initialize guest %drX registers to power-on defaults on launch
Initializes the %drX registers to power on defaults, and bump the VM send/recieve header to reflect same
discussed with deraadt@
|
#
1.43 |
|
10-Dec-2018 |
claudio |
Implement the fw_cfg interface basics and use it to set the bootorder if a bootdevice was forced. This implements both the pure IO port interface and also the new DMA interface, a few direct commands are implemented which are needed but in general the "file" interface should be used. There is no write support for the guest. Tested against the latest vmm-firmware port. This requires also a -current kernel to pass the IO ports to vmd(8). OK mlarkin@ ccardenas@
|
#
1.42 |
|
06-Dec-2018 |
claudio |
Make it possible to define the bootdevice in vmd. This information is used currently only when booting a OpenBSD kernel. If VMBOOTDEV_NET is used the internal dhcp server will pass "auto_install" as boot file to the client and the boot loader passes the MAC of the first interface to the kernel to indicate PXE booting. Adding boot order support to SeaBIOS is not yet implemented. Ok ccardenas@
|
Revision tags: OPENBSD_6_4_BASE
|
#
1.41 |
|
08-Oct-2018 |
reyk |
Add support for qcow2 base images (external snapshots).
This works is from Ori Bernstein, committing on his behalf:
Add support to vmd for external snapshots. That is, snapshots that are derived from a base image. Data lookups start in the derived image, and if the derived image does not contain some data, the search proceeds ot the base image. Multiple derived images may exist off of a single base image.
A limitation of this format is that modifying the base image will corrupt the derived image.
This change also adds support for creating disk derived disk images to vmctl. To use it:
vmctl create derived.qcow2 -s 16G -b base.qcow2
From Ori Bernstein OK mlarkin@ reyk@
|
#
1.40 |
|
28-Sep-2018 |
reyk |
Support vmd-internal's vmboot with qcow2 disk images.
OK mlarkin@
|
#
1.39 |
|
19-Sep-2018 |
ccardenas |
Various clean up items for disks.
- qcow2: general cleanup - vioraw: check malloc - virtio: add function to sync disks - vm: call virtio_shutdown to sync disks when vm is finished executing
Thanks to Ori Bernstein.
Ok miko@
|
#
1.38 |
|
17-Jul-2018 |
mlarkin |
vmd(8): fix vmctl -b option for i386 kernels.
ok pd@
|
#
1.37 |
|
12-Jul-2018 |
mlarkin |
vmm(8)/vmm(4): send a copy of the guest register state to vmd on exit, avoiding multiple readregs ioctls back to vmm in case register content is needed subsequently.
ok phessler
|
#
1.36 |
|
10-Jul-2018 |
mlarkin |
vmd(8): route ELCR handler to the right function
|
#
1.35 |
|
09-Jul-2018 |
mlarkin |
vmd(8): better debug message in a failure case
|
#
1.34 |
|
19-Jun-2018 |
reyk |
knf
|
#
1.33 |
|
27-Apr-2018 |
mlarkin |
vmd(8): implement vmd side of ELCR registers
ok guenther
|
#
1.32 |
|
26-Apr-2018 |
mlarkin |
vmd(8): handle PIT channel 2 status readback via port 0x61
Allow PIT channel 2 status (fired/counting) readback via port 0x61 bit 5.
ok guenther@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.31 |
|
03-Jan-2018 |
ccardenas |
Add initial CD-ROM support to VMD via vioscsi.
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
#
1.30 |
|
29-Nov-2017 |
mlarkin |
make vmm(4) less responsible for initial register state, preferring to let usermode daemons handle that.
ok pd@
|
#
1.29 |
|
28-Nov-2017 |
mlarkin |
fix some spelling errors in a few comments
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.28 |
|
19-Sep-2017 |
mlarkin |
Clarify a wrong conditional, found by jsg.
ok jsg
|
#
1.27 |
|
17-Sep-2017 |
pd |
vmd: send/recv pci config space instead of recreating pci devices on receive
ok mlarkin@
|
#
1.26 |
|
17-Sep-2017 |
pd |
vmd: re add rtc.per and rtc.sec evtimers on receive
This was missed in receive. mc146818_start is already defined. This fixes rtc time resync on receive.
ok mlarkin@
|
#
1.25 |
|
11-Sep-2017 |
dlg |
add functions to provide direct access to guest memory as vmd addresses
iovec_mem() populates an iovec array based on guest physical addresses. this allows the use of things like readv and writev for moving data between the guest and a disk image file without having to bounce the memory.
vaddr_mem() provides a vmd usable pointer based on a guests physical address. this makes it possible to directly reference things like virtio rings without having to bounce that memory either. however, it assumes that a contiguous range of guest physical memory will sit in a single vm memory range. mlarkin@ says this is right.
ok mlarkin@
|
#
1.24 |
|
20-Aug-2017 |
pd |
vmd: Allow only upward migration
This restricts receiving vms from hosts with more cpu features.
Tested on broadwell -> skylake (works) skylake -> broadwell (don't work)
ok mlarkin@
|
#
1.23 |
|
14-Aug-2017 |
mlarkin |
vmd: set MSR_MISC_ENABLE=0 on vm creation, this will be re-set in vmm based on proper values from the host in use.
|
#
1.22 |
|
15-Jul-2017 |
pd |
Add vmctl send and vmctl receive
ok reyk@ and mlarkin@
|
#
1.21 |
|
09-Jul-2017 |
pd |
vmd/vmctl: Add ability to pause / unpause vms
With help from Ashwin Agrawal
ok reyk@ mlarkin@
|
#
1.20 |
|
07-Jun-2017 |
mlarkin |
vmd: Implement simulated baudrate support in the ns8250 module. The previous version was allowing an output rate that is "too fast", and linux guests would give up after 512 characters TXed ("too much work for irq4").
This diff calculates the approximate rate we can sustain at the current programmed baud rate and limits the output to that rate by inserting a HZ delay after a specified number of characters have been transmitted. This fixes the linux guest console issue.
Note that the console now outputs at more or less the selected baud rate, instead of nearly instantaneously as before - if you selected 9600 in your guest VMs before, you might want to change that to 115200 now for a better console experience.
krw@ "seems like a good idea to me"
|
#
1.19 |
|
30-May-2017 |
tedu |
split vioblk read/write functions into start and finish as prep for async io operations. ok mlarkin
|
#
1.18 |
|
28-May-2017 |
mlarkin |
SVM: add some exit types
Also, fix a comment that wasn't applicable anymore, and change a format from decimal to hex
|
#
1.17 |
|
05-May-2017 |
reyk |
VMs cannot use proc_compose() to PROC_VMM, they have to use imsg_compose() on the "vmm_pipe" directly. This fixes the communication channel from VMs back to vmm.
|
#
1.16 |
|
05-May-2017 |
mlarkin |
Allow vmd(8) to set guest %xcr0
Usermode part of previous vmm(4) diff.
Posted to tech by Pratik Vyas
|
#
1.15 |
|
02-May-2017 |
mlarkin |
fix an error in i386 vmd build
|
#
1.14 |
|
02-May-2017 |
mlarkin |
Matching vmd(8) part of previous diff (first part of vmctl send/receive).
ok kettenis
|
#
1.13 |
|
25-Apr-2017 |
reyk |
spacing
|
#
1.12 |
|
19-Apr-2017 |
reyk |
Add support for dynamic "NAT" interfaces (-L/local interface).
When a local interface is configured, vmd configures a /31 address on the tap(4) interface of the host and provides another IP in the same subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server that replies with IP, gateway, and DNS addresses to the VM. The built-in server only ever responds to the VM on the inside and cannot leak its DHCP responses to the outside.
Thanks to Uwe Werler, Josh Grosse, and some others for testing!
OK deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.11 |
|
27-Mar-2017 |
deraadt |
die whitespace die die die
|
#
1.10 |
|
25-Mar-2017 |
mlarkin |
Last bits needed to get seabios + alpine linux working. This is enough to get started and let more people help finding and fixing bugs.
ok kettenis, deraadt
|
#
1.9 |
|
25-Mar-2017 |
reyk |
Boot using BIOS from /etc/firmware/vmm-bios by default.
Instead of using the internal "vmboot", VMs will now be booted using the external BIOS firmware in /etc/firmware/vmm-bios (which is subject to a LGPLv3 license). Direct booting of OpenBSD kernels or non-default BIOS images is still supported for now using the -b/boot option that is replacing the -k/kernel option.
As requested by Theo, vmd(8) fails if neither the default BIOS is found nor a kernel has been specified in the VM configuration. The "vmm" BIOS has to be installed using fw_update(1), which will be done automatically in most cases where the OpenBSD can fetch it after install/upgrade.
OK mlarkin@
|
#
1.8 |
|
25-Mar-2017 |
mlarkin |
Implement some missing functionality and clean up some code in vmd pci emulation.
ok kettenis
|
#
1.7 |
|
25-Mar-2017 |
mlarkin |
Introduce a new function to obtain properly sized input data, and convert i8253/i8259/mc146818 emulation to use this.
|
#
1.6 |
|
24-Mar-2017 |
mlarkin |
Allow vmd to proceed after an interrupt occurred after retiring a cpuid instruction. Matches previous commit to kernel vmm.c
|
#
1.5 |
|
23-Mar-2017 |
mlarkin |
Implement memory size and SMP CPU count NVRAM registers in the emulated mc146818. This is needed for seabios to boot properly (and construct a sensible e820 map to send to the guest OS).
|
#
1.4 |
|
21-Mar-2017 |
mlarkin |
Fix two errors in NS8250 (UART) emulation. The first error zeroed out the high bits of %eax on reading register data from the emulated UART ports. The second error didn't properly assert the TXRDY bit during init - this bit was only set after the first character was sent. Both these bugs caused seabios to not be able to output any data. Found during the recent effort to get Linux guests booting.
|
#
1.3 |
|
15-Mar-2017 |
reyk |
Improve vmmci(4) shutdown and reboot.
This change handles various cases to power off the VM, even if it is unresponsive, stuck in ddb, or when the shutdown was initiated from the VM guest side. Usage of timeout and VM ACKs make sure that the VM is really turned off at some point.
OK mlarkin@
|
#
1.2 |
|
02-Mar-2017 |
reyk |
Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.
This is especially useful when multiple VMs share a switch, the implementation is independent from the underlying switch or bridge.
no objections mlarkin@
|
#
1.1 |
|
01-Mar-2017 |
reyk |
Split vmm.c into two files: vm.c for the VM child, vmm.c for the parent
As discussed with mlarkin@, it makes it easier to maintain the file.
OK mlarkin@
|
#
1.49 |
|
28-May-2019 |
pd |
vmd: unset CR0_CD and CR0_NW in default flat64 register values
These never got unset on AMD/SVM guests when booted via vmctl start -b causing them to run very slow
ok mlarkin@
|
#
1.48 |
|
12-May-2019 |
pd |
vmm: add a x86 page table walker
Add a first cut of x86 page table walker to vmd(8) and vmm(4). This function is not used right now but is a building block for future features like HPET, OUTSB and INSB emulation, nested virtualisation support, etc.
With help from Mike Larkin
ok mlarkin@
|
#
1.47 |
|
11-May-2019 |
jasper |
vm_dump_header allocated space for a signature but it was never set; set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image
ok mlarkin@ pd@
|
#
1.46 |
|
11-May-2019 |
jasper |
track the state of the vm (running, paused, etc) using a single bitfield instead of a handful of separate variables. this will makes it easier for vmd to report and check on the individual vm states
no functional change intended
ok ccardenas@ mlarkin@
|
Revision tags: OPENBSD_6_5_BASE
|
#
1.45 |
|
01-Mar-2019 |
mlarkin |
vmd(8): remove some i386 remnants that missed the original cleanup
ok pd, kn, deraadt
|
#
1.44 |
|
20-Feb-2019 |
mlarkin |
vmd(8): initialize guest %drX registers to power-on defaults on launch
Initializes the %drX registers to power on defaults, and bump the VM send/recieve header to reflect same
discussed with deraadt@
|
#
1.43 |
|
10-Dec-2018 |
claudio |
Implement the fw_cfg interface basics and use it to set the bootorder if a bootdevice was forced. This implements both the pure IO port interface and also the new DMA interface, a few direct commands are implemented which are needed but in general the "file" interface should be used. There is no write support for the guest. Tested against the latest vmm-firmware port. This requires also a -current kernel to pass the IO ports to vmd(8). OK mlarkin@ ccardenas@
|
#
1.42 |
|
06-Dec-2018 |
claudio |
Make it possible to define the bootdevice in vmd. This information is used currently only when booting a OpenBSD kernel. If VMBOOTDEV_NET is used the internal dhcp server will pass "auto_install" as boot file to the client and the boot loader passes the MAC of the first interface to the kernel to indicate PXE booting. Adding boot order support to SeaBIOS is not yet implemented. Ok ccardenas@
|
Revision tags: OPENBSD_6_4_BASE
|
#
1.41 |
|
08-Oct-2018 |
reyk |
Add support for qcow2 base images (external snapshots).
This works is from Ori Bernstein, committing on his behalf:
Add support to vmd for external snapshots. That is, snapshots that are derived from a base image. Data lookups start in the derived image, and if the derived image does not contain some data, the search proceeds ot the base image. Multiple derived images may exist off of a single base image.
A limitation of this format is that modifying the base image will corrupt the derived image.
This change also adds support for creating disk derived disk images to vmctl. To use it:
vmctl create derived.qcow2 -s 16G -b base.qcow2
From Ori Bernstein OK mlarkin@ reyk@
|
#
1.40 |
|
28-Sep-2018 |
reyk |
Support vmd-internal's vmboot with qcow2 disk images.
OK mlarkin@
|
#
1.39 |
|
19-Sep-2018 |
ccardenas |
Various clean up items for disks.
- qcow2: general cleanup - vioraw: check malloc - virtio: add function to sync disks - vm: call virtio_shutdown to sync disks when vm is finished executing
Thanks to Ori Bernstein.
Ok miko@
|
#
1.38 |
|
17-Jul-2018 |
mlarkin |
vmd(8): fix vmctl -b option for i386 kernels.
ok pd@
|
#
1.37 |
|
12-Jul-2018 |
mlarkin |
vmm(8)/vmm(4): send a copy of the guest register state to vmd on exit, avoiding multiple readregs ioctls back to vmm in case register content is needed subsequently.
ok phessler
|
#
1.36 |
|
10-Jul-2018 |
mlarkin |
vmd(8): route ELCR handler to the right function
|
#
1.35 |
|
09-Jul-2018 |
mlarkin |
vmd(8): better debug message in a failure case
|
#
1.34 |
|
19-Jun-2018 |
reyk |
knf
|
#
1.33 |
|
27-Apr-2018 |
mlarkin |
vmd(8): implement vmd side of ELCR registers
ok guenther
|
#
1.32 |
|
26-Apr-2018 |
mlarkin |
vmd(8): handle PIT channel 2 status readback via port 0x61
Allow PIT channel 2 status (fired/counting) readback via port 0x61 bit 5.
ok guenther@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.31 |
|
03-Jan-2018 |
ccardenas |
Add initial CD-ROM support to VMD via vioscsi.
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
#
1.30 |
|
29-Nov-2017 |
mlarkin |
make vmm(4) less responsible for initial register state, preferring to let usermode daemons handle that.
ok pd@
|
#
1.29 |
|
28-Nov-2017 |
mlarkin |
fix some spelling errors in a few comments
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.28 |
|
19-Sep-2017 |
mlarkin |
Clarify a wrong conditional, found by jsg.
ok jsg
|
#
1.27 |
|
17-Sep-2017 |
pd |
vmd: send/recv pci config space instead of recreating pci devices on receive
ok mlarkin@
|
#
1.26 |
|
17-Sep-2017 |
pd |
vmd: re add rtc.per and rtc.sec evtimers on receive
This was missed in receive. mc146818_start is already defined. This fixes rtc time resync on receive.
ok mlarkin@
|
#
1.25 |
|
11-Sep-2017 |
dlg |
add functions to provide direct access to guest memory as vmd addresses
iovec_mem() populates an iovec array based on guest physical addresses. this allows the use of things like readv and writev for moving data between the guest and a disk image file without having to bounce the memory.
vaddr_mem() provides a vmd usable pointer based on a guests physical address. this makes it possible to directly reference things like virtio rings without having to bounce that memory either. however, it assumes that a contiguous range of guest physical memory will sit in a single vm memory range. mlarkin@ says this is right.
ok mlarkin@
|
#
1.24 |
|
20-Aug-2017 |
pd |
vmd: Allow only upward migration
This restricts receiving vms from hosts with more cpu features.
Tested on broadwell -> skylake (works) skylake -> broadwell (don't work)
ok mlarkin@
|
#
1.23 |
|
14-Aug-2017 |
mlarkin |
vmd: set MSR_MISC_ENABLE=0 on vm creation, this will be re-set in vmm based on proper values from the host in use.
|
#
1.22 |
|
15-Jul-2017 |
pd |
Add vmctl send and vmctl receive
ok reyk@ and mlarkin@
|
#
1.21 |
|
09-Jul-2017 |
pd |
vmd/vmctl: Add ability to pause / unpause vms
With help from Ashwin Agrawal
ok reyk@ mlarkin@
|
#
1.20 |
|
07-Jun-2017 |
mlarkin |
vmd: Implement simulated baudrate support in the ns8250 module. The previous version was allowing an output rate that is "too fast", and linux guests would give up after 512 characters TXed ("too much work for irq4").
This diff calculates the approximate rate we can sustain at the current programmed baud rate and limits the output to that rate by inserting a HZ delay after a specified number of characters have been transmitted. This fixes the linux guest console issue.
Note that the console now outputs at more or less the selected baud rate, instead of nearly instantaneously as before - if you selected 9600 in your guest VMs before, you might want to change that to 115200 now for a better console experience.
krw@ "seems like a good idea to me"
|
#
1.19 |
|
30-May-2017 |
tedu |
split vioblk read/write functions into start and finish as prep for async io operations. ok mlarkin
|
#
1.18 |
|
28-May-2017 |
mlarkin |
SVM: add some exit types
Also, fix a comment that wasn't applicable anymore, and change a format from decimal to hex
|
#
1.17 |
|
05-May-2017 |
reyk |
VMs cannot use proc_compose() to PROC_VMM, they have to use imsg_compose() on the "vmm_pipe" directly. This fixes the communication channel from VMs back to vmm.
|
#
1.16 |
|
05-May-2017 |
mlarkin |
Allow vmd(8) to set guest %xcr0
Usermode part of previous vmm(4) diff.
Posted to tech by Pratik Vyas
|
#
1.15 |
|
02-May-2017 |
mlarkin |
fix an error in i386 vmd build
|
#
1.14 |
|
02-May-2017 |
mlarkin |
Matching vmd(8) part of previous diff (first part of vmctl send/receive).
ok kettenis
|
#
1.13 |
|
25-Apr-2017 |
reyk |
spacing
|
#
1.12 |
|
19-Apr-2017 |
reyk |
Add support for dynamic "NAT" interfaces (-L/local interface).
When a local interface is configured, vmd configures a /31 address on the tap(4) interface of the host and provides another IP in the same subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server that replies with IP, gateway, and DNS addresses to the VM. The built-in server only ever responds to the VM on the inside and cannot leak its DHCP responses to the outside.
Thanks to Uwe Werler, Josh Grosse, and some others for testing!
OK deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.11 |
|
27-Mar-2017 |
deraadt |
die whitespace die die die
|
#
1.10 |
|
25-Mar-2017 |
mlarkin |
Last bits needed to get seabios + alpine linux working. This is enough to get started and let more people help finding and fixing bugs.
ok kettenis, deraadt
|
#
1.9 |
|
25-Mar-2017 |
reyk |
Boot using BIOS from /etc/firmware/vmm-bios by default.
Instead of using the internal "vmboot", VMs will now be booted using the external BIOS firmware in /etc/firmware/vmm-bios (which is subject to a LGPLv3 license). Direct booting of OpenBSD kernels or non-default BIOS images is still supported for now using the -b/boot option that is replacing the -k/kernel option.
As requested by Theo, vmd(8) fails if neither the default BIOS is found nor a kernel has been specified in the VM configuration. The "vmm" BIOS has to be installed using fw_update(1), which will be done automatically in most cases where the OpenBSD can fetch it after install/upgrade.
OK mlarkin@
|
#
1.8 |
|
25-Mar-2017 |
mlarkin |
Implement some missing functionality and clean up some code in vmd pci emulation.
ok kettenis
|
#
1.7 |
|
25-Mar-2017 |
mlarkin |
Introduce a new function to obtain properly sized input data, and convert i8253/i8259/mc146818 emulation to use this.
|
#
1.6 |
|
24-Mar-2017 |
mlarkin |
Allow vmd to proceed after an interrupt occurred after retiring a cpuid instruction. Matches previous commit to kernel vmm.c
|
#
1.5 |
|
23-Mar-2017 |
mlarkin |
Implement memory size and SMP CPU count NVRAM registers in the emulated mc146818. This is needed for seabios to boot properly (and construct a sensible e820 map to send to the guest OS).
|
#
1.4 |
|
21-Mar-2017 |
mlarkin |
Fix two errors in NS8250 (UART) emulation. The first error zeroed out the high bits of %eax on reading register data from the emulated UART ports. The second error didn't properly assert the TXRDY bit during init - this bit was only set after the first character was sent. Both these bugs caused seabios to not be able to output any data. Found during the recent effort to get Linux guests booting.
|
#
1.3 |
|
15-Mar-2017 |
reyk |
Improve vmmci(4) shutdown and reboot.
This change handles various cases to power off the VM, even if it is unresponsive, stuck in ddb, or when the shutdown was initiated from the VM guest side. Usage of timeout and VM ACKs make sure that the VM is really turned off at some point.
OK mlarkin@
|
#
1.2 |
|
02-Mar-2017 |
reyk |
Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.
This is especially useful when multiple VMs share a switch, the implementation is independent from the underlying switch or bridge.
no objections mlarkin@
|
#
1.1 |
|
01-Mar-2017 |
reyk |
Split vmm.c into two files: vm.c for the VM child, vmm.c for the parent
As discussed with mlarkin@, it makes it easier to maintain the file.
OK mlarkin@
|
#
1.48 |
|
12-May-2019 |
pd |
vmm: add a x86 page table walker
Add a first cut of x86 page table walker to vmd(8) and vmm(4). This function is not used right now but is a building block for future features like HPET, OUTSB and INSB emulation, nested virtualisation support, etc.
With help from Mike Larkin
ok mlarkin@
|
#
1.47 |
|
11-May-2019 |
jasper |
vm_dump_header allocated space for a signature but it was never set; set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image
ok mlarkin@ pd@
|
#
1.46 |
|
11-May-2019 |
jasper |
track the state of the vm (running, paused, etc) using a single bitfield instead of a handful of separate variables. this will makes it easier for vmd to report and check on the individual vm states
no functional change intended
ok ccardenas@ mlarkin@
|
Revision tags: OPENBSD_6_5_BASE
|
#
1.45 |
|
01-Mar-2019 |
mlarkin |
vmd(8): remove some i386 remnants that missed the original cleanup
ok pd, kn, deraadt
|
#
1.44 |
|
20-Feb-2019 |
mlarkin |
vmd(8): initialize guest %drX registers to power-on defaults on launch
Initializes the %drX registers to power on defaults, and bump the VM send/recieve header to reflect same
discussed with deraadt@
|
#
1.43 |
|
10-Dec-2018 |
claudio |
Implement the fw_cfg interface basics and use it to set the bootorder if a bootdevice was forced. This implements both the pure IO port interface and also the new DMA interface, a few direct commands are implemented which are needed but in general the "file" interface should be used. There is no write support for the guest. Tested against the latest vmm-firmware port. This requires also a -current kernel to pass the IO ports to vmd(8). OK mlarkin@ ccardenas@
|
#
1.42 |
|
06-Dec-2018 |
claudio |
Make it possible to define the bootdevice in vmd. This information is used currently only when booting a OpenBSD kernel. If VMBOOTDEV_NET is used the internal dhcp server will pass "auto_install" as boot file to the client and the boot loader passes the MAC of the first interface to the kernel to indicate PXE booting. Adding boot order support to SeaBIOS is not yet implemented. Ok ccardenas@
|
Revision tags: OPENBSD_6_4_BASE
|
#
1.41 |
|
08-Oct-2018 |
reyk |
Add support for qcow2 base images (external snapshots).
This works is from Ori Bernstein, committing on his behalf:
Add support to vmd for external snapshots. That is, snapshots that are derived from a base image. Data lookups start in the derived image, and if the derived image does not contain some data, the search proceeds ot the base image. Multiple derived images may exist off of a single base image.
A limitation of this format is that modifying the base image will corrupt the derived image.
This change also adds support for creating disk derived disk images to vmctl. To use it:
vmctl create derived.qcow2 -s 16G -b base.qcow2
From Ori Bernstein OK mlarkin@ reyk@
|
#
1.40 |
|
28-Sep-2018 |
reyk |
Support vmd-internal's vmboot with qcow2 disk images.
OK mlarkin@
|
#
1.39 |
|
19-Sep-2018 |
ccardenas |
Various clean up items for disks.
- qcow2: general cleanup - vioraw: check malloc - virtio: add function to sync disks - vm: call virtio_shutdown to sync disks when vm is finished executing
Thanks to Ori Bernstein.
Ok miko@
|
#
1.38 |
|
17-Jul-2018 |
mlarkin |
vmd(8): fix vmctl -b option for i386 kernels.
ok pd@
|
#
1.37 |
|
12-Jul-2018 |
mlarkin |
vmm(8)/vmm(4): send a copy of the guest register state to vmd on exit, avoiding multiple readregs ioctls back to vmm in case register content is needed subsequently.
ok phessler
|
#
1.36 |
|
10-Jul-2018 |
mlarkin |
vmd(8): route ELCR handler to the right function
|
#
1.35 |
|
09-Jul-2018 |
mlarkin |
vmd(8): better debug message in a failure case
|
#
1.34 |
|
19-Jun-2018 |
reyk |
knf
|
#
1.33 |
|
27-Apr-2018 |
mlarkin |
vmd(8): implement vmd side of ELCR registers
ok guenther
|
#
1.32 |
|
26-Apr-2018 |
mlarkin |
vmd(8): handle PIT channel 2 status readback via port 0x61
Allow PIT channel 2 status (fired/counting) readback via port 0x61 bit 5.
ok guenther@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.31 |
|
03-Jan-2018 |
ccardenas |
Add initial CD-ROM support to VMD via vioscsi.
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
#
1.30 |
|
29-Nov-2017 |
mlarkin |
make vmm(4) less responsible for initial register state, preferring to let usermode daemons handle that.
ok pd@
|
#
1.29 |
|
28-Nov-2017 |
mlarkin |
fix some spelling errors in a few comments
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.28 |
|
19-Sep-2017 |
mlarkin |
Clarify a wrong conditional, found by jsg.
ok jsg
|
#
1.27 |
|
17-Sep-2017 |
pd |
vmd: send/recv pci config space instead of recreating pci devices on receive
ok mlarkin@
|
#
1.26 |
|
17-Sep-2017 |
pd |
vmd: re add rtc.per and rtc.sec evtimers on receive
This was missed in receive. mc146818_start is already defined. This fixes rtc time resync on receive.
ok mlarkin@
|
#
1.25 |
|
11-Sep-2017 |
dlg |
add functions to provide direct access to guest memory as vmd addresses
iovec_mem() populates an iovec array based on guest physical addresses. this allows the use of things like readv and writev for moving data between the guest and a disk image file without having to bounce the memory.
vaddr_mem() provides a vmd usable pointer based on a guests physical address. this makes it possible to directly reference things like virtio rings without having to bounce that memory either. however, it assumes that a contiguous range of guest physical memory will sit in a single vm memory range. mlarkin@ says this is right.
ok mlarkin@
|
#
1.24 |
|
20-Aug-2017 |
pd |
vmd: Allow only upward migration
This restricts receiving vms from hosts with more cpu features.
Tested on broadwell -> skylake (works) skylake -> broadwell (don't work)
ok mlarkin@
|
#
1.23 |
|
14-Aug-2017 |
mlarkin |
vmd: set MSR_MISC_ENABLE=0 on vm creation, this will be re-set in vmm based on proper values from the host in use.
|
#
1.22 |
|
15-Jul-2017 |
pd |
Add vmctl send and vmctl receive
ok reyk@ and mlarkin@
|
#
1.21 |
|
09-Jul-2017 |
pd |
vmd/vmctl: Add ability to pause / unpause vms
With help from Ashwin Agrawal
ok reyk@ mlarkin@
|
#
1.20 |
|
07-Jun-2017 |
mlarkin |
vmd: Implement simulated baudrate support in the ns8250 module. The previous version was allowing an output rate that is "too fast", and linux guests would give up after 512 characters TXed ("too much work for irq4").
This diff calculates the approximate rate we can sustain at the current programmed baud rate and limits the output to that rate by inserting a HZ delay after a specified number of characters have been transmitted. This fixes the linux guest console issue.
Note that the console now outputs at more or less the selected baud rate, instead of nearly instantaneously as before - if you selected 9600 in your guest VMs before, you might want to change that to 115200 now for a better console experience.
krw@ "seems like a good idea to me"
|
#
1.19 |
|
30-May-2017 |
tedu |
split vioblk read/write functions into start and finish as prep for async io operations. ok mlarkin
|
#
1.18 |
|
28-May-2017 |
mlarkin |
SVM: add some exit types
Also, fix a comment that wasn't applicable anymore, and change a format from decimal to hex
|
#
1.17 |
|
05-May-2017 |
reyk |
VMs cannot use proc_compose() to PROC_VMM, they have to use imsg_compose() on the "vmm_pipe" directly. This fixes the communication channel from VMs back to vmm.
|
#
1.16 |
|
05-May-2017 |
mlarkin |
Allow vmd(8) to set guest %xcr0
Usermode part of previous vmm(4) diff.
Posted to tech by Pratik Vyas
|
#
1.15 |
|
02-May-2017 |
mlarkin |
fix an error in i386 vmd build
|
#
1.14 |
|
02-May-2017 |
mlarkin |
Matching vmd(8) part of previous diff (first part of vmctl send/receive).
ok kettenis
|
#
1.13 |
|
25-Apr-2017 |
reyk |
spacing
|
#
1.12 |
|
19-Apr-2017 |
reyk |
Add support for dynamic "NAT" interfaces (-L/local interface).
When a local interface is configured, vmd configures a /31 address on the tap(4) interface of the host and provides another IP in the same subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server that replies with IP, gateway, and DNS addresses to the VM. The built-in server only ever responds to the VM on the inside and cannot leak its DHCP responses to the outside.
Thanks to Uwe Werler, Josh Grosse, and some others for testing!
OK deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.11 |
|
27-Mar-2017 |
deraadt |
die whitespace die die die
|
#
1.10 |
|
25-Mar-2017 |
mlarkin |
Last bits needed to get seabios + alpine linux working. This is enough to get started and let more people help finding and fixing bugs.
ok kettenis, deraadt
|
#
1.9 |
|
25-Mar-2017 |
reyk |
Boot using BIOS from /etc/firmware/vmm-bios by default.
Instead of using the internal "vmboot", VMs will now be booted using the external BIOS firmware in /etc/firmware/vmm-bios (which is subject to a LGPLv3 license). Direct booting of OpenBSD kernels or non-default BIOS images is still supported for now using the -b/boot option that is replacing the -k/kernel option.
As requested by Theo, vmd(8) fails if neither the default BIOS is found nor a kernel has been specified in the VM configuration. The "vmm" BIOS has to be installed using fw_update(1), which will be done automatically in most cases where the OpenBSD can fetch it after install/upgrade.
OK mlarkin@
|
#
1.8 |
|
25-Mar-2017 |
mlarkin |
Implement some missing functionality and clean up some code in vmd pci emulation.
ok kettenis
|
#
1.7 |
|
25-Mar-2017 |
mlarkin |
Introduce a new function to obtain properly sized input data, and convert i8253/i8259/mc146818 emulation to use this.
|
#
1.6 |
|
24-Mar-2017 |
mlarkin |
Allow vmd to proceed after an interrupt occurred after retiring a cpuid instruction. Matches previous commit to kernel vmm.c
|
#
1.5 |
|
23-Mar-2017 |
mlarkin |
Implement memory size and SMP CPU count NVRAM registers in the emulated mc146818. This is needed for seabios to boot properly (and construct a sensible e820 map to send to the guest OS).
|
#
1.4 |
|
21-Mar-2017 |
mlarkin |
Fix two errors in NS8250 (UART) emulation. The first error zeroed out the high bits of %eax on reading register data from the emulated UART ports. The second error didn't properly assert the TXRDY bit during init - this bit was only set after the first character was sent. Both these bugs caused seabios to not be able to output any data. Found during the recent effort to get Linux guests booting.
|
#
1.3 |
|
15-Mar-2017 |
reyk |
Improve vmmci(4) shutdown and reboot.
This change handles various cases to power off the VM, even if it is unresponsive, stuck in ddb, or when the shutdown was initiated from the VM guest side. Usage of timeout and VM ACKs make sure that the VM is really turned off at some point.
OK mlarkin@
|
#
1.2 |
|
02-Mar-2017 |
reyk |
Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.
This is especially useful when multiple VMs share a switch, the implementation is independent from the underlying switch or bridge.
no objections mlarkin@
|
#
1.1 |
|
01-Mar-2017 |
reyk |
Split vmm.c into two files: vm.c for the VM child, vmm.c for the parent
As discussed with mlarkin@, it makes it easier to maintain the file.
OK mlarkin@
|
#
1.47 |
|
11-May-2019 |
jasper |
vm_dump_header allocated space for a signature but it was never set; set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image
ok mlarkin@ pd@
|
#
1.46 |
|
11-May-2019 |
jasper |
track the state of the vm (running, paused, etc) using a single bitfield instead of a handful of separate variables. this will makes it easier for vmd to report and check on the individual vm states
no functional change intended
ok ccardenas@ mlarkin@
|
Revision tags: OPENBSD_6_5_BASE
|
#
1.45 |
|
01-Mar-2019 |
mlarkin |
vmd(8): remove some i386 remnants that missed the original cleanup
ok pd, kn, deraadt
|
#
1.44 |
|
20-Feb-2019 |
mlarkin |
vmd(8): initialize guest %drX registers to power-on defaults on launch
Initializes the %drX registers to power on defaults, and bump the VM send/recieve header to reflect same
discussed with deraadt@
|
#
1.43 |
|
10-Dec-2018 |
claudio |
Implement the fw_cfg interface basics and use it to set the bootorder if a bootdevice was forced. This implements both the pure IO port interface and also the new DMA interface, a few direct commands are implemented which are needed but in general the "file" interface should be used. There is no write support for the guest. Tested against the latest vmm-firmware port. This requires also a -current kernel to pass the IO ports to vmd(8). OK mlarkin@ ccardenas@
|
#
1.42 |
|
06-Dec-2018 |
claudio |
Make it possible to define the bootdevice in vmd. This information is used currently only when booting a OpenBSD kernel. If VMBOOTDEV_NET is used the internal dhcp server will pass "auto_install" as boot file to the client and the boot loader passes the MAC of the first interface to the kernel to indicate PXE booting. Adding boot order support to SeaBIOS is not yet implemented. Ok ccardenas@
|
Revision tags: OPENBSD_6_4_BASE
|
#
1.41 |
|
08-Oct-2018 |
reyk |
Add support for qcow2 base images (external snapshots).
This works is from Ori Bernstein, committing on his behalf:
Add support to vmd for external snapshots. That is, snapshots that are derived from a base image. Data lookups start in the derived image, and if the derived image does not contain some data, the search proceeds ot the base image. Multiple derived images may exist off of a single base image.
A limitation of this format is that modifying the base image will corrupt the derived image.
This change also adds support for creating disk derived disk images to vmctl. To use it:
vmctl create derived.qcow2 -s 16G -b base.qcow2
From Ori Bernstein OK mlarkin@ reyk@
|
#
1.40 |
|
28-Sep-2018 |
reyk |
Support vmd-internal's vmboot with qcow2 disk images.
OK mlarkin@
|
#
1.39 |
|
19-Sep-2018 |
ccardenas |
Various clean up items for disks.
- qcow2: general cleanup - vioraw: check malloc - virtio: add function to sync disks - vm: call virtio_shutdown to sync disks when vm is finished executing
Thanks to Ori Bernstein.
Ok miko@
|
#
1.38 |
|
17-Jul-2018 |
mlarkin |
vmd(8): fix vmctl -b option for i386 kernels.
ok pd@
|
#
1.37 |
|
12-Jul-2018 |
mlarkin |
vmm(8)/vmm(4): send a copy of the guest register state to vmd on exit, avoiding multiple readregs ioctls back to vmm in case register content is needed subsequently.
ok phessler
|
#
1.36 |
|
10-Jul-2018 |
mlarkin |
vmd(8): route ELCR handler to the right function
|
#
1.35 |
|
09-Jul-2018 |
mlarkin |
vmd(8): better debug message in a failure case
|
#
1.34 |
|
19-Jun-2018 |
reyk |
knf
|
#
1.33 |
|
27-Apr-2018 |
mlarkin |
vmd(8): implement vmd side of ELCR registers
ok guenther
|
#
1.32 |
|
26-Apr-2018 |
mlarkin |
vmd(8): handle PIT channel 2 status readback via port 0x61
Allow PIT channel 2 status (fired/counting) readback via port 0x61 bit 5.
ok guenther@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.31 |
|
03-Jan-2018 |
ccardenas |
Add initial CD-ROM support to VMD via vioscsi.
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
#
1.30 |
|
29-Nov-2017 |
mlarkin |
make vmm(4) less responsible for initial register state, preferring to let usermode daemons handle that.
ok pd@
|
#
1.29 |
|
28-Nov-2017 |
mlarkin |
fix some spelling errors in a few comments
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.28 |
|
19-Sep-2017 |
mlarkin |
Clarify a wrong conditional, found by jsg.
ok jsg
|
#
1.27 |
|
17-Sep-2017 |
pd |
vmd: send/recv pci config space instead of recreating pci devices on receive
ok mlarkin@
|
#
1.26 |
|
17-Sep-2017 |
pd |
vmd: re add rtc.per and rtc.sec evtimers on receive
This was missed in receive. mc146818_start is already defined. This fixes rtc time resync on receive.
ok mlarkin@
|
#
1.25 |
|
11-Sep-2017 |
dlg |
add functions to provide direct access to guest memory as vmd addresses
iovec_mem() populates an iovec array based on guest physical addresses. this allows the use of things like readv and writev for moving data between the guest and a disk image file without having to bounce the memory.
vaddr_mem() provides a vmd usable pointer based on a guests physical address. this makes it possible to directly reference things like virtio rings without having to bounce that memory either. however, it assumes that a contiguous range of guest physical memory will sit in a single vm memory range. mlarkin@ says this is right.
ok mlarkin@
|
#
1.24 |
|
20-Aug-2017 |
pd |
vmd: Allow only upward migration
This restricts receiving vms from hosts with more cpu features.
Tested on broadwell -> skylake (works) skylake -> broadwell (don't work)
ok mlarkin@
|
#
1.23 |
|
14-Aug-2017 |
mlarkin |
vmd: set MSR_MISC_ENABLE=0 on vm creation, this will be re-set in vmm based on proper values from the host in use.
|
#
1.22 |
|
15-Jul-2017 |
pd |
Add vmctl send and vmctl receive
ok reyk@ and mlarkin@
|
#
1.21 |
|
09-Jul-2017 |
pd |
vmd/vmctl: Add ability to pause / unpause vms
With help from Ashwin Agrawal
ok reyk@ mlarkin@
|
#
1.20 |
|
07-Jun-2017 |
mlarkin |
vmd: Implement simulated baudrate support in the ns8250 module. The previous version was allowing an output rate that is "too fast", and linux guests would give up after 512 characters TXed ("too much work for irq4").
This diff calculates the approximate rate we can sustain at the current programmed baud rate and limits the output to that rate by inserting a HZ delay after a specified number of characters have been transmitted. This fixes the linux guest console issue.
Note that the console now outputs at more or less the selected baud rate, instead of nearly instantaneously as before - if you selected 9600 in your guest VMs before, you might want to change that to 115200 now for a better console experience.
krw@ "seems like a good idea to me"
|
#
1.19 |
|
30-May-2017 |
tedu |
split vioblk read/write functions into start and finish as prep for async io operations. ok mlarkin
|
#
1.18 |
|
28-May-2017 |
mlarkin |
SVM: add some exit types
Also, fix a comment that wasn't applicable anymore, and change a format from decimal to hex
|
#
1.17 |
|
05-May-2017 |
reyk |
VMs cannot use proc_compose() to PROC_VMM, they have to use imsg_compose() on the "vmm_pipe" directly. This fixes the communication channel from VMs back to vmm.
|
#
1.16 |
|
05-May-2017 |
mlarkin |
Allow vmd(8) to set guest %xcr0
Usermode part of previous vmm(4) diff.
Posted to tech by Pratik Vyas
|
#
1.15 |
|
02-May-2017 |
mlarkin |
fix an error in i386 vmd build
|
#
1.14 |
|
02-May-2017 |
mlarkin |
Matching vmd(8) part of previous diff (first part of vmctl send/receive).
ok kettenis
|
#
1.13 |
|
25-Apr-2017 |
reyk |
spacing
|
#
1.12 |
|
19-Apr-2017 |
reyk |
Add support for dynamic "NAT" interfaces (-L/local interface).
When a local interface is configured, vmd configures a /31 address on the tap(4) interface of the host and provides another IP in the same subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server that replies with IP, gateway, and DNS addresses to the VM. The built-in server only ever responds to the VM on the inside and cannot leak its DHCP responses to the outside.
Thanks to Uwe Werler, Josh Grosse, and some others for testing!
OK deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.11 |
|
27-Mar-2017 |
deraadt |
die whitespace die die die
|
#
1.10 |
|
25-Mar-2017 |
mlarkin |
Last bits needed to get seabios + alpine linux working. This is enough to get started and let more people help finding and fixing bugs.
ok kettenis, deraadt
|
#
1.9 |
|
25-Mar-2017 |
reyk |
Boot using BIOS from /etc/firmware/vmm-bios by default.
Instead of using the internal "vmboot", VMs will now be booted using the external BIOS firmware in /etc/firmware/vmm-bios (which is subject to a LGPLv3 license). Direct booting of OpenBSD kernels or non-default BIOS images is still supported for now using the -b/boot option that is replacing the -k/kernel option.
As requested by Theo, vmd(8) fails if neither the default BIOS is found nor a kernel has been specified in the VM configuration. The "vmm" BIOS has to be installed using fw_update(1), which will be done automatically in most cases where the OpenBSD can fetch it after install/upgrade.
OK mlarkin@
|
#
1.8 |
|
25-Mar-2017 |
mlarkin |
Implement some missing functionality and clean up some code in vmd pci emulation.
ok kettenis
|
#
1.7 |
|
25-Mar-2017 |
mlarkin |
Introduce a new function to obtain properly sized input data, and convert i8253/i8259/mc146818 emulation to use this.
|
#
1.6 |
|
24-Mar-2017 |
mlarkin |
Allow vmd to proceed after an interrupt occurred after retiring a cpuid instruction. Matches previous commit to kernel vmm.c
|
#
1.5 |
|
23-Mar-2017 |
mlarkin |
Implement memory size and SMP CPU count NVRAM registers in the emulated mc146818. This is needed for seabios to boot properly (and construct a sensible e820 map to send to the guest OS).
|
#
1.4 |
|
21-Mar-2017 |
mlarkin |
Fix two errors in NS8250 (UART) emulation. The first error zeroed out the high bits of %eax on reading register data from the emulated UART ports. The second error didn't properly assert the TXRDY bit during init - this bit was only set after the first character was sent. Both these bugs caused seabios to not be able to output any data. Found during the recent effort to get Linux guests booting.
|
#
1.3 |
|
15-Mar-2017 |
reyk |
Improve vmmci(4) shutdown and reboot.
This change handles various cases to power off the VM, even if it is unresponsive, stuck in ddb, or when the shutdown was initiated from the VM guest side. Usage of timeout and VM ACKs make sure that the VM is really turned off at some point.
OK mlarkin@
|
#
1.2 |
|
02-Mar-2017 |
reyk |
Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.
This is especially useful when multiple VMs share a switch, the implementation is independent from the underlying switch or bridge.
no objections mlarkin@
|
#
1.1 |
|
01-Mar-2017 |
reyk |
Split vmm.c into two files: vm.c for the VM child, vmm.c for the parent
As discussed with mlarkin@, it makes it easier to maintain the file.
OK mlarkin@
|
#
1.45 |
|
01-Mar-2019 |
mlarkin |
vmd(8): remove some i386 remnants that missed the original cleanup
ok pd, kn, deraadt
|
#
1.44 |
|
20-Feb-2019 |
mlarkin |
vmd(8): initialize guest %drX registers to power-on defaults on launch
Initializes the %drX registers to power on defaults, and bump the VM send/recieve header to reflect same
discussed with deraadt@
|
#
1.43 |
|
10-Dec-2018 |
claudio |
Implement the fw_cfg interface basics and use it to set the bootorder if a bootdevice was forced. This implements both the pure IO port interface and also the new DMA interface, a few direct commands are implemented which are needed but in general the "file" interface should be used. There is no write support for the guest. Tested against the latest vmm-firmware port. This requires also a -current kernel to pass the IO ports to vmd(8). OK mlarkin@ ccardenas@
|
#
1.42 |
|
06-Dec-2018 |
claudio |
Make it possible to define the bootdevice in vmd. This information is used currently only when booting a OpenBSD kernel. If VMBOOTDEV_NET is used the internal dhcp server will pass "auto_install" as boot file to the client and the boot loader passes the MAC of the first interface to the kernel to indicate PXE booting. Adding boot order support to SeaBIOS is not yet implemented. Ok ccardenas@
|
Revision tags: OPENBSD_6_4_BASE
|
#
1.41 |
|
08-Oct-2018 |
reyk |
Add support for qcow2 base images (external snapshots).
This works is from Ori Bernstein, committing on his behalf:
Add support to vmd for external snapshots. That is, snapshots that are derived from a base image. Data lookups start in the derived image, and if the derived image does not contain some data, the search proceeds ot the base image. Multiple derived images may exist off of a single base image.
A limitation of this format is that modifying the base image will corrupt the derived image.
This change also adds support for creating disk derived disk images to vmctl. To use it:
vmctl create derived.qcow2 -s 16G -b base.qcow2
From Ori Bernstein OK mlarkin@ reyk@
|
#
1.40 |
|
28-Sep-2018 |
reyk |
Support vmd-internal's vmboot with qcow2 disk images.
OK mlarkin@
|
#
1.39 |
|
19-Sep-2018 |
ccardenas |
Various clean up items for disks.
- qcow2: general cleanup - vioraw: check malloc - virtio: add function to sync disks - vm: call virtio_shutdown to sync disks when vm is finished executing
Thanks to Ori Bernstein.
Ok miko@
|
#
1.38 |
|
17-Jul-2018 |
mlarkin |
vmd(8): fix vmctl -b option for i386 kernels.
ok pd@
|
#
1.37 |
|
12-Jul-2018 |
mlarkin |
vmm(8)/vmm(4): send a copy of the guest register state to vmd on exit, avoiding multiple readregs ioctls back to vmm in case register content is needed subsequently.
ok phessler
|
#
1.36 |
|
10-Jul-2018 |
mlarkin |
vmd(8): route ELCR handler to the right function
|
#
1.35 |
|
09-Jul-2018 |
mlarkin |
vmd(8): better debug message in a failure case
|
#
1.34 |
|
19-Jun-2018 |
reyk |
knf
|
#
1.33 |
|
27-Apr-2018 |
mlarkin |
vmd(8): implement vmd side of ELCR registers
ok guenther
|
#
1.32 |
|
26-Apr-2018 |
mlarkin |
vmd(8): handle PIT channel 2 status readback via port 0x61
Allow PIT channel 2 status (fired/counting) readback via port 0x61 bit 5.
ok guenther@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.31 |
|
03-Jan-2018 |
ccardenas |
Add initial CD-ROM support to VMD via vioscsi.
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
#
1.30 |
|
29-Nov-2017 |
mlarkin |
make vmm(4) less responsible for initial register state, preferring to let usermode daemons handle that.
ok pd@
|
#
1.29 |
|
28-Nov-2017 |
mlarkin |
fix some spelling errors in a few comments
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.28 |
|
19-Sep-2017 |
mlarkin |
Clarify a wrong conditional, found by jsg.
ok jsg
|
#
1.27 |
|
17-Sep-2017 |
pd |
vmd: send/recv pci config space instead of recreating pci devices on receive
ok mlarkin@
|
#
1.26 |
|
17-Sep-2017 |
pd |
vmd: re add rtc.per and rtc.sec evtimers on receive
This was missed in receive. mc146818_start is already defined. This fixes rtc time resync on receive.
ok mlarkin@
|
#
1.25 |
|
11-Sep-2017 |
dlg |
add functions to provide direct access to guest memory as vmd addresses
iovec_mem() populates an iovec array based on guest physical addresses. this allows the use of things like readv and writev for moving data between the guest and a disk image file without having to bounce the memory.
vaddr_mem() provides a vmd usable pointer based on a guests physical address. this makes it possible to directly reference things like virtio rings without having to bounce that memory either. however, it assumes that a contiguous range of guest physical memory will sit in a single vm memory range. mlarkin@ says this is right.
ok mlarkin@
|
#
1.24 |
|
20-Aug-2017 |
pd |
vmd: Allow only upward migration
This restricts receiving vms from hosts with more cpu features.
Tested on broadwell -> skylake (works) skylake -> broadwell (don't work)
ok mlarkin@
|
#
1.23 |
|
14-Aug-2017 |
mlarkin |
vmd: set MSR_MISC_ENABLE=0 on vm creation, this will be re-set in vmm based on proper values from the host in use.
|
#
1.22 |
|
15-Jul-2017 |
pd |
Add vmctl send and vmctl receive
ok reyk@ and mlarkin@
|
#
1.21 |
|
09-Jul-2017 |
pd |
vmd/vmctl: Add ability to pause / unpause vms
With help from Ashwin Agrawal
ok reyk@ mlarkin@
|
#
1.20 |
|
07-Jun-2017 |
mlarkin |
vmd: Implement simulated baudrate support in the ns8250 module. The previous version was allowing an output rate that is "too fast", and linux guests would give up after 512 characters TXed ("too much work for irq4").
This diff calculates the approximate rate we can sustain at the current programmed baud rate and limits the output to that rate by inserting a HZ delay after a specified number of characters have been transmitted. This fixes the linux guest console issue.
Note that the console now outputs at more or less the selected baud rate, instead of nearly instantaneously as before - if you selected 9600 in your guest VMs before, you might want to change that to 115200 now for a better console experience.
krw@ "seems like a good idea to me"
|
#
1.19 |
|
30-May-2017 |
tedu |
split vioblk read/write functions into start and finish as prep for async io operations. ok mlarkin
|
#
1.18 |
|
28-May-2017 |
mlarkin |
SVM: add some exit types
Also, fix a comment that wasn't applicable anymore, and change a format from decimal to hex
|
#
1.17 |
|
05-May-2017 |
reyk |
VMs cannot use proc_compose() to PROC_VMM, they have to use imsg_compose() on the "vmm_pipe" directly. This fixes the communication channel from VMs back to vmm.
|
#
1.16 |
|
05-May-2017 |
mlarkin |
Allow vmd(8) to set guest %xcr0
Usermode part of previous vmm(4) diff.
Posted to tech by Pratik Vyas
|
#
1.15 |
|
02-May-2017 |
mlarkin |
fix an error in i386 vmd build
|
#
1.14 |
|
02-May-2017 |
mlarkin |
Matching vmd(8) part of previous diff (first part of vmctl send/receive).
ok kettenis
|
#
1.13 |
|
25-Apr-2017 |
reyk |
spacing
|
#
1.12 |
|
19-Apr-2017 |
reyk |
Add support for dynamic "NAT" interfaces (-L/local interface).
When a local interface is configured, vmd configures a /31 address on the tap(4) interface of the host and provides another IP in the same subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server that replies with IP, gateway, and DNS addresses to the VM. The built-in server only ever responds to the VM on the inside and cannot leak its DHCP responses to the outside.
Thanks to Uwe Werler, Josh Grosse, and some others for testing!
OK deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.11 |
|
27-Mar-2017 |
deraadt |
die whitespace die die die
|
#
1.10 |
|
25-Mar-2017 |
mlarkin |
Last bits needed to get seabios + alpine linux working. This is enough to get started and let more people help finding and fixing bugs.
ok kettenis, deraadt
|
#
1.9 |
|
25-Mar-2017 |
reyk |
Boot using BIOS from /etc/firmware/vmm-bios by default.
Instead of using the internal "vmboot", VMs will now be booted using the external BIOS firmware in /etc/firmware/vmm-bios (which is subject to a LGPLv3 license). Direct booting of OpenBSD kernels or non-default BIOS images is still supported for now using the -b/boot option that is replacing the -k/kernel option.
As requested by Theo, vmd(8) fails if neither the default BIOS is found nor a kernel has been specified in the VM configuration. The "vmm" BIOS has to be installed using fw_update(1), which will be done automatically in most cases where the OpenBSD can fetch it after install/upgrade.
OK mlarkin@
|
#
1.8 |
|
25-Mar-2017 |
mlarkin |
Implement some missing functionality and clean up some code in vmd pci emulation.
ok kettenis
|
#
1.7 |
|
25-Mar-2017 |
mlarkin |
Introduce a new function to obtain properly sized input data, and convert i8253/i8259/mc146818 emulation to use this.
|
#
1.6 |
|
24-Mar-2017 |
mlarkin |
Allow vmd to proceed after an interrupt occurred after retiring a cpuid instruction. Matches previous commit to kernel vmm.c
|
#
1.5 |
|
23-Mar-2017 |
mlarkin |
Implement memory size and SMP CPU count NVRAM registers in the emulated mc146818. This is needed for seabios to boot properly (and construct a sensible e820 map to send to the guest OS).
|
#
1.4 |
|
21-Mar-2017 |
mlarkin |
Fix two errors in NS8250 (UART) emulation. The first error zeroed out the high bits of %eax on reading register data from the emulated UART ports. The second error didn't properly assert the TXRDY bit during init - this bit was only set after the first character was sent. Both these bugs caused seabios to not be able to output any data. Found during the recent effort to get Linux guests booting.
|
#
1.3 |
|
15-Mar-2017 |
reyk |
Improve vmmci(4) shutdown and reboot.
This change handles various cases to power off the VM, even if it is unresponsive, stuck in ddb, or when the shutdown was initiated from the VM guest side. Usage of timeout and VM ACKs make sure that the VM is really turned off at some point.
OK mlarkin@
|
#
1.2 |
|
02-Mar-2017 |
reyk |
Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.
This is especially useful when multiple VMs share a switch, the implementation is independent from the underlying switch or bridge.
no objections mlarkin@
|
#
1.1 |
|
01-Mar-2017 |
reyk |
Split vmm.c into two files: vm.c for the VM child, vmm.c for the parent
As discussed with mlarkin@, it makes it easier to maintain the file.
OK mlarkin@
|
#
1.44 |
|
20-Feb-2019 |
mlarkin |
vmd(8): initialize guest %drX registers to power-on defaults on launch
Initializes the %drX registers to power on defaults, and bump the VM send/recieve header to reflect same
discussed with deraadt@
|
#
1.43 |
|
10-Dec-2018 |
claudio |
Implement the fw_cfg interface basics and use it to set the bootorder if a bootdevice was forced. This implements both the pure IO port interface and also the new DMA interface, a few direct commands are implemented which are needed but in general the "file" interface should be used. There is no write support for the guest. Tested against the latest vmm-firmware port. This requires also a -current kernel to pass the IO ports to vmd(8). OK mlarkin@ ccardenas@
|
#
1.42 |
|
06-Dec-2018 |
claudio |
Make it possible to define the bootdevice in vmd. This information is used currently only when booting a OpenBSD kernel. If VMBOOTDEV_NET is used the internal dhcp server will pass "auto_install" as boot file to the client and the boot loader passes the MAC of the first interface to the kernel to indicate PXE booting. Adding boot order support to SeaBIOS is not yet implemented. Ok ccardenas@
|
Revision tags: OPENBSD_6_4_BASE
|
#
1.41 |
|
08-Oct-2018 |
reyk |
Add support for qcow2 base images (external snapshots).
This works is from Ori Bernstein, committing on his behalf:
Add support to vmd for external snapshots. That is, snapshots that are derived from a base image. Data lookups start in the derived image, and if the derived image does not contain some data, the search proceeds ot the base image. Multiple derived images may exist off of a single base image.
A limitation of this format is that modifying the base image will corrupt the derived image.
This change also adds support for creating disk derived disk images to vmctl. To use it:
vmctl create derived.qcow2 -s 16G -b base.qcow2
From Ori Bernstein OK mlarkin@ reyk@
|
#
1.40 |
|
28-Sep-2018 |
reyk |
Support vmd-internal's vmboot with qcow2 disk images.
OK mlarkin@
|
#
1.39 |
|
19-Sep-2018 |
ccardenas |
Various clean up items for disks.
- qcow2: general cleanup - vioraw: check malloc - virtio: add function to sync disks - vm: call virtio_shutdown to sync disks when vm is finished executing
Thanks to Ori Bernstein.
Ok miko@
|
#
1.38 |
|
17-Jul-2018 |
mlarkin |
vmd(8): fix vmctl -b option for i386 kernels.
ok pd@
|
#
1.37 |
|
12-Jul-2018 |
mlarkin |
vmm(8)/vmm(4): send a copy of the guest register state to vmd on exit, avoiding multiple readregs ioctls back to vmm in case register content is needed subsequently.
ok phessler
|
#
1.36 |
|
10-Jul-2018 |
mlarkin |
vmd(8): route ELCR handler to the right function
|
#
1.35 |
|
09-Jul-2018 |
mlarkin |
vmd(8): better debug message in a failure case
|
#
1.34 |
|
19-Jun-2018 |
reyk |
knf
|
#
1.33 |
|
27-Apr-2018 |
mlarkin |
vmd(8): implement vmd side of ELCR registers
ok guenther
|
#
1.32 |
|
26-Apr-2018 |
mlarkin |
vmd(8): handle PIT channel 2 status readback via port 0x61
Allow PIT channel 2 status (fired/counting) readback via port 0x61 bit 5.
ok guenther@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.31 |
|
03-Jan-2018 |
ccardenas |
Add initial CD-ROM support to VMD via vioscsi.
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
#
1.30 |
|
29-Nov-2017 |
mlarkin |
make vmm(4) less responsible for initial register state, preferring to let usermode daemons handle that.
ok pd@
|
#
1.29 |
|
28-Nov-2017 |
mlarkin |
fix some spelling errors in a few comments
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.28 |
|
19-Sep-2017 |
mlarkin |
Clarify a wrong conditional, found by jsg.
ok jsg
|
#
1.27 |
|
17-Sep-2017 |
pd |
vmd: send/recv pci config space instead of recreating pci devices on receive
ok mlarkin@
|
#
1.26 |
|
17-Sep-2017 |
pd |
vmd: re add rtc.per and rtc.sec evtimers on receive
This was missed in receive. mc146818_start is already defined. This fixes rtc time resync on receive.
ok mlarkin@
|
#
1.25 |
|
11-Sep-2017 |
dlg |
add functions to provide direct access to guest memory as vmd addresses
iovec_mem() populates an iovec array based on guest physical addresses. this allows the use of things like readv and writev for moving data between the guest and a disk image file without having to bounce the memory.
vaddr_mem() provides a vmd usable pointer based on a guests physical address. this makes it possible to directly reference things like virtio rings without having to bounce that memory either. however, it assumes that a contiguous range of guest physical memory will sit in a single vm memory range. mlarkin@ says this is right.
ok mlarkin@
|
#
1.24 |
|
20-Aug-2017 |
pd |
vmd: Allow only upward migration
This restricts receiving vms from hosts with more cpu features.
Tested on broadwell -> skylake (works) skylake -> broadwell (don't work)
ok mlarkin@
|
#
1.23 |
|
14-Aug-2017 |
mlarkin |
vmd: set MSR_MISC_ENABLE=0 on vm creation, this will be re-set in vmm based on proper values from the host in use.
|
#
1.22 |
|
15-Jul-2017 |
pd |
Add vmctl send and vmctl receive
ok reyk@ and mlarkin@
|
#
1.21 |
|
09-Jul-2017 |
pd |
vmd/vmctl: Add ability to pause / unpause vms
With help from Ashwin Agrawal
ok reyk@ mlarkin@
|
#
1.20 |
|
07-Jun-2017 |
mlarkin |
vmd: Implement simulated baudrate support in the ns8250 module. The previous version was allowing an output rate that is "too fast", and linux guests would give up after 512 characters TXed ("too much work for irq4").
This diff calculates the approximate rate we can sustain at the current programmed baud rate and limits the output to that rate by inserting a HZ delay after a specified number of characters have been transmitted. This fixes the linux guest console issue.
Note that the console now outputs at more or less the selected baud rate, instead of nearly instantaneously as before - if you selected 9600 in your guest VMs before, you might want to change that to 115200 now for a better console experience.
krw@ "seems like a good idea to me"
|
#
1.19 |
|
30-May-2017 |
tedu |
split vioblk read/write functions into start and finish as prep for async io operations. ok mlarkin
|
#
1.18 |
|
28-May-2017 |
mlarkin |
SVM: add some exit types
Also, fix a comment that wasn't applicable anymore, and change a format from decimal to hex
|
#
1.17 |
|
05-May-2017 |
reyk |
VMs cannot use proc_compose() to PROC_VMM, they have to use imsg_compose() on the "vmm_pipe" directly. This fixes the communication channel from VMs back to vmm.
|
#
1.16 |
|
05-May-2017 |
mlarkin |
Allow vmd(8) to set guest %xcr0
Usermode part of previous vmm(4) diff.
Posted to tech by Pratik Vyas
|
#
1.15 |
|
02-May-2017 |
mlarkin |
fix an error in i386 vmd build
|
#
1.14 |
|
02-May-2017 |
mlarkin |
Matching vmd(8) part of previous diff (first part of vmctl send/receive).
ok kettenis
|
#
1.13 |
|
25-Apr-2017 |
reyk |
spacing
|
#
1.12 |
|
19-Apr-2017 |
reyk |
Add support for dynamic "NAT" interfaces (-L/local interface).
When a local interface is configured, vmd configures a /31 address on the tap(4) interface of the host and provides another IP in the same subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server that replies with IP, gateway, and DNS addresses to the VM. The built-in server only ever responds to the VM on the inside and cannot leak its DHCP responses to the outside.
Thanks to Uwe Werler, Josh Grosse, and some others for testing!
OK deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.11 |
|
27-Mar-2017 |
deraadt |
die whitespace die die die
|
#
1.10 |
|
25-Mar-2017 |
mlarkin |
Last bits needed to get seabios + alpine linux working. This is enough to get started and let more people help finding and fixing bugs.
ok kettenis, deraadt
|
#
1.9 |
|
25-Mar-2017 |
reyk |
Boot using BIOS from /etc/firmware/vmm-bios by default.
Instead of using the internal "vmboot", VMs will now be booted using the external BIOS firmware in /etc/firmware/vmm-bios (which is subject to a LGPLv3 license). Direct booting of OpenBSD kernels or non-default BIOS images is still supported for now using the -b/boot option that is replacing the -k/kernel option.
As requested by Theo, vmd(8) fails if neither the default BIOS is found nor a kernel has been specified in the VM configuration. The "vmm" BIOS has to be installed using fw_update(1), which will be done automatically in most cases where the OpenBSD can fetch it after install/upgrade.
OK mlarkin@
|
#
1.8 |
|
25-Mar-2017 |
mlarkin |
Implement some missing functionality and clean up some code in vmd pci emulation.
ok kettenis
|
#
1.7 |
|
25-Mar-2017 |
mlarkin |
Introduce a new function to obtain properly sized input data, and convert i8253/i8259/mc146818 emulation to use this.
|
#
1.6 |
|
24-Mar-2017 |
mlarkin |
Allow vmd to proceed after an interrupt occurred after retiring a cpuid instruction. Matches previous commit to kernel vmm.c
|
#
1.5 |
|
23-Mar-2017 |
mlarkin |
Implement memory size and SMP CPU count NVRAM registers in the emulated mc146818. This is needed for seabios to boot properly (and construct a sensible e820 map to send to the guest OS).
|
#
1.4 |
|
21-Mar-2017 |
mlarkin |
Fix two errors in NS8250 (UART) emulation. The first error zeroed out the high bits of %eax on reading register data from the emulated UART ports. The second error didn't properly assert the TXRDY bit during init - this bit was only set after the first character was sent. Both these bugs caused seabios to not be able to output any data. Found during the recent effort to get Linux guests booting.
|
#
1.3 |
|
15-Mar-2017 |
reyk |
Improve vmmci(4) shutdown and reboot.
This change handles various cases to power off the VM, even if it is unresponsive, stuck in ddb, or when the shutdown was initiated from the VM guest side. Usage of timeout and VM ACKs make sure that the VM is really turned off at some point.
OK mlarkin@
|
#
1.2 |
|
02-Mar-2017 |
reyk |
Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.
This is especially useful when multiple VMs share a switch, the implementation is independent from the underlying switch or bridge.
no objections mlarkin@
|
#
1.1 |
|
01-Mar-2017 |
reyk |
Split vmm.c into two files: vm.c for the VM child, vmm.c for the parent
As discussed with mlarkin@, it makes it easier to maintain the file.
OK mlarkin@
|
#
1.43 |
|
10-Dec-2018 |
claudio |
Implement the fw_cfg interface basics and use it to set the bootorder if a bootdevice was forced. This implements both the pure IO port interface and also the new DMA interface, a few direct commands are implemented which are needed but in general the "file" interface should be used. There is no write support for the guest. Tested against the latest vmm-firmware port. This requires also a -current kernel to pass the IO ports to vmd(8). OK mlarkin@ ccardenas@
|
#
1.42 |
|
06-Dec-2018 |
claudio |
Make it possible to define the bootdevice in vmd. This information is used currently only when booting a OpenBSD kernel. If VMBOOTDEV_NET is used the internal dhcp server will pass "auto_install" as boot file to the client and the boot loader passes the MAC of the first interface to the kernel to indicate PXE booting. Adding boot order support to SeaBIOS is not yet implemented. Ok ccardenas@
|
Revision tags: OPENBSD_6_4_BASE
|
#
1.41 |
|
08-Oct-2018 |
reyk |
Add support for qcow2 base images (external snapshots).
This works is from Ori Bernstein, committing on his behalf:
Add support to vmd for external snapshots. That is, snapshots that are derived from a base image. Data lookups start in the derived image, and if the derived image does not contain some data, the search proceeds ot the base image. Multiple derived images may exist off of a single base image.
A limitation of this format is that modifying the base image will corrupt the derived image.
This change also adds support for creating disk derived disk images to vmctl. To use it:
vmctl create derived.qcow2 -s 16G -b base.qcow2
From Ori Bernstein OK mlarkin@ reyk@
|
#
1.40 |
|
28-Sep-2018 |
reyk |
Support vmd-internal's vmboot with qcow2 disk images.
OK mlarkin@
|
#
1.39 |
|
19-Sep-2018 |
ccardenas |
Various clean up items for disks.
- qcow2: general cleanup - vioraw: check malloc - virtio: add function to sync disks - vm: call virtio_shutdown to sync disks when vm is finished executing
Thanks to Ori Bernstein.
Ok miko@
|
#
1.38 |
|
17-Jul-2018 |
mlarkin |
vmd(8): fix vmctl -b option for i386 kernels.
ok pd@
|
#
1.37 |
|
12-Jul-2018 |
mlarkin |
vmm(8)/vmm(4): send a copy of the guest register state to vmd on exit, avoiding multiple readregs ioctls back to vmm in case register content is needed subsequently.
ok phessler
|
#
1.36 |
|
10-Jul-2018 |
mlarkin |
vmd(8): route ELCR handler to the right function
|
#
1.35 |
|
09-Jul-2018 |
mlarkin |
vmd(8): better debug message in a failure case
|
#
1.34 |
|
19-Jun-2018 |
reyk |
knf
|
#
1.33 |
|
27-Apr-2018 |
mlarkin |
vmd(8): implement vmd side of ELCR registers
ok guenther
|
#
1.32 |
|
26-Apr-2018 |
mlarkin |
vmd(8): handle PIT channel 2 status readback via port 0x61
Allow PIT channel 2 status (fired/counting) readback via port 0x61 bit 5.
ok guenther@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.31 |
|
03-Jan-2018 |
ccardenas |
Add initial CD-ROM support to VMD via vioscsi.
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
#
1.30 |
|
29-Nov-2017 |
mlarkin |
make vmm(4) less responsible for initial register state, preferring to let usermode daemons handle that.
ok pd@
|
#
1.29 |
|
28-Nov-2017 |
mlarkin |
fix some spelling errors in a few comments
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.28 |
|
19-Sep-2017 |
mlarkin |
Clarify a wrong conditional, found by jsg.
ok jsg
|
#
1.27 |
|
17-Sep-2017 |
pd |
vmd: send/recv pci config space instead of recreating pci devices on receive
ok mlarkin@
|
#
1.26 |
|
17-Sep-2017 |
pd |
vmd: re add rtc.per and rtc.sec evtimers on receive
This was missed in receive. mc146818_start is already defined. This fixes rtc time resync on receive.
ok mlarkin@
|
#
1.25 |
|
11-Sep-2017 |
dlg |
add functions to provide direct access to guest memory as vmd addresses
iovec_mem() populates an iovec array based on guest physical addresses. this allows the use of things like readv and writev for moving data between the guest and a disk image file without having to bounce the memory.
vaddr_mem() provides a vmd usable pointer based on a guests physical address. this makes it possible to directly reference things like virtio rings without having to bounce that memory either. however, it assumes that a contiguous range of guest physical memory will sit in a single vm memory range. mlarkin@ says this is right.
ok mlarkin@
|
#
1.24 |
|
20-Aug-2017 |
pd |
vmd: Allow only upward migration
This restricts receiving vms from hosts with more cpu features.
Tested on broadwell -> skylake (works) skylake -> broadwell (don't work)
ok mlarkin@
|
#
1.23 |
|
14-Aug-2017 |
mlarkin |
vmd: set MSR_MISC_ENABLE=0 on vm creation, this will be re-set in vmm based on proper values from the host in use.
|
#
1.22 |
|
15-Jul-2017 |
pd |
Add vmctl send and vmctl receive
ok reyk@ and mlarkin@
|
#
1.21 |
|
09-Jul-2017 |
pd |
vmd/vmctl: Add ability to pause / unpause vms
With help from Ashwin Agrawal
ok reyk@ mlarkin@
|
#
1.20 |
|
07-Jun-2017 |
mlarkin |
vmd: Implement simulated baudrate support in the ns8250 module. The previous version was allowing an output rate that is "too fast", and linux guests would give up after 512 characters TXed ("too much work for irq4").
This diff calculates the approximate rate we can sustain at the current programmed baud rate and limits the output to that rate by inserting a HZ delay after a specified number of characters have been transmitted. This fixes the linux guest console issue.
Note that the console now outputs at more or less the selected baud rate, instead of nearly instantaneously as before - if you selected 9600 in your guest VMs before, you might want to change that to 115200 now for a better console experience.
krw@ "seems like a good idea to me"
|
#
1.19 |
|
30-May-2017 |
tedu |
split vioblk read/write functions into start and finish as prep for async io operations. ok mlarkin
|
#
1.18 |
|
28-May-2017 |
mlarkin |
SVM: add some exit types
Also, fix a comment that wasn't applicable anymore, and change a format from decimal to hex
|
#
1.17 |
|
05-May-2017 |
reyk |
VMs cannot use proc_compose() to PROC_VMM, they have to use imsg_compose() on the "vmm_pipe" directly. This fixes the communication channel from VMs back to vmm.
|
#
1.16 |
|
05-May-2017 |
mlarkin |
Allow vmd(8) to set guest %xcr0
Usermode part of previous vmm(4) diff.
Posted to tech by Pratik Vyas
|
#
1.15 |
|
02-May-2017 |
mlarkin |
fix an error in i386 vmd build
|
#
1.14 |
|
02-May-2017 |
mlarkin |
Matching vmd(8) part of previous diff (first part of vmctl send/receive).
ok kettenis
|
#
1.13 |
|
25-Apr-2017 |
reyk |
spacing
|
#
1.12 |
|
19-Apr-2017 |
reyk |
Add support for dynamic "NAT" interfaces (-L/local interface).
When a local interface is configured, vmd configures a /31 address on the tap(4) interface of the host and provides another IP in the same subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server that replies with IP, gateway, and DNS addresses to the VM. The built-in server only ever responds to the VM on the inside and cannot leak its DHCP responses to the outside.
Thanks to Uwe Werler, Josh Grosse, and some others for testing!
OK deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.11 |
|
27-Mar-2017 |
deraadt |
die whitespace die die die
|
#
1.10 |
|
25-Mar-2017 |
mlarkin |
Last bits needed to get seabios + alpine linux working. This is enough to get started and let more people help finding and fixing bugs.
ok kettenis, deraadt
|
#
1.9 |
|
25-Mar-2017 |
reyk |
Boot using BIOS from /etc/firmware/vmm-bios by default.
Instead of using the internal "vmboot", VMs will now be booted using the external BIOS firmware in /etc/firmware/vmm-bios (which is subject to a LGPLv3 license). Direct booting of OpenBSD kernels or non-default BIOS images is still supported for now using the -b/boot option that is replacing the -k/kernel option.
As requested by Theo, vmd(8) fails if neither the default BIOS is found nor a kernel has been specified in the VM configuration. The "vmm" BIOS has to be installed using fw_update(1), which will be done automatically in most cases where the OpenBSD can fetch it after install/upgrade.
OK mlarkin@
|
#
1.8 |
|
25-Mar-2017 |
mlarkin |
Implement some missing functionality and clean up some code in vmd pci emulation.
ok kettenis
|
#
1.7 |
|
25-Mar-2017 |
mlarkin |
Introduce a new function to obtain properly sized input data, and convert i8253/i8259/mc146818 emulation to use this.
|
#
1.6 |
|
24-Mar-2017 |
mlarkin |
Allow vmd to proceed after an interrupt occurred after retiring a cpuid instruction. Matches previous commit to kernel vmm.c
|
#
1.5 |
|
23-Mar-2017 |
mlarkin |
Implement memory size and SMP CPU count NVRAM registers in the emulated mc146818. This is needed for seabios to boot properly (and construct a sensible e820 map to send to the guest OS).
|
#
1.4 |
|
21-Mar-2017 |
mlarkin |
Fix two errors in NS8250 (UART) emulation. The first error zeroed out the high bits of %eax on reading register data from the emulated UART ports. The second error didn't properly assert the TXRDY bit during init - this bit was only set after the first character was sent. Both these bugs caused seabios to not be able to output any data. Found during the recent effort to get Linux guests booting.
|
#
1.3 |
|
15-Mar-2017 |
reyk |
Improve vmmci(4) shutdown and reboot.
This change handles various cases to power off the VM, even if it is unresponsive, stuck in ddb, or when the shutdown was initiated from the VM guest side. Usage of timeout and VM ACKs make sure that the VM is really turned off at some point.
OK mlarkin@
|
#
1.2 |
|
02-Mar-2017 |
reyk |
Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.
This is especially useful when multiple VMs share a switch, the implementation is independent from the underlying switch or bridge.
no objections mlarkin@
|
#
1.1 |
|
01-Mar-2017 |
reyk |
Split vmm.c into two files: vm.c for the VM child, vmm.c for the parent
As discussed with mlarkin@, it makes it easier to maintain the file.
OK mlarkin@
|
Revision tags: OPENBSD_6_4_BASE
|
#
1.41 |
|
08-Oct-2018 |
reyk |
Add support for qcow2 base images (external snapshots).
This works is from Ori Bernstein, committing on his behalf:
Add support to vmd for external snapshots. That is, snapshots that are derived from a base image. Data lookups start in the derived image, and if the derived image does not contain some data, the search proceeds ot the base image. Multiple derived images may exist off of a single base image.
A limitation of this format is that modifying the base image will corrupt the derived image.
This change also adds support for creating disk derived disk images to vmctl. To use it:
vmctl create derived.qcow2 -s 16G -b base.qcow2
From Ori Bernstein OK mlarkin@ reyk@
|
#
1.40 |
|
28-Sep-2018 |
reyk |
Support vmd-internal's vmboot with qcow2 disk images.
OK mlarkin@
|
#
1.39 |
|
19-Sep-2018 |
ccardenas |
Various clean up items for disks.
- qcow2: general cleanup - vioraw: check malloc - virtio: add function to sync disks - vm: call virtio_shutdown to sync disks when vm is finished executing
Thanks to Ori Bernstein.
Ok miko@
|
#
1.38 |
|
17-Jul-2018 |
mlarkin |
vmd(8): fix vmctl -b option for i386 kernels.
ok pd@
|
#
1.37 |
|
12-Jul-2018 |
mlarkin |
vmm(8)/vmm(4): send a copy of the guest register state to vmd on exit, avoiding multiple readregs ioctls back to vmm in case register content is needed subsequently.
ok phessler
|
#
1.36 |
|
10-Jul-2018 |
mlarkin |
vmd(8): route ELCR handler to the right function
|
#
1.35 |
|
09-Jul-2018 |
mlarkin |
vmd(8): better debug message in a failure case
|
#
1.34 |
|
19-Jun-2018 |
reyk |
knf
|
#
1.33 |
|
27-Apr-2018 |
mlarkin |
vmd(8): implement vmd side of ELCR registers
ok guenther
|
#
1.32 |
|
26-Apr-2018 |
mlarkin |
vmd(8): handle PIT channel 2 status readback via port 0x61
Allow PIT channel 2 status (fired/counting) readback via port 0x61 bit 5.
ok guenther@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.31 |
|
03-Jan-2018 |
ccardenas |
Add initial CD-ROM support to VMD via vioscsi.
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
#
1.30 |
|
29-Nov-2017 |
mlarkin |
make vmm(4) less responsible for initial register state, preferring to let usermode daemons handle that.
ok pd@
|
#
1.29 |
|
28-Nov-2017 |
mlarkin |
fix some spelling errors in a few comments
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.28 |
|
19-Sep-2017 |
mlarkin |
Clarify a wrong conditional, found by jsg.
ok jsg
|
#
1.27 |
|
17-Sep-2017 |
pd |
vmd: send/recv pci config space instead of recreating pci devices on receive
ok mlarkin@
|
#
1.26 |
|
17-Sep-2017 |
pd |
vmd: re add rtc.per and rtc.sec evtimers on receive
This was missed in receive. mc146818_start is already defined. This fixes rtc time resync on receive.
ok mlarkin@
|
#
1.25 |
|
11-Sep-2017 |
dlg |
add functions to provide direct access to guest memory as vmd addresses
iovec_mem() populates an iovec array based on guest physical addresses. this allows the use of things like readv and writev for moving data between the guest and a disk image file without having to bounce the memory.
vaddr_mem() provides a vmd usable pointer based on a guests physical address. this makes it possible to directly reference things like virtio rings without having to bounce that memory either. however, it assumes that a contiguous range of guest physical memory will sit in a single vm memory range. mlarkin@ says this is right.
ok mlarkin@
|
#
1.24 |
|
20-Aug-2017 |
pd |
vmd: Allow only upward migration
This restricts receiving vms from hosts with more cpu features.
Tested on broadwell -> skylake (works) skylake -> broadwell (don't work)
ok mlarkin@
|
#
1.23 |
|
14-Aug-2017 |
mlarkin |
vmd: set MSR_MISC_ENABLE=0 on vm creation, this will be re-set in vmm based on proper values from the host in use.
|
#
1.22 |
|
15-Jul-2017 |
pd |
Add vmctl send and vmctl receive
ok reyk@ and mlarkin@
|
#
1.21 |
|
09-Jul-2017 |
pd |
vmd/vmctl: Add ability to pause / unpause vms
With help from Ashwin Agrawal
ok reyk@ mlarkin@
|
#
1.20 |
|
07-Jun-2017 |
mlarkin |
vmd: Implement simulated baudrate support in the ns8250 module. The previous version was allowing an output rate that is "too fast", and linux guests would give up after 512 characters TXed ("too much work for irq4").
This diff calculates the approximate rate we can sustain at the current programmed baud rate and limits the output to that rate by inserting a HZ delay after a specified number of characters have been transmitted. This fixes the linux guest console issue.
Note that the console now outputs at more or less the selected baud rate, instead of nearly instantaneously as before - if you selected 9600 in your guest VMs before, you might want to change that to 115200 now for a better console experience.
krw@ "seems like a good idea to me"
|
#
1.19 |
|
30-May-2017 |
tedu |
split vioblk read/write functions into start and finish as prep for async io operations. ok mlarkin
|
#
1.18 |
|
28-May-2017 |
mlarkin |
SVM: add some exit types
Also, fix a comment that wasn't applicable anymore, and change a format from decimal to hex
|
#
1.17 |
|
05-May-2017 |
reyk |
VMs cannot use proc_compose() to PROC_VMM, they have to use imsg_compose() on the "vmm_pipe" directly. This fixes the communication channel from VMs back to vmm.
|
#
1.16 |
|
05-May-2017 |
mlarkin |
Allow vmd(8) to set guest %xcr0
Usermode part of previous vmm(4) diff.
Posted to tech by Pratik Vyas
|
#
1.15 |
|
02-May-2017 |
mlarkin |
fix an error in i386 vmd build
|
#
1.14 |
|
02-May-2017 |
mlarkin |
Matching vmd(8) part of previous diff (first part of vmctl send/receive).
ok kettenis
|
#
1.13 |
|
25-Apr-2017 |
reyk |
spacing
|
#
1.12 |
|
19-Apr-2017 |
reyk |
Add support for dynamic "NAT" interfaces (-L/local interface).
When a local interface is configured, vmd configures a /31 address on the tap(4) interface of the host and provides another IP in the same subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server that replies with IP, gateway, and DNS addresses to the VM. The built-in server only ever responds to the VM on the inside and cannot leak its DHCP responses to the outside.
Thanks to Uwe Werler, Josh Grosse, and some others for testing!
OK deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.11 |
|
27-Mar-2017 |
deraadt |
die whitespace die die die
|
#
1.10 |
|
25-Mar-2017 |
mlarkin |
Last bits needed to get seabios + alpine linux working. This is enough to get started and let more people help finding and fixing bugs.
ok kettenis, deraadt
|
#
1.9 |
|
25-Mar-2017 |
reyk |
Boot using BIOS from /etc/firmware/vmm-bios by default.
Instead of using the internal "vmboot", VMs will now be booted using the external BIOS firmware in /etc/firmware/vmm-bios (which is subject to a LGPLv3 license). Direct booting of OpenBSD kernels or non-default BIOS images is still supported for now using the -b/boot option that is replacing the -k/kernel option.
As requested by Theo, vmd(8) fails if neither the default BIOS is found nor a kernel has been specified in the VM configuration. The "vmm" BIOS has to be installed using fw_update(1), which will be done automatically in most cases where the OpenBSD can fetch it after install/upgrade.
OK mlarkin@
|
#
1.8 |
|
25-Mar-2017 |
mlarkin |
Implement some missing functionality and clean up some code in vmd pci emulation.
ok kettenis
|
#
1.7 |
|
25-Mar-2017 |
mlarkin |
Introduce a new function to obtain properly sized input data, and convert i8253/i8259/mc146818 emulation to use this.
|
#
1.6 |
|
24-Mar-2017 |
mlarkin |
Allow vmd to proceed after an interrupt occurred after retiring a cpuid instruction. Matches previous commit to kernel vmm.c
|
#
1.5 |
|
23-Mar-2017 |
mlarkin |
Implement memory size and SMP CPU count NVRAM registers in the emulated mc146818. This is needed for seabios to boot properly (and construct a sensible e820 map to send to the guest OS).
|
#
1.4 |
|
21-Mar-2017 |
mlarkin |
Fix two errors in NS8250 (UART) emulation. The first error zeroed out the high bits of %eax on reading register data from the emulated UART ports. The second error didn't properly assert the TXRDY bit during init - this bit was only set after the first character was sent. Both these bugs caused seabios to not be able to output any data. Found during the recent effort to get Linux guests booting.
|
#
1.3 |
|
15-Mar-2017 |
reyk |
Improve vmmci(4) shutdown and reboot.
This change handles various cases to power off the VM, even if it is unresponsive, stuck in ddb, or when the shutdown was initiated from the VM guest side. Usage of timeout and VM ACKs make sure that the VM is really turned off at some point.
OK mlarkin@
|
#
1.2 |
|
02-Mar-2017 |
reyk |
Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.
This is especially useful when multiple VMs share a switch, the implementation is independent from the underlying switch or bridge.
no objections mlarkin@
|
#
1.1 |
|
01-Mar-2017 |
reyk |
Split vmm.c into two files: vm.c for the VM child, vmm.c for the parent
As discussed with mlarkin@, it makes it easier to maintain the file.
OK mlarkin@
|
#
1.40 |
|
28-Sep-2018 |
reyk |
Support vmd-internal's vmboot with qcow2 disk images.
OK mlarkin@
|
#
1.39 |
|
19-Sep-2018 |
ccardenas |
Various clean up items for disks.
- qcow2: general cleanup - vioraw: check malloc - virtio: add function to sync disks - vm: call virtio_shutdown to sync disks when vm is finished executing
Thanks to Ori Bernstein.
Ok miko@
|
#
1.38 |
|
17-Jul-2018 |
mlarkin |
vmd(8): fix vmctl -b option for i386 kernels.
ok pd@
|
#
1.37 |
|
12-Jul-2018 |
mlarkin |
vmm(8)/vmm(4): send a copy of the guest register state to vmd on exit, avoiding multiple readregs ioctls back to vmm in case register content is needed subsequently.
ok phessler
|
#
1.36 |
|
10-Jul-2018 |
mlarkin |
vmd(8): route ELCR handler to the right function
|
#
1.35 |
|
09-Jul-2018 |
mlarkin |
vmd(8): better debug message in a failure case
|
#
1.34 |
|
19-Jun-2018 |
reyk |
knf
|
#
1.33 |
|
27-Apr-2018 |
mlarkin |
vmd(8): implement vmd side of ELCR registers
ok guenther
|
#
1.32 |
|
26-Apr-2018 |
mlarkin |
vmd(8): handle PIT channel 2 status readback via port 0x61
Allow PIT channel 2 status (fired/counting) readback via port 0x61 bit 5.
ok guenther@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.31 |
|
03-Jan-2018 |
ccardenas |
Add initial CD-ROM support to VMD via vioscsi.
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
#
1.30 |
|
29-Nov-2017 |
mlarkin |
make vmm(4) less responsible for initial register state, preferring to let usermode daemons handle that.
ok pd@
|
#
1.29 |
|
28-Nov-2017 |
mlarkin |
fix some spelling errors in a few comments
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.28 |
|
19-Sep-2017 |
mlarkin |
Clarify a wrong conditional, found by jsg.
ok jsg
|
#
1.27 |
|
17-Sep-2017 |
pd |
vmd: send/recv pci config space instead of recreating pci devices on receive
ok mlarkin@
|
#
1.26 |
|
17-Sep-2017 |
pd |
vmd: re add rtc.per and rtc.sec evtimers on receive
This was missed in receive. mc146818_start is already defined. This fixes rtc time resync on receive.
ok mlarkin@
|
#
1.25 |
|
11-Sep-2017 |
dlg |
add functions to provide direct access to guest memory as vmd addresses
iovec_mem() populates an iovec array based on guest physical addresses. this allows the use of things like readv and writev for moving data between the guest and a disk image file without having to bounce the memory.
vaddr_mem() provides a vmd usable pointer based on a guests physical address. this makes it possible to directly reference things like virtio rings without having to bounce that memory either. however, it assumes that a contiguous range of guest physical memory will sit in a single vm memory range. mlarkin@ says this is right.
ok mlarkin@
|
#
1.24 |
|
20-Aug-2017 |
pd |
vmd: Allow only upward migration
This restricts receiving vms from hosts with more cpu features.
Tested on broadwell -> skylake (works) skylake -> broadwell (don't work)
ok mlarkin@
|
#
1.23 |
|
14-Aug-2017 |
mlarkin |
vmd: set MSR_MISC_ENABLE=0 on vm creation, this will be re-set in vmm based on proper values from the host in use.
|
#
1.22 |
|
15-Jul-2017 |
pd |
Add vmctl send and vmctl receive
ok reyk@ and mlarkin@
|
#
1.21 |
|
09-Jul-2017 |
pd |
vmd/vmctl: Add ability to pause / unpause vms
With help from Ashwin Agrawal
ok reyk@ mlarkin@
|
#
1.20 |
|
07-Jun-2017 |
mlarkin |
vmd: Implement simulated baudrate support in the ns8250 module. The previous version was allowing an output rate that is "too fast", and linux guests would give up after 512 characters TXed ("too much work for irq4").
This diff calculates the approximate rate we can sustain at the current programmed baud rate and limits the output to that rate by inserting a HZ delay after a specified number of characters have been transmitted. This fixes the linux guest console issue.
Note that the console now outputs at more or less the selected baud rate, instead of nearly instantaneously as before - if you selected 9600 in your guest VMs before, you might want to change that to 115200 now for a better console experience.
krw@ "seems like a good idea to me"
|
#
1.19 |
|
30-May-2017 |
tedu |
split vioblk read/write functions into start and finish as prep for async io operations. ok mlarkin
|
#
1.18 |
|
28-May-2017 |
mlarkin |
SVM: add some exit types
Also, fix a comment that wasn't applicable anymore, and change a format from decimal to hex
|
#
1.17 |
|
05-May-2017 |
reyk |
VMs cannot use proc_compose() to PROC_VMM, they have to use imsg_compose() on the "vmm_pipe" directly. This fixes the communication channel from VMs back to vmm.
|
#
1.16 |
|
05-May-2017 |
mlarkin |
Allow vmd(8) to set guest %xcr0
Usermode part of previous vmm(4) diff.
Posted to tech by Pratik Vyas
|
#
1.15 |
|
02-May-2017 |
mlarkin |
fix an error in i386 vmd build
|
#
1.14 |
|
02-May-2017 |
mlarkin |
Matching vmd(8) part of previous diff (first part of vmctl send/receive).
ok kettenis
|
#
1.13 |
|
25-Apr-2017 |
reyk |
spacing
|
#
1.12 |
|
19-Apr-2017 |
reyk |
Add support for dynamic "NAT" interfaces (-L/local interface).
When a local interface is configured, vmd configures a /31 address on the tap(4) interface of the host and provides another IP in the same subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server that replies with IP, gateway, and DNS addresses to the VM. The built-in server only ever responds to the VM on the inside and cannot leak its DHCP responses to the outside.
Thanks to Uwe Werler, Josh Grosse, and some others for testing!
OK deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.11 |
|
27-Mar-2017 |
deraadt |
die whitespace die die die
|
#
1.10 |
|
25-Mar-2017 |
mlarkin |
Last bits needed to get seabios + alpine linux working. This is enough to get started and let more people help finding and fixing bugs.
ok kettenis, deraadt
|
#
1.9 |
|
25-Mar-2017 |
reyk |
Boot using BIOS from /etc/firmware/vmm-bios by default.
Instead of using the internal "vmboot", VMs will now be booted using the external BIOS firmware in /etc/firmware/vmm-bios (which is subject to a LGPLv3 license). Direct booting of OpenBSD kernels or non-default BIOS images is still supported for now using the -b/boot option that is replacing the -k/kernel option.
As requested by Theo, vmd(8) fails if neither the default BIOS is found nor a kernel has been specified in the VM configuration. The "vmm" BIOS has to be installed using fw_update(1), which will be done automatically in most cases where the OpenBSD can fetch it after install/upgrade.
OK mlarkin@
|
#
1.8 |
|
25-Mar-2017 |
mlarkin |
Implement some missing functionality and clean up some code in vmd pci emulation.
ok kettenis
|
#
1.7 |
|
25-Mar-2017 |
mlarkin |
Introduce a new function to obtain properly sized input data, and convert i8253/i8259/mc146818 emulation to use this.
|
#
1.6 |
|
24-Mar-2017 |
mlarkin |
Allow vmd to proceed after an interrupt occurred after retiring a cpuid instruction. Matches previous commit to kernel vmm.c
|
#
1.5 |
|
23-Mar-2017 |
mlarkin |
Implement memory size and SMP CPU count NVRAM registers in the emulated mc146818. This is needed for seabios to boot properly (and construct a sensible e820 map to send to the guest OS).
|
#
1.4 |
|
21-Mar-2017 |
mlarkin |
Fix two errors in NS8250 (UART) emulation. The first error zeroed out the high bits of %eax on reading register data from the emulated UART ports. The second error didn't properly assert the TXRDY bit during init - this bit was only set after the first character was sent. Both these bugs caused seabios to not be able to output any data. Found during the recent effort to get Linux guests booting.
|
#
1.3 |
|
15-Mar-2017 |
reyk |
Improve vmmci(4) shutdown and reboot.
This change handles various cases to power off the VM, even if it is unresponsive, stuck in ddb, or when the shutdown was initiated from the VM guest side. Usage of timeout and VM ACKs make sure that the VM is really turned off at some point.
OK mlarkin@
|
#
1.2 |
|
02-Mar-2017 |
reyk |
Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.
This is especially useful when multiple VMs share a switch, the implementation is independent from the underlying switch or bridge.
no objections mlarkin@
|
#
1.1 |
|
01-Mar-2017 |
reyk |
Split vmm.c into two files: vm.c for the VM child, vmm.c for the parent
As discussed with mlarkin@, it makes it easier to maintain the file.
OK mlarkin@
|
#
1.38 |
|
17-Jul-2018 |
mlarkin |
vmd(8): fix vmctl -b option for i386 kernels.
ok pd@
|
#
1.37 |
|
12-Jul-2018 |
mlarkin |
vmm(8)/vmm(4): send a copy of the guest register state to vmd on exit, avoiding multiple readregs ioctls back to vmm in case register content is needed subsequently.
ok phessler
|
#
1.36 |
|
10-Jul-2018 |
mlarkin |
vmd(8): route ELCR handler to the right function
|
#
1.35 |
|
09-Jul-2018 |
mlarkin |
vmd(8): better debug message in a failure case
|
#
1.34 |
|
19-Jun-2018 |
reyk |
knf
|
#
1.33 |
|
27-Apr-2018 |
mlarkin |
vmd(8): implement vmd side of ELCR registers
ok guenther
|
#
1.32 |
|
26-Apr-2018 |
mlarkin |
vmd(8): handle PIT channel 2 status readback via port 0x61
Allow PIT channel 2 status (fired/counting) readback via port 0x61 bit 5.
ok guenther@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.31 |
|
03-Jan-2018 |
ccardenas |
Add initial CD-ROM support to VMD via vioscsi.
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
#
1.30 |
|
29-Nov-2017 |
mlarkin |
make vmm(4) less responsible for initial register state, preferring to let usermode daemons handle that.
ok pd@
|
#
1.29 |
|
28-Nov-2017 |
mlarkin |
fix some spelling errors in a few comments
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.28 |
|
19-Sep-2017 |
mlarkin |
Clarify a wrong conditional, found by jsg.
ok jsg
|
#
1.27 |
|
17-Sep-2017 |
pd |
vmd: send/recv pci config space instead of recreating pci devices on receive
ok mlarkin@
|
#
1.26 |
|
17-Sep-2017 |
pd |
vmd: re add rtc.per and rtc.sec evtimers on receive
This was missed in receive. mc146818_start is already defined. This fixes rtc time resync on receive.
ok mlarkin@
|
#
1.25 |
|
11-Sep-2017 |
dlg |
add functions to provide direct access to guest memory as vmd addresses
iovec_mem() populates an iovec array based on guest physical addresses. this allows the use of things like readv and writev for moving data between the guest and a disk image file without having to bounce the memory.
vaddr_mem() provides a vmd usable pointer based on a guests physical address. this makes it possible to directly reference things like virtio rings without having to bounce that memory either. however, it assumes that a contiguous range of guest physical memory will sit in a single vm memory range. mlarkin@ says this is right.
ok mlarkin@
|
#
1.24 |
|
20-Aug-2017 |
pd |
vmd: Allow only upward migration
This restricts receiving vms from hosts with more cpu features.
Tested on broadwell -> skylake (works) skylake -> broadwell (don't work)
ok mlarkin@
|
#
1.23 |
|
14-Aug-2017 |
mlarkin |
vmd: set MSR_MISC_ENABLE=0 on vm creation, this will be re-set in vmm based on proper values from the host in use.
|
#
1.22 |
|
15-Jul-2017 |
pd |
Add vmctl send and vmctl receive
ok reyk@ and mlarkin@
|
#
1.21 |
|
09-Jul-2017 |
pd |
vmd/vmctl: Add ability to pause / unpause vms
With help from Ashwin Agrawal
ok reyk@ mlarkin@
|
#
1.20 |
|
07-Jun-2017 |
mlarkin |
vmd: Implement simulated baudrate support in the ns8250 module. The previous version was allowing an output rate that is "too fast", and linux guests would give up after 512 characters TXed ("too much work for irq4").
This diff calculates the approximate rate we can sustain at the current programmed baud rate and limits the output to that rate by inserting a HZ delay after a specified number of characters have been transmitted. This fixes the linux guest console issue.
Note that the console now outputs at more or less the selected baud rate, instead of nearly instantaneously as before - if you selected 9600 in your guest VMs before, you might want to change that to 115200 now for a better console experience.
krw@ "seems like a good idea to me"
|
#
1.19 |
|
30-May-2017 |
tedu |
split vioblk read/write functions into start and finish as prep for async io operations. ok mlarkin
|
#
1.18 |
|
28-May-2017 |
mlarkin |
SVM: add some exit types
Also, fix a comment that wasn't applicable anymore, and change a format from decimal to hex
|
#
1.17 |
|
05-May-2017 |
reyk |
VMs cannot use proc_compose() to PROC_VMM, they have to use imsg_compose() on the "vmm_pipe" directly. This fixes the communication channel from VMs back to vmm.
|
#
1.16 |
|
05-May-2017 |
mlarkin |
Allow vmd(8) to set guest %xcr0
Usermode part of previous vmm(4) diff.
Posted to tech by Pratik Vyas
|
#
1.15 |
|
02-May-2017 |
mlarkin |
fix an error in i386 vmd build
|
#
1.14 |
|
02-May-2017 |
mlarkin |
Matching vmd(8) part of previous diff (first part of vmctl send/receive).
ok kettenis
|
#
1.13 |
|
25-Apr-2017 |
reyk |
spacing
|
#
1.12 |
|
19-Apr-2017 |
reyk |
Add support for dynamic "NAT" interfaces (-L/local interface).
When a local interface is configured, vmd configures a /31 address on the tap(4) interface of the host and provides another IP in the same subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server that replies with IP, gateway, and DNS addresses to the VM. The built-in server only ever responds to the VM on the inside and cannot leak its DHCP responses to the outside.
Thanks to Uwe Werler, Josh Grosse, and some others for testing!
OK deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.11 |
|
27-Mar-2017 |
deraadt |
die whitespace die die die
|
#
1.10 |
|
25-Mar-2017 |
mlarkin |
Last bits needed to get seabios + alpine linux working. This is enough to get started and let more people help finding and fixing bugs.
ok kettenis, deraadt
|
#
1.9 |
|
25-Mar-2017 |
reyk |
Boot using BIOS from /etc/firmware/vmm-bios by default.
Instead of using the internal "vmboot", VMs will now be booted using the external BIOS firmware in /etc/firmware/vmm-bios (which is subject to a LGPLv3 license). Direct booting of OpenBSD kernels or non-default BIOS images is still supported for now using the -b/boot option that is replacing the -k/kernel option.
As requested by Theo, vmd(8) fails if neither the default BIOS is found nor a kernel has been specified in the VM configuration. The "vmm" BIOS has to be installed using fw_update(1), which will be done automatically in most cases where the OpenBSD can fetch it after install/upgrade.
OK mlarkin@
|
#
1.8 |
|
25-Mar-2017 |
mlarkin |
Implement some missing functionality and clean up some code in vmd pci emulation.
ok kettenis
|
#
1.7 |
|
25-Mar-2017 |
mlarkin |
Introduce a new function to obtain properly sized input data, and convert i8253/i8259/mc146818 emulation to use this.
|
#
1.6 |
|
24-Mar-2017 |
mlarkin |
Allow vmd to proceed after an interrupt occurred after retiring a cpuid instruction. Matches previous commit to kernel vmm.c
|
#
1.5 |
|
23-Mar-2017 |
mlarkin |
Implement memory size and SMP CPU count NVRAM registers in the emulated mc146818. This is needed for seabios to boot properly (and construct a sensible e820 map to send to the guest OS).
|
#
1.4 |
|
21-Mar-2017 |
mlarkin |
Fix two errors in NS8250 (UART) emulation. The first error zeroed out the high bits of %eax on reading register data from the emulated UART ports. The second error didn't properly assert the TXRDY bit during init - this bit was only set after the first character was sent. Both these bugs caused seabios to not be able to output any data. Found during the recent effort to get Linux guests booting.
|
#
1.3 |
|
15-Mar-2017 |
reyk |
Improve vmmci(4) shutdown and reboot.
This change handles various cases to power off the VM, even if it is unresponsive, stuck in ddb, or when the shutdown was initiated from the VM guest side. Usage of timeout and VM ACKs make sure that the VM is really turned off at some point.
OK mlarkin@
|
#
1.2 |
|
02-Mar-2017 |
reyk |
Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.
This is especially useful when multiple VMs share a switch, the implementation is independent from the underlying switch or bridge.
no objections mlarkin@
|
#
1.1 |
|
01-Mar-2017 |
reyk |
Split vmm.c into two files: vm.c for the VM child, vmm.c for the parent
As discussed with mlarkin@, it makes it easier to maintain the file.
OK mlarkin@
|
#
1.37 |
|
12-Jul-2018 |
mlarkin |
vmm(8)/vmm(4): send a copy of the guest register state to vmd on exit, avoiding multiple readregs ioctls back to vmm in case register content is needed subsequently.
ok phessler
|
#
1.36 |
|
10-Jul-2018 |
mlarkin |
vmd(8): route ELCR handler to the right function
|
#
1.35 |
|
09-Jul-2018 |
mlarkin |
vmd(8): better debug message in a failure case
|
#
1.34 |
|
19-Jun-2018 |
reyk |
knf
|
#
1.33 |
|
27-Apr-2018 |
mlarkin |
vmd(8): implement vmd side of ELCR registers
ok guenther
|
#
1.32 |
|
26-Apr-2018 |
mlarkin |
vmd(8): handle PIT channel 2 status readback via port 0x61
Allow PIT channel 2 status (fired/counting) readback via port 0x61 bit 5.
ok guenther@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.31 |
|
03-Jan-2018 |
ccardenas |
Add initial CD-ROM support to VMD via vioscsi.
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
#
1.30 |
|
29-Nov-2017 |
mlarkin |
make vmm(4) less responsible for initial register state, preferring to let usermode daemons handle that.
ok pd@
|
#
1.29 |
|
28-Nov-2017 |
mlarkin |
fix some spelling errors in a few comments
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.28 |
|
19-Sep-2017 |
mlarkin |
Clarify a wrong conditional, found by jsg.
ok jsg
|
#
1.27 |
|
17-Sep-2017 |
pd |
vmd: send/recv pci config space instead of recreating pci devices on receive
ok mlarkin@
|
#
1.26 |
|
17-Sep-2017 |
pd |
vmd: re add rtc.per and rtc.sec evtimers on receive
This was missed in receive. mc146818_start is already defined. This fixes rtc time resync on receive.
ok mlarkin@
|
#
1.25 |
|
11-Sep-2017 |
dlg |
add functions to provide direct access to guest memory as vmd addresses
iovec_mem() populates an iovec array based on guest physical addresses. this allows the use of things like readv and writev for moving data between the guest and a disk image file without having to bounce the memory.
vaddr_mem() provides a vmd usable pointer based on a guests physical address. this makes it possible to directly reference things like virtio rings without having to bounce that memory either. however, it assumes that a contiguous range of guest physical memory will sit in a single vm memory range. mlarkin@ says this is right.
ok mlarkin@
|
#
1.24 |
|
20-Aug-2017 |
pd |
vmd: Allow only upward migration
This restricts receiving vms from hosts with more cpu features.
Tested on broadwell -> skylake (works) skylake -> broadwell (don't work)
ok mlarkin@
|
#
1.23 |
|
14-Aug-2017 |
mlarkin |
vmd: set MSR_MISC_ENABLE=0 on vm creation, this will be re-set in vmm based on proper values from the host in use.
|
#
1.22 |
|
15-Jul-2017 |
pd |
Add vmctl send and vmctl receive
ok reyk@ and mlarkin@
|
#
1.21 |
|
09-Jul-2017 |
pd |
vmd/vmctl: Add ability to pause / unpause vms
With help from Ashwin Agrawal
ok reyk@ mlarkin@
|
#
1.20 |
|
07-Jun-2017 |
mlarkin |
vmd: Implement simulated baudrate support in the ns8250 module. The previous version was allowing an output rate that is "too fast", and linux guests would give up after 512 characters TXed ("too much work for irq4").
This diff calculates the approximate rate we can sustain at the current programmed baud rate and limits the output to that rate by inserting a HZ delay after a specified number of characters have been transmitted. This fixes the linux guest console issue.
Note that the console now outputs at more or less the selected baud rate, instead of nearly instantaneously as before - if you selected 9600 in your guest VMs before, you might want to change that to 115200 now for a better console experience.
krw@ "seems like a good idea to me"
|
#
1.19 |
|
30-May-2017 |
tedu |
split vioblk read/write functions into start and finish as prep for async io operations. ok mlarkin
|
#
1.18 |
|
28-May-2017 |
mlarkin |
SVM: add some exit types
Also, fix a comment that wasn't applicable anymore, and change a format from decimal to hex
|
#
1.17 |
|
05-May-2017 |
reyk |
VMs cannot use proc_compose() to PROC_VMM, they have to use imsg_compose() on the "vmm_pipe" directly. This fixes the communication channel from VMs back to vmm.
|
#
1.16 |
|
05-May-2017 |
mlarkin |
Allow vmd(8) to set guest %xcr0
Usermode part of previous vmm(4) diff.
Posted to tech by Pratik Vyas
|
#
1.15 |
|
02-May-2017 |
mlarkin |
fix an error in i386 vmd build
|
#
1.14 |
|
02-May-2017 |
mlarkin |
Matching vmd(8) part of previous diff (first part of vmctl send/receive).
ok kettenis
|
#
1.13 |
|
25-Apr-2017 |
reyk |
spacing
|
#
1.12 |
|
19-Apr-2017 |
reyk |
Add support for dynamic "NAT" interfaces (-L/local interface).
When a local interface is configured, vmd configures a /31 address on the tap(4) interface of the host and provides another IP in the same subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server that replies with IP, gateway, and DNS addresses to the VM. The built-in server only ever responds to the VM on the inside and cannot leak its DHCP responses to the outside.
Thanks to Uwe Werler, Josh Grosse, and some others for testing!
OK deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.11 |
|
27-Mar-2017 |
deraadt |
die whitespace die die die
|
#
1.10 |
|
25-Mar-2017 |
mlarkin |
Last bits needed to get seabios + alpine linux working. This is enough to get started and let more people help finding and fixing bugs.
ok kettenis, deraadt
|
#
1.9 |
|
25-Mar-2017 |
reyk |
Boot using BIOS from /etc/firmware/vmm-bios by default.
Instead of using the internal "vmboot", VMs will now be booted using the external BIOS firmware in /etc/firmware/vmm-bios (which is subject to a LGPLv3 license). Direct booting of OpenBSD kernels or non-default BIOS images is still supported for now using the -b/boot option that is replacing the -k/kernel option.
As requested by Theo, vmd(8) fails if neither the default BIOS is found nor a kernel has been specified in the VM configuration. The "vmm" BIOS has to be installed using fw_update(1), which will be done automatically in most cases where the OpenBSD can fetch it after install/upgrade.
OK mlarkin@
|
#
1.8 |
|
25-Mar-2017 |
mlarkin |
Implement some missing functionality and clean up some code in vmd pci emulation.
ok kettenis
|
#
1.7 |
|
25-Mar-2017 |
mlarkin |
Introduce a new function to obtain properly sized input data, and convert i8253/i8259/mc146818 emulation to use this.
|
#
1.6 |
|
24-Mar-2017 |
mlarkin |
Allow vmd to proceed after an interrupt occurred after retiring a cpuid instruction. Matches previous commit to kernel vmm.c
|
#
1.5 |
|
23-Mar-2017 |
mlarkin |
Implement memory size and SMP CPU count NVRAM registers in the emulated mc146818. This is needed for seabios to boot properly (and construct a sensible e820 map to send to the guest OS).
|
#
1.4 |
|
21-Mar-2017 |
mlarkin |
Fix two errors in NS8250 (UART) emulation. The first error zeroed out the high bits of %eax on reading register data from the emulated UART ports. The second error didn't properly assert the TXRDY bit during init - this bit was only set after the first character was sent. Both these bugs caused seabios to not be able to output any data. Found during the recent effort to get Linux guests booting.
|
#
1.3 |
|
15-Mar-2017 |
reyk |
Improve vmmci(4) shutdown and reboot.
This change handles various cases to power off the VM, even if it is unresponsive, stuck in ddb, or when the shutdown was initiated from the VM guest side. Usage of timeout and VM ACKs make sure that the VM is really turned off at some point.
OK mlarkin@
|
#
1.2 |
|
02-Mar-2017 |
reyk |
Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.
This is especially useful when multiple VMs share a switch, the implementation is independent from the underlying switch or bridge.
no objections mlarkin@
|
#
1.1 |
|
01-Mar-2017 |
reyk |
Split vmm.c into two files: vm.c for the VM child, vmm.c for the parent
As discussed with mlarkin@, it makes it easier to maintain the file.
OK mlarkin@
|
#
1.34 |
|
19-Jun-2018 |
reyk |
knf
|
#
1.33 |
|
27-Apr-2018 |
mlarkin |
vmd(8): implement vmd side of ELCR registers
ok guenther
|
#
1.32 |
|
26-Apr-2018 |
mlarkin |
vmd(8): handle PIT channel 2 status readback via port 0x61
Allow PIT channel 2 status (fired/counting) readback via port 0x61 bit 5.
ok guenther@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.31 |
|
03-Jan-2018 |
ccardenas |
Add initial CD-ROM support to VMD via vioscsi.
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
#
1.30 |
|
29-Nov-2017 |
mlarkin |
make vmm(4) less responsible for initial register state, preferring to let usermode daemons handle that.
ok pd@
|
#
1.29 |
|
28-Nov-2017 |
mlarkin |
fix some spelling errors in a few comments
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.28 |
|
19-Sep-2017 |
mlarkin |
Clarify a wrong conditional, found by jsg.
ok jsg
|
#
1.27 |
|
17-Sep-2017 |
pd |
vmd: send/recv pci config space instead of recreating pci devices on receive
ok mlarkin@
|
#
1.26 |
|
17-Sep-2017 |
pd |
vmd: re add rtc.per and rtc.sec evtimers on receive
This was missed in receive. mc146818_start is already defined. This fixes rtc time resync on receive.
ok mlarkin@
|
#
1.25 |
|
11-Sep-2017 |
dlg |
add functions to provide direct access to guest memory as vmd addresses
iovec_mem() populates an iovec array based on guest physical addresses. this allows the use of things like readv and writev for moving data between the guest and a disk image file without having to bounce the memory.
vaddr_mem() provides a vmd usable pointer based on a guests physical address. this makes it possible to directly reference things like virtio rings without having to bounce that memory either. however, it assumes that a contiguous range of guest physical memory will sit in a single vm memory range. mlarkin@ says this is right.
ok mlarkin@
|
#
1.24 |
|
20-Aug-2017 |
pd |
vmd: Allow only upward migration
This restricts receiving vms from hosts with more cpu features.
Tested on broadwell -> skylake (works) skylake -> broadwell (don't work)
ok mlarkin@
|
#
1.23 |
|
14-Aug-2017 |
mlarkin |
vmd: set MSR_MISC_ENABLE=0 on vm creation, this will be re-set in vmm based on proper values from the host in use.
|
#
1.22 |
|
15-Jul-2017 |
pd |
Add vmctl send and vmctl receive
ok reyk@ and mlarkin@
|
#
1.21 |
|
09-Jul-2017 |
pd |
vmd/vmctl: Add ability to pause / unpause vms
With help from Ashwin Agrawal
ok reyk@ mlarkin@
|
#
1.20 |
|
07-Jun-2017 |
mlarkin |
vmd: Implement simulated baudrate support in the ns8250 module. The previous version was allowing an output rate that is "too fast", and linux guests would give up after 512 characters TXed ("too much work for irq4").
This diff calculates the approximate rate we can sustain at the current programmed baud rate and limits the output to that rate by inserting a HZ delay after a specified number of characters have been transmitted. This fixes the linux guest console issue.
Note that the console now outputs at more or less the selected baud rate, instead of nearly instantaneously as before - if you selected 9600 in your guest VMs before, you might want to change that to 115200 now for a better console experience.
krw@ "seems like a good idea to me"
|
#
1.19 |
|
30-May-2017 |
tedu |
split vioblk read/write functions into start and finish as prep for async io operations. ok mlarkin
|
#
1.18 |
|
28-May-2017 |
mlarkin |
SVM: add some exit types
Also, fix a comment that wasn't applicable anymore, and change a format from decimal to hex
|
#
1.17 |
|
05-May-2017 |
reyk |
VMs cannot use proc_compose() to PROC_VMM, they have to use imsg_compose() on the "vmm_pipe" directly. This fixes the communication channel from VMs back to vmm.
|
#
1.16 |
|
05-May-2017 |
mlarkin |
Allow vmd(8) to set guest %xcr0
Usermode part of previous vmm(4) diff.
Posted to tech by Pratik Vyas
|
#
1.15 |
|
02-May-2017 |
mlarkin |
fix an error in i386 vmd build
|
#
1.14 |
|
02-May-2017 |
mlarkin |
Matching vmd(8) part of previous diff (first part of vmctl send/receive).
ok kettenis
|
#
1.13 |
|
25-Apr-2017 |
reyk |
spacing
|
#
1.12 |
|
19-Apr-2017 |
reyk |
Add support for dynamic "NAT" interfaces (-L/local interface).
When a local interface is configured, vmd configures a /31 address on the tap(4) interface of the host and provides another IP in the same subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server that replies with IP, gateway, and DNS addresses to the VM. The built-in server only ever responds to the VM on the inside and cannot leak its DHCP responses to the outside.
Thanks to Uwe Werler, Josh Grosse, and some others for testing!
OK deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.11 |
|
27-Mar-2017 |
deraadt |
die whitespace die die die
|
#
1.10 |
|
25-Mar-2017 |
mlarkin |
Last bits needed to get seabios + alpine linux working. This is enough to get started and let more people help finding and fixing bugs.
ok kettenis, deraadt
|
#
1.9 |
|
25-Mar-2017 |
reyk |
Boot using BIOS from /etc/firmware/vmm-bios by default.
Instead of using the internal "vmboot", VMs will now be booted using the external BIOS firmware in /etc/firmware/vmm-bios (which is subject to a LGPLv3 license). Direct booting of OpenBSD kernels or non-default BIOS images is still supported for now using the -b/boot option that is replacing the -k/kernel option.
As requested by Theo, vmd(8) fails if neither the default BIOS is found nor a kernel has been specified in the VM configuration. The "vmm" BIOS has to be installed using fw_update(1), which will be done automatically in most cases where the OpenBSD can fetch it after install/upgrade.
OK mlarkin@
|
#
1.8 |
|
25-Mar-2017 |
mlarkin |
Implement some missing functionality and clean up some code in vmd pci emulation.
ok kettenis
|
#
1.7 |
|
25-Mar-2017 |
mlarkin |
Introduce a new function to obtain properly sized input data, and convert i8253/i8259/mc146818 emulation to use this.
|
#
1.6 |
|
24-Mar-2017 |
mlarkin |
Allow vmd to proceed after an interrupt occurred after retiring a cpuid instruction. Matches previous commit to kernel vmm.c
|
#
1.5 |
|
23-Mar-2017 |
mlarkin |
Implement memory size and SMP CPU count NVRAM registers in the emulated mc146818. This is needed for seabios to boot properly (and construct a sensible e820 map to send to the guest OS).
|
#
1.4 |
|
21-Mar-2017 |
mlarkin |
Fix two errors in NS8250 (UART) emulation. The first error zeroed out the high bits of %eax on reading register data from the emulated UART ports. The second error didn't properly assert the TXRDY bit during init - this bit was only set after the first character was sent. Both these bugs caused seabios to not be able to output any data. Found during the recent effort to get Linux guests booting.
|
#
1.3 |
|
15-Mar-2017 |
reyk |
Improve vmmci(4) shutdown and reboot.
This change handles various cases to power off the VM, even if it is unresponsive, stuck in ddb, or when the shutdown was initiated from the VM guest side. Usage of timeout and VM ACKs make sure that the VM is really turned off at some point.
OK mlarkin@
|
#
1.2 |
|
02-Mar-2017 |
reyk |
Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.
This is especially useful when multiple VMs share a switch, the implementation is independent from the underlying switch or bridge.
no objections mlarkin@
|
#
1.1 |
|
01-Mar-2017 |
reyk |
Split vmm.c into two files: vm.c for the VM child, vmm.c for the parent
As discussed with mlarkin@, it makes it easier to maintain the file.
OK mlarkin@
|
#
1.33 |
|
27-Apr-2018 |
mlarkin |
vmd(8): implement vmd side of ELCR registers
ok guenther
|
#
1.32 |
|
26-Apr-2018 |
mlarkin |
vmd(8): handle PIT channel 2 status readback via port 0x61
Allow PIT channel 2 status (fired/counting) readback via port 0x61 bit 5.
ok guenther@
|
Revision tags: OPENBSD_6_3_BASE
|
#
1.31 |
|
03-Jan-2018 |
ccardenas |
Add initial CD-ROM support to VMD via vioscsi.
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
#
1.30 |
|
29-Nov-2017 |
mlarkin |
make vmm(4) less responsible for initial register state, preferring to let usermode daemons handle that.
ok pd@
|
#
1.29 |
|
28-Nov-2017 |
mlarkin |
fix some spelling errors in a few comments
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.28 |
|
19-Sep-2017 |
mlarkin |
Clarify a wrong conditional, found by jsg.
ok jsg
|
#
1.27 |
|
17-Sep-2017 |
pd |
vmd: send/recv pci config space instead of recreating pci devices on receive
ok mlarkin@
|
#
1.26 |
|
17-Sep-2017 |
pd |
vmd: re add rtc.per and rtc.sec evtimers on receive
This was missed in receive. mc146818_start is already defined. This fixes rtc time resync on receive.
ok mlarkin@
|
#
1.25 |
|
11-Sep-2017 |
dlg |
add functions to provide direct access to guest memory as vmd addresses
iovec_mem() populates an iovec array based on guest physical addresses. this allows the use of things like readv and writev for moving data between the guest and a disk image file without having to bounce the memory.
vaddr_mem() provides a vmd usable pointer based on a guests physical address. this makes it possible to directly reference things like virtio rings without having to bounce that memory either. however, it assumes that a contiguous range of guest physical memory will sit in a single vm memory range. mlarkin@ says this is right.
ok mlarkin@
|
#
1.24 |
|
20-Aug-2017 |
pd |
vmd: Allow only upward migration
This restricts receiving vms from hosts with more cpu features.
Tested on broadwell -> skylake (works) skylake -> broadwell (don't work)
ok mlarkin@
|
#
1.23 |
|
14-Aug-2017 |
mlarkin |
vmd: set MSR_MISC_ENABLE=0 on vm creation, this will be re-set in vmm based on proper values from the host in use.
|
#
1.22 |
|
15-Jul-2017 |
pd |
Add vmctl send and vmctl receive
ok reyk@ and mlarkin@
|
#
1.21 |
|
09-Jul-2017 |
pd |
vmd/vmctl: Add ability to pause / unpause vms
With help from Ashwin Agrawal
ok reyk@ mlarkin@
|
#
1.20 |
|
07-Jun-2017 |
mlarkin |
vmd: Implement simulated baudrate support in the ns8250 module. The previous version was allowing an output rate that is "too fast", and linux guests would give up after 512 characters TXed ("too much work for irq4").
This diff calculates the approximate rate we can sustain at the current programmed baud rate and limits the output to that rate by inserting a HZ delay after a specified number of characters have been transmitted. This fixes the linux guest console issue.
Note that the console now outputs at more or less the selected baud rate, instead of nearly instantaneously as before - if you selected 9600 in your guest VMs before, you might want to change that to 115200 now for a better console experience.
krw@ "seems like a good idea to me"
|
#
1.19 |
|
30-May-2017 |
tedu |
split vioblk read/write functions into start and finish as prep for async io operations. ok mlarkin
|
#
1.18 |
|
28-May-2017 |
mlarkin |
SVM: add some exit types
Also, fix a comment that wasn't applicable anymore, and change a format from decimal to hex
|
#
1.17 |
|
05-May-2017 |
reyk |
VMs cannot use proc_compose() to PROC_VMM, they have to use imsg_compose() on the "vmm_pipe" directly. This fixes the communication channel from VMs back to vmm.
|
#
1.16 |
|
05-May-2017 |
mlarkin |
Allow vmd(8) to set guest %xcr0
Usermode part of previous vmm(4) diff.
Posted to tech by Pratik Vyas
|
#
1.15 |
|
02-May-2017 |
mlarkin |
fix an error in i386 vmd build
|
#
1.14 |
|
02-May-2017 |
mlarkin |
Matching vmd(8) part of previous diff (first part of vmctl send/receive).
ok kettenis
|
#
1.13 |
|
25-Apr-2017 |
reyk |
spacing
|
#
1.12 |
|
19-Apr-2017 |
reyk |
Add support for dynamic "NAT" interfaces (-L/local interface).
When a local interface is configured, vmd configures a /31 address on the tap(4) interface of the host and provides another IP in the same subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server that replies with IP, gateway, and DNS addresses to the VM. The built-in server only ever responds to the VM on the inside and cannot leak its DHCP responses to the outside.
Thanks to Uwe Werler, Josh Grosse, and some others for testing!
OK deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.11 |
|
27-Mar-2017 |
deraadt |
die whitespace die die die
|
#
1.10 |
|
25-Mar-2017 |
mlarkin |
Last bits needed to get seabios + alpine linux working. This is enough to get started and let more people help finding and fixing bugs.
ok kettenis, deraadt
|
#
1.9 |
|
25-Mar-2017 |
reyk |
Boot using BIOS from /etc/firmware/vmm-bios by default.
Instead of using the internal "vmboot", VMs will now be booted using the external BIOS firmware in /etc/firmware/vmm-bios (which is subject to a LGPLv3 license). Direct booting of OpenBSD kernels or non-default BIOS images is still supported for now using the -b/boot option that is replacing the -k/kernel option.
As requested by Theo, vmd(8) fails if neither the default BIOS is found nor a kernel has been specified in the VM configuration. The "vmm" BIOS has to be installed using fw_update(1), which will be done automatically in most cases where the OpenBSD can fetch it after install/upgrade.
OK mlarkin@
|
#
1.8 |
|
25-Mar-2017 |
mlarkin |
Implement some missing functionality and clean up some code in vmd pci emulation.
ok kettenis
|
#
1.7 |
|
25-Mar-2017 |
mlarkin |
Introduce a new function to obtain properly sized input data, and convert i8253/i8259/mc146818 emulation to use this.
|
#
1.6 |
|
24-Mar-2017 |
mlarkin |
Allow vmd to proceed after an interrupt occurred after retiring a cpuid instruction. Matches previous commit to kernel vmm.c
|
#
1.5 |
|
23-Mar-2017 |
mlarkin |
Implement memory size and SMP CPU count NVRAM registers in the emulated mc146818. This is needed for seabios to boot properly (and construct a sensible e820 map to send to the guest OS).
|
#
1.4 |
|
21-Mar-2017 |
mlarkin |
Fix two errors in NS8250 (UART) emulation. The first error zeroed out the high bits of %eax on reading register data from the emulated UART ports. The second error didn't properly assert the TXRDY bit during init - this bit was only set after the first character was sent. Both these bugs caused seabios to not be able to output any data. Found during the recent effort to get Linux guests booting.
|
#
1.3 |
|
15-Mar-2017 |
reyk |
Improve vmmci(4) shutdown and reboot.
This change handles various cases to power off the VM, even if it is unresponsive, stuck in ddb, or when the shutdown was initiated from the VM guest side. Usage of timeout and VM ACKs make sure that the VM is really turned off at some point.
OK mlarkin@
|
#
1.2 |
|
02-Mar-2017 |
reyk |
Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.
This is especially useful when multiple VMs share a switch, the implementation is independent from the underlying switch or bridge.
no objections mlarkin@
|
#
1.1 |
|
01-Mar-2017 |
reyk |
Split vmm.c into two files: vm.c for the VM child, vmm.c for the parent
As discussed with mlarkin@, it makes it easier to maintain the file.
OK mlarkin@
|
#
1.31 |
|
03-Jan-2018 |
ccardenas |
Add initial CD-ROM support to VMD via vioscsi.
* Adds 'cdrom' keyword to vm.conf(5) and '-r' to vmctl(8) * Support various sized ISOs (Limitation of 4G ISOs on Linux guests) * Known working guests: OpenBSD (primary), Alpine Linux (primary), CentOS 6 (secondary), Ubuntu 17.10 (secondary). NOTE: Secondary indicates some issue(s) preventing full/reliable functionality outside the scope of the vioscsi work. * If the attached disks are non-bootable (i.e. empty), SeaBIOS (vmd's default BIOS) will boot from CD-ROM.
ok mlarkin@, jca@
|
#
1.30 |
|
29-Nov-2017 |
mlarkin |
make vmm(4) less responsible for initial register state, preferring to let usermode daemons handle that.
ok pd@
|
#
1.29 |
|
28-Nov-2017 |
mlarkin |
fix some spelling errors in a few comments
|
Revision tags: OPENBSD_6_2_BASE
|
#
1.28 |
|
19-Sep-2017 |
mlarkin |
Clarify a wrong conditional, found by jsg.
ok jsg
|
#
1.27 |
|
17-Sep-2017 |
pd |
vmd: send/recv pci config space instead of recreating pci devices on receive
ok mlarkin@
|
#
1.26 |
|
17-Sep-2017 |
pd |
vmd: re add rtc.per and rtc.sec evtimers on receive
This was missed in receive. mc146818_start is already defined. This fixes rtc time resync on receive.
ok mlarkin@
|
#
1.25 |
|
11-Sep-2017 |
dlg |
add functions to provide direct access to guest memory as vmd addresses
iovec_mem() populates an iovec array based on guest physical addresses. this allows the use of things like readv and writev for moving data between the guest and a disk image file without having to bounce the memory.
vaddr_mem() provides a vmd usable pointer based on a guests physical address. this makes it possible to directly reference things like virtio rings without having to bounce that memory either. however, it assumes that a contiguous range of guest physical memory will sit in a single vm memory range. mlarkin@ says this is right.
ok mlarkin@
|
#
1.24 |
|
20-Aug-2017 |
pd |
vmd: Allow only upward migration
This restricts receiving vms from hosts with more cpu features.
Tested on broadwell -> skylake (works) skylake -> broadwell (don't work)
ok mlarkin@
|
#
1.23 |
|
14-Aug-2017 |
mlarkin |
vmd: set MSR_MISC_ENABLE=0 on vm creation, this will be re-set in vmm based on proper values from the host in use.
|
#
1.22 |
|
15-Jul-2017 |
pd |
Add vmctl send and vmctl receive
ok reyk@ and mlarkin@
|
#
1.21 |
|
09-Jul-2017 |
pd |
vmd/vmctl: Add ability to pause / unpause vms
With help from Ashwin Agrawal
ok reyk@ mlarkin@
|
#
1.20 |
|
07-Jun-2017 |
mlarkin |
vmd: Implement simulated baudrate support in the ns8250 module. The previous version was allowing an output rate that is "too fast", and linux guests would give up after 512 characters TXed ("too much work for irq4").
This diff calculates the approximate rate we can sustain at the current programmed baud rate and limits the output to that rate by inserting a HZ delay after a specified number of characters have been transmitted. This fixes the linux guest console issue.
Note that the console now outputs at more or less the selected baud rate, instead of nearly instantaneously as before - if you selected 9600 in your guest VMs before, you might want to change that to 115200 now for a better console experience.
krw@ "seems like a good idea to me"
|
#
1.19 |
|
30-May-2017 |
tedu |
split vioblk read/write functions into start and finish as prep for async io operations. ok mlarkin
|
#
1.18 |
|
28-May-2017 |
mlarkin |
SVM: add some exit types
Also, fix a comment that wasn't applicable anymore, and change a format from decimal to hex
|
#
1.17 |
|
05-May-2017 |
reyk |
VMs cannot use proc_compose() to PROC_VMM, they have to use imsg_compose() on the "vmm_pipe" directly. This fixes the communication channel from VMs back to vmm.
|
#
1.16 |
|
05-May-2017 |
mlarkin |
Allow vmd(8) to set guest %xcr0
Usermode part of previous vmm(4) diff.
Posted to tech by Pratik Vyas
|
#
1.15 |
|
02-May-2017 |
mlarkin |
fix an error in i386 vmd build
|
#
1.14 |
|
02-May-2017 |
mlarkin |
Matching vmd(8) part of previous diff (first part of vmctl send/receive).
ok kettenis
|
#
1.13 |
|
25-Apr-2017 |
reyk |
spacing
|
#
1.12 |
|
19-Apr-2017 |
reyk |
Add support for dynamic "NAT" interfaces (-L/local interface).
When a local interface is configured, vmd configures a /31 address on the tap(4) interface of the host and provides another IP in the same subnet via DHCP (BOOTP) to the VM. vmd runs an internal BOOTP server that replies with IP, gateway, and DNS addresses to the VM. The built-in server only ever responds to the VM on the inside and cannot leak its DHCP responses to the outside.
Thanks to Uwe Werler, Josh Grosse, and some others for testing!
OK deraadt@
|
Revision tags: OPENBSD_6_1_BASE
|
#
1.11 |
|
27-Mar-2017 |
deraadt |
die whitespace die die die
|
#
1.10 |
|
25-Mar-2017 |
mlarkin |
Last bits needed to get seabios + alpine linux working. This is enough to get started and let more people help finding and fixing bugs.
ok kettenis, deraadt
|
#
1.9 |
|
25-Mar-2017 |
reyk |
Boot using BIOS from /etc/firmware/vmm-bios by default.
Instead of using the internal "vmboot", VMs will now be booted using the external BIOS firmware in /etc/firmware/vmm-bios (which is subject to a LGPLv3 license). Direct booting of OpenBSD kernels or non-default BIOS images is still supported for now using the -b/boot option that is replacing the -k/kernel option.
As requested by Theo, vmd(8) fails if neither the default BIOS is found nor a kernel has been specified in the VM configuration. The "vmm" BIOS has to be installed using fw_update(1), which will be done automatically in most cases where the OpenBSD can fetch it after install/upgrade.
OK mlarkin@
|
#
1.8 |
|
25-Mar-2017 |
mlarkin |
Implement some missing functionality and clean up some code in vmd pci emulation.
ok kettenis
|
#
1.7 |
|
25-Mar-2017 |
mlarkin |
Introduce a new function to obtain properly sized input data, and convert i8253/i8259/mc146818 emulation to use this.
|
#
1.6 |
|
24-Mar-2017 |
mlarkin |
Allow vmd to proceed after an interrupt occurred after retiring a cpuid instruction. Matches previous commit to kernel vmm.c
|
#
1.5 |
|
23-Mar-2017 |
mlarkin |
Implement memory size and SMP CPU count NVRAM registers in the emulated mc146818. This is needed for seabios to boot properly (and construct a sensible e820 map to send to the guest OS).
|
#
1.4 |
|
21-Mar-2017 |
mlarkin |
Fix two errors in NS8250 (UART) emulation. The first error zeroed out the high bits of %eax on reading register data from the emulated UART ports. The second error didn't properly assert the TXRDY bit during init - this bit was only set after the first character was sent. Both these bugs caused seabios to not be able to output any data. Found during the recent effort to get Linux guests booting.
|
#
1.3 |
|
15-Mar-2017 |
reyk |
Improve vmmci(4) shutdown and reboot.
This change handles various cases to power off the VM, even if it is unresponsive, stuck in ddb, or when the shutdown was initiated from the VM guest side. Usage of timeout and VM ACKs make sure that the VM is really turned off at some point.
OK mlarkin@
|
#
1.2 |
|
02-Mar-2017 |
reyk |
Add "locked lladdr" option to prevent VMs from spoofing MAC addresses.
This is especially useful when multiple VMs share a switch, the implementation is independent from the underlying switch or bridge.
no objections mlarkin@
|
#
1.1 |
|
01-Mar-2017 |
reyk |
Split vmm.c into two files: vm.c for the VM child, vmm.c for the parent
As discussed with mlarkin@, it makes it easier to maintain the file.
OK mlarkin@
|