History log of /freebsd-current/usr.sbin/bhyve/virtio.c
Revision Date Author Comments
# 0efad4ac 16-Feb-2024 Jessica Clarke <jrtc27@jrtc27.com>

bhyve: Support legacy PCI interrupts on arm64

This allows us to remove various #ifdef hacks and enable building more
PCI devices.

Note that a hole is left in the interrupt mapping for the RTC rather
than having the two core devices straddle the PCIe interrupts. QEMU's
virt machine also takes this approach.

Reviewed by: jhb
MFC after: 2 weeks
Obtained from: CheriBSD


# 4bb929cc 03-Apr-2024 Mark Johnston <markj@FreeBSD.org>

bhyve: Partially disable INT#x support in virtio for arm64

A FreeBSD guest won't make use of this support and pci_lintr_* is not
implemented on arm64. Simply make pci_lintr_*() calls amd64-specific
for now.

Reviewed by: corvink, jhb
MFC after: 2 weeks
Sponsored by: Innovate UK
Differential Revision: https://reviews.freebsd.org/D41741


# 4d65a7c6 24-Nov-2023 Warner Losh <imp@FreeBSD.org>

usr.sbin: Automated cleanup of cdefs and other formatting

Apply the following automated changes to try to eliminate
no-longer-needed sys/cdefs.h includes as well as now-empty
blank lines in a row.

Remove /^#if.*\n#endif.*\n#include\s+<sys/cdefs.h>.*\n/
Remove /\n+#include\s+<sys/cdefs.h>.*\n+#if.*\n#endif.*\n+/
Remove /\n+#if.*\n#endif.*\n+/
Remove /^#if.*\n#endif.*\n/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/types.h>/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/param.h>/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/capsicum.h>/

Sponsored by: Netflix


# 1d386b48 16-Aug-2023 Warner Losh <imp@FreeBSD.org>

Remove $FreeBSD$: one-line .c pattern

Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/


# 4d846d26 10-May-2023 Warner Losh <imp@FreeBSD.org>

spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSD

The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch
up to that fact and revert to their recommended match of BSD-2-Clause.

Discussed with: pfg
MFC After: 3 days
Sponsored by: Netflix


# 0f735657 24-Mar-2023 John Baldwin <jhb@FreeBSD.org>

bhyve: Remove vmctx member from struct vm_snapshot_meta.

This is a userland-only pointer that isn't relevant to the kernel and
doesn't belong in the ioctl structure shared between userland and the
kernel. For the kernel, the old structure for the ioctl is still
supported under COMPAT_FREEBSD13.

This changes vm_snapshot_req() in libvmmapi to accept an explicit
vmctx argument.

It also changes vm_snapshot_guest2host_addr to take an explicit vmctx
argument. As part of this change, move the declaration for this
function and its wrapper macro from vmm_snapshot.h to snapshot.h as it
is a userland-only API.

Reviewed by: corvink, markj
Differential Revision: https://reviews.freebsd.org/D38125


# 6a284cac 19-Jan-2023 John Baldwin <jhb@FreeBSD.org>

bhyve: Remove vmctx argument from PCI device model methods.

Most of these arguments were unused. Device models which do need
access to the vmctx in one of these methods can obtain it from the
pi_vmctx member of the pci_devinst argument instead.

Reviewed by: corvink, markj
Differential Revision: https://reviews.freebsd.org/D38096


# 78c2cd83 09-Dec-2022 John Baldwin <jhb@FreeBSD.org>

bhyve: Remove unused vcpu argument from PCI read/write methods.

Reviewed by: corvink, markj
Differential Revision: https://reviews.freebsd.org/D37652


# 593200c2 11-Nov-2022 Mark Johnston <markj@FreeBSD.org>

bhyve: Drop volatile qualifiers from virtio rings

The qualifiers are there presumably because these rings are mapped into
the guest, but they do not appear to be required for correctness, and
bhyve generally doesn't qualify accesses to guest memory this way.
Moreover, the qualifiers are discarded by snapshot code, causing clang
to emit warnings. Just stop using volatile here.

MFC after: 2 weeks
Reviewed by: corvink, jhb
Differential Revision: https://reviews.freebsd.org/D37291


# ed721684 23-Oct-2022 Mark Johnston <markj@FreeBSD.org>

bhyve: Address some signed/unsigned comparison warnings

MFC after: 1 week


# 98d920d9 08-Oct-2022 Mark Johnston <markj@FreeBSD.org>

bhyve: Annotate unused function parameters

MFC after: 1 week


# 37045dfa 16-Aug-2022 Mark Johnston <markj@FreeBSD.org>

bhyve: Mark variables and functions as static where appropriate

Mark them const as well when it makes sense to do so. No functional
change intended.

MFC after: 1 week
Sponsored by: The FreeBSD Foundation


# d079fc19 16-Dec-2021 Vitaliy Gusev <gusev.vitaliy@gmail.com>

bhyve: Only snapshot initialized VirtIO queues

If the virtio device is not fully initialized, then suspend fails with:

vi_pci_snapshot_queues: invalid address: vq->vq_desc
Failed to snapshot virtio-rnd; ret=14

MFC after: 1 week
Reviewed by: jhb
Differential Revision: https://reviews.freebsd.org/D26268


# 054accac 08-Jun-2021 Corvin Köhne <C.Koehne@beckhoff.com>

Add a virtio-input device emulation.

This will be used to inject keyboard/mouse input events into a guest.
The command line syntax is:
-s <slot>,virtio-input,/dev/input/eventX

Reviewed by: jhb (bhyve), grehan
Obtained from: Corvin Köhne <C.Koehne@beckhoff.com>
MFC after: 3 weeks
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D30020


# b0139127 30-Mar-2021 Ka Ho Ng <khng@FreeBSD.org>

bhyve: change vq_getchain to return iovecs in both directions

The old prototype requires callers to inspect flags of each descriptors
to get the starting position of host-writable iovecs.

vq_getchain() is changed to return a virtio request with the number of
host-readable iovecs and host-writable iovecs instead. Callers can avoid
boilerplate code of getting the start offset of host-writable iovecs.

Sponsored by: The FreeBSD Foundation
MFC after: 3 weeks
Reviewed by: afedorov
Approved by: philip (mentor)
Differential Revision: https://reviews.freebsd.org/D29433


# 54ac6f72 16-Mar-2021 Ka Ho Ng <khng300@gmail.com>

bhyve: virtio shares definitions between sys/dev/virtio

Definitions inside usr.sbin/bhyve/virtio.h are thrown away.
Definitions in sys/dev/virtio are used instead.

This reduces code duplication.

Sponsored by: The FreeBSD Foundation
Reviewed by: grehan
Approved by: philip (mentor)
Differential Revision: https://reviews.freebsd.org/D29084


# 483d953a 04-May-2020 John Baldwin <jhb@FreeBSD.org>

Initial support for bhyve save and restore.

Save and restore (also known as suspend and resume) permits a snapshot
to be taken of a guest's state that can later be resumed. In the
current implementation, bhyve(8) creates a UNIX domain socket that is
used by bhyvectl(8) to send a request to save a snapshot (and
optionally exit after the snapshot has been taken). A snapshot
currently consists of two files: the first holds a copy of guest RAM,
and the second file holds other guest state such as vCPU register
values and device model state.

To resume a guest, bhyve(8) must be started with a matching pair of
command line arguments to instantiate the same set of device models as
well as a pointer to the saved snapshot.

While the current implementation is useful for several uses cases, it
has a few limitations. The file format for saving the guest state is
tied to the ABI of internal bhyve structures and is not
self-describing (in that it does not communicate the set of device
models present in the system). In addition, the state saved for some
device models closely matches the internal data structures which might
prove a challenge for compatibility of snapshot files across a range
of bhyve versions. The file format also does not currently support
versioning of individual chunks of state. As a result, the current
file format is not a fixed binary format and future revisions to save
and restore will break binary compatiblity of snapshot files. The
goal is to move to a more flexible format that adds versioning,
etc. and at that point to commit to providing a reasonable level of
compatibility. As a result, the current implementation is not enabled
by default. It can be enabled via the WITH_BHYVE_SNAPSHOT=yes option
for userland builds, and the kernel option BHYVE_SHAPSHOT.

Submitted by: Mihai Tiganus, Flavius Anton, Darius Mihai
Submitted by: Elena Mihailescu, Mihai Carabas, Sergiu Weisz
Relnotes: yes
Sponsored by: University Politehnica of Bucharest
Sponsored by: Matthew Grooms (student scholarships)
Sponsored by: iXsystems
Differential Revision: https://reviews.freebsd.org/D19495


# 332eff95 08-Jan-2020 Vincenzo Maffione <vmaffione@FreeBSD.org>

bhyve: add wrapper for debug printf statements

Add printf() wrapper to use CR/CRLF terminators depending on whether
stdio is mapped to a tty open in raw mode.
Try to use the wrapper everywhere.
For now we leave the custom DPRINTF/WPRINTF defined by device
models, but we may remove them in the future.

Reviewed by: grehan, jhb
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D22657


# d55e0373 08-Nov-2019 Vincenzo Maffione <vmaffione@FreeBSD.org>

bhyve: add support for virtio-net mergeable rx buffers

Mergeable rx buffers is a virtio-net feature that allows the hypervisor
to use multiple RX descriptor chains to receive a single receive packet.
Without this feature, a TSO-enabled guest is compelled to publish only
64K (or 32K) long chains, and each of these large buffers is consumed
to receive a single packet, even a very short one. This is a waste of
memory, as a RX queue has room for 256 chains, which means up to 16MB
of buffer memory for each (single-queue) vtnet device.
With the feature on, the guest can publish 2K long chains, and the
hypervisor will merge them as needed.

This change also enables the feature in the netmap backend, which
supports virtio-net offloads. We plan to add support for the
tap backend too.
Note that differently from QEMU/KVM, here we implement one-copy receive,
while QEMU uses two copies.

Reviewed by: jhb
MFC after: 3 weeks
Differential Revision: https://reviews.freebsd.org/D21007


# 17e9052c 11-Jun-2019 Vincenzo Maffione <vmaffione@FreeBSD.org>

bhyve: virtio: introduce vq_kick_enable() and vq_kick_disable()

The VirtIO standard supports two schemes for notification suppression:
a notification enable bit and a more sophisticated one (event_idx) that
also supports delayed notifications. Currently bhyve fully supports
only the first scheme. This patch hides the notification suppression
internals by means of two inline routines, vq_kick_enable() and
vq_kick_disable(), and makes the code more readable.
Moreover, further improve readability by replacing the call to mb()
with a call to atomic_thread_fence_seq_cst(), which is already used
in virtio.c

Reviewed by: pmooney_pfmooney.com, bryanv
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D20581


# efc25f65 18-May-2019 Rodney W. Grimes <rgrimes@FreeBSD.org>

bhyve virtio needs barriers

Under certain tight race conditions, we found that the lack of a memory
barrier in bhyve's virtio handling causes it to miss a NO_NOTIFY state
transition on block devices, resulting in guest stall. The investigation
is recorded in OS-7613. As part of the examination into bhyve's use of
barriers, one other section was found to be problematic, but only on
non-x86 ISAs with less strict memory ordering. That was addressed in
this patch as well, although it was not at all a problem on x86.

PR: 231117
Submitted by: Patrick Mooney <patrick.mooney@joyent.com>
Reviewed by: jhb, kib, rgrimes
Approved by: jhb
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D19501


# f7224b70 13-Jun-2018 Marcelo Araujo <araujo@FreeBSD.org>

Fix style(9) space vs tab.

Reviewed by: jhb
MFC after: 3 weeks.
Sponsored by: iXsystems Inc.
Differential Revision: https://reviews.freebsd.org/D15768


# 1de7b4b8 27-Nov-2017 Pedro F. Giffuni <pfg@FreeBSD.org>

various: general adoption of SPDX licensing ID tags.

Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

No functional change intended.


# be80efd4 11-May-2015 Peter Grehan <grehan@FreeBSD.org>

Handling indirect descriptors is a capability of the host and
not one that needs to be negotiated. Use the host capabilities
field and not the negotiated field when verifying that indirect
descriptors are supported.

Found with the Redhat Windows viostor driver, which clears
the indirect capability in the negotiated caps and then starts
using them.

Reported and tested by: Leon Dang (ldang@nahannisys.com)
MFC after: 2 weeks


# fed2d5ed 26-Mar-2015 Peter Grehan <grehan@FreeBSD.org>

Move legacy interrupt allocation for virtio devices to common code.
There are a number of assumptions about legacy interrupts always
being available in virtio so don't allow back-ends to make the
decision to support them.

This fixes the issue seen with virtio-rnd on OpenBSD. MSI-x vectors
were not being used, and the virtio-rnd backend wasn't allocating a
legacy interrupt resulting in a bhyve assert and guest exit.

Reported by: Julian Hsiao, madoka at nyanisore dot net
Reviewed by: neel
MFC after: 1 week


# 7315946b 15-Mar-2015 Alexander Motin <mav@FreeBSD.org>

Fix networking problem after r280026.

I've missed that network driver sometimes returns taken request back to
available queue without processing. Add new helper function for that case.

Reported by: flo
MFC after: 2 weeks


# fdb7e97f 15-Mar-2015 Alexander Motin <mav@FreeBSD.org>

Modify virtqueue helpers added in r253440 to allow queuing.

Original virtqueue design allows queued and out-of-order processing, but
helpers added in r253440 suppose only direct blocking in-order one.
It could be fine for network, etc., but it is a huge limitation for storage
devices.


# e18f344b 08-Sep-2014 Peter Grehan <grehan@FreeBSD.org>

Add a callback to be notified about negotiated features.

Submitted by: luigi
Obtained from: Vincenzo Maffione, Universita` di Pisa
MFC after: 3 days


# b297e71e 22-Aug-2014 Tycho Nightingale <tychon@FreeBSD.org>

Fix a recursive lock acquisition in vi_reset_dev().

Reviewed by: grehan


# f23a8ac1 02-Jul-2014 Peter Grehan <grehan@FreeBSD.org>

Use correct flag for event index.

Submitted by: luigi
Obtained from: Vincenzo Maffione, Universita` di Pisa
MFC after: 1 week


# 3cbf3585 29-Jan-2014 John Baldwin <jhb@FreeBSD.org>

Enhance the support for PCI legacy INTx interrupts and enable them in
the virtio backends.
- Add a new ioctl to export the count of pins on the I/O APIC from vmm
to the hypervisor.
- Use pins on the I/O APIC >= 16 for PCI interrupts leaving 0-15 for
ISA interrupts.
- Populate the MP Table with I/O interrupt entries for any PCI INTx
interrupts.
- Create a _PRT table under the PCI root bridge in ACPI to route any
PCI INTx interrupts appropriately.
- Track which INTx interrupts are in use per-slot so that functions
that share a slot attempt to distribute their INTx interrupts across
the four available pins.
- Implicitly mask INTx interrupts if either MSI or MSI-X is enabled
and when the INTx DIS bit is set in a function's PCI command register.
Either assert or deassert the associated I/O APIC pin when the
state of one of those conditions changes.
- Add INTx support to the virtio backends.
- Always advertise the MSI capability in the virtio backends.

Submitted by: neel (7)
Reviewed by: neel
MFC after: 2 weeks


# d68f0bd6 09-Jan-2014 Peter Grehan <grehan@FreeBSD.org>

Fix issue with the virtio descriptor region being truncated
if it was above 4GB. This was seen with CentOS 6.5 guests with
large RAM, since the block drivers are loaded late in the
boot sequence and end up allocating descriptor memory from
high addresses.

Reported by: Michael Dexter
MFC after: 3 days


# aaa30169 17-Sep-2013 Peter Grehan <grehan@FreeBSD.org>

Pass the number of supported vectors to pci_emul_add_msicap() and
not the actual PCI BAR number.

Reviewed by: neel
Approved by: re@ (blanket)


# ba41c3c1 17-Jul-2013 Peter Grehan <grehan@FreeBSD.org>

Major rework of the virtio code. Split out common parts, and modify
the net/block devices accordingly.

Submitted by: Chris Torek torek at torek dot net
Reviewed by: grehan