History log of /linux-master/include/uapi/linux/vhost_types.h
Revision Date Author Comments
# 4398776f 21-Oct-2023 Si-Wei Liu <si-wei.liu@oracle.com>

vhost-vdpa: introduce IOTLB_PERSIST backend feature bit

Userspace needs this feature flag to distinguish if vhost-vdpa iotlb in
the kernel can be trusted to persist IOTLB mapping across vDPA reset.
Without it, userspace has no way to tell apart if it's running on an
older kernel, which could silently drop all iotlb mapping across vDPA
reset, especially with broken parent driver implementation for the
.reset driver op. The broken driver may incorrectly drop all mappings of
its own as part of .reset, which inadvertently ends up with corrupted
mapping state between vhost-vdpa userspace and the kernel. As a
workaround, to make the mapping behaviour predictable across reset,
userspace has to pro-actively remove all mappings before vDPA reset, and
then restore all the mappings afterwards. This workaround is done
unconditionally on top of all parent drivers today, due to the parent
driver implementation issue and no means to differentiate. This
workaround had been utilized in QEMU since day one when the
corresponding vhost-vdpa userspace backend came to the world.

There are 3 cases that backend may claim this feature bit on for:

- parent device that has to work with platform IOMMU
- parent device with on-chip IOMMU that has the expected
.reset_map support in driver
- parent device with vendor specific IOMMU implementation with
persistent IOTLB mapping already that has to specifically
declare this backend feature

The reason why .reset_map is being one of the pre-condition for
persistent iotlb is because without it, vhost-vdpa can't switch back
iotlb to the initial state later on, especially for the on-chip IOMMU
case which starts with identity mapping at device creation. virtio-vdpa
requires on-chip IOMMU to perform 1:1 passthrough translation from PA to
IOVA as-is to begin with, and .reset_map is the only means to turn back
iotlb to the identity mapping mode after vhost-vdpa is gone.

The difference in behavior did not matter as QEMU unmaps all the memory
unregistering the memory listener at vhost_vdpa_dev_start( started =
false), but the backend acknowledging this feature flag allows QEMU to
make sure it is safe to skip this unmap & map in the case of vhost stop
& start cycle.

In that sense, this feature flag is actually a signal for userspace to
know that the driver bug has been solved. Not offering it indicates that
userspace cannot trust the kernel will retain the maps.

Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <1697880319-4937-4-git-send-email-si-wei.liu@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>


# 7db0d602 18-Oct-2023 Si-Wei Liu <si-wei.liu@oracle.com>

vhost-vdpa: introduce descriptor group backend feature

Userspace knows if the device has dedicated descriptor group or not
by checking this feature bit.

It's only exposed if the vdpa driver backend implements the
.get_vq_desc_group() operation callback. Userspace trying to negotiate
this feature when it or the dependent _F_IOTLB_ASID feature hasn't
been exposed will result in an error.

Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20231018171456.1624030-5-dtatulea@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Tested-by: Si-Wei Liu <si-wei.liu@oracle.com>
Tested-by: Lei Yang <leiyang@redhat.com>


# 8b59b4da 09-Jun-2023 Eugenio Pérez <eperezma@redhat.com>

vdpa: add VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK flag

This feature flag allows the driver enabling virtqueues both before and
after DRIVER_OK.

This is needed for software assisted live migration, so userland can
restore the device status in devices with control virtqueue before the
dataplane is enabled.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Shannon Nelson <shannon.nelson@amd.com>
Message-Id: <20230609092127.170673-2-eperezma@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>


# c1ecd8e9 26-Jun-2023 Mike Christie <michael.christie@oracle.com>

vhost: allow userspace to create workers

For vhost-scsi with 3 vqs or more and a workload that tries to use
them in parallel like:

fio --filename=/dev/sdb --direct=1 --rw=randrw --bs=4k \
--ioengine=libaio --iodepth=128 --numjobs=3

the single vhost worker thread will become a bottlneck and we are stuck
at around 500K IOPs no matter how many jobs, virtqueues, and CPUs are
used.

To better utilize virtqueues and available CPUs, this patch allows
userspace to create workers and bind them to vqs. You can have N workers
per dev and also share N workers with M vqs on that dev.

This patch adds the interface related code and the next patch will hook
vhost-scsi into it. The patches do not try to hook net and vsock into
the interface because:

1. multiple workers don't seem to help vsock. The problem is that with
only 2 virtqueues we never fully use the existing worker when doing
bidirectional tests. This seems to match vhost-scsi where we don't see
the worker as a bottleneck until 3 virtqueues are used.

2. net already has a way to use multiple workers.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Message-Id: <20230626232307.97930-16-michael.christie@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>


# 69106b6f 03-Jan-2023 Sebastien Boeuf <sebastien.boeuf@intel.com>

vhost-vdpa: Introduce RESUME backend feature bit

Userspace knows if the device can be resumed or not by checking this
feature bit.

It's only exposed if the vdpa driver backend implements the resume()
operation callback. Userspace trying to negotiate this feature when it
hasn't been exposed will result in an error.

Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Message-Id: <b18db236ba3d990cdb41278eb4703be9201d9514.1672742878.git.sebastien.boeuf@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>


# 0723f1df 10-Aug-2022 Eugenio Pérez <eperezma@redhat.com>

vhost-vdpa: introduce SUSPEND backend feature bit

Userland knows if it can suspend the device or not by checking this feature
bit.

It's only offered if the vdpa driver backend implements the suspend()
operation callback, and to offer it or userland to ack it if the backend
does not offer that callback is an error.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20220810171512.2343333-3-eperezma@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>


# 94dfc73e 06-Apr-2022 Gustavo A. R. Silva <gustavoars@kernel.org>

treewide: uapi: Replace zero-length arrays with flexible-array members

There is a regular need in the kernel to provide a way to declare
having a dynamically sized set of trailing elements in a structure.
Kernel code should always use “flexible array members”[1] for these
cases. The older style of one-element or zero-length arrays should
no longer be used[2].

This code was transformed with the help of Coccinelle:
(linux-5.19-rc2$ spatch --jobs $(getconf _NPROCESSORS_ONLN) --sp-file script.cocci --include-headers --dir . > output.patch)

@@
identifier S, member, array;
type T1, T2;
@@

struct S {
...
T1 member;
T2 array[
- 0
];
};

-fstrict-flex-arrays=3 is coming and we need to land these changes
to prevent issues like these in the short future:

../fs/minix/dir.c:337:3: warning: 'strcpy' will always overflow; destination buffer has size 0,
but the source string has length 2 (including NUL byte) [-Wfortify-source]
strcpy(de3->name, ".");
^

Since these are all [0] to [] changes, the risk to UAPI is nearly zero. If
this breaks anything, we can use a union with a new member name.

[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://www.kernel.org/doc/html/v5.16/process/deprecated.html#zero-length-and-one-element-arrays

Link: https://github.com/KSPP/linux/issues/78
Build-tested-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/lkml/62b675ec.wKX6AOZ6cbE71vtF%25lkp@intel.com/
Acked-by: Dan Williams <dan.j.williams@intel.com> # For ndctl.h
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>


# 91233ad7 30-Mar-2022 Gautam Dawar <gautam.dawar@xilinx.com>

vhost: support ASID in IOTLB API

This patches allows userspace to send ASID based IOTLB message to
vhost. This idea is to use the reserved u32 field in the existing V2
IOTLB message. Vhost device should advertise this capability via
VHOST_BACKEND_F_IOTLB_ASID backend feature.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Gautam Dawar <gdawar@xilinx.com>
Message-Id: <20220330180436.24644-10-gdawar@xilinx.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>


# 175d493c 30-Mar-2022 Gautam Dawar <gautam.dawar@xilinx.com>

vhost: move the backend feature bits to vhost_types.h

We should store feature bits in vhost_types.h as what has been done
for e.g VHOST_F_LOG_ALL.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Gautam Dawar <gdawar@xilinx.com>
Message-Id: <20220330180436.24644-2-gdawar@xilinx.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>


# 1b48dc03 23-Oct-2020 Jason Wang <jasowang@redhat.com>

vhost: vdpa: report iova range

This patch introduces a new ioctl for vhost-vdpa device that can
report the iova range by the device.

For device that implements get_iova_range() method, we fetch it from
the vDPA device. If device doesn't implement get_iova_range() but
depends on platform IOMMU, we will query via DOMAIN_ATTR_GEOMETRY,
otherwise [0, ULLONG_MAX] is assumed.

For safety, this patch also rules out the map request which is not in
the valid range.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20201023090043.14430-3-jasowang@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>


# 25abc060 04-Aug-2020 Jason Wang <jasowang@redhat.com>

vhost-vdpa: support IOTLB batching hints

This patches extend the vhost IOTLB API to accept batch updating hints
form userspace. When userspace wants update the device IOTLB in a
batch, it may do:

1) Write vhost_iotlb_msg with VHOST_IOTLB_BATCH_BEGIN flag
2) Perform a batch of IOTLB updating via VHOST_IOTLB_UPDATE/INVALIDATE
3) Write vhost_iotlb_msg with VHOST_IOTLB_BATCH_END flag

Vhost-vdpa may decide to batch the IOMMU/IOTLB updating in step 3 when
vDPA device support set_map() ops. This is useful for the vDPA device
that want to know all the mappings to tweak their own DMA translation
logic.

For vDPA device that doesn't require set_map(), no behavior changes.

This capability is advertised via VHOST_BACKEND_F_IOTLB_BATCH capability.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20200804162048.22587-5-eli@mellanox.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>


# 4c8cf318 26-Mar-2020 Tiwei Bie <tiwei.bie@intel.com>

vhost: introduce vDPA-based backend

This patch introduces a vDPA-based vhost backend. This backend is
built on top of the same interface defined in virtio-vDPA and provides
a generic vhost interface for userspace to accelerate the virtio
devices in guest.

This backend is implemented as a vDPA device driver on top of the same
ops used in virtio-vDPA. It will create char device entry named
vhost-vdpa-$index for userspace to use. Userspace can use vhost ioctls
on top of this char device to setup the backend.

Vhost ioctls are extended to make it type agnostic and behave like a
virtio device, this help to eliminate type specific API like what
vhost_net/scsi/vsock did:

- VHOST_VDPA_GET_DEVICE_ID: get the virtio device ID which is defined
by virtio specification to differ from different type of devices
- VHOST_VDPA_GET_VRING_NUM: get the maximum size of virtqueue
supported by the vDPA device
- VHSOT_VDPA_SET/GET_STATUS: set and get virtio status of vDPA device
- VHOST_VDPA_SET/GET_CONFIG: access virtio config space
- VHOST_VDPA_SET_VRING_ENABLE: enable a specific virtqueue

For memory mapping, IOTLB API is mandated for vhost-vDPA which means
userspace drivers are required to use
VHOST_IOTLB_UPDATE/VHOST_IOTLB_INVALIDATE to add or remove mapping for
a specific userspace memory region.

The vhost-vDPA API is designed to be type agnostic, but it allows net
device only in current stage. Due to the lacking of control virtqueue
support, some features were filter out by vhost-vdpa.

We will enable more features and devices in the near future.

Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20200326140125.19794-8-jasowang@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>


# 4b867132 17-Dec-2018 Paolo Bonzini <pbonzini@redhat.com>

vhost: split structs into a separate header file

vhost structs are shared by vhost-kernel and vhost-user. Split them
into a separate file to ease copying them into programs that implement
either the server or the client side of vhost-user.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>