#
47e62e6d |
|
02-Feb-2024 |
Zhu Lingshan <lingshan.zhu@intel.com> |
vduse: implement vdpa_config_ops.get_vq_size for vduse This commit implements get_vq_size for vdpa_config_ops. This new interface is used to report per vq size. Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Message-Id: <20240202163905.8834-8-lingshan.zhu@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
#
d7b4e328 |
|
19-Feb-2024 |
Maxime Coquelin <maxime.coquelin@redhat.com> |
vduse: implement DMA sync callbacks Since commit 295525e29a5b ("virtio_net: merge dma operations when filling mergeable buffers"), VDUSE device require support for DMA's .sync_single_for_cpu() operation as the memory is non-coherent between the device and CPU because of the use of a bounce buffer. This patch implements both .sync_single_for_cpu() and .sync_single_for_device() callbacks, and also skip bounce buffer copies during DMA map and unmap operations if the DMA_ATTR_SKIP_CPU_SYNC attribute is set to avoid extra copies of the same buffer. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Message-Id: <20240219170606.587290-1-maxime.coquelin@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
#
4e94ddfe |
|
30-Nov-2023 |
Christian Brauner <brauner@kernel.org> |
file: remove __receive_fd() Honestly, there's little value in having a helper with and without that int __user *ufd argument. It's just messy and doesn't really give us anything. Just expose receive_fd() with that argument and get rid of that helper. Link: https://lore.kernel.org/r/20231130-vfs-files-fixes-v1-5-e73ca6f4ea83@kernel.org Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Christian Brauner <brauner@kernel.org>
|
#
3652117f |
|
22-Nov-2023 |
Christian Brauner <brauner@kernel.org> |
eventfd: simplify eventfd_signal() Ever since the eventfd type was introduced back in 2007 in commit e1ad7468c77d ("signal/timer/event: eventfd core") the eventfd_signal() function only ever passed 1 as a value for @n. There's no point in keeping that additional argument. Link: https://lore.kernel.org/r/20231122-vfs-eventfd-signal-v2-2-bd549b14ce0c@kernel.org Acked-by: Xu Yilun <yilun.xu@intel.com> Acked-by: Andrew Donnellan <ajd@linux.ibm.com> # ocxl Acked-by: Eric Farman <farman@linux.ibm.com> # s390 Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Christian Brauner <brauner@kernel.org>
|
#
484f0a07 |
|
06-Oct-2023 |
Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
vduse: make vduse_class constant Now that the driver core allows for struct class to be in read-only memory, we should make all 'class' structures declared at build time placing them into read-only memory, instead of having to be dynamically allocated at runtime. Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Cc: Xie Yongji <xieyongji@bytedance.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Message-Id: <2023100643-tricolor-citizen-6c2d@gregkh> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Xie Yongji <xieyongji@bytedance.com> Acked-by: Jason Wang <jasowang@redhat.com>
|
#
7ca26efb |
|
05-Jul-2023 |
Maxime Coquelin <maxime.coquelin@redhat.com> |
vduse: Use proper spinlock for IRQ injection The IRQ injection work used spin_lock_irq() to protect the scheduling of the softirq, but spin_lock_bh() should be used. With spin_lock_irq(), we noticed delay of more than 6 seconds between the time a NAPI polling work is scheduled and the time it is executed. Fixes: c8a6153b6c59 ("vduse: Introduce VDUSE - vDPA Device in Userspace") Cc: xieyongji@bytedance.com Suggested-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Message-Id: <20230705114505.63274-1-maxime.coquelin@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Xie Yongji <xieyongji@bytedance.com>
|
#
f06cf1e1 |
|
22-Jun-2023 |
Maxime Coquelin <maxime.coquelin@redhat.com> |
vduse: fix NULL pointer dereference vduse_vdpa_set_vq_affinity callback can be called with NULL value as cpu_mask when deleting the vduse device. This patch resets virtqueue's IRQ affinity mask value to set all CPUs instead of dereferencing NULL cpu_mask. [ 4760.952149] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 4760.959110] #PF: supervisor read access in kernel mode [ 4760.964247] #PF: error_code(0x0000) - not-present page [ 4760.969385] PGD 0 P4D 0 [ 4760.971927] Oops: 0000 [#1] PREEMPT SMP PTI [ 4760.976112] CPU: 13 PID: 2346 Comm: vdpa Not tainted 6.4.0-rc6+ #4 [ 4760.982291] Hardware name: Dell Inc. PowerEdge R640/0W23H8, BIOS 2.8.1 06/26/2020 [ 4760.989769] RIP: 0010:memcpy_orig+0xc5/0x130 [ 4760.994049] Code: 16 f8 4c 89 07 4c 89 4f 08 4c 89 54 17 f0 4c 89 5c 17 f8 c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 83 fa 08 72 1b <4c> 8b 06 4c 8b 4c 16 f8 4c 89 07 4c 89 4c 17 f8 c3 cc cc cc cc 66 [ 4761.012793] RSP: 0018:ffffb1d565abb830 EFLAGS: 00010246 [ 4761.018020] RAX: ffff9f4bf6b27898 RBX: ffff9f4be23969c0 RCX: ffff9f4bcadf6400 [ 4761.025152] RDX: 0000000000000008 RSI: 0000000000000000 RDI: ffff9f4bf6b27898 [ 4761.032286] RBP: 0000000000000000 R08: 0000000000000008 R09: 0000000000000000 [ 4761.039416] R10: 0000000000000000 R11: 0000000000000600 R12: 0000000000000000 [ 4761.046549] R13: 0000000000000000 R14: 0000000000000080 R15: ffffb1d565abbb10 [ 4761.053680] FS: 00007f64c2ec2740(0000) GS:ffff9f635f980000(0000) knlGS:0000000000000000 [ 4761.061765] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 4761.067513] CR2: 0000000000000000 CR3: 0000001875270006 CR4: 00000000007706e0 [ 4761.074645] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 4761.081775] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 4761.088909] PKRU: 55555554 [ 4761.091620] Call Trace: [ 4761.094074] <TASK> [ 4761.096180] ? __die+0x1f/0x70 [ 4761.099238] ? page_fault_oops+0x171/0x4f0 [ 4761.103340] ? exc_page_fault+0x7b/0x180 [ 4761.107265] ? asm_exc_page_fault+0x22/0x30 [ 4761.111460] ? memcpy_orig+0xc5/0x130 [ 4761.115126] vduse_vdpa_set_vq_affinity+0x3e/0x50 [vduse] [ 4761.120533] virtnet_clean_affinity.part.0+0x3d/0x90 [virtio_net] [ 4761.126635] remove_vq_common+0x1a4/0x250 [virtio_net] [ 4761.131781] virtnet_remove+0x5d/0x70 [virtio_net] [ 4761.136580] virtio_dev_remove+0x3a/0x90 [ 4761.140509] device_release_driver_internal+0x19b/0x200 [ 4761.145742] bus_remove_device+0xc2/0x130 [ 4761.149755] device_del+0x158/0x3e0 [ 4761.153245] ? kernfs_find_ns+0x35/0xc0 [ 4761.157086] device_unregister+0x13/0x60 [ 4761.161010] unregister_virtio_device+0x11/0x20 [ 4761.165543] device_release_driver_internal+0x19b/0x200 [ 4761.170770] bus_remove_device+0xc2/0x130 [ 4761.174782] device_del+0x158/0x3e0 [ 4761.178276] ? __pfx_vdpa_name_match+0x10/0x10 [vdpa] [ 4761.183336] device_unregister+0x13/0x60 [ 4761.187260] vdpa_nl_cmd_dev_del_set_doit+0x63/0xe0 [vdpa] Fixes: 28f6288eb63d ("vduse: Support set_vq_affinity callback") Cc: xieyongji@bytedance.com Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Message-Id: <20230622204851.318125-1-maxime.coquelin@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Xie Yongji <xieyongji@bytedance.com>
|
#
4c630f30 |
|
17-May-2023 |
Lorenzo Stoakes <lstoakes@gmail.com> |
mm/gup: remove vmas parameter from pin_user_pages() We are now in a position where no caller of pin_user_pages() requires the vmas parameter at all, so eliminate this parameter from the function and all callers. This clears the way to removing the vmas parameter from GUP altogether. Link: https://lkml.kernel.org/r/195a99ae949c9f5cb589d2222b736ced96ec199a.1684350871.git.lstoakes@gmail.com Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> [qib] Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com> [drivers/media] Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christian König <christian.koenig@amd.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Janosch Frank <frankja@linux.ibm.com> Cc: Jarkko Sakkinen <jarkko@kernel.org> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Sean Christopherson <seanjc@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
a90e8608 |
|
29-May-2023 |
Sheng Zhao <sheng.zhao@bytedance.com> |
vduse: avoid empty string for dev name Syzkaller hits a kernel WARN when the first character of the dev name provided is NULL. Solution is to add a NULL check before calling cdev_device_add() in vduse_create_dev(). kobject: (0000000072042169): attempted to be registered with empty name! WARNING: CPU: 0 PID: 112695 at lib/kobject.c:236 Call Trace: kobject_add_varg linux/src/lib/kobject.c:390 [inline] kobject_add+0xf6/0x150 linux/src/lib/kobject.c:442 device_add+0x28f/0xc20 linux/src/drivers/base/core.c:2167 cdev_device_add+0x83/0xc0 linux/src/fs/char_dev.c:546 vduse_create_dev linux/src/drivers/vdpa/vdpa_user/vduse_dev.c:2254 [inline] vduse_ioctl+0x7b5/0xf30 linux/src/drivers/vdpa/vdpa_user/vduse_dev.c:2316 vfs_ioctl linux/src/fs/ioctl.c:47 [inline] file_ioctl linux/src/fs/ioctl.c:510 [inline] do_vfs_ioctl+0x14b/0xa80 linux/src/fs/ioctl.c:697 ksys_ioctl+0x7c/0xa0 linux/src/fs/ioctl.c:714 __do_sys_ioctl linux/src/fs/ioctl.c:721 [inline] __se_sys_ioctl linux/src/fs/ioctl.c:719 [inline] __x64_sys_ioctl+0x42/0x50 linux/src/fs/ioctl.c:719 do_syscall_64+0x94/0x330 linux/src/arch/x86/entry/common.c:291 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: c8a6153b6c59 ("vduse: Introduce VDUSE - vDPA Device in Userspace") Cc: "Xie Yongji" <xieyongji@bytedance.com> Reported-by: Xianjun Zeng <zengxianjun@bytedance.com> Signed-off-by: Sheng Zhao <sheng.zhao@bytedance.com> Message-Id: <20230530033626.1266794-1-sheng.zhao@bytedance.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Xie Yongji <xieyongji@bytedance.com> Cc: "Michael S. Tsirkin"<mst@redhat.com>, "Jason Wang"<jasowang@redhat.com>, Reviewed-by: Xie Yongji <xieyongji@bytedance.com>
|
#
b774f93d |
|
22-Mar-2023 |
Xie Yongji <xieyongji@bytedance.com> |
vduse: Support specifying bounce buffer size via sysfs As discussed in [1], this adds sysfs interface to support specifying bounce buffer size in virtio-vdpa case. It would be a performance tuning parameter for high throughput workloads. [1] https://lore.kernel.org/netdev/e8f25a35-9d45-69f9-795d-bdbbb90337a3@redhat.com/ Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20230323053043.35-12-xieyongji@bytedance.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
#
d4438d23 |
|
22-Mar-2023 |
Xie Yongji <xieyongji@bytedance.com> |
vduse: Delay iova domain creation Delay creating iova domain until the vduse device is registered to vdpa bus. This is a preparation for adding sysfs interface to support specifying bounce buffer size for the iova domain. Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20230323053043.35-11-xieyongji@bytedance.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
#
e38632dd |
|
22-Mar-2023 |
Xie Yongji <xieyongji@bytedance.com> |
vduse: Signal vq trigger eventfd directly if possible Now the vdpa callback will associate an trigger eventfd in some cases. For performance reasons, VDUSE can signal it directly during irq injection. Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20230323053043.35-10-xieyongji@bytedance.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
#
66640f4a |
|
22-Mar-2023 |
Xie Yongji <xieyongji@bytedance.com> |
vduse: Add sysfs interface for irq callback affinity Add sysfs interface for each vduse virtqueue to get/set the affinity for irq callback. This might be useful for performance tuning when the irq callback affinity mask contains more than one CPU. Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Message-Id: <20230323053043.35-8-xieyongji@bytedance.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
|
#
bfae1648 |
|
22-Mar-2023 |
Xie Yongji <xieyongji@bytedance.com> |
vduse: Support get_vq_affinity callback This implements get_vq_affinity callback so that the virtio-blk driver can build the blk-mq queues based on the irq callback affinity. Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Message-Id: <20230323053043.35-7-xieyongji@bytedance.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
|
#
28f6288e |
|
22-Mar-2023 |
Xie Yongji <xieyongji@bytedance.com> |
vduse: Support set_vq_affinity callback Since virtio-vdpa bus driver already support interrupt affinity spreading mechanism, let's implement the set_vq_affinity callback to bring it to vduse device. After we get the virtqueue's affinity, we can spread IRQs between CPUs in the affinity mask, in a round-robin manner, to run the irq callback. Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Message-Id: <20230323053043.35-6-xieyongji@bytedance.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
|
#
78885597 |
|
22-Mar-2023 |
Xie Yongji <xieyongji@bytedance.com> |
vduse: Refactor allocation for vduse virtqueues Allocate memory for vduse virtqueues one by one instead of doing one allocation for all of them. This is a preparation for adding sysfs interface for virtqueues. Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20230323053043.35-5-xieyongji@bytedance.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
#
1aaba11d |
|
13-Mar-2023 |
Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
driver core: class: remove module * from class_create() The module pointer in class_create() never actually did anything, and it shouldn't have been requred to be set as a parameter even if it did something. So just remove it and fix up all callers of the function in the kernel tree at the same time. Cc: "Rafael J. Wysocki" <rafael@kernel.org> Acked-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Link: https://lore.kernel.org/r/20230313181843.1207845-4-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
937c783a |
|
28-Nov-2022 |
Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> |
vduse: Validate vq_num in vduse_validate_config() Add a limit to 'config->vq_num' which is user controlled data which comes from an vduse_ioctl to prevent large memory allocations. Micheal says - This limit is somewhat arbitrary. However, currently virtio pci and ccw are limited to a 16 bit vq number. While MMIO isn't it is also isn't used with lots of VQs due to current lack of support for per-vq interrupts. Thus, the 0xffff limit on number of VQs corresponding to a 16-bit VQ number seems sufficient for now. This is found using static analysis with smatch. Suggested-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Message-Id: <20221128155717.2579992-1-harshit.m.mogalapalli@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
|
#
ff62b8e6 |
|
23-Nov-2022 |
Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
driver core: make struct class.devnode() take a const * The devnode() in struct class should not be modifying the device that is passed into it, so mark it as a const * and propagate the function signature changes out into all relevant subsystems that use this callback. Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Reinette Chatre <reinette.chatre@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: x86@kernel.org Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Jens Axboe <axboe@kernel.dk> Cc: Justin Sanders <justin@coraid.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: Benjamin Gaignard <benjamin.gaignard@collabora.com> Cc: Liam Mark <lmark@codeaurora.org> Cc: Laura Abbott <labbott@redhat.com> Cc: Brian Starkey <Brian.Starkey@arm.com> Cc: John Stultz <jstultz@google.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Maxime Ripard <mripard@kernel.org> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: David Airlie <airlied@gmail.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Leon Romanovsky <leon@kernel.org> Cc: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: Sean Young <sean@mess.org> Cc: Frank Haverkamp <haver@linux.ibm.com> Cc: Jiri Slaby <jirislaby@kernel.org> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Cornelia Huck <cohuck@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Anton Vorontsov <anton@enomsg.org> Cc: Colin Cross <ccross@android.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Jaroslav Kysela <perex@perex.cz> Cc: Takashi Iwai <tiwai@suse.com> Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl> Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Cc: Xie Yongji <xieyongji@bytedance.com> Cc: Gautam Dawar <gautam.dawar@xilinx.com> Cc: Dan Carpenter <error27@gmail.com> Cc: Eli Cohen <elic@nvidia.com> Cc: Parav Pandit <parav@nvidia.com> Cc: Maxime Coquelin <maxime.coquelin@redhat.com> Cc: alsa-devel@alsa-project.org Cc: dri-devel@lists.freedesktop.org Cc: kvm@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Cc: linux-block@vger.kernel.org Cc: linux-input@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-media@vger.kernel.org Cc: linux-rdma@vger.kernel.org Cc: linux-scsi@vger.kernel.org Cc: linux-usb@vger.kernel.org Cc: virtualization@lists.linux-foundation.org Link: https://lore.kernel.org/r/20221123122523.1332370-2-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
46f8a292 |
|
31-Aug-2022 |
Maxime Coquelin <maxime.coquelin@redhat.com> |
vduse: prevent uninitialized memory accesses If the VDUSE application provides a smaller config space than the driver expects, the driver may use uninitialized memory from the stack. This patch prevents it by initializing the buffer passed by the driver to store the config value. This fix addresses CVE-2022-2308. Cc: stable@vger.kernel.org # v5.15+ Fixes: c8a6153b6c59 ("vduse: Introduce VDUSE - vDPA Device in Userspace") Reviewed-by: Xie Yongji <xieyongji@bytedance.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Message-Id: <20220831154923.97809-1-maxime.coquelin@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
|
#
ad146355 |
|
02-Aug-2022 |
Xie Yongji <xieyongji@bytedance.com> |
vduse: Support querying information of IOVA regions This introduces a new ioctl: VDUSE_IOTLB_GET_INFO to support querying some information of IOVA regions. Now it can be used to query whether the IOVA region supports userspace memory registration. Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Message-Id: <20220803045523.23851-6-xieyongji@bytedance.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
|
#
79a463be |
|
02-Aug-2022 |
Xie Yongji <xieyongji@bytedance.com> |
vduse: Support registering userspace memory for IOVA regions Introduce two ioctls: VDUSE_IOTLB_REG_UMEM and VDUSE_IOTLB_DEREG_UMEM to support registering and de-registering userspace memory for IOVA regions. Now it only supports registering userspace memory for bounce buffer region in virtio-vdpa case. Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220803045523.23851-5-xieyongji@bytedance.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
#
0e0348ac3 |
|
13-Jun-2022 |
Parav Pandit <parav@nvidia.com> |
vduse: Tie vduse mgmtdev and its device vduse devices are not backed by any real devices such as PCI. Hence it doesn't have any parent device linked to it. Kernel driver model in [1] suggests to avoid an empty device release callback. Hence tie the mgmtdevice object's life cycle to an allocate dummy struct device instead of static one. [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/core-api/kobject.rst?h=v5.18-rc7#n284 Signed-off-by: Parav Pandit <parav@nvidia.com> Message-Id: <20220613195223.473966-1-parav@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Xie Yongji <xieyongji@bytedance.com> Acked-by: Jason Wang <jasowang@redhat.com>
|
#
b27ee76c |
|
26-Apr-2022 |
Xie Yongji <xieyongji@bytedance.com> |
vduse: Fix NULL pointer dereference on sysfs access The control device has no drvdata. So we will get a NULL pointer dereference when accessing control device's msg_timeout attribute via sysfs: [ 132.841881][ T3644] BUG: kernel NULL pointer dereference, address: 00000000000000f8 [ 132.850619][ T3644] RIP: 0010:msg_timeout_show (drivers/vdpa/vdpa_user/vduse_dev.c:1271) [ 132.869447][ T3644] dev_attr_show (drivers/base/core.c:2094) [ 132.870215][ T3644] sysfs_kf_seq_show (fs/sysfs/file.c:59) [ 132.871164][ T3644] ? device_remove_bin_file (drivers/base/core.c:2088) [ 132.872082][ T3644] kernfs_seq_show (fs/kernfs/file.c:164) [ 132.872838][ T3644] seq_read_iter (fs/seq_file.c:230) [ 132.873578][ T3644] ? __vmalloc_area_node (mm/vmalloc.c:3041) [ 132.874532][ T3644] kernfs_fop_read_iter (fs/kernfs/file.c:238) [ 132.875513][ T3644] __kernel_read (fs/read_write.c:440 (discriminator 1)) [ 132.876319][ T3644] kernel_read (fs/read_write.c:459) [ 132.877129][ T3644] kernel_read_file (fs/kernel_read_file.c:94) [ 132.877978][ T3644] kernel_read_file_from_fd (include/linux/file.h:45 fs/kernel_read_file.c:186) [ 132.879019][ T3644] __do_sys_finit_module (kernel/module.c:4207) [ 132.879930][ T3644] __ia32_sys_finit_module (kernel/module.c:4189) [ 132.880930][ T3644] do_int80_syscall_32 (arch/x86/entry/common.c:112 arch/x86/entry/common.c:132) [ 132.881847][ T3644] entry_INT80_compat (arch/x86/entry/entry_64_compat.S:419) To fix it, don't create the unneeded attribute for control device anymore. Fixes: c8a6153b6c59 ("vduse: Introduce VDUSE - vDPA Device in Userspace") Reported-by: kernel test robot <oliver.sang@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Message-Id: <20220426073656.229-1-xieyongji@bytedance.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
#
db9adcbf |
|
30-Mar-2022 |
Gautam Dawar <gautam.dawar@xilinx.com> |
vdpa: multiple address spaces support This patches introduces the multiple address spaces support for vDPA device. This idea is to identify a specific address space via an dedicated identifier - ASID. During vDPA device allocation, vDPA device driver needs to report the number of address spaces supported by the device then the DMA mapping ops of the vDPA device needs to be extended to support ASID. This helps to isolate the environments for the virtqueue that will not be assigned directly. E.g in the case of virtio-net, the control virtqueue will not be assigned directly to guest. As a start, simply claim 1 virtqueue groups and 1 address spaces for all vDPA devices. And vhost-vDPA will simply reject the device with more than 1 virtqueue groups or address spaces. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Gautam Dawar <gdawar@xilinx.com> Message-Id: <20220330180436.24644-7-gdawar@xilinx.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
#
d4821902 |
|
30-Mar-2022 |
Gautam Dawar <gautam.dawar@xilinx.com> |
vdpa: introduce virtqueue groups This patch introduces virtqueue groups to vDPA device. The virtqueue group is the minimal set of virtqueues that must share an address space. And the address space identifier could only be attached to a specific virtqueue group. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Gautam Dawar <gdawar@xilinx.com> Message-Id: <20220330180436.24644-6-gdawar@xilinx.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
#
a64917bc |
|
05-Jan-2022 |
Eli Cohen <elic@nvidia.com> |
vdpa: Provide interface to read driver features Provide an interface to read the negotiated features. This is needed when building the netlink message in vdpa_dev_net_config_fill(). Also fix the implementation of vdpa_dev_net_config_fill() to use the negotiated features instead of the device features. To make APIs clearer, make the following name changes to struct vdpa_config_ops so they better describe their operations: get_features -> get_device_features set_features -> set_driver_features Finally, add get_driver_features to return the negotiated features and add implementation to all the upstream drivers. Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20220105114646.577224-2-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
#
b4d80c8d |
|
06-Dec-2021 |
Guanjun <guanjun@linux.alibaba.com> |
vduse: moving kvfree into caller This free action should be moved into caller 'vduse_ioctl' in concert with the allocation. No functional change. Signed-off-by: Guanjun <guanjun@linux.alibaba.com> Link: https://lore.kernel.org/r/1638780498-55571-1-git-send-email-guanjun@linux.alibaba.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
#
dc1db006 |
|
08-Dec-2021 |
Dan Carpenter <dan.carpenter@oracle.com> |
vduse: check that offset is within bounds in get_config() This condition checks "len" but it does not check "offset" and that could result in an out of bounds read if "offset > dev->config_size". The problem is that since both variables are unsigned the "dev->config_size - offset" subtraction would result in a very high unsigned value. I think these checks might not be necessary because "len" and "offset" are supposed to already have been validated using the vhost_vdpa_config_validate() function. But I do not know the code perfectly, and I like to be safe. Fixes: c8a6153b6c59 ("vduse: Introduce VDUSE - vDPA Device in Userspace") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/20211208150956.GA29160@kili Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Cc: stable@vger.kernel.org
|
#
ff9f9c6e |
|
08-Dec-2021 |
Dan Carpenter <dan.carpenter@oracle.com> |
vduse: fix memory corruption in vduse_dev_ioctl() The "config.offset" comes from the user. There needs to a check to prevent it being out of bounds. The "config.offset" and "dev->config_size" variables are both type u32. So if the offset if out of bounds then the "dev->config_size - config.offset" subtraction results in a very high u32 value. The out of bounds offset can result in memory corruption. Fixes: c8a6153b6c59 ("vduse: Introduce VDUSE - vDPA Device in Userspace") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/20211208103307.GA3778@kili Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Cc: stable@vger.kernel.org
|
#
d8ca2fa5 |
|
26-Oct-2021 |
Parav Pandit <parav@nvidia.com> |
vdpa: Enable user to set mac and mtu of vdpa device $ vdpa dev add name bar mgmtdev vdpasim_net mac 00:11:22:33:44:55 mtu 9000 $ vdpa dev config show bar: mac 00:11:22:33:44:55 link up link_announce false mtu 9000 $ vdpa dev config show -jp { "config": { "bar": { "mac": "00:11:22:33:44:55", "link ": "up", "link_announce ": false, "mtu": 9000, } } } Signed-off-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Eli Cohen <elic@nvidia.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20211026175519.87795-5-parav@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
|
#
0943aacf |
|
29-Sep-2021 |
Xie Yongji <xieyongji@bytedance.com> |
vduse: Fix race condition between resetting and irq injecting The interrupt might be triggered after a reset since there is no synchronization between resetting and irq injecting. And it might break something if the interrupt is delayed until a new round of device initialization. Fixes: c8a6153b6c59 ("vduse: Introduce VDUSE - vDPA Device in Userspace") Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Link: https://lore.kernel.org/r/20210929083050.88-1-xieyongji@bytedance.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
#
1394103f |
|
23-Sep-2021 |
Xie Yongji <xieyongji@bytedance.com> |
vduse: Disallow injecting interrupt before DRIVER_OK is set The interrupt callback should not be triggered before DRIVER_OK is set. Otherwise, it might break the virtio device driver. So let's add a check to avoid the unexpected behavior. Fixes: c8a6153b6c59 ("vduse: Introduce VDUSE - vDPA Device in Userspace") Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Link: https://lore.kernel.org/r/20210923075722.98-1-xieyongji@bytedance.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
#
7bb5fb20 |
|
06-Sep-2021 |
Xie Yongji <xieyongji@bytedance.com> |
vduse: Cleanup the old kernel states after reset failure We should cleanup the old kernel states e.g. interrupt callback no matter whether the userspace handle the reset correctly or not since virtio-vdpa can't handle the reset failure now. Otherwise, the old state might be used after reset which might break something, e.g. the old interrupt callback might be triggered by userspace after reset, which can break the virtio device driver. Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Link: https://lore.kernel.org/r/20210906142158.181-1-xieyongji@bytedance.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
|
#
6243e3c7 |
|
07-Sep-2021 |
Dan Carpenter <dan.carpenter@oracle.com> |
vduse: missing error code in vduse_init() This should return -ENOMEM if alloc_workqueue() fails. Currently it returns success. Fixes: b66219796563 ("vduse: Introduce VDUSE - vDPA Device in Userspace") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/20210907073223.GA18254@kili Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Xie Yongji <xieyongji@bytedance.com> Acked-by: Jason Wang <jasowang@redhat.com>
|
#
7bc7f618 |
|
31-Aug-2021 |
Xie Yongji <xieyongji@bytedance.com> |
Documentation: Add documentation for VDUSE VDUSE (vDPA Device in Userspace) is a framework to support implementing software-emulated vDPA devices in userspace. This document is intended to clarify the VDUSE design and usage. Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20210831103634.33-14-xieyongji@bytedance.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
#
c8a6153b |
|
31-Aug-2021 |
Xie Yongji <xieyongji@bytedance.com> |
vduse: Introduce VDUSE - vDPA Device in Userspace This VDUSE driver enables implementing software-emulated vDPA devices in userspace. The vDPA device is created by ioctl(VDUSE_CREATE_DEV) on /dev/vduse/control. Then a char device interface (/dev/vduse/$NAME) is exported to userspace for device emulation. In order to make the device emulation more secure, the device's control path is handled in kernel. A message mechnism is introduced to forward some dataplane related control messages to userspace. And in the data path, the DMA buffer will be mapped into userspace address space through different ways depending on the vDPA bus to which the vDPA device is attached. In virtio-vdpa case, the MMU-based software IOTLB is used to achieve that. And in vhost-vdpa case, the DMA buffer is reside in a userspace memory region which can be shared to the VDUSE userspace processs via transferring the shmfd. For more details on VDUSE design and usage, please see the follow-on Documentation commit. NB(mst): when merging this with b542e383d8c0 ("eventfd: Make signal recursion protection a task bit") replace eventfd_signal_count with eventfd_signal_allowed, and drop the previous ("eventfd: Export eventfd_wake_count to modules"). Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20210831103634.33-13-xieyongji@bytedance.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|