History log of /freebsd-current/usr.sbin/bhyve/pci_nvme.c
Revision Date Author Comments
# c46860db 29-Jan-2024 John Baldwin <jhb@FreeBSD.org>

bhyve: Use NVMEF macro to construct fields

Reviewed by: corvink, chuck (older version)
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D43607


# c85b3903 29-Jan-2024 John Baldwin <jhb@FreeBSD.org>

bhyve: Use the NVMEM macro instead of expanded versions

Reviewed by: corvink, chuck
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D43603


# 1dade1f2 29-Jan-2024 John Baldwin <jhb@FreeBSD.org>

nvme: Rename NVMEB helper macro to NVMEM

The current macro always builds a full mask for a named field, so use
the M suffix for mask.

Reviewed by: chuck, imp
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D43601


# c4269236 29-Jan-2024 John Baldwin <jhb@FreeBSD.org>

bhyve: Use NVMEV to read the ASQS field of AQA

This is not a functional change, but just being consistent instead of
omitting a shift by 0.

Reviewed by: corvink, chuck, imp
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D43600


# 94962f5a 29-Jan-2024 John Baldwin <jhb@FreeBSD.org>

bhyve: Use the NVMEV macro instead of expanded versions

Reviewed by: corvink, chuck (older version)
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D43598


# 32557d16 12-Oct-2023 Chuck Tuffli <chuck@FreeBSD.org>

bhyve nvme: Add NQN value

Add a NVMe Qualified Name (NQN) to the Controller Data structure using
the "first format" (i.e., "... used by any organization that owns a
domain name" Section 7.9 NVM-Express 1.4c 2021.06.28 Ratified).

This avoids a Linux kernel warning about a missing or invalid NQN.

Reviewed by: jhb
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D42058


# 18974bd6 17-Aug-2023 John Baldwin <jhb@FreeBSD.org>

bhyve: Store the FreeBSD OUI in little-endian in the controller data

Section 7.10.3 of the NVME 1.4b specification states that the IEEE OUI
in the identify controller structure is stored in little-endian format
(unlike the embedded OUI in EUI64 identifiers).

Reviewed by: corvink, chuck, imp
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D41487


# 1d386b48 16-Aug-2023 Warner Losh <imp@FreeBSD.org>

Remove $FreeBSD$: one-line .c pattern

Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/


# 13013d26 28-Jun-2023 Mark Johnston <markj@FreeBSD.org>

bhyve: Stop calling pci_lintr_request() in the NVMe device model

The device model effectively assumes that MSI-X is enabled (it never
asserts the legacy interrupt), so any guest which relies on being able
to use the legacy PCI interrupt will fail.

The WIP arm64 port does not implement legacy PCI interrupts, but NVMe
emulation is potentially useful there. Simply remove the call.

Reviewed by: corvink, chuck, jhb
Tested by: chuck
MFC after: 1 month
Sponsored by: Innovate UK
Differential Revision: https://reviews.freebsd.org/D40731


# 480bef94 16-Aug-2021 Corvin Köhne <corvink@FreeBSD.org>

bhyve: add bootindex option for several devices

The bootindex option creates an entry in the "bootorder" fwcfg file.
This file can be picked up by the guest firmware to determine the
bootorder. Nevertheless, it's not guaranteed that the guest firmware
uses the bootorder. At the moment, our OVMF ignores the bootorder. This
will change in the future.

If guest firmware supports the "bootorder" fwcfg file and no device uses
the bootindex option, the boot order is determined by the firmware
itself. If one or more devices specify a bootindex, the first bootable
device with the lowest bootindex will be booted. It's not garanteed that
devices without a bootindex will be recognized as bootable from the
firmware in that case.

Reviewed by: jhb
MFC after: 1 week
Sponsored by: Beckhoff Automation GmbH & Co. KG
Differential Revision: https://reviews.freebsd.org/D39285


# 0dc159ce 01-Jun-2023 Elyes Haouas <ehaouas@noos.fr>

bhyve: Fix typos

Signed-off-by: Elyes Haouas <ehaouas@noos.fr>
Reviewed by: imp
Pull Request: https://github.com/freebsd/freebsd-src/pull/653


# 4d846d26 10-May-2023 Warner Losh <imp@FreeBSD.org>

spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSD

The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch
up to that fact and revert to their recommended match of BSD-2-Clause.

Discussed with: pfg
MFC After: 3 days
Sponsored by: Netflix


# b344bd3a 23-Apr-2023 Val Packett <val@packett.cool>

ext2fs: extract crc16 into sys/crc16.h

deduplicate this as it might be needed for other drivers (e.g. Apple SPI-HID)

Sponsored by: https://www.patreon.com/valpackett
Reviewed by: chuck, imp
MFC after: 1 month
Differential revision: https://reviews.freebsd.org/D32879


# 1308a17b 14-Mar-2023 Elyes Haouas <ehaouas@noos.fr>

bhyve: Remove trailing semicolon

Macros shouldn't use trailing semicolon.

Signed-off-by: Elyes Haouas <ehaouas@noos.fr>
Reviewed by: imp
Pull Request: https://github.com/freebsd/freebsd-src/pull/654


# 6a284cac 19-Jan-2023 John Baldwin <jhb@FreeBSD.org>

bhyve: Remove vmctx argument from PCI device model methods.

Most of these arguments were unused. Device models which do need
access to the vmctx in one of these methods can obtain it from the
pi_vmctx member of the pci_devinst argument instead.

Reviewed by: corvink, markj
Differential Revision: https://reviews.freebsd.org/D38096


# 78c2cd83 09-Dec-2022 John Baldwin <jhb@FreeBSD.org>

bhyve: Remove unused vcpu argument from PCI read/write methods.

Reviewed by: corvink, markj
Differential Revision: https://reviews.freebsd.org/D37652


# 34781da5 09-Dec-2022 John Baldwin <jhb@FreeBSD.org>

bhyve: Remove unused argument from pci_nvme_handle_doorbell.

Reviewed by: corvink, chuck, markj
Differential Revision: https://reviews.freebsd.org/D37650


# 15cebe3d 28-Nov-2022 John Baldwin <jhb@FreeBSD.org>

bhyve: Fix sign compare warnings in the NVMe device model.

Reviewed by: corvink
Differential Revision: https://reviews.freebsd.org/D37489


# 5d805962 28-Nov-2022 John Baldwin <jhb@FreeBSD.org>

bhyve: Avoid unlikely truncation of the blockif ident strings.

The ident string for NVMe and VirtIO block deivces do not contain the
bus, and the various fields can potentially use up to three characters
when printed as unsigned values (full range of uint8_t) even if not
likely in practice.

Reviewed by: corvink, chuck
Differential Revision: https://reviews.freebsd.org/D37488


# 47d61162 28-Nov-2022 John Baldwin <jhb@FreeBSD.org>

bhyve: Clear lid to 0 for internal device errors for NVMe AENs.

Reported by: GCC
Reviewed by: corvink, chuck, imp, markj
Differential Revision: https://reviews.freebsd.org/D37487


# 1d9e8a9e 28-Nov-2022 John Baldwin <jhb@FreeBSD.org>

bhyve: Don't leak uninitialized bits in NVMe completion statuses.

In some cases, some bits in the 16-bit status word were never
initialized.

Reported by: GCC
Reviewed by: corvink, chuck, markj
Differential Revision: https://reviews.freebsd.org/D37486


# 10846c53 14-Nov-2022 Wanpeng Qian <wanpengqian@gmail.com>

bhyve: nvme controller obey async event setting when reporting critical temperature

Async event report is controlled by async event configuration feature
setting. When reporting a critical temperature warning, check the async
event configuration.

Approved by: manu (mentor)
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D37355


# 05a21658 14-Nov-2022 Wanpeng Qian <wanpengqian@gmail.com>

bhyve: return FEATURE_NOT_CHANGEABLE for unimplemented feature of NVMe controller

Set Feature is a feature specified function. Currently only some
features have the set procedure. For features that are not handled by
the controller, we should return a FEATURE_NOT_CHANGEABLE error message.

Approved by: manu (mentor)
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D32802


# 8ab99dbe 14-Nov-2022 Wanpeng Qian <wanpengqian@gmail.com>

bhyve: abort and return FEATURE_NOT_SAVEABLE while set feature with a save flag for NVMe controller.

Currently bhyve's NVMe controller cannot save feature values cross
reboot. It should return a FEATURE_NOT_SAVEABLE error when the command
specifies a save flag.

Quote from NVMe specification, page 205:

https://nvmexpress.org/wp-content/uploads/NVM-Express-1_4-2019.06.10-Ratified.pdf

If the Feature Identifier specified in the Set Features command is not
saveable by the controller and the controller receives a Set Features
command with the Save bit set to one, then the command shall be aborted
with a status of Feature Identifier Not Saveable.

Reviewed by: chuck (older version)
Approved by: manu (mentor)
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D32767


# b631954f 04-Nov-2022 Wanpeng Qian <wanpengqian@gmail.com>

bhyve: initial PowerCycles value

Currently PowerCycles field of Log Page is 0 and it is an invalid value.
This patch will initial the PowerCycles data to 1.

MFC after: 1 week
Approved by: manu (mentor)
Reviewed By: grehan (older version), chuck, corvink
Differential Revision: https://reviews.freebsd.org/D32558


# ae71263c 27-Oct-2022 Mark Johnston <markj@FreeBSD.org>

bhyve: Remove an unused parameter from pci_nvme_append_iov_req()

No functional change intended.

MFC after: 1 week
Reviewed by: jhb
Differential Revision: https://reviews.freebsd.org/D37116


# ed721684 23-Oct-2022 Mark Johnston <markj@FreeBSD.org>

bhyve: Address some signed/unsigned comparison warnings

MFC after: 1 week


# 6391be30 16-Aug-2022 Chuck Tuffli <chuck@FreeBSD.org>

bhyve nvme: Switch to POSIX standard functions

Switch bzero to memset and bcopy to memcpy

Reviewed by: imp, jhb
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D36215


# 37045dfa 16-Aug-2022 Mark Johnston <markj@FreeBSD.org>

bhyve: Mark variables and functions as static where appropriate

Mark them const as well when it makes sense to do so. No functional
change intended.

MFC after: 1 week
Sponsored by: The FreeBSD Foundation


# 715f82e4 16-Aug-2022 Chuck Tuffli <chuck@FreeBSD.org>

bhyve nvme: Support minimal Controller list

Controllers must support the Identify Controller list if they support
Namespace Management. But the UNH NVMe tests use this command regardless
of whether the device under test supports Namespace Management.

This implementation returns an empty Controller list (i.e., Number of
Identifiers is zero).

Fixes UNH Test 1.1.2

Reviewed by: jhb
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D36193


# ec0efe34 16-Aug-2022 Chuck Tuffli <chuck@FreeBSD.org>

bhyve nvme: Fix reported SANICAP value

The NVMe specification only allows Controllers compliant with the
revision 1.3 and earlier specification to report a value of 0x0 in the
No-Deallocate Modifies Media After Sanitize (NODMMAS) field.

For our revision 1.4 Controller, report that media is not modified after
Sanitize as the implementation does not implement Sanitize.

Fixes UNH Test 1.1.2

Reviewed by: jhb
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D36192


# 9f678cfc 14-Aug-2022 Wanpeng Qian <wanpengqian@gmail.com>

bhyve nvme: Fix firmware read only initialization

Summary:
Code was using the mask value without the shift.

Test Plan: Within FreeBSD/Linux guest, Identify NVMe controller to check the result.

Reviewed by: chuck, imp
MFC after: 2 weeks
Signed-off-by: Wanpeng Qian <wanpengqian@gmail.com>
Differential Revision: https://reviews.freebsd.org/D32659


# 3cae1004 14-Aug-2022 WanpengQian <wanpengqian@gmail.com>

bhyve nvme: Fix Active Firmware Info

Summary:
Currently Active Firmware Info is not initialized.

Fix is to initialize the Active Firmware Info to Slot 1.

Test Plan: Within FreeBSD/Linux guests, show the Firmware Logpage to confirm.

Reviewed By: chuck
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D32658


# eae0210c 14-Aug-2022 WanpengQian <wanpengqian@gmail.com>

bhyve: Fix Number of Power States Supported value

Summary:
Set Number of Power States Supported to indicate 1 power state. Keep the
Power State Descriptor data structures as zero to indicate "Not
reported".

Test Plan:
Within FreeBSD/Linux guests, list the number of power states and check
the Max Power value.

Reviewed By: markj, chuck
MFC after: 2 weeks
Signed-off-by: Wanpeng Qian <wanpengqian@gmail.com>
Differential Revision: https://reviews.freebsd.org/D32657


# b6ecef28 14-Aug-2022 Mark Johnston <markj@FreeBSD.org>

bhyve: Address uses of uninitialized variables in pci_nvme.c

The debug print in nvme_opc_get_log_page() would print an uninitialized
local variable.

In nvme_opc_write_read(), a failed LBA bounds check would cause
pci_nvme_stats_write_read_update() to be called with an uninitialized
variable as a parameter. Although the parameter is unused when the
check fails (and so status != 0), LLVM 14 emits some bogus machine code
in this path, which happens to result in a segfault when it gets
executed.

PR: 265749
Reviewed by: chuck, emaste
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D36119


# af86d12c 14-Aug-2022 Mark Johnston <markj@FreeBSD.org>

bhyve: Address -Wunused* warnings in pci_nvme.c

Currently these are not reported because bhyve is compiled with WARNS=2.
Let's start taking small steps towards enabling more warnings.

No functional change intended.

Reviewed by: chuck, imp, emaste
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D36118


# 7376c08c 09-Jun-2022 Chuck Tuffli <chuck@FreeBSD.org>

bhyve nvme: Fix uninitialized pointer

The Dataset Management code could free an uninitialized pointer if the
device doesn't support the Dataset Management command.

PR: 264548
Reported by: Robert Morris <rtm@lcs.mit.edu>


# d7d1beca 14-Aug-2022 Chuck Tuffli <chuck@FreeBSD.org>

bhyve nvme: Fix Controller init error cases

Fuzzing of bhyve uncovered an assertion failure in the NVMe emulation.
Investigation uncovered several corner cases the code did not handle.
This change handles several Controller initialization errors, including
- bad AQ sizes
- bad AQ vm_map_gpa
- doorbell writes prior to RDY
- doorbell writes to uninitialized queue
- CSTS.RDY if CFS set

PR: 256317,256319,256320,256322
Reported by: Cheolwoo Myung <cwmyung@snu.ac.kr>
Reviewed by: jhb
Differential Revision: https://reviews.freebsd.org/D35453


# 3d367862 14-Aug-2022 Chuck Tuffli <chuck@FreeBSD.org>

bhyve nvme: Check return value of mapped memory

Fuzzing of bhyve using hyfuzz discovered a way to cause a segmentation
fault in the NVMe emulation. If a guest specifies a physical address in
either the PRP1 or PRP2 field of a command that cannot be mapped from
guest to host, the function paddr_guest2host() returns a NULL pointer.
The NVMe emulation did not check for this error case, which allowed for
the segmentation fault to occur.

Fix is to check for a return value of NULL and indicate an error back to
the guest (Data Transfer error). While in the area, slightly refactor
the write/read blockif function to use a common error exit path.

PR: 256321
Reported by: Cheolwoo Myung <cwmyung@snu.ac.kr>
Reviewed by: imp, jhb
Differential Revision: https://reviews.freebsd.org/D35452


# 88951aaa 09-Jun-2022 Chuck Tuffli <chuck@FreeBSD.org>

bhyve nvme: Fix out-of-bound IOV array access

Summary:
NVMe operations indicate the memory region(s) associated with a command
via physical region pages (PRPs). Since each PRP has a fixed size,
contiguous memory regions larger than the PRP size require multiple PRP
entries.

Instead of issuing a blockif call for each PRP, the NVMe emulation
concatenates multiple contiguous PRP entries into a single blockif
request. The test for contiguous regions has a bug such that it
mistakenly treats an initial PRP address of zero as a contiguous range
and concatenates it with the previous. But because there is no previous
IOV, the concatenation code corrupts the IO request structure and leads
to a segmentation fault when the blockif request completes.

Fix is to test for the existence of a previous range before trying to
concatenate the current range with the previous one.

While in the area, rename pci_nvme_append_iov_req()'s lba parameter to
offset to match its usage.

PR: 264177
Reported by: Robert Morris <rtm@lcs.mit.edu>
Reviewed by: jhb
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D35328


# f6f357ef 15-Mar-2022 Andy Fiddaman <andy@omniosce.org>

bhyve: missing mutex initializations

Explicitly initialize the mutex that a PCI virtio module passes back to
virtio.

It so happens that these mutexes were being initialized regardless, no
functional change intended.

Reviewed by: chuck, jhb
Differential Revision: https://reviews.freebsd.org/D34372


# e0ac9dc2 23-Feb-2022 Chuck Tuffli <chuck@FreeBSD.org>

bhyve nvme: Advertise Namespace changed AEN

Advertise Namespace Attribute Notices events in the Optional
Asynchronous Events Supported (OAES) field of the Identify Controller
data structure. Additionally, rename the enums and macros to clarify
these are AEN's related to Notices and not generic information.

Reported by: andy@omniosce.org

Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D34331


# c2318cf8 21-Feb-2022 Chuck Tuffli <chuck@FreeBSD.org>

nvme: fix spelling of Namespace

Fix spelling of a macro definition.

Reviewed by: mav, imp
Differential Revision: https://reviews.freebsd.org/D34330


# ac678b4a 30-Jan-2022 Chuck Tuffli <chuck@FreeBSD.org>

bhyve nvme: Fix Identify Namespace, NSID=ffffffff

If the NVMe Controller doesn't support Namespace Management, it should
return "Invalid Namespace or Format" when the Host request Identify
Namespace with the global NSID value.

Fixes UNH IOL 16.0 Test 9.1, Case 6

Reviewed by: imp, allanjude
Tested by: jason@tubnor.net
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D33578


# fa263c53 30-Jan-2022 Chuck Tuffli <chuck@FreeBSD.org>

bhyve nvme: Fix Set Features, AEN

NVMe Controllers which do not support Endurance Groups must return an
error when the Endurance Group Event Aggregate Log Change Notices bit is
set in Set Features, Asynchronous Event Configuration.

Fixes UNH IOL Test 3.12, Case 8

Tested by: jason@tubnor.net
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D33577


# ff5ed0fa 30-Jan-2022 Chuck Tuffli <chuck@FreeBSD.org>

bhyve nvme: Fix reported VWC value

v1.4 and later NVMe Controllers report "Flush all Namespaces" support
differently.

Fixes UNH IOL 16.0 Test 2.6, Case 3

Reviewed by: imp, allanjude
Tested by: jason@tubnor.net
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D33576


# 9d8cd046 30-Jan-2022 Chuck Tuffli <chuck@FreeBSD.org>

bhyve nvme: Fix LBA out-of-range calculation

The function which checks for a valid LBA range mistakenly named an
input value as NLB ("Number of Logical Blocks") instead of "number of
blocks". The NVMe specification defines NLB as a zero-based value (i.e.
NLB=0x0 represents 1 block, 0x1 is 2 blocks, etc.), but the passed
parameter is a 1's-based value.

Fix is to rename the variable to avoid future confusion.

While in the neighborhood, also check that the starting LBA is less than
the size of the backing storage to avoid an integer overflow.

Reviewed by: imp, allanjude, jhb
Tested by: jason@tubnor.net
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D33575


# 073f2076 30-Jan-2022 Chuck Tuffli <chuck@FreeBSD.org>

bhyve nvme: Add Select support to Get Features

Implement basic support for the SEL field of Get Features. This returns
information about Namespace Specific features.

Fixes UNH ILO 16.0 Test 1.2, Case 13

Reviewed by: imp, allanjude
Tested by: jason@tubnor.net
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D33574


# 29241c96 30-Jan-2022 Chuck Tuffli <chuck@FreeBSD.org>

bhyve nvme: Update v1.4 Identify Controller data

Compliant v1.4 Controllers must report a Controller Type (CNTRLTYPE).
Also, do not advertise secure erase functionality in the Format NVM
Attributes field of the Identify Controller data structure as the
Controller does not implement secure erase.

Fixes UNH ILO Test 1.1, Case 2

Reviewed by: imp, allanjude
Tested by: jason@tubnor.net
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D33573


# ea9ee355 30-Jan-2022 Chuck Tuffli <chuck@FreeBSD.org>

bhyve nvme: Add Temperature Threshold support

This adds the ability for a guest OS to send Set / Get Feature,
Temperature Threshold commands. The implementation assumes a constant
temperature and will generate an Asynchronous Event Notification if the
specified threshold is above/below this value. Although the
specification allows 9 temperature values, this implementation only
implements the Composite Temperature.

While in the neighborhood, move the clear of the CSTS register in the
reset function after all other cleanup. This avoids a race with the
guest thinking the reset is complete (i.e. CSTS.RDY = 0) before the NVMe
emulation is actually complete with the reset.

Fixes UNH IOL 16.0 Test 1.7, cases 1, 2, and 4.

Tested by: jason@tubnor.net
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D33572


# 1381a118 30-Jan-2022 Chuck Tuffli <chuck@FreeBSD.org>

bhyve nvme: Fix Set Features

Be more conservative and only support the Features mandatory for an I/O
Controller.

Avoids a "hang" in UNH test 1.2.10 associated with Predictable Latency
Mode Configuration and Host Behavior Support features.

Tested by: jason@tubnor.net
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D33571


# 45ab4076 30-Jan-2022 Chuck Tuffli <chuck@FreeBSD.org>

bhyve nvme: Remove redundant AER Limit checks

The NVMe emulation checked if the Asynchronous Event Request Limit
(a.k.a AERL) would be exceeded in pci_nvme_aer_add(), but this function
is only called from nvme_opc_async_event_req() which also checks for
exceeding the AERL.

Reviewed by: imp, allanjude
Tested by: jason@tubnor.net
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D33570


# 785b5da3 30-Jan-2022 Chuck Tuffli <chuck@FreeBSD.org>

bhyve nvme: Add missing Admin opcodes

Don't treat unsupported Admin commands as Invalid Opcode. Instead return
the proper Invalid Field in Command.

Fixes UNH IOL test 1.17.2

Reviewed by: imp, allanjude
Tested by: jason@tubnor.net
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D33569


# b1b2a4d9 30-Jan-2022 Chuck Tuffli <chuck@FreeBSD.org>

bhyve nvme: Implement Log Page Offset

Modify the Get Log Page command to parse the Log Page Offset fields to
support more recent versions of the NVMe specification.

Fixes various tests for UNH Test 1.3.*

Reviewed by: imp, allanjude
Tested by: jason@tubnor.net
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D33568


# 62d47fec 30-Jan-2022 Chuck Tuffli <chuck@FreeBSD.org>

bhyve nvme: Fix Namespace Specific Set Features

Return an error if the feature specified in Set Features is Namespace
specific but the Namespace ID uses the Global Namespace tag.

Fixes UNH Test 1.2.7

Reviewed by: imp, allanjude
Tested by: jason@tubnor.net
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D33566


# cf76cdd4 30-Jan-2022 Chuck Tuffli <chuck@FreeBSD.org>

bhyve nvme: Fix NVM Format completion status

The NVM Format command is unique among the Admin commands in that it
needs to finish asynchronously. For this reason, the emulation code
invented a synthetic completion status (NVME_NO_STATUS) to indicate that
the command was still in progress and the command processing loop should
not generate a completion message. The implementation used the value
0xffff for the synthetic value as this set both the Status Code and
Status Code Type fields to reserved values.

Format initialized the completion status to this value and expected
error cases to override it with a status code/type appropriate to the
situation. The macros used to set the NVMe status are careful not to
modify bit 0 (i.e. the phase bit), which with the synthetic completion
status, causes the phase bit to get out of sync. When running tests in a
guest with illegal NVM Format commands, Admin commands would eventually
hang because it appeared there were no completions due to the incorrect
phase bit value.

Fix is to only set NVME_NO_STATUS if the blockif delete command
succeeds. While in the neighborhood, add a missing break statement when
NVM Format is not supported.

Reviewed by: imp, allanjude
Tested by: jason@tubnor.net
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D33565


# 595a12f1 30-Jan-2022 Chuck Tuffli <chuck@FreeBSD.org>

bhyve nvme: Advertise v1.4 support

Bump advertised NVMe support from v1.3 to v1.4

Reviewed by: allanjude
Tested by: jason@tubnor.net
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D33564


# c2fa905c 26-Dec-2021 Toomas Soome <tsoome@FreeBSD.org>

bhyve: clean up trailing whitespaces

Clean up trailing whitespaces. No functional changes.

Reviewed by: jhb
Differential Revision: https://reviews.freebsd.org/D33681


# cf3ed8e0 15-Dec-2021 Chuck Tuffli <chuck@FreeBSD.org>

bhyve nvme: Inform guests of namespace resize

Register a "block resize" callback to be notified of changes to the
backing storage for the Namespace. Use this to generate an Asynchronous
Event Notification, Namespace Attributes Changed when the guest OS
provides an Asynchronous Event Request.

MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D32953


# 9f1fa1a4 15-Dec-2021 Chuck Tuffli <chuck@FreeBSD.org>

bhyve nvme: Add AEN support to NVMe emulation

Add Asynchronous Event Notification infrastructure to the NVMe
emulation.

Reviewed by: imp, grehan
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D32952


# e76c0e4f 30-Aug-2021 Elliott Mitchell <ehem_freebsd@m5p.com>

bhyve: Nuke double-semicolons

A distinct number of double-semicolons ended up in bhyve. Take a pass at
getting rid of many of these harmless typos.

MFC after: 3 days


# 91064841 27-Jun-2021 Chuck Tuffli <chuck@FreeBSD.org>

bhyve: Fix NVMe iovec construction for large IOs

The UEFI driver included with Rocky Linux 8.4 uncovered an existing bug
in the NVMe emulation's construction of iovec's.

By default, NVMe data transfer operations use a scatter-gather list in
which all entries point to a fixed size memory region. For example, if
the Memory Page Size is 4KiB, a 2MiB IO requires 512 entries. Lists
themselves are also fixed size (default is 512 entries).

Because the list size is fixed, the last entry is special. If the IO
requires more than 512 entries, the last entry in the list contains the
address of the next list of entries. But if the IO requires exactly 512
entries, the last entry points to data.

The NVMe emulation missed this logic and unconditionally treated the
last entry as a pointer to the next list. Fix is to check if the
remaining data is greater than the page size before using the last entry
as a pointer to the next list.

PR: 256422
Reported by: dave@syix.com
Tested by: jason@tubnor.net
MFC after: 5 days
Relnotes: yes
Reviewed by: imp, grehan
Differential Revision: https://reviews.freebsd.org/D30897


# a11ca79c 24-Jun-2021 Chuck Tuffli <chuck@FreeBSD.org>

bhyve: fix NVMe MDTS comment

Removes an obsolete comment and adds parenthesis around the macro while
in the area. No functional change.


# 3a4ab183 15-Jun-2021 Chuck Tuffli <chuck@FreeBSD.org>

bhyve: Fix cli regression with NVMe ram

The configuration management refactoring inadvertently removed support
for a RAM-backed NVMe Namespace (i.e. -s X,nvme,ram=16384). This adds it
back.

Reported by: andy@omniosce.org
Reviewed by: jhb, andy@omniosce.org
Fixes: 621b5090487d
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D30717


# 621b5090 26-Jun-2019 John Baldwin <jhb@FreeBSD.org>

Refactor configuration management in bhyve.

Replace the existing ad-hoc configuration via various global variables
with a small database of key-value pairs. The database supports
heirarchical keys using a MIB-like syntax to name the path to a given
key. Values are always stored as strings. The API used to manage
configuation values does include wrappers to handling boolean values.
Other values use non-string types require parsing by consumers.

The configuration values are stored in a tree using nvlists. Leaf
nodes hold string values. Configuration values are permitted to
reference other configuration values using '%(name)'. This permits
constructing template configurations.

All existing command line arguments now set configuration values. For
devices, the "-s" option parses its option argument to generate a list
of key-value pairs for the given device.

A new '-o' command line option permits setting an individual
configuration variable. The key name is always given as a full path
of dot-separated components.

A new '-k' command line option parses a simple configuration file.
This configuration file holds a flat list of 'key=value' lines where
the 'key' is the full path of a configuration variable. Lines
starting with a '#' are comments.

In general, bhyve starts by parsing command line options in sequence
and applying those settings to configuration values. Once this is
complete, bhyve then begins initializing its state based on the
configuration values. This means that subsequent configuration
options or files may override or supplement previously given settings.

A special 'config.dump' configuration value can be set to true to help
debug configuration issues. When this value is set, bhyve will print
out the configuration variables as a flat list of 'key=value' lines.

Most command line argments map to a single configuration variable,
e.g. '-w' sets the 'x86.strictmsr' value to false. A few command
line arguments have less obvious effects:

- Multiple '-p' options append their values (as a comma-seperated
list) to "vcpu.N.cpuset" values (where N is a decimal vcpu number).

- For '-s' options, a pci.<bus>.<slot>.<function> node is created.
The first argument to '-s' (the device type) is used as the value of
a "device" variable. Additional comma-separated arguments are then
parsed into 'key=value' pairs and used to set additional variables
under the device node. A PCI device emulation driver can provide
its own hook to override the parsing of the additonal '-s' arguments
after the device type.

After the configuration phase as completed, the init_pci hook
then walks the "pci.<bus>.<slot>.<func>" nodes. It uses the
"device" value to find the device model to use. The device
model's init routine is passed a reference to its nvlist node
in the configuration tree which it can query for specific
variables.

The result is that a lot of the string parsing is removed from
the device models and centralized. In addition, adding a new
variable just requires teaching the model to look for the new
variable.

- For '-l' options, a similar model is used where the string is
parsed into values that are later read during initialization.
One key note here is that the serial ports use the commonly
used lowercase names from existing documentation and examples
(e.g. "lpc.com1") instead of the uppercase names previously
used internally in bhyve.

Reviewed by: grehan
MFC after: 3 months
Differential Revision: https://reviews.freebsd.org/D26035


# 71a51f69 23-Aug-2020 Chuck Tuffli <chuck@FreeBSD.org>

bhyve: NVMe queue create must init head/tail

The NVMe emulation code did not explicitly initialize queue head and
tail pointers on queue creation. As these pointers are part of
calloc()'ed memory, this only becomes a problem if the queues are
deleted and then recreated.

This error can manifest with messages about completions not matching a
command.


# c4a86c1f 23-Aug-2020 Chuck Tuffli <chuck@FreeBSD.org>

bhyve: NVMe set nominal health values

Some operating systems believe bhyve's emulated NVMe drive is failing
based on certain values in the SMART / Health Information log page being
zero. Fix is to set the reported temperature and available spare values
to reasonable defaults.

Submitted by: wanpengqian@gmail.com
Reviewed by: grehan
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D24202


# 0ed1d2e4 28-Jun-2020 Chuck Tuffli <chuck@FreeBSD.org>

bhyve: fix NVMe Active Namespace list

The NVMe specification requires unused entries in the Identify, Active
Namespace ID data to be zero. Fix is bzero the provided page, similar to
what is done for the Namespace Descriptors list.

Fixes UNH Tests 2.6 and 2.9

Tested by: Jason Tubnor
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D24901


# a104b18c 28-Jun-2020 Chuck Tuffli <chuck@FreeBSD.org>

bhyve: NVMe handle zero length DSM ranges

Dataset Management range specifications may have a zero length (a.k.a.
an empty range definition). Handle the case of all ranges being empty by
completing with Success (DSM commands are advisory only). For
Deallocate, skip empty range definitions when sending TRIM's to the
backing storage.

Fixes UNH Test 2.2.4

Reviewed by: imp
Tested by: Jason Tubnor
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D24900


# 7669ea7b 28-Jun-2020 Chuck Tuffli <chuck@FreeBSD.org>

bhyve: fix NVMe Get Features, Predictable Latency

If the Predictable Latency Mode is not supported, NVMe Controllers must
return Invalid Field in Command status for the Get Features command
with IDs:
- Predictable Latency Mode Config
- Predictable Latency Mode Window

Fixes UNH Tests 3.6

Tested by: Jason Tubnor
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D24899


# f97ed151 28-Jun-2020 Chuck Tuffli <chuck@FreeBSD.org>

bhyve: add NVMe Feature Interrupt Vector Config

This adds support for NVMe Get Features, Interrupt Vector Config
parameter error checking done by the UNH compliance tests.

Fixes UNH Tests 1.6.8 and 5.5.6

Tested by: Jason Tubnor
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D24898


# 46ea6273 28-Jun-2020 Chuck Tuffli <chuck@FreeBSD.org>

bhyve: add basic NVMe Firmware Commit support

This commit updates the Identify Controller data to advertise the
Controller supports a single firmware slot and that firmware slot 1 is
read-only. Additionally, it returns an "Invalid Firmware Slot" error
when the host issues any Firmware Commit command (a.k.a. Firmware
Activate).

Fixes UNH Test 5.5.3

Tested by: Jason Tubnor
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D24897


# 106329ef 28-Jun-2020 Chuck Tuffli <chuck@FreeBSD.org>

bhyve: Add AER support to NVMe emulation

This adds support to bhyve's NVMe device emulation for processing Async
Event Requests but not returning them (i.e. Async Event Notifications).

Fixes UNH Test 5.5.2

Tested by: Jason Tubnor
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D24896


# 8bba8666 28-Jun-2020 Chuck Tuffli <chuck@FreeBSD.org>

bhyve: validate the NVMe LBA start and count

Add checks that the combination of Starting LBA and Number of Logical
Blocks in a command will not exceed the range of the underlying storage.

Note that because NVMe specifices the Starting LBA as a uint64_t, care
must be taken when converting it and the block count to avoid an integer
overflow.

Fixes UNH Tests 2.2.3, 2.3.2, and 2.4.2

Tested by: Jason Tubnor
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D24895


# 7d248cff 28-Jun-2020 Chuck Tuffli <chuck@FreeBSD.org>

bhyve: implement NVMe SMART data I/O statistics

SMART data in NVMe includes statistics for number of read and write
commands issued as well as the number of "data units" read and written.
NVMe defines "data unit" as thousands of 512 byte blocks (e.g. 1 data
unit is 1-1,000 512 byte blocks, 3 data units are 2,001-3,000 512 byte
blocks).

This patch implements counters for:
- Data Units Read
- Data Units Written
- Host Read Commands
- Host Write Commands
and exposes the values when the guest reads the SMART/Health Log Page.

Fixes UNH Test 1.3.8

Tested by: Jason Tubnor
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D24894


# ae638f2b 28-Jun-2020 Chuck Tuffli <chuck@FreeBSD.org>

bhyve: validate NVMe deallocate range values

For NVMe emulation, validate the Data Set Management LBA ranges do not
exceed the capacity of the backing storage. If they do, return an "LBA
Out of Range" error.

Fixes UNH Test 2.2.3

Tested by: Jason Tubnor
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D24893


# 73cd73c0 28-Jun-2020 Chuck Tuffli <chuck@FreeBSD.org>

bhyve: base pci_nvme_ioreq size on advertised MDTS

NVMe controllers advertise their Max Data Transfer Size (MDTS) to limit
the number of page descriptors in an I/O request. Take advantage of this
and size the struct pci_nvme_ioreq accordingly.

Ensuring these values match both future-proofs the code and allows
removing some complexity which only exists to handle this possibility.

Tested by: Jason Tubnor
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D24891


# 206edceb 28-Jun-2020 Chuck Tuffli <chuck@FreeBSD.org>

bhyve: refactor NVMe I/O read/write

Split the NVM I/O function (i.e. nvme_opc_write_read) into separate
functions - one for RAM based backing-store and another for disk based
backing-store for easier maintenance. No functional changes.

Tested by: Jason Tubnor
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D24890


# a0900f46 28-Jun-2020 Chuck Tuffli <chuck@FreeBSD.org>

bhyve: implement NVMe Format NVM command

The Format NVM command mainly allows the host to specify the block size
and protection information used for the Namespace. As the bhyve
implementation simply maps the capabilities of the backing storage
through to the guest, there isn't anything to implement. But a side
effect of the format is the NVMe Controller shall not return any data
previously written (i.e. erase previously written data). This patch
implements this later behavior to provide a compliant implementation.

Fixes UNH Test 1.6

Tested by: Jason Tubnor
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D24889


# 45cf8268 28-Jun-2020 Chuck Tuffli <chuck@FreeBSD.org>

bhyve: make unsupported NVMe commands a debug message

Tested by: Jason Tubnor
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D24888


# e3ebd421 28-Jun-2020 Chuck Tuffli <chuck@FreeBSD.org>

bhyve: add more compliant NVMe Get/Set Features

Create a generic Get/Set Features by saving off the contents of CDW11
from the Set command and returning the saved value in the completion of
the Get command. Implementation allows providing optional implementation
for both Set and Get.

Add infrastructure to determine which feature ID's are namespace
specific and flag violations of this category of error.

Also adds the feature specific behavior of Set Features, Number of
Queues to only allow this command once per Controller reset.

Fixes UNH Tests 1.2, 5.4, and 5.5.6

Tested by: Jason Tubnor
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D24887


# d708ced6 28-Jun-2020 Chuck Tuffli <chuck@FreeBSD.org>

bhyve: fix NVMe queue creation and deletion

Add checks for various types of invalid I/O Queue Create and Delete
command parameters, including:
- QID=0
- QID>MAX
- QID already in use
- Delete an Active CQ
- Invalid QSIZE
- Invalid CQID (SQ creation)
- Invalid interrupt vector (CQ creation)

Fixes UNH Tests 1.4.2-5,7-8

Tested by: Jason Tubnor
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D24886


# f6f02911 28-Jun-2020 Chuck Tuffli <chuck@FreeBSD.org>

bhyve: fix NVMe Get Log Page command

Fix the logic in nvme_opc_get_log_page to calculate the number of DWORDS
(uint32_t) instead of WORDS (uint16_t) for the byte length. And only
return the allowed number of Log Page bytes as determined by the user
request and actual size of the requested log page.

Fixes UNH Test 1.3

Tested by: Jason Tubnor
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D24885


# f8fa7467 28-Jun-2020 Chuck Tuffli <chuck@FreeBSD.org>

bhyve: implement NVMe Namespace Identification Descriptor

NVMe 1.3 compliant controllers must implement the Namespace
Identification Descriptor structure (i.e. CNS=3). Previously this was
unimplemented.

Fixes UNH Test 1.1.4-0

Tested by: Jason Tubnor
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D24884


# 064ca48f 28-Jun-2020 Chuck Tuffli <chuck@FreeBSD.org>

bhyve: Consolidate NVMe CQ update

Consolidate the code which writes Completion Queue entries and updates
the CQ doorbell value. While in the neighborhood, convert the "toggle CQ
phase bit" code to use an XOR operation instead of an "if/else" branch.

Tested by: Jason Tubnor
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D24882


# d7e180fe 28-Jun-2020 Chuck Tuffli <chuck@FreeBSD.org>

bhyve: add locks around NVMe queue accesses

The NVMe code attempted to ensure thread safety through a combination of
using atomics and a "busy" flag. But this approach leads to unavoidable
race conditions.

Fix is to use per-queue mutex locks to ensure thread safety within the
queue processing code. While in the neighborhood, move all the queue
initialization code to a common function.

Tested by: Jason Tubnor
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D19841


# cf20131a 28-Jun-2020 Chuck Tuffli <chuck@FreeBSD.org>

bhyve: add a comment explaining NVME dsm option

Tested by: Jason Tubnor
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D24881


# 9963f180 28-Jun-2020 Chuck Tuffli <chuck@FreeBSD.org>

bhyve: implement NVMe Flush command

This adds support for the NVMe I/O command Flush. For block-based
devices, submit a DIOCGFLUSH to the backing storage. Otherwise, command
is treated like a NOP and completes with a Successful status.

Tested by: Jason Tubnor
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D24880


# a43ab8d2 28-Jun-2020 Chuck Tuffli <chuck@FreeBSD.org>

bhyve: refactor NVMe IO command handling

This refactors the NVMe I/O command processing function to make adding
new commands easier. The main change is to move command specific
processing (i.e. Read/Write) to separate functions for each NVMe I/O
command and leave the common per-command processing in the existing
pci_nvme_handle_io_cmd() function.

While here, add checks for some common errors (invalid Namespace ID,
invalid opcode, LBA out of range).

Add myself to the Copyright holders

Reviewed by: imp
Tested by: Jason Tubnor
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D24879


# 0220a2ae 28-Jun-2020 Chuck Tuffli <chuck@FreeBSD.org>

bhyve: convert NVMe logging statements

Convert the debug and warning logging macros to be parameterized and
correctly use bhyve's PRINTLN macro.

Reviewed by: imp
Tested by: Jason Tubnor
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D24878


# 1264a2b9 27-Mar-2020 Chuck Tuffli <chuck@FreeBSD.org>

bhyve: fix NVMe emulation update of SQHD

The SQHD field of a Completion Queue entry indicates the current
Submission Queue head pointer value. The head pointer represents the
next entry to be consumed and is updated after consuming the current
entry.

In the Admin queue processing, the current code updates the head pointer
after reporting the value to the host via the SQHD. This gives the
impression that the Controller is perpetually one command behind in its
processing of the Admin SQ. And while this doesn't appear to bother some
initiators, it is wrong.

Fix is to update the SQ head pointer prior to writing the SQHD value in
the completion.

While here, fix missed update of dword 0 (cdw0) in the completion
message.

Reported by: khng300
Reviewed by: jhb, imp
Approved by: jhb (maintainer)
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D24083


# 961be12f 27-Mar-2020 Chuck Tuffli <chuck@FreeBSD.org>

bhyve: fix NVMe emulation missed interrupts

The bhyve NVMe emulation has a race in the logic which generates command
completion interrupts. On FreeBSD guests, this manifests as kernel log
messages similar to:
nvme0: Missing interrupt

The NVMe emulation code sets a per-submission queue "busy" flag while
processing the submission queue, and only generates an interrupt when
the submission queue is not busy.

Aside from being counter to the NVMe design (i.e. interrupt properties
are tied to the completion queue) and adding complexity (e.g. exceptions
to not generating an interrupt when "busy"), it causes a race condition
under the following conditions:
- guest OS has no outstanding interrupts
- guest OS submits a single NVMe IO command
- bhyve emulation processes the SQ and sets the "busy" flag
- bhyve emulation submits the asynchronous IO to the backing storage
- IO request to the backing storage completes before the SQ processing
loop exits and doesn't generate an interrupt because the SQ is "busy"
- bhyve emulation finishes processing the SQ and clears the "busy" flag

Fix is to remove the "busy" flag and generate an interrupt when the CQ
head and tail pointers do not match.

Reported by: khng300
Reviewed by: jhb, imp
Approved by: jhb (maintainer)
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D24082


# f3e46ff9 27-Mar-2020 Chuck Tuffli <chuck@FreeBSD.org>

bhyve: use STAILQ in NVMe emulation

Use the standard queue(3) macros instead of hand-crafted linked list
code.

Reviewed by: imp, jhb
Approved by: jhb (maintainer)
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D24081


# cd65e089 27-Mar-2020 Chuck Tuffli <chuck@FreeBSD.org>

bhyve: implement NVMe deallocate command

This adds support for the Dataset Management (DSM) command to the NVMe
emulation in general, and more specifically, for the deallocate
attribute (a.k.a. trim in the ATA protocol). If the backing storage for
the namespace supports delete (i.e. deallocate), setting the deallocate
attribute in a DSM will trim/delete the requested LBA ranges in the
underlying storage.

Reviewed by: jhb, araujo, imp
Approved by: jhb (maintainer)
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D21839


# d31d525e 27-Mar-2020 Chuck Tuffli <chuck@FreeBSD.org>

bhyve: refactor NVMe namespace initialization

Pass the struct pci_nvme_blockstore pointer for this namespace to the
namespace initialization function instead of only the desired eui64
value.

Minor functional change in that the code updates the eui64 value in the
blockstore.

Reviewed by: jhb, araujo
Approved by: jhb (maintainer)
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D21838


# da8de3e9 27-Mar-2020 Chuck Tuffli <chuck@FreeBSD.org>

bhyve: refactor NVMe PRP memcpy

Add a "copy direction" parameter to nvme_prp_memcpy such that data can
be copied to the memory specified by the PRP entries (current behavior)
or copied from the PRP entries (new behavior). The upcoming deallocate
functionality will use the copy from capability.

Reviewed by: jhb, araujo
Approved by: jhb (maintainer)
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D21837


# 332eff95 08-Jan-2020 Vincenzo Maffione <vmaffione@FreeBSD.org>

bhyve: add wrapper for debug printf statements

Add printf() wrapper to use CR/CRLF terminators depending on whether
stdio is mapped to a tty open in raw mode.
Try to use the wrapper everywhere.
For now we leave the custom DPRINTF/WPRINTF defined by device
models, but we may remove them in the future.

Reviewed by: grehan, jhb
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D22657


# 79c1428e 02-Dec-2019 Vincenzo Maffione <vmaffione@FreeBSD.org>

bhyve: uniform printf format string newlines

Some of the printf statements only use LF to get a newline. However, a CR character is also required for the serial console to print debug logs in a nice way.
Fix those code locations that only use LF, by adding a CR character.

Reviewed by: markj, aleksandr.fedorov@itglobal.com
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D22552


# 31b67520 16-Jul-2019 Chuck Tuffli <chuck@FreeBSD.org>

bhyve: update the NVMe CQ based on the status

Instead of skipping the NVMe Completion Queue update based on the
opcode, define a synthetic status value which indicates the completion
queue entry is invalid. This will also allow deferred completion queue
updates for other commands.

Also returns the correct status for unrecognized opcodes ("invalid
opcode").

Reviewed by: imp, jhb, araujo
Approved by: imp (mentor), jhb (maintainer)
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D20945


# 409a80e5 12-Jul-2019 Chuck Tuffli <chuck@FreeBSD.org>

bhyve: Create EUI64 for NVMe namespaces

Accept an IEEE Extended Unique Identifier (EUI-64) from the command
line for each NVMe namespace. If one isn't provided, it will create one
based on the CRC16 of:
- the FreeBSD IEEE OUI
- PCI bus, device/slot, function values
- Namespace ID

Reviewed by: imp, araujo, jhb, rgrimes
Approved by: imp (mentor), jhb (maintainer)
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D19905


# e47c1922 11-Jul-2019 Sean Chittenden <seanc@FreeBSD.org>

usr.sbin/bhyve: unconditionally initialize the NVMe completion status

Follow-up work to improve the handling of unsupported/invalid opcodes
is being developed by chuck@.

Coverity CID: 1398928
Reviewed by: chuck
Approved by: araujo, imp
Differential Revision: https://reviews.freebsd.org/D20914


# 129f93c5 07-Jun-2019 Chuck Tuffli <chuck@FreeBSD.org>

bhyve: Add PCIe Integrated Endpoint capability

The NVMe CAM driver reports the PCIe Link Capability and Status for
devices. For emulated bhyve NVMe devices, this looks like:

nda0: nvme version 1.3 x63 (max x63) lanes PCIe Gen15 (max Gen15) link

The driver outputs this because the emulated device doesn't include the
PCIe Capability structure. The NVMe specification requires these
registers, so the fix is to add this set of capability registers to the
emulated device.

Note that PCI Express devices that are integrated into the Root Complex
(i.e. Bus 0x0) do not have to support the Link Capability or Status
registers. Windows will fail to start (i.e. Code 10) devices that appear
to be part of the Root Complex but report being a PCI Express Endpoint.
So also add a check to pci_emul_add_pciecap() to check if the device is
integrated and change the device type.

Reviewed by: imp, ken, araujo, jhb, rgrimes
Approved by: imp (mentor), ken (mentor), jhb (maintainer)
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D19904


# a1daa3ae 05-Apr-2019 Chuck Tuffli <chuck@FreeBSD.org>

bhyve: Fix NVMe data structure copy to guest

bhyve's NVMe emulation was transferring Identify data back to the guest
incorrectly causing memory corruptions. These corruptions resulted in
core dumps and other system level errors in the guest.

In their simplest form, NVMe Physical Region Page (PRP) values in
commands indicate which physical pages to use for data transfer. The
first PRP value is not required to be page aligned but does not cross a
page boundary. The second PRP value must be page aligned, does not cross
a page boundary, and need not be contiguous with PRP1.

The code was copying Identify data past the end of PRP1. This happens to
work if PRP1 and PRP2 are physically contiguous but will corrupt guest
memory in unpredictable ways if they are not.

Fix is to copy the Identify data back to the guest piecewise (i.e. for
each PRP entry). Also fix a similarly wrong problem when copying back
Log page data.

Reviewed by: imp (mentor), araujo, jhb, rgrimes, bhyve
Approved by: imp (mentor), bhyve (jhb)
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D19695


# fe1b713e 05-Apr-2019 Chuck Tuffli <chuck@FreeBSD.org>

bhyve: Fix NVMe BAR size calculation

The NVMe specification defines bits 13:4 of BAR0 as Reserved (i.e. 0x0).
Most drivers do not enforce this, but the Windows NVMe driver does and
will refuse to start the device (i.e. error 10) if any of these bits are
set.

The current BAR size calculation tries to minimize the amount of memory
the device reserves by scaling the BAR size by the maximum number of
queues supported by the device. But unless the device supports a large
number of queue pairs (over 1536), it will reserve too little memory.

The fix is to allocate a minimum of 16K bytes for BAR0.

Tested on Windows Server 2016 and 2019

Reviewed by: imp (mentor), araujo, jhb, bhyve
Approved by: imp (mentor), bhyve (jhb)
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D19676


# 7bb10738 14-Mar-2019 Chuck Tuffli <chuck@FreeBSD.org>

Fix bhyve's NVMe Identify Namespace data

The NVMe Identify Namespace data structure's Number of LBA Formats
(NLBAF) field is a 0's based value (i.e. 0x0 means 1). Since the
emulation only supports a single format, set NLBAF to 0x0, not 1.

Reviewed by: imp, araujo, rgrimes
Approved by: imp (mentor)
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D19579


# fbac7e0b 04-Jan-2019 Chuck Tuffli <chuck@FreeBSD.org>

Fix bhyve's NVMe Completion Queue entry values

The function which processes Admin commands was not returning the
Command Specific value in Completion Queue Entry, Dword 0 (CDW0). This
effects commands such as Set Features, Number of Queues which returns
the number of queues supported by the device in CDW0. In this case, the
host will only create 1 queue pair (Number of Queues is zero based).
This also masked a bug in the queue counting logic.

Reviewed by: imp, araujo
Approved by: imp (mentor)
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D18703


# 76e47b94 04-Jan-2019 Chuck Tuffli <chuck@FreeBSD.org>

Fix bhyve's NVMe queue bookkeeping

Many size / length parameters in NVMe are "0's based", meaning, a value
of 0x0 represents 1, 0x1 represents 2, etc.. While this leads to an
efficient encoding, it can lead to subtle bugs. With respect to queues,
these parameters include:
- Maximum number of queue entries
- Maximum number of queues
- Number of Completion Queues
- Number of Submission Queues

To be consistent, convert all 0's based values from the host to 1's
based value internally. Likewise, covert internal 1's based values to
0's based values when returned to the host. This fixes an off-by-one bug
when creating IO queues and simplifies some of the code. Note that this
bug is masked by another bug.

While in the neighborhood,
- fix an erroneous queue ID check (checking CQ count when deleting SQ)
- check for queue ID of 0x0 in a few places where this is illegal
- clean up the Set Features, Number of Queues command and check for
illegal values

Reviewed by: imp, araujo
Approved by: imp (mentor)
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D18702


# 0f6f91a8 06-Nov-2018 Marcelo Araujo <araujo@FreeBSD.org>

Comestic change to try to inline the memset with SSE/AVX instructions.
Also switch from int to size_t to keep portability.

Reviewed by: brooks
Sponsored by: iXsystems Inc.
Differential Revision: https://reviews.freebsd.org/D17795


# 9544e6dc 21-Aug-2018 Chuck Tuffli <chuck@FreeBSD.org>

Make NVMe compatible with the original API

The original NVMe API used bit-fields to represent fields in data
structures defined by the specification (e.g. the op-code in the command
data structure). The implementation targeted x86_64 processors and
defined the bit fields for little endian dwords (i.e. 32 bits).

This approach does not work as-is for big endian architectures and was
changed to use a combination of bit shifts and masks to support PowerPC.
Unfortunately, this changed the NVMe API and forces #ifdef's based on
the OS revision level in user space code.

This change reverts to something that looks like the original API, but
it uses bytes instead of bit-fields inside the packed command structure.
As a bonus, this works as-is for both big and little endian CPU
architectures.

Bump __FreeBSD_version to 1200081 due to API change

Reviewed by: imp, kbowling, smh, mav
Approved by: imp (mentor)
Differential Revision: https://reviews.freebsd.org/D16404


# 1465a1e1 21-Aug-2018 Marcelo Araujo <araujo@FreeBSD.org>

Fix resource leak when using strdup(3).

Reported by: Coverity
CID: 1394929
Sponsored by: iXsystems Inc.


# 6b2c20cd 19-Aug-2018 Marcelo Araujo <araujo@FreeBSD.org>

NVMe spec version 1.3c says that "serial number" field must be 7-bit ASCII,
with unused bytes padded by space characters. Same for firmware number and
namespace number.

Discussed with: imp@
Sponsored by: iXsystems Inc.


# b018ea01 19-Aug-2018 Marcelo Araujo <araujo@FreeBSD.org>

Users must set the number of queues from 1 to maximum 16 queues.

Sponsored by: iXsystems Inc.


# df90fce2 19-Aug-2018 Marcelo Araujo <araujo@FreeBSD.org>

Fix double mutex lock.

Reported by: Coverity
CID: 1394833
Discussed with: Leon Dang
Sponsored by: iXsystems Inc.


# ec89307f 16-Aug-2018 Marcelo Araujo <araujo@FreeBSD.org>

Fix a resource leak when using strdup(3) and also fix few style(9).

Reported by: Coverity
CID: 1394929
MFC after: 1 week
Sponsored by: iXsystems Inc.


# 3955e1c0 16-Aug-2018 Marcelo Araujo <araujo@FreeBSD.org>

Remove duplicated code.

Reported by: Coverity
CID: 1394893
MFC after: 1 week
Sponsored by: iXsystems Inc.


# 9e59a2e8 16-Aug-2018 Marcelo Araujo <araujo@FreeBSD.org>

Add a comment explaining how the PSN works and why there is no need for
a null terminator. Also mark CID 1394825 as intentional.

Reported by: Coverity
CID: 1394825
MFC after: 1 week
Sponsored by: iXsystems Inc.


# e30993c2 16-Aug-2018 Marcelo Araujo <araujo@FreeBSD.org>

Increase the mask from 15 to 255 or otherwise NVME_FEAT_SOFTWARE_PROGRESS
will never be reached.

Discussed with: Leon Dang and Darius Mihai <dariusmihaim@gmail.com>
MFC after: 1 week.
Sponsored by: iXsystems Inc.


# c066c68c 04-Jul-2018 Marcelo Araujo <araujo@FreeBSD.org>

- Add bhyve NVMe device emulation.

The initial work on bhyve NVMe device emulation was done by the GSoC student
Shunsuke Mie and was heavily modified in performan, functionality and
guest support by Leon Dang.

bhyve:
-s <n>,nvme,devpath,maxq=#,qsz=#,ioslots=#,sectsz=#,ser=A-Z

accepted devpath:
/dev/blockdev
/path/to/image
ram=size_in_MiB

Tested with guest OS: FreeBSD Head, Linux Fedora fc27, Ubuntu 18.04,
OpenSuse 15.0, Windows Server 2016 Datacenter.
Tested with all accepted device paths: Real nvme, zdev and also with ram.
Tested on: AMD Ryzen Threadripper 1950X 16-Core Processor and
Intel(R) Xeon(R) CPU E5-2609 v2 @ 2.50GHz.

Tests at: https://people.freebsd.org/~araujo/bhyve_nvme/nvme.txt

Submitted by: Shunsuke Mie <sux2mfgj_gmail.com>,
Leon Dang <leon_digitalmsx.com>
Reviewed by: chuck (early version), grehan
Relnotes: Yes
Sponsored by: iXsystems Inc.
Differential Revision: https://reviews.freebsd.org/D14022