#
e84a75f9 |
|
07-Apr-2024 |
Warner Losh <imp@FreeBSD.org> |
nvme: Add telemetry page definitions Add definition for page types 7 and 8 for host initiated telemetry and controller initiated telemetry (they differ by one byte, but that byte that's defined in the host version is reserved in the controller version). Sponsored by: Netflix
|
#
ebcfab99 |
|
08-May-2024 |
John Baldwin <jhb@FreeBSD.org> |
nvme: Explicitly align struct nvme_command on an 8 byte boundary This was already true for most architectures due to uint64_t structure members. However, i386 is special in that it only requires 4 byte alignment for uint64_t. As a result, casts from struct nvme_command to struct nvmf_fabric_cmd were raising a "cast increases alignment" warning on i386. Explicitly aligning struct nvme_command pacifies this warning on i386. Reported by: rscheff Sponsored by: Chelsio Communications
|
#
29d7e39f |
|
07-May-2024 |
John Baldwin <jhb@FreeBSD.org> |
nvme: Bump the alignment of struct nvme_health_information_page to 8 This ensures that embedded uint64_t values used for statistics counters are aligned when allocating a structure on the stack or as part of a containing structure. In particular this quiets -Waddress-of-packed-member warnings from GCC when compiling the code in nvmfd to update the stats. Reported by: GCC
|
#
5e3e4442 |
|
02-May-2024 |
John Baldwin <jhb@FreeBSD.org> |
nvme: Add constants for the Fused Operation (FUSE) field in commands Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D44845
|
#
d86edc18 |
|
02-May-2024 |
John Baldwin <jhb@FreeBSD.org> |
nvmf.h: New header defining ioctls for NVMe over Fabrics This defines structures, ioctl commands, and related constants used for both the Fabrics host and controller. Reviewed by: imp Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D44706
|
#
97b77de2 |
|
16-Apr-2024 |
Warner Losh <imp@FreeBSD.org> |
nvme: Eliminate intel_log_temp_stats_swapbytes We can't post a AER for this page, so there's no need to be able to swap it to host byte order. It's not one of the standard defined pages that can post via AER, and the vendor's public docs for this temperature page don't suggest it's possible to get over or under event changes. Since nvmecontrol no longer needsd the swap routine, remove it since it's now unused. Sponsored by: Netflix Reviewed by: chuck Differential Revision: https://reviews.freebsd.org/D44659
|
#
0b8f21e8 |
|
03-Apr-2024 |
Warner Losh <imp@FreeBSD.org> |
nvme: Add LPA bits Add all the bits from the NVMe 2.0 base specification: CMD_EFFECTS to indicate the commands and effects log page is supported, TELEMETRY to indicate that the telemetry log pages and protocols are supported, PERSISTENT_EVENTS to indicate the persistent event log is supported, LOG_PAGES_PAGE to indicate that various log pages related to log page and command support are supported: L0, L5, L12, and L13. and DA4_TELEMETRY to indicate that the DA4 area is supported for telemetry data. Sponsored by: Netflix
|
#
21d3a84d |
|
22-Mar-2024 |
John Baldwin <jhb@FreeBSD.org> |
nvme: Add NVMe over Fabrics fields to nvme_controller_data Reviewed by: imp Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D44448
|
#
7fa8adb8 |
|
22-Mar-2024 |
John Baldwin <jhb@FreeBSD.org> |
nvme: Add constants for the Controller Attributes field in cdata Reviewed by: imp Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D44447
|
#
88ecf154 |
|
22-Mar-2024 |
John Baldwin <jhb@FreeBSD.org> |
nvme: Add constants and types for the discovery log page This is used in NVMe over Fabrics to enumerate a list of available controllers. Reviewed by: imp Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D44446
|
#
b354bb04 |
|
22-Mar-2024 |
John Baldwin <jhb@FreeBSD.org> |
nvme: Add constants for fields in AER completion dword 0 Reviewed by: imp Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D44445
|
#
cbda1886 |
|
22-Mar-2024 |
John Baldwin <jhb@FreeBSD.org> |
nvme: Add constants for the extended data for Get Log Page command flag nvme(4) doesn't check this flag, but Fabrics implementations may need to set this flag in the log page attributes cdata field. Reviewed by: imp Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D44444
|
#
b8cb8dd3 |
|
22-Mar-2024 |
John Baldwin <jhb@FreeBSD.org> |
nvme: Add constants for the PSDT field in cdw0 This is not used in nvme(4) but is used in NVMe over Fabrics transports which use SGLs to describe buffers instead of PRPs. While here, adjust the shift value for the FUSE field to be relative to the 'fuse' member of 'struct nvme_command'. Reviewed by: imp Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D44443
|
#
f21a54d1 |
|
22-Mar-2024 |
John Baldwin <jhb@FreeBSD.org> |
nvme: Add SGL structure and constants for use in NVMe commands Fabrics capsules use an SGL structure instead of prp1/2 addresses to describe the data buffer used for a command. The SGL structure is added to a union with the existing prp1/2 fields. Reviewed by: imp Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D44442
|
#
1931b75e |
|
22-Mar-2024 |
John Baldwin <jhb@FreeBSD.org> |
nvme: Export constants for min and max queue sizes These are useful for NVMe over Fabrics. Reviewed by: imp Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D44441
|
#
2a2682ee |
|
06-Mar-2024 |
Warner Losh <imp@FreeBSD.org> |
nvme: Add SMART WARNING for persistent memory region NVME 2.0 added persistent memory regions, and this bit reports critical warnings / errors with those regions. Sponsored by: Netflix Reviewed by: mav Differential Revision: https://reviews.freebsd.org/D44213
|
#
7485926e |
|
01-Mar-2024 |
John Baldwin <jhb@FreeBSD.org> |
nvme: Firmware revisions in the firmware slot info logpage are ASCII strings In particular, don't try to byteswap the values as 64-bit integers and always print a non-empty version as a string. Reviewed by: chuck, imp Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D44121
|
#
3a477a9b |
|
29-Jan-2024 |
John Baldwin <jhb@FreeBSD.org> |
nvme: Add NVMEF helper macro as the inverse of NVMEV This macro accepts a field name and a value for the field and constructs the shifted field value. Reviewed by: chuck Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D43604
|
#
1dade1f2 |
|
29-Jan-2024 |
John Baldwin <jhb@FreeBSD.org> |
nvme: Rename NVMEB helper macro to NVMEM The current macro always builds a full mask for a named field, so use the M suffix for mask. Reviewed by: chuck, imp Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D43601
|
#
479680f2 |
|
29-Jan-2024 |
John Baldwin <jhb@FreeBSD.org> |
nvme: Use the NVMEV macro instead of expanded versions Reviewed by: chuck Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D43595
|
#
b46c7b1e |
|
27-Dec-2023 |
Alexander Motin <mav@FreeBSD.org> |
nvme: Add some bits from NVMe 2.0c spec. MFC after: 1 week
|
#
95ee2897 |
|
16-Aug-2023 |
Warner Losh <imp@FreeBSD.org> |
sys: Remove $FreeBSD$: two-line .h pattern Remove /^\s*\*\n \*\s+\$FreeBSD\$$\n/
|
#
081c22db |
|
15-Aug-2023 |
John Baldwin <jhb@FreeBSD.org> |
nvme.h: Fix a comment typo in admin opcode enum Sponsored by: Chelsio Communications
|
#
ac8c866f |
|
07-Aug-2023 |
Warner Losh <imp@FreeBSD.org> |
nvme: Add more NVME Base Spec 2.0 and NVME Command Set Spec 1.0a Add admin commands capacity management, lockdown and fabrics commands. Add I/O copy command. Sponsored by: Netflix Reviewed by: chuck, mav, jhb Differential Revision: https://reviews.freebsd.org/D41311
|
#
5ae44634 |
|
27-Jun-2023 |
John Baldwin <jhb@FreeBSD.org> |
nvme: Fix typo in "Command Aborted by Host" constant name. Reviewed by: chuck, imp Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D40763
|
#
4d846d26 |
|
10-May-2023 |
Warner Losh <imp@FreeBSD.org> |
spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSD The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch up to that fact and revert to their recommended match of BSD-2-Clause. Discussed with: pfg MFC After: 3 days Sponsored by: Netflix
|
#
9a5acf36 |
|
19-Dec-2022 |
Dag-Erling Smørgrav <des@FreeBSD.org> |
nvme: Clear the notify flag if the consumer rejects the controller. While here, fix some type mismatch warnings. Reviewed by: imp Sponsored by: Netapp, Inc. Sponsored by: Klara, Inc. MFC after: 1 week
|
#
8ab99dbe |
|
14-Nov-2022 |
Wanpeng Qian <wanpengqian@gmail.com> |
bhyve: abort and return FEATURE_NOT_SAVEABLE while set feature with a save flag for NVMe controller. Currently bhyve's NVMe controller cannot save feature values cross reboot. It should return a FEATURE_NOT_SAVEABLE error when the command specifies a save flag. Quote from NVMe specification, page 205: https://nvmexpress.org/wp-content/uploads/NVM-Express-1_4-2019.06.10-Ratified.pdf If the Feature Identifier specified in the Set Features command is not saveable by the controller and the controller receives a Set Features command with the Save bit set to one, then the command shall be aborted with a status of Feature Identifier Not Saveable. Reviewed by: chuck (older version) Approved by: manu (mentor) MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D32767
|
#
a69c0964 |
|
05-Aug-2022 |
Alexander Motin <mav@FreeBSD.org> |
nvme: Print CRD, M and DNR status bits on errors. It may help with some issues debugging. MFC after: 1 week
|
#
3086efe8 |
|
15-Apr-2022 |
Warner Losh <imp@FreeBSD.org> |
nvme: Remove NVME_MAX_XFER_SIZE, replace inline calculation NVME_MAX_XFER_SIZE used to be a constant (back when MAXPHYS was a constant) to denote the smaller of MAXPHYS or the largest PRP we could encode with our prealloation scheme. However, it's no longer constant since MAXPHYS varies at runtime. In addition, the actual maximum is now based on the drive's currently in use page_size, which is also a runtime expression. As such, remove the define and expand it inline in the one place its used still in the tree. Sponsored by: Netflix Reviewed by: chuck Differential Revision: https://reviews.freebsd.org/D34870
|
#
e66c1b51 |
|
15-Apr-2022 |
Warner Losh <imp@FreeBSD.org> |
nvme: Define NVME_MPS_SHIFT The memory page size (MPS) is expressed in terms of a 2^(number + 12) and other items in the system inherit this. Create a define rather than sprinkling 12 everywehere. Sponsored by: Netflix Reviewed by: chuck Differential Revision: https://reviews.freebsd.org/D34865
|
#
214df80a |
|
08-Apr-2022 |
Warner Losh <imp@FreeBSD.org> |
nvme: new define for size of host memory buffer sizes The nvme spec defines the various fields that specify sizes for host memory buffers in terms of 4096 chunks. So, rather than use a bare 4096 here, use NVME_HMB_UNITS. This is explicitly not the host page size of 4096, nor the default memory page size (mps) of the NVMe drive, but its own thing and needs its own define. No functional change is intended, only the logical spelling of 4k. Sponsored by: Netflix
|
#
c2318cf8 |
|
21-Feb-2022 |
Chuck Tuffli <chuck@FreeBSD.org> |
nvme: fix spelling of Namespace Fix spelling of a macro definition. Reviewed by: mav, imp Differential Revision: https://reviews.freebsd.org/D34330
|
#
e71afa12 |
|
21-Feb-2022 |
Chuck Tuffli <chuck@FreeBSD.org> |
nvme: Add OAES bit-field definitions Create definitions for the Optional Asynchronous Events Supported (OAES) values. Also adds a helper macro for the common use case of "mask and shift". E.g. value = NVME_CTRLR_DATA_OAES_NS_ATTR_MASK << NVME_CTRLR_DATA_OAES_NS_ATTR_SHIFT; becomes value = NVMEB(NVME_CTRLR_DATA_OAES_NS_ATTR); Reviewed by: mav, imp Differential Revision: https://reviews.freebsd.org/D34300
|
#
fea3cf1d |
|
02-Jul-2021 |
Warner Losh <imp@FreeBSD.org> |
nvme: Fix alignment on nvme structures Remove __packed from nvme_command, nvme_completion and nvme_dsm_trim. Add super-alignment to nvme_completion since it's always at least that aligned in hardware (and in our existing uses of it embedded in structures). It generates better code in nvme_qpair_process_completions on riscv64 because otherwise the ABI assumes a 4-byte alignment, and the same on all other platforms. Reviewed by: jrtc27@, mav@, chuck@ Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D31001
|
#
80a75155 |
|
02-Jul-2021 |
Warner Losh <imp@FreeBSD.org> |
nvme: style nit Put the { on the same line as the struct nvme_foo when we define these structures. It's FreeBSD standard and these were inconsistent. Sponsored by: Netflix
|
#
e83fdf8b |
|
08-Jan-2021 |
Chuck Tuffli <chuck@FreeBSD.org> |
fix big-endian platforms after 6733401935f8 The NVMe byte-swap routines for big-endian platforms used memcpy() to move the unaligned 64-bit value into a temp register to byte swap it. Instead of introducing a dependency, manually byte-swap the values in place. Point hat: me
|
#
67334019 |
|
08-Jan-2021 |
Chuck Tuffli <chuck@FreeBSD.org> |
nvmecontrol: add device self-test op and log page Add decoding of the Device Self-test log page and the ability to start or abort a test. Reviewed by: imp, mav Tested by: Muhammad Ahmad <muhammad.ahmad@seagate.com> MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D27517
|
#
8d08cdc7 |
|
02-Dec-2020 |
Chuck Tuffli <chuck@FreeBSD.org> |
nvme: Fix typo in definition Change occurrences of "selt test" to "self tests in the NVMe header file. Reviewed by: imp, mav MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D27439
|
#
cf7c0629 |
|
01-Dec-2020 |
Michal Meloun <mmel@FreeBSD.org> |
Always use the __unused attribute even for potentially unused parameters. Requested by: ian, imp MFC with: r368167
|
#
b2e9e573 |
|
30-Nov-2020 |
Michal Meloun <mmel@FreeBSD.org> |
Unbreak r368167 in userland. Decorate unused arguments. Reported by: kp, tuexen, jenkins, and many others MFC with: r368167
|
#
52a83207 |
|
30-Nov-2020 |
Michal Meloun <mmel@FreeBSD.org> |
NVME: Don't try to swap data on little endian machines. These swapping functions violate BUSDMA contract - we cannot write to armed (by bus_dmamap_sync(PRE_..)) buffers. Remove them at least from little endian machines until a better solution will be developed. Reviewed by: imp MFC after: 3 weeks
|
#
ac90f70d |
|
28-Nov-2020 |
Alexander Motin <mav@FreeBSD.org> |
Increase nvme(4) maximum transfer size from 1MB to 2MB. With 4KB page size the 2MB is the maximum we can address with one page PRP. Going further would require chaining, that would add some more complexity. On the other side, to reduce memory consumption, allocate the PRP memory respecting maximum transfer size reported in the controller identify data. Many of NVMe devices support much smaller values, starting from 128KB. To do that we have to change the initialization sequence to pull the data earlier, before setting up the I/O queue pairs. The admin queue pair is still allocated for full MIN(maxphys, 2MB) size, but it is not a big deal, since there is only one such queue with only 16 trackers. Reviewed by: imp MFC after: 2 weeks Sponsored by: iXsystems, Inc.
|
#
cd853791 |
|
27-Nov-2020 |
Konstantin Belousov <kib@FreeBSD.org> |
Make MAXPHYS tunable. Bump MAXPHYS to 1M. Replace MAXPHYS by runtime variable maxphys. It is initialized from MAXPHYS by default, but can be also adjusted with the tunable kern.maxphys. Make b_pages[] array in struct buf flexible. Size b_pages[] for buffer cache buffers exactly to atop(maxbcachebuf) (currently it is sized to atop(MAXPHYS)), and b_pages[] for pbufs is sized to atop(maxphys) + 1. The +1 for pbufs allow several pbuf consumers, among them vmapbuf(), to use unaligned buffers still sized to maxphys, esp. when such buffers come from userspace (*). Overall, we save significant amount of otherwise wasted memory in b_pages[] for buffer cache buffers, while bumping MAXPHYS to desired high value. Eliminate all direct uses of the MAXPHYS constant in kernel and driver sources, except a place which initialize maxphys. Some random (and arguably weird) uses of MAXPHYS, e.g. in linuxolator, are converted straight. Some drivers, which use MAXPHYS to size embeded structures, get private MAXPHYS-like constant; their convertion is out of scope for this work. Changes to cam/, dev/ahci, dev/ata, dev/mpr, dev/mpt, dev/mvs, dev/siis, where either submitted by, or based on changes by mav. Suggested by: mav (*) Reviewed by: imp, mav, imp, mckusick, scottl (intermediate versions) Tested by: pho Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D27225
|
#
0bed3eab |
|
13-Nov-2020 |
Alexander Motin <mav@FreeBSD.org> |
Add PMRCAP printing and fix earlier CAP_HI. MFC after: 3 days
|
#
6dd1985b |
|
28-Oct-2020 |
Alexander Motin <mav@FreeBSD.org> |
Fix unintentional constant rename in r367109. MFC after: 1 week
|
#
c44441f8 |
|
28-Oct-2020 |
Alexander Motin <mav@FreeBSD.org> |
Print NVMe controller capabilities in verbose dmesg. Those values are not reported in controller identification, while sometimes interesting for development and debugging. MFC after: 1 week
|
#
e32d47f3 |
|
21-Sep-2020 |
David Bright <dab@FreeBSD.org> |
Add an ioctl to get an NVMe device's maximum transfer size Reviewed by: imp, chuck Obtained from: Dell EMC Isilon MFC after: 1 week Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D26390
|
#
d87b31e1 |
|
01-Sep-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
nvme: clean up empty lines in .c and .h files
|
#
881534f0 |
|
31-Aug-2020 |
Warner Losh <imp@FreeBSD.org> |
Use symbolic names for asych events Rather than |= 0x300, define and use asyn event names for the name space changes and the firmware activations that we're asking for.
|
#
67abaee9 |
|
07-Jan-2020 |
Alexander Motin <mav@FreeBSD.org> |
Add Host Memory Buffer support to nvme(4). This allows cheapest DRAM-less NVMe SSDs to use some of host RAM (about 1MB per 1GB on the devices I have) for its metadata cache, significantly improving random I/O performance. Device reports minimal and preferable size of the buffer. The code limits it to 1% of physical RAM by default. If the buffer can not be allocated or below minimal size, the device will just have to work without it. MFC after: 2 weeks Relnotes: yes Sponsored by: iXsystems, Inc.
|
#
70d20ed3 |
|
05-Aug-2019 |
Alexander Motin <mav@FreeBSD.org> |
Add `nvmecontrol resv` to handle NVMe reservations. NVMe reservations are quite alike to SCSI persistent reservations and can be used in clustered setups with shared multiport storage. MFC after: 10 days Relnotes: yes Sponsored by: iXsystems, Inc.
|
#
a6d222eb |
|
02-Aug-2019 |
Alexander Motin <mav@FreeBSD.org> |
Add more random bits from NVMe 1.4. MFC after: 2 weeks
|
#
6c99d132 |
|
02-Aug-2019 |
Alexander Motin <mav@FreeBSD.org> |
Decode few more NVMe log pages. In particular: Changed Namespace List, Commands Supported and Effects, Reservation Notification, Sanitize Status. Add few new arguments to `nvmecontrol log` subcommand. MFC after: 2 weeks Sponsored by: iXsystems, Inc.
|
#
8dafbebd |
|
01-Aug-2019 |
Alexander Motin <mav@FreeBSD.org> |
Fix typo in r350529. MFC after: 2 weeks
|
#
90dfa8f0 |
|
01-Aug-2019 |
Alexander Motin <mav@FreeBSD.org> |
Add more new fields and values from NVMe 1.4. MFC after: 2 weeks
|
#
a7bf63be |
|
01-Aug-2019 |
Alexander Motin <mav@FreeBSD.org> |
Add IOCTL to translate nvdX into nvmeY and NSID. While very useful by itself, it also makes `nvmecontrol` not depend on hardcoded device names parsing, that in its turn makes simple to take nvdX (and potentially any other) device names as arguments. Also added IOCTL bypass from nvdX to respective nvmeYnsZ makes them interchangeable for management purposes. MFC after: 2 weeks Sponsored by: iXsystems, Inc.
|
#
8de2d8c0 |
|
28-Jul-2019 |
Alexander Motin <mav@FreeBSD.org> |
Add some new fields and bits from NVMe 1.4. MFC after: 2 weeks Sponsored by: iXsystems, Inc.
|
#
62d2cf18 |
|
18-Jul-2019 |
Warner Losh <imp@FreeBSD.org> |
Provide macros to extract the sub-fields of the CAP_LO and CAP_HI registers. These macros make places where we extract these easier to read. The shift and mask stuff is also a bit tedious and error prone. Start with the CAP_LO and CAP_HI registers since their scope is somewhat constrained. This is style chagne only, no functional changes. Reviewed by: chuck Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D20979
|
#
1aed4995 |
|
05-May-2019 |
Alexander Motin <mav@FreeBSD.org> |
Decode Deallocate Logical Block Features. MFC after: 1 week
|
#
87b3975e |
|
13-Dec-2018 |
Chuck Tuffli <chuck@FreeBSD.org> |
nda(4) fix check for Dataset Management support In the nda(4) driver, only set DISKFLAG_CANDELETE (a.k.a. can support BIO_DELETE) if the drive supports Dataset Management. There are reports that without this check, VMWare Workstation does not work reliably. Fix is to check the ONCS field in the NVMe Controller Data structure for support. This check previously existed but did not survive the big-endian changes. Reported by: yuripv@yuripv.net Reviewed by: imp, mav, jimharris Approved by: imp (mentor) MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D18493
|
#
9544e6dc |
|
21-Aug-2018 |
Chuck Tuffli <chuck@FreeBSD.org> |
Make NVMe compatible with the original API The original NVMe API used bit-fields to represent fields in data structures defined by the specification (e.g. the op-code in the command data structure). The implementation targeted x86_64 processors and defined the bit fields for little endian dwords (i.e. 32 bits). This approach does not work as-is for big endian architectures and was changed to use a combination of bit shifts and masks to support PowerPC. Unfortunately, this changed the NVMe API and forces #ifdef's based on the OS revision level in user space code. This change reverts to something that looks like the original API, but it uses bytes instead of bit-fields inside the packed command structure. As a bonus, this works as-is for both big and little endian CPU architectures. Bump __FreeBSD_version to 1200081 due to API change Reviewed by: imp, kbowling, smh, mav Approved by: imp (mentor) Differential Revision: https://reviews.freebsd.org/D16404
|
#
f439e3a4 |
|
24-May-2018 |
Alexander Motin <mav@FreeBSD.org> |
Refactor NVMe CAM integration. - Remove layering violation, when NVMe SIM code accessed CAM internal device structures to set pointers on controller and namespace data. Instead make NVMe XPT probe fetch the data directly from hardware. - Cleanup NVMe SIM code, fixing support for multiple namespaces per controller (reporting them as LUNs) and adding controller detach support and run-time namespace change notifications. - Add initial support for namespace change async events. So far only in CAM mode, but it allows run-time namespace arrival and departure. - Add missing nvme_notify_fail_consumers() call on controller detach. Together with previous changes this allows NVMe device detach/unplug. Non-CAM mode still requires a lot of love to stay on par, but at least CAM mode code should not stay in the way so much, becoming much more self-sufficient. Reviewed by: imp MFC after: 1 month Sponsored by: iXsystems, Inc.
|
#
afdbfe1e |
|
19-Mar-2018 |
Warner Losh <imp@FreeBSD.org> |
Starting LBA is a 64bit number, so use htole64 instead of htole32. The latter casts the LBA to a 32-bit number before assigning it to the 64 bit structure entity. This works fine on the first 2TB of TRIMs, but terrible beyond that due to trucation. Also, add an assert to make sure we don't end too many DSM TRIM entries in one request. Sponsored by: Netflix
|
#
807e94b2 |
|
14-Mar-2018 |
Warner Losh <imp@FreeBSD.org> |
Implement trim collapsing in nda When multiple trims are in the queue, collapse them as much as possible. At present, this usually results in only a few trims being collapsed together, but more work on that will make it possible to do hundreds (up to some configurable max). Sponsored by: Netflix
|
#
01c1be35 |
|
12-Mar-2018 |
Alexander Motin <mav@FreeBSD.org> |
Print fuses and fna fields in identify data. MFC after: 2 weeks Sponsored by: iXsystems, Inc.
|
#
6b1a96b1 |
|
10-Mar-2018 |
Alexander Motin <mav@FreeBSD.org> |
Add new opcodes and statuses from NVMe 1.3a. MFC after: 2 weeks Sponsored by: iXsystems, Inc.
|
#
3fa5467a |
|
10-Mar-2018 |
Alexander Motin <mav@FreeBSD.org> |
Add new identify data structures fields from NVMe 1.3a. Some of them are already supported by existing hardware, so reporting them `nvmecontrol identify` can be useful.
|
#
afdc2600 |
|
22-Feb-2018 |
Kyle Evans <kevans@FreeBSD.org> |
nvme: Unbreak LE builds after r329824 The parameter 'p' is unused if _BYTE_ORDER == _LITTLE_ENDIAN. Add in a (void)p to fix the build.
|
#
0d787e9b |
|
22-Feb-2018 |
Wojciech Macek <wma@FreeBSD.org> |
NVMe: Add big-endian support Remove bitfields from defined structures as they are not portable. Instead use shift and mask macros in the driver and nvmecontrol application. NVMe is now working on powerpc64 host. Submitted by: Michal Stanek <mst@semihalf.com> Obtained from: Semihalf Reviewed by: imp, wma Sponsored by: IBM, QCM Technologies Differential revision: https://reviews.freebsd.org/D13916
|
#
0028abe6 |
|
22-Feb-2018 |
Warner Losh <imp@FreeBSD.org> |
Backout r329818, r329816 and r329815. These aren't the commits I thought I was testing prior to commit. Revert until I can sort out what happened and fix it.
|
#
4d87e271 |
|
21-Feb-2018 |
Warner Losh <imp@FreeBSD.org> |
Combine BIO_DELETE requests for nda devices Now that we're queueing BIO_DELETE requests in the CAM I/O scheduler, it make sense to try to combine as many as possible into a single request to send down to hardware. Hopefully, lots of larger requests like this are better than lots of individual transactions. Note for future: need to limit based on total size of the trim request. Should also collapse adjacent ranges where possible to increase the size of the max payload. Sponsored by: Netflix
|
#
718cf2cc |
|
27-Nov-2017 |
Pedro F. Giffuni <pfg@FreeBSD.org> |
sys/dev: further adoption of SPDX licensing ID tags. Mainly focus on files that use BSD 2-Clause license, however the tool I was using misidentified many licenses so this was mostly a manual - error prone - task. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts.
|
#
4e3b2744 |
|
13-Nov-2017 |
Warner Losh <imp@FreeBSD.org> |
Provide link speed data in XPT_GET_TRAN_SETTINGS. Provide full version information for that and XPT_PATH_INQ. Provide macros to encode/decode major/minor versions. Read the link speed and lane count to compute the base_transfer_speed for XPT_PATH_INQ. Sponsored by: Netflix
|
#
fa271a5d |
|
15-Oct-2017 |
Warner Losh <imp@FreeBSD.org> |
Closer examination shows that nvme and CAM both normally zero-fill allocations (for req and ccb, which ultimately contain the nvme_cmd). As such, we can micro-optimize these routines. Add a comment to this effect, and bzero the ccb used to make the requests for the nda dump rotuine so it more closely matches a ccb allocated with xpt_get_ccb(). Sponsored by: Netflix
|
#
fbed8df2 |
|
15-Oct-2017 |
Warner Losh <imp@FreeBSD.org> |
Explicitly set reserved fields and 'fuse' to 0. This prevents us from acidentally sending bogus values in these fields, which some drives may reject with an error or worse (undefined behavior). This is especially needed for the ndadump routine which allocates the cmd from stack garbage.... Sponsored by: Netflix
|
#
c2005bba |
|
29-Aug-2017 |
Warner Losh <imp@FreeBSD.org> |
Fix a few overlooked spots where the coded uses 16-bit NSIDs. Chuck Tuffli had submitted a more thorough patch that I was unaware of when I did my work and this brings in the bits I missed from that patch. PR: 220267 Submitted by: Chuck Tuffli
|
#
030edcce |
|
25-Aug-2017 |
Warner Losh <imp@FreeBSD.org> |
Fill in reserved areas from NVMe spec in the IDENTIFY structure (struct nvme_controller_data) as defined in the NVM Express specification, revsion 1.3. Sponsored by: Netflix
|
#
223a9b93 |
|
25-Aug-2017 |
Warner Losh <imp@FreeBSD.org> |
Add feature codes from NVMe 1.3 specification: o Automomous Power State Transition o Host Memory Buffer o Timestamp o Keep Alive Timer o Host Controlled Thermal Management o Non-Operational Power State Config Also note that feature codes 0x78-0x7f are reserved for the NVMe Management Interface. Sponsored by: Netflix
|
#
0012e436 |
|
24-Aug-2017 |
Warner Losh <imp@FreeBSD.org> |
Use _Static_assert These files are compiled in userland too, so we can't use sys/systm.h and rely on CTASSERT. Switch to using _Static_assert instead. MFC After: 3 days Sponsored by: Netflix
|
#
0c26c199 |
|
24-Aug-2017 |
Warner Losh <imp@FreeBSD.org> |
Sanity check sizes Add compile time sanity checks to make sure that packed structures are the proper size, typically as defined in the NVMe standard.
|
#
8a5d94f9 |
|
03-Aug-2017 |
Warner Losh <imp@FreeBSD.org> |
Make nvd vs nda choice boot-time rather than build-time Introduce hw.nvme.use_nvd tunable. This tunable allows both nvd and nda to be installed in the kernel, while allowing only one of them to create devices. This is an all-or-nothing setting, and you can't change it after boot-time. However, it will allow easier A/B testing. Differential Revision: https://reviews.freebsd.org/D11825
|
#
594ffc03 |
|
27-Jun-2017 |
Warner Losh <imp@FreeBSD.org> |
Add new definitions for namespaces. Sponsored by: Netflix Submitted by: Matt Williams (via D11330)
|
#
05ee702a |
|
07-Mar-2017 |
Warner Losh <imp@FreeBSD.org> |
cwd10 takes the low 32-bits and cwd11 takes the upper 32-bits of the lba. Rather than do a cast to uint64_t, which clang warns might be unaligned, do the stores 32-bits at a time. Sponsored by: Netflix
|
#
0cf14228 |
|
19-Nov-2016 |
Warner Losh <imp@FreeBSD.org> |
Implement HGST Log page 0xc1, as documented in the HGST SN100 and SN150 product manuals. Subpage 0x32 is documented, but not implemented. Sponsored by: Netflix, Inc
|
#
ab1dd091 |
|
19-Nov-2016 |
Warner Losh <imp@FreeBSD.org> |
Print Intel's expanded Temperature log page. Sponsored by: Netflix, Inc
|
#
d01f26f5 |
|
19-Nov-2016 |
Warner Losh <imp@FreeBSD.org> |
Add log pages that Intel SSDs provide. It turns out that many of these are widely implemented beyond just Intel drives. Sponsored by: Netflix, Inc
|
#
aea52879 |
|
19-Nov-2016 |
Warner Losh <imp@FreeBSD.org> |
Add log pages defined through NVM Express 1.2.1. Sponsored by: Netflix, Inc
|
#
dc58cdf9 |
|
19-Nov-2016 |
Warner Losh <imp@FreeBSD.org> |
Expand the SMART / Health Information Log Page (Page 02) printout based on NVM Express 1.2.1 Standard. Sponsored by: Netflix, Inc
|
#
a498975e |
|
18-Jul-2016 |
Scott Long <scottl@FreeBSD.org> |
Implement crashdump support on NVME MFC after: 3 days Sponsored by: Netflix, Inc.
|
#
f24c011b |
|
10-Jun-2016 |
Warner Losh <imp@FreeBSD.org> |
Commit the bits of nda that were missed. This should fix the build. Approved by: re@
|
#
ee7f4d81 |
|
10-Mar-2016 |
Alexander Motin <mav@FreeBSD.org> |
Revert r292074 (by smh): Limit stripesize reported from nvd(4) to 4K I believe that this patch handled the problem from the wrong side. Instead of making ZFS properly handle large stripe sizes, it made unrelated driver to lie in reported parameters to workaround that. Alternative solution for this problem from ZFS side was committed at r296615. Discussed with: smh
|
#
038659e7 |
|
30-Jan-2016 |
Warner Losh <imp@FreeBSD.org> |
Implement power command to list all power modes, find out the power mode we're in and to set the power mode.
|
#
fdf16a68 |
|
10-Dec-2015 |
Steven Hartland <smh@FreeBSD.org> |
Limit stripesize reported from nvd(4) to 4K Intel NVMe controllers have a slow path for I/Os that span a 128KB stripe boundary but ZFS limits ashift, which is derived from d_stripesize, to 13 (8KB) so we limit the stripesize reported to geom(8) to 4KB. This may result in a small number of additional I/Os to require splitting in nvme(4), however the NVMe I/O path is very efficient so these additional I/Os will cause very minimal (if any) difference in performance or CPU utilisation. This can be controller by the new sysctl kern.nvme.max_optimal_sectorsize. MFC after: 1 week Sponsored by: Multiplay Differential Revision: https://reviews.freebsd.org/D4446
|
#
fdbd3d80 |
|
30-Oct-2015 |
Jim Harris <jimharris@FreeBSD.org> |
nvd, nvme: report stripesize through GEOM disk layer MFC after: 3 days Sponsored by: Intel
|
#
992db80f |
|
08-Oct-2013 |
Jim Harris <jimharris@FreeBSD.org> |
Extend some 32-bit fields and variables to 64-bit to prevent overflow when calculating stats in nvmecontrol perftest. Sponsored by: Intel Reported by: Joe Golio <joseph.golio@emc.com> Reviewed by: carl Approved by: re (hrs) MFC after: 1 week
|
#
a40e72a6 |
|
08-Oct-2013 |
Jim Harris <jimharris@FreeBSD.org> |
Add driver-assisted striping for upcoming Intel NVMe controllers that can benefit from it. Sponsored by: Intel Reviewed by: kib (earlier version), carl Approved by: re (hrs) MFC after: 1 week
|
#
56183abc |
|
13-Aug-2013 |
Jim Harris <jimharris@FreeBSD.org> |
Send a shutdown notification in the driver unload path, to ensure notification gets sent in cases where system shuts down with driver unloaded. Sponsored by: Intel Reviewed by: carl MFC after: 3 days
|
#
38441bd9 |
|
19-Jul-2013 |
Jim Harris <jimharris@FreeBSD.org> |
Add message when nvd disks are attached and detached. As part of this commit, add an nvme_strvis() function which borrows heavily from cam_strvis(). This will allow stripping of leading/trailing whitespace and also handle unprintable characters in model/serial numbers. This function goes into a new nvme_util.c file which is used by both the driver and nvmecontrol. Sponsored by: Intel Reviewed by: carl MFC after: 3 days
|
#
e8f25c62 |
|
17-Jul-2013 |
Jim Harris <jimharris@FreeBSD.org> |
Define constants for the lengths of the serial number, model number and firmware revision in the controller's identify structure. Also modify consumers of these fields to ensure they only use the specified number of bytes for their respective fields. Sponsored by: Intel Reviewed by: carl MFC after: 3 days
|
#
66619178 |
|
11-Jul-2013 |
Jim Harris <jimharris@FreeBSD.org> |
Fix a poorly worded comment in nvme(4). MFC after: 3 days
|
#
e9efbc13 |
|
09-Jul-2013 |
Jim Harris <jimharris@FreeBSD.org> |
Update copyright dates. MFC after: 3 days
|
#
49fac610 |
|
26-Jun-2013 |
Jim Harris <jimharris@FreeBSD.org> |
Add firmware replacement and activation support to nvmecontrol(8) through a new firmware command. NVMe controllers may support up to 7 firmware slots for storing of different firmware revisions. This new firmware command supports firmware replacement (i.e. firmware download) with or without immediate activation, or activation of a previously stored firmware image. It also supports selection of the firmware slot during replacement operations, using IDENTIFY information from the controller to check that the specified slot is valid. Newly activated firmware does not take effect until the new controller reset, either via a reboot or separate 'nvmecontrol reset' command to the same controller. Submitted by: Joe Golio <joseph.golio@emc.com> Obtained from: EMC / Isilon Storage Division MFC after: 3 days
|
#
8d09e3c4 |
|
26-Jun-2013 |
Jim Harris <jimharris@FreeBSD.org> |
Use MAXPHYS to specify the maximum I/O size for nvme(4). Also allow admin commands to transfer up to this maximum I/O size, rather than the artificial limit previously imposed. The larger I/O size is very beneficial for upcoming firmware download support. This has the added benefit of simplifying the code since both admin and I/O commands now use the same maximum I/O size. Sponsored by: Intel MFC after: 3 days
|
#
5076698e |
|
12-Apr-2013 |
Jim Harris <jimharris@FreeBSD.org> |
Remove the NVME_IDENTIFY_CONTROLLER and NVME_IDENTIFY_NAMESPACE IOCTLs and replace them with the NVMe passthrough equivalent. Sponsored by: Intel
|
#
7c3f19d7 |
|
12-Apr-2013 |
Jim Harris <jimharris@FreeBSD.org> |
Add support for passthrough NVMe commands. This includes a new IOCTL to support a generic method for nvmecontrol(8) to pass IDENTIFY, GET_LOG_PAGE, GET_FEATURES and other commands to the controller, rather than separate IOCTLs for each. Sponsored by: Intel
|
#
5fdf9c3c |
|
01-Apr-2013 |
Jim Harris <jimharris@FreeBSD.org> |
Add unmapped bio support to nvme(4) and nvd(4). Sponsored by: Intel
|
#
232e2edb |
|
26-Mar-2013 |
Jim Harris <jimharris@FreeBSD.org> |
Add the ability to internally mark a controller as failed, if it is unable to start or reset. Also add a notifier for NVMe consumers for controller fail conditions and plumb this notifier for nvd(4) to destroy the associated GEOM disks when a failure occurs. This requires a bit of work to cover the races when a consumer is sending I/O requests to a controller that is transitioning to the failed state. To help cover this condition, add a task to defer completion of I/Os submitted to a failed controller, so that the consumer will still always receive its completions in a different context than the submission. Sponsored by: Intel Reviewed by: carl
|
#
0d7e13ec |
|
26-Mar-2013 |
Jim Harris <jimharris@FreeBSD.org> |
Pass associated log page data to async event consumers, if requested. Sponsored by: Intel Reviewed by: carl
|
#
0692579b |
|
26-Mar-2013 |
Jim Harris <jimharris@FreeBSD.org> |
Add structure definitions and controller command function for firmware log pages. Sponsored by: Intel Reviewed by: carl
|
#
08927782 |
|
26-Mar-2013 |
Jim Harris <jimharris@FreeBSD.org> |
Add structure definitions and a controller command function for error log pages. Sponsored by: Intel Reviewed by: carl
|
#
cf81529c |
|
26-Mar-2013 |
Jim Harris <jimharris@FreeBSD.org> |
Create struct nvme_status. NVMe error log entries include status, so breaking this out into its own data structure allows it to be included in both the nvme_completion data structure as well as error log entry data structures. While here, expose nvme_completion_is_error(), and change all of the places that were explicitly looking at sc/sct bits to use this macro instead. Sponsored by: Intel Reviewed by: carl
|
#
dbba7442 |
|
26-Mar-2013 |
Jim Harris <jimharris@FreeBSD.org> |
Add API for nvme consumers to access controller and namespace identify data. Sponsored by: Intel Reviewed by: carl
|
#
b846efd7 |
|
26-Mar-2013 |
Jim Harris <jimharris@FreeBSD.org> |
Add controller reset capability to nvme(4) and ability to explicitly invoke it from nvmecontrol(8). Controller reset will be performed in cases where I/O are repeatedly timing out, the controller reports an unrecoverable condition, or when explicitly requested via IOCTL or an nvme consumer. Since the controller may be in such a state where it cannot even process queue deletion requests, we will perform a controller reset without trying to clean up anything on the controller first. Sponsored by: Intel Reviewed by: carl
|
#
5f1e251d |
|
26-Mar-2013 |
Jim Harris <jimharris@FreeBSD.org> |
Create a generic nvme_ctrlr_cmd_get_log_page function, and change the health information log page function to use it. Sponsored by: Intel
|
#
99d99f74 |
|
26-Mar-2013 |
Jim Harris <jimharris@FreeBSD.org> |
Expose the get/set features API to nvme consumers. Sponsored by: Intel
|
#
038a5ee4 |
|
26-Mar-2013 |
Jim Harris <jimharris@FreeBSD.org> |
Add an interface for nvme shim drivers (i.e. nvd) to register for notifications when new nvme controllers are added to the system. Sponsored by: Intel
|
#
0a0b08cc |
|
26-Mar-2013 |
Jim Harris <jimharris@FreeBSD.org> |
Enable asynchronous event requests on non-Chatham devices. Also add logic to clean up all outstanding asynchronous event requests when resetting or shutting down the controller, since these requests will not be explicitly completed by the controller itself. Sponsored by: Intel
|
#
0f71ecf7 |
|
17-Oct-2012 |
Jim Harris <jimharris@FreeBSD.org> |
Add ability to queue nvme_request objects if no nvme_trackers are available. This eliminates the need to manage queue depth at the nvd(4) level for Chatham prototype board workarounds, and also adds the ability to accept a number of requests on a single qpair that is much larger than the number of trackers allocated. Sponsored by: Intel
|
#
9eb93f29 |
|
17-Oct-2012 |
Jim Harris <jimharris@FreeBSD.org> |
Add return codes to all functions used for submitting commands to I/O queues. Sponsored by: Intel
|
#
be4dcf1b |
|
18-Sep-2012 |
Jim Harris <jimharris@FreeBSD.org> |
Add __aligned(4) to NVMe defined data structures. This fixes issue in nvmecontrol(8), where clang throws a cast-align warning when casting a __packed structure pointer to a uint32_t pointer as part of printing raw hex output. Reported by: dhw
|
#
bb0ec6b3 |
|
17-Sep-2012 |
Jim Harris <jimharris@FreeBSD.org> |
This is the first of several commits which will add NVM Express (NVMe) support to FreeBSD. A full description of the overall functionality being added is below. nvmexpress.org defines NVM Express as "an optimized register interface, command set and feature set fo PCI Express (PCIe)-based Solid-State Drives (SSDs)." This commit adds nvme(4) and nvd(4) driver source code and Makefiles to the tree. Full NVMe functionality description: Add nvme(4) and nvd(4) drivers and nvmecontrol(8) for NVM Express (NVMe) device support. There will continue to be ongoing work on NVM Express support, but there is more than enough to allow for evaluation of pre-production NVM Express devices as well as soliciting feedback. Questions and feedback are welcome. nvme(4) implements NVMe hardware abstraction and is a provider of NVMe namespaces. The closest equivalent of an NVMe namespace is a SCSI LUN. nvd(4) is an NVMe consumer, surfacing NVMe namespaces as GEOM disks. nvmecontrol(8) is used for NVMe configuration and management. The following are currently supported: nvme(4) - full mandatory NVM command set support - per-CPU IO queues (enabled by default but configurable) - per-queue sysctls for statistics and full command/completion queue dumps for debugging - registration API for NVMe namespace consumers - I/O error handling (except for timeoutsee below) - compilation switches for support back to stable-7 nvd(4) - BIO_DELETE and BIO_FLUSH (if supported by controller) - proper BIO_ORDERED handling nvmecontrol(8) - devlist: list NVMe controllers and their namespaces - identify: display controller or namespace identify data in human-readable or hex format - perftest: quick and dirty performance test to measure raw performance of NVMe device without userspace/physio/GEOM overhead The following are still work in progress and will be completed over the next 3-6 months in rough priority order: - complete man pages - firmware download and activation - asynchronous error requests - command timeout error handling - controller resets - nvmecontrol(8) log page retrieval This has been primarily tested on amd64, with light testing on i386. I would be happy to provide assistance to anyone interested in porting this to other architectures, but am not currently planning to do this work myself. Big-endian and dmamap sync for command/completion queues are the main areas that would need to be addressed. The nvme(4) driver currently has references to Chatham, which is an Intel-developed prototype board which is not fully spec compliant. These references will all be removed over time. Sponsored by: Intel Contributions from: Joe Golio/EMC <joseph dot golio at emc dot com>
|